AlgoTurk
Sign in

Baseten Brief

B2B SaaS·Inference Platform·Reported late-stage financing·🌐

Published
📄 Read as article →

Summary

Baseten is an enterprise-focused inference platform that positions itself between low-cost token-API/model-catalog players (Together AI, DeepInfra, Replicate) and general serverless GPU providers (Modal, Fireworks) by emphasizing hybrid/self-hosted deployments, low-latency routing and compliance (SOC 2, HIPAA). The company’s site-published signals — a $300M round (April 2026), total funding listed as $585M and a $5B valuation — plus a January acquisition and a May 2026 Benchling partnership, suggest an aggressive push into regulated, high-availability accounts (customers named include Stability AI, Patreon, Writer, Rime). Product feedback praises its developer ergonomics and uptime (99.99%) but flags steep MLOps overhead and billing friction. The key tension: can Baseten convert enterprise procurement cycles and justify a premium GTM/ops model while competing against much lower-cost, API-first alternatives and navigating mixed public reporting about later fundraising/valuation activity?

USD 18.84 billion (2025) — Global AI Inference Platform-as-a-Service (PaaS) marketMarket (TAM)
$2BTotal raised
AcceleratingMomentum
CleanRisk profile

Bull case

  • Substantive capital and public valuation signal: the company’s site lists a $300M raise (April 2026), total funding $585M and a $5B valuation, giving Baseten credibility and runway to invest in enterprise GTM and compliance.
  • Enterprise-focused differentiators: SOC 2 and HIPAA certifications, published global/hybrid deployment options, and named customers (Stability AI, Patreon, Writer, Rime) enable sales into regulated and high-availability use cases that token-API providers typically miss.
  • Product and reliability signals: site-published 99.99% uptime, positive developer visibility (Product Hunt 5.0/5) and customer testimonials (e.g., Rime, OpenEvidence) point to a capable inference runtime and packaging (Truss-based) for production models.
  • Acceleration through partnerships and M&A: a January 2026 acquisition and a May 2026 Benchling partnership—plus press chatter of a larger raise—indicate active commercial expansion and platform integration moves.

Watch-outs

  • Crowded, price-competitive landscape: Together AI, DeepInfra and Replicate offer lower-cost token APIs and model catalogs, while Modal and Fireworks compete on raw serverless GPU throughput—pricing and feature parity could compress Baseten’s enterprise margins.
  • Mixed user sentiment and operational friction: public reviews flag steep MLOps requirements, pricing/idle-cost complexity, setup overhead and at least one reported billing-support failure, all of which can slow enterprise rollouts.
  • Inconsistent external funding reports: the company site shows a $300M raise and $585M total funding, but outside press references larger $1.5B raises and different totals; these conflicting public reports create valuation/runway ambiguity.
  • Concentration and GTM execution risk: only four customers are named on the site despite the large reported funding and rapid expansion, implying the company must prove it can scale enterprise sales and reduce customer concentration.

From their website

Stats they publish

Raised $300M in April 2026 (funding round)
Total funding $585M
Valuation $5B
Employees ≈250
Platform uptime claim: 99.99%

Pricing

b2b_saas — 3 plans: Basic (free, $0 per month, pay-as-you-go), Pro (quote; volume discounts), Enterprise (custom quote). Model API token rates e.g., Z AI GLM 5.2 $4.50 per 1M tokens, GLM 5 $3.15/1M, OpenAI GPT OSS 120B $0.50/1M. Dedicated GPU (per minute) examples: T4 $0.01052, H100 $0.10833; CPU 1x2 $0.00058.

Product suite

  • Dedicated Inference

    Inference-optimized infrastructure for high-scale model serving

  • Model APIs

    Pre‑optimized, ready-to-use model APIs for rapid prototyping and evaluation

  • Training on Baseten

    Train and deploy models on inference-optimized infrastructure

  • Frontier Gateway

    Deploy inference APIs to monetize and serve models

  • Baseten Chains
  • Truss

Target segments

ML Engineering TeamsAI Research LabsStartup Product TeamsEnterprise Software Teams

Testimonials

  • The deployment process used to take up so much of our time. Now, it's as simple as a few commands, and we're done. What used to take hours now takes less than one, and the reduced maintenance means we can focus on improving our core product.
    Jagath Jai Kumar · Full Stack Engineer, OpenEvidencebaseten.co
  • Rime's state-of-the-art p99 latency and 100% uptime is driven by our shared laser focus on fundamentals, and we're excited to push the frontier even further with Baseten.
    Lily Clifford · Co-founder and CEO, Rimebaseten.co
  • Thanks to Baseten, we're getting cost-effective high-performance model serving without any extra burden on our internal engineering teams. Instead, we get to focus our expertise on creating the best possible domain-specific LLMs for our customers.
    Waseem Alshikh · CTO and Co-Founder, Writerbaseten.co
  • I want the best possible experience for our users, but also for our company. Baseten has hands down provided both. We really appreciate the level of commitment and support from your entire team.
    Nathan Sobo · Co-Founder, Zed Industriesbaseten.co
  • With Baseten, we gained a lot of control over our entire inference pipeline and worked with Baseten's team to optimize each step.
    Sahaj Gargbaseten.co

Compliance & certifications

SOC 2HIPAACCPA
Trade name
BaseTen Labs, Inc.
Registered address
201 Spear St, Suite 1600, San Francisco, CA 94105
Social
80% confidence

Crunchbase

verified profile
BaseTen Labs, Inc.Active · Private · For Profit
Heat80+6%
Growth92-5%
CB rank#275
Monthly visits105,818+19.3% MoM
HQSan Francisco, California, United States
Employees101-250
Industries
AI InfrastructureArtificial Intelligence (AI)Developer ToolsMachine LearningSoftwareSoftware Engineering
Technology
DNSSECCloudflare CDNMobile Non Scaleable ContentIPv6HSTS IncludeSubdomains PreLoadGoogle CloudApple Mobile Web App CapableCrUX Top 50mGoogle Universal AnalyticsGoogle
View the full profile on Crunchbase ↗

Founders & Team

5
Team size11-50 employees (profile listings)
ConfidenceHigh confidence
  • Tuhin SrivastavaFounder
    CEO and Co-FounderPreviously co‑founded Shape (an HR analytics platform) and has held roles across ML and product; raised venture capital for Baseten.
  • Amir HaghighatFounder
    CTO and Co-FounderPrior engineering leadership roles at Gumroad and other engineering teams before Baseten.
  • Phil HowesFounder
    Co-Founder & Chief ScientistCo‑founder with Srivastava on prior work (Shape) and holds an advanced degree in mathematics (as reported in public profiles).
  • Pankaj GuptaFounder
    Co-FounderPreviously worked as a software engineer on large-scale backend and data infrastructure at companies including Uber and Twitter (reported in public profiles).
  • Vlad Shulman
    Lead, Forward Deployed EngineeringFounder of Retain.ai (acquired by Dagster) and technical lead at fast‑growing companies; now leads the FDE function at Baseten.

Baseten's About page lists four co-founders — Tuhin Srivastava (CEO), Amir Haghighat (CTO), Phil Howes (Chief Scientist) and Pankaj Gupta — who are presented as the founding team. Company blog posts identify additional leaders (for example, Vlad Shulman leads Forward Deployed Engineering); broader employee counts differ across external profiles.

  • The company's About page explicitly names its four co-founders and their titles; background items (prior employers, degrees, acquisitions) are drawn from third‑party profiles and press cited elsewhere in the provided text.
  • Public employee/headcount figures vary across aggregators (examples in the provided text show both 11–50 and ~238); Baseten's website does not publish an authoritative headcount.
  • Several other individual names appear in aggregator sources (Crunchbase, etc.) included in the provided text; only people explicitly tied to Baseten in company pages or Baseten blog posts were listed here.
SourcesHow we map the team: the founders and key leaders, taken from the company's own team page and cross-checked against public profiles.Searches we ranBaseten founder OR co-founder OR CEOBaseten leadership team OR managementBaseten about us team

Market Size

TAMTotal addressableUSD 18.84 billion (2025) — Global AI Inference Platform-as-a-Service (PaaS) marketMarketsandMarkets global AI Inference PaaS market size for 2025 (publisher: MarketsandMarkets via PR Newswire). This PaaS figure most closely matches Baseten's product (managed/platform inference).high confidence

The most relevant published top‑down figure for Baseten's product is MarketsandMarkets' global AI Inference Platform-as-a-Service (PaaS) estimate of USD 18.84B (2025), which is selected as the TAM because it measures platform/PaaS revenue (closest match to Baseten's managed inference offering). A defensible bottom-up SAM could not be calculated from available public sources because there is no published, reliable count of addressable ML/AI engineering teams nor a representative annual spend (ACV) per customer that maps Baseten's unit pricing to yearly customer revenue; therefore SAM and SOM are not estimated here.

Cited sources
  • MarketsandMarkets (reported via PR Newswire)USD 18.84 billion in 2025 and USD 105.22 billion by 2030, registering a CAGR of 41.1% · Global · 41.1% CAGR
  • Grand View ResearchUSD 97.24 billion in 2024 and USD 253.75 billion by 2030, CAGR 17.5% (2025-2030) · Global · 17.5% CAGR
  • TechnavioThe AI inference market is expected to grow by USD 128.85 billion from 2024 to 2029, at a CAGR of 19.4% · Global · 19.4% CAGR
Assumptions
  • Different research providers use different scopes: some 'AI inference' totals include hardware and services (much broader than PaaS/platform software) and therefore are larger than the platform-specific PaaS figure used as TAM.
  • MarketsandMarkets' PaaS figure is the best published proxy for a platform vendor TAM but may double-count managed service revenue and cloud provider offerings; treat it as a market-level upper bound for PaaS vendors.
  • A bottom-up SAM requires (a) a grounded count of target customers (ML/AI teams, enterprise product teams) and (b) a representative annual spend per customer. Baseten publishes per-model pricing but not typical customer usage or ACV, so SAM cannot be reliably derived from public data.
  • Public traction and revenue estimates for Baseten in third-party aggregator sites are inconsistent (examples in public sources range widely), so anchoring SOM to claimed ARR without audited disclosures would be unreliable.
SourcesHow we size the market: an AI web search across published market research, cross-checked against a bottom-up estimate for this company's specific space.Searches we ranai inference platform market sizeai inference platform market size CAGR forecastnumber of ai inference platform customers worldwide

Competitors

Top 5

Competitive set splits between low-cost token APIs/model catalogs (DeepInfra, Together, Replicate) and infra/providers/serverless compute (Modal, Fireworks). Baseten sits as an enterprise-focused, inference-specialized platform emphasizing hybrid/self-hosted deployments, low latency and compliance.

  • Together AI
    together.aiServerless per-token API with broad open-model catalog and dedicated GPU clusters; Baseten differentiates via enterprise governance, multi-cloud/self-hosted deployment and SOC2/HIPAA compliance.
  • DeepInfra
    deepinfra.comLow-cost, OpenAI-compatible serverless APIs and dedicated GPU clusters; Baseten wins on low-latency multi-cloud routing, custom runtimes, enterprise compliance, and hybrid/self-hosted options.
  • Replicate
    replicate.comAPI-first, pay-per-prediction model catalog for prototyping; Baseten is stronger for production: enterprise controls, SLAs, self-hosting and high-availability.
  • Fireworks AI
    fireworks.aiHigh-throughput serverless inference (claims trillions of tokens/day) and voice-agent stack; Baseten competes on custom model runtimes, multi-cloud low-latency and enterprise compliance.
  • Modal
    modal.comGeneral-purpose serverless GPU compute with Python-native developer ergonomics; Baseten is inference-specialized with optimized runtimes, low-latency SLAs and hybrid deployments.
SourcesHow we find competitors: an AI web search for the company's closest rivals and how it's positioned against them. Each name is verified before it's listed.Searches we ranBaseten ai inference platform alternativesBaseten ai inference platform competitorsai inference platform companies like Baseten

Funding

Total raised$2B
ConfidenceMedium confidence
From Crunchbase ↗
  • Venture RoundUndisclosedJune 28, 2021
    Crunchbase
  • Series A$20MApril 26, 2022
    Greylock leadSouth Park CommonsAI FundCaffeinated CapitalGreg BrockmanDylan FieldMustafa SuleymanDJ Patil
    Crunchbase
  • Seed RoundUndisclosedApril 26, 2022
    Greylock leadSouth Park Commons lead
    Crunchbase
  • Series BUndisclosedMarch 4, 2024
    IVP leadSpark Capital lead
    Crunchbase
  • Series CUndisclosedFebruary 19, 2025
    IVP leadSpark Capital lead
    Crunchbase
  • Series D$150MSep 5, 2025
    Bond lead
    TechCrunch
  • Series E$300MJanuary 23, 2026
    CapitalG leadIVP leadBOND01 AdvisorsGreylockAltimeterBoxGroupBattery VenturesNVIDIABlackbird VenturesConviction01A
    Baseten (company blog)
  • Reported late-stage financing$1.5BJune 18, 2026
    Spark Capital leadSands Capital leadAltimeter Capital leadWellington Management lead
    TechCrunch

Public records show Baseten raised a $300M Series E at a reported $5B valuation (company blog, Jan 23, 2026). Crunchbase lists prior rounds including a $20M Series A (Apr 26, 2022) and several undisclosed early-/mid-stage rounds; TechCrunch and other press report a reported $1.5B fundraise in June 2026 and reference a prior $150M Series D. The documented, closed amounts in the provided sources sum to $470M.

  • The $1.5B raise reported in multiple press outlets (TechCrunch, Wall Street Journal referenced) is described as "close to finalizing"/"reported" in the available excerpts — it is included as a reported financing but not as a company-confirmed closed round in the supplied sources.
  • Crunchbase entries supplied in the corpus list additional rounds with undisclosed amounts; several early/mid-stage rounds lack amounts in the provided feeds, so the round-level sum here ($470M) is a floor based only on rounds with explicit USD amounts in the supplied texts.
  • Series D ($150M) is cited in press copy (TechCrunch referencing the WSJ) but the provided excerpts do not include a standalone company or press release confirming details (lead investors and exact date are not specified in the supplied TechCrunch snippet).
  • Multiple aggregator pages cited elsewhere in the corpus (PitchBook/Tracxn/Sacra excerpts appearing in the input) claim different totals (commonly ~$585M prior to the Series E); those pages did not have explicit [SOURCE] markers in the supplied blocks, so their figures are noted but not treated as canonical here.
  • Some supplied snippets conflict on early-stage/seed details (different seed leads or amounts); rounds and investors are only attributed to specific rounds when the provided source text explicitly ties them to that round.
SourcesHow we track funding: an AI web search across funding databases and press for disclosed rounds, amounts and investors. Conflicting totals are reconciled to the best-evidenced figure.Searches we ranBaseten funding round raisedBaseten investorsBaseten seed roundBaseten ai-inference-platform fundingBaseten funding 2026

Momentum

TrajectoryAccelerating
ConfidenceHigh confidence
  • PressJune 19, 2026
    Baseten is reportedly nearing a $1.5 billion funding round that would value the company at up to $13 billionTech Funding News
  • PressJune 19, 2026
    Baseten reportedly closing a $1.5 billion funding round (report)FinSMEs
  • PressJune 18, 2026
    Baseten is close to finalizing a $1.5 billion funding round at a $13 billion valuation (report)TechCrunch
  • PressJune 18, 2026
    Baseten is finalizing a $1.5 billion fundraising round (report)PYMNTS.com
  • PressMay 27, 2026
    Baseten in talks to raise $1 billion at an $11 billion valuation (report)The Information
  • PartnershipMay 20, 2026
    Benchling and Baseten announced Benchling Inference to give biotech customers scalable GPU capacityBenchling / PR Newswire
  • FundingJanuary 23, 2026
    Baseten raised $300M at a $5B valuation (Series E)Baseten (company announcement)
  • MilestoneJanuary 23, 2026
    Baseten announced it has acquired Parsed, a reinforcement learning startupBaseten / Business Wire
  • FundingSeptember 5, 2025
    Baseten raised $150M in a Series DTechCrunch
  • FundingFebruary 19, 2025
    Baseten closed a Series C (undisclosed) on February 19, 2025Crunchbase
  • FundingMarch 4, 2024
    Baseten closed a Series B (undisclosed) on March 4, 2024Crunchbase
  • FundingApril 26, 2022
    Baseten closed a $20M Series A on April 26, 2022Crunchbase
  • FundingApril 26, 2022
    Baseten raised a Seed round (undisclosed) on April 26, 2022Crunchbase
  • FundingJune 28, 2021
    Early venture round (undisclosed) dated June 28, 2021Crunchbase

Baseten shows accelerating momentum: a January 2026 $300M round and acquisition, a May 2026 partnership with Benchling, and multiple June 2026 press reports that the company is pursuing a new $1.5B raise at double‑digit billion-dollar valuations. Historical fundraising activity (Series D 2025, Series C 2025, earlier rounds) supports a rapid growth trajectory.

  • Several June 2026 items are press reports about an in-progress or rumored financing (reported $1.5B / split-priced $11B–$13B structure) and are not presented as closed transactions in the supplied text.
  • Some company announcements and the Series E announcement appear in the supplied text without a direct source URL; those entries use the verbatim company wording but have null URLs in the record.
  • Historical round details are taken from the provided Crunchbase lines; discrepancies in exact announcement dates (e.g., Jan vs. Apr 2026 for the $300M figure in different sources) appear in the supplied material.
SourcesHow we read momentum: an AI web search for recent dated signals — funding, hires, launches and partnerships — with same-name companies filtered out.Searches we ranBaseten news announcementBaseten 2026Baseten funding investment round 2026Baseten ai-inference-platform launch partnership

Customer Sentiment

11
OverallMixed
ConfidenceMedium confidence
What customers praise
  • Fast, dependable model servingMultiple reviews praise Baseten for low-latency, reliable model serving and the ability to handle production workloads with consistent throughput.Reviews praise Baseten for dependable, fast model serving and smooth scaling, with an easy path from models to live APIs and low ops overhead.Product Hunt
  • Easy path from model to live API / intuitive toolingUsers report quick deployment from models to live APIs, with intuitive onboarding, clear tooling, and dashboards for versions, metrics and logs.Users echo the sentiment: intuitive onboarding, clear tooling, and scalable deployments that handle growth without friction.Product Hunt
  • Truss packaging standardizes model handoffsThe Truss framework is highlighted as a key feature that bundles models and their runtime dependencies into a single, standardized package to ease productionization.Truss helps solve this by bundling the model and everything it needs to run into a single, standardized package.eesel.ai
  • Designed for ML engineers / MLOps teamsReviewers consistently call Baseten developer-first, well suited to ML engineers and MLOps teams who want control over dedicated deployments and hardware.Baseten is a platform for builders. It was designed from the ground up for ML engineers, data scientists, and developers at startups or big companies with dedicated AI teams.eesel.ai
  • Positive hands-on support cited by some customersAt least one customer review thanks Baseten for responsive help that enabled a real-time product to meet its reliability needs.A huge thanks to Baseten! Fast, reliable AI inference is critical for real-time horror experiences, and your team delivered beyond expectations.Product Hunt
Common complaints
  • Poor customer support / billing responsivenessOne customer reported prolonged silence from chat support, repeated incorrect invoices, and unresolved billing updates over several days.If they can't fix a basic billing field in over a week, and can't communicate while doing it, I honestly don't know how anyone can rely on them for anything critical.Product Hunt
  • Not suitable for non-technical usersMultiple reviews and analyst articles warn the platform is developer-first and will frustrate non-technical product/support teams.However, Baseten is absolutely not for non-technical users.eesel.ai
  • Requires MLOps expertise and platform learning (Truss, hardware choices)Deploying and optimizing models requires learning Baseten's Truss framework and understanding GPU trade-offs and autoscaling to achieve efficiency.Of course, this also means you need someone on your team who understands the difference between a T4 and an A100 GPU and knows how to make those trade-offs.eesel.ai
  • Pricing complexity and potential idle-costs from dedicated instancesBaseten's dedicated-instance deployment model can be cost-effective at high utilization but reviewers warn dedicated instances mean paying for idle time and more complex cost management.Baseten’s dedicated instances can be cost-effective at very high utilization rates. But dedicated instances also mean you pay for idle time and manage scaling yourself.wavespeed.ai
  • Longer time-to-first-generation / setup overheadAnalyst comparisons call out hours of setup, Truss learning and configuration versus minutes to first result on fully managed generation APIs.Time to first generation Hours (setup, deploy, configure) Minuteswavespeed.ai
  • Limited out-of-the-box generation APIs (overkill for simple image/video needs)For teams that only need ready-to-use image or video generation, reviewers say Baseten is more infrastructure and setup than necessary compared with fully managed APIs.If you just need an image or video generation API, Baseten is more infrastructure than you need.wavespeed.ai

Public user and analyst reviews portray Baseten as a powerful, developer-first inference platform praised for fast, dependable model serving, Truss-based packaging, and scalable deployments; however reviewers also flag steep MLOps requirements, pricing/idle-cost complexity, setup overhead, and at least one reported instance of poor billing support.

  • Most cited signals are a mix of a small number of user reviews (e.g., Product Hunt), blog/analyst reviews, and aggregator snippets rather than large-scale verified customer review panels.
  • Glassdoor rating is an employee review and not a customer signal; some aggregator scores (tenereteam) may be unverified.
  • Coverage is limited to the provided public reviews and blog posts; public customer feedback volume appears small.
SourcesHow we gauge sentiment: an AI web search of third-party review sites for real customer ratings and feedback — shown only when there's enough signal to be fair.Searches we ranBaseten reviewsBaseten trustpilot OR g2 OR capterraBaseten app reviews OR ratings

AI Visibility

We ask ChatGPT WITH web search about the company and show its answers plus the sources it cited — the real AI-search experience. The key signal is share of voice: does the company come up when AI is asked about its space, not just when asked by name.

Models asked ChatGPT · web search

AI awarenessStrong
ConfidenceHigh confidence

Accuracy of what AI said: Accurate — Direct answer correctly identifies Baseten as an inference platform and matches the profile on product focus, founding team, reported headcount, and the cited late-stage funding/valuation.

ChatGPT · web searchFound
QWhat do you know about Baseten, the company at baseten.co? What do they do, who founded it, size/stage, and any notable facts? Cite your sources.
A

Baseten is a San Francisco-based company specializing in AI infrastructure, focusing on deploying, serving, and scaling machine learning models in production environments. Their platform offers tools and expertise to transform complex ML models into scalable, production-ready applications, emphasizing performance, scalability, and cost-effectiveness. (baseten.co)

Founders and Leadership: Established in 2019, Baseten was founded by:

  • Tuhin Srivastava (CEO): Former data scientist at Gumroad and co-founder of Shape.
  • Amir Haghighat (CTO): Previously led data platform engineering at Clover Health and served as Head of Engineering at Gumroad.
  • Phil Howes (Chief Scientist): Co-founder of Shape and former ML engineer at Gumroad, holding a PhD in mathematics from the University of Sydney.
  • Pankaj Gupta: Co-founder with a background in engineering and machine learning. (baseten.co)

Company Size and Stage: As of February 2026, Baseten employed approximately 200 people. The company has raised around $585 million across multiple funding rounds, including a $300 million Series E in January 2026, valuing the company at $5 billion. (aiwiki.ai)

Notable Facts:

  • Baseten's platform supports various AI modalities, including large language models (LLMs), transcription, image generation, text-to-speech, and embeddings. (devtune.ai)
  • The company has experienced significant growth, with a 100x increase in inference volume during 2025. (aiwiki.ai)
  • Notable customers include Cursor, Notion, Quora, Patreon, and Clay. (aiwiki.ai)

Baseten continues to focus on providing efficient and scalable solutions for AI model deployment, addressing challenges in performance, scalability, and cost-effectiveness for businesses integrating machine learning into their operations.

ChatGPT · web searchAmong leaders
QWho are the most notable companies or products in the inference-platform space? Give a concise list with one line each, and cite your sources.
A

Here is a concise list of notable companies and products in the AI inference platform space:

  1. OpenRouter: Provides access to over 200 models through a single API, offering automatic routing and fallback mechanisms. (oversite.org)
  2. fal.ai: Specializes in low-latency image and video model inference, delivering rapid processing times. (oversite.org)
  3. Together AI: Offers affordable open-source large language model (LLM) inference and fine-tuning services. (oversite.org)
  4. Replicate: Features a broad catalog of models with easy deployment options for various AI applications. (oversite.org)
  5. Groq: Utilizes custom silicon to achieve ultra-low latency in AI inference tasks, making it ideal for real-time applications. (gmicloud.ai)
  6. Cerebras: Employs wafer-scale engine technology to deliver high throughput for large-scale AI inference workloads. (gmicloud.ai)
  7. Baseten: Focuses on enterprise custom model deployment with a wide selection of GPUs and the open-source Truss framework. (rywalker.com)
  8. DeepInfra: Provides cost-effective AI inference services with a wide range of open-source models. (nortonmediaenterprise.com)
  9. WaveSpeed: Offers exclusive models with a 99.9% service level agreement (SLA), ensuring high reliability. (apidog.com)
  10. Novita AI: Provides affordable multi-model inference services with GPU infrastructure support. (blogs.novita.ai)

These companies and products represent significant advancements in AI inference platforms, catering to various needs such as low latency, high throughput, affordability, and ease of deployment.

ChatGPT · web searchRecommended
QIf someone asked you to recommend companies or products in the inference-platform space, who would you name? Cite your sources.
A

When considering companies and products in the AI inference platform space, several notable options stand out:

  1. OpenRouter: Recognized for providing access to over 200 models through a single API, OpenRouter offers automatic routing and fallback mechanisms. Its pay-per-token pricing model typically includes a markup of 0-20% over provider pricing. (oversite.org)
  2. fal.ai: Specializing in image and video model inference with minimal latency, fal.ai operates on a pay-per-request basis. For instance, the FLUX.1 Pro model is priced at approximately $0.05 per image, while the Wan 2.2 model costs around $0.15 per video. (oversite.org)
  3. Together AI: This platform is noted for its affordable open-source large language model (LLM) inference and fine-tuning capabilities. Pricing includes Llama 3.3 70B at $0.54 per million input tokens, Mixtral 8x22B at $0.60 per million, and FLUX.1 at $0.04 per image. (oversite.org)
  4. Replicate: Offering a broad catalog of models and straightforward deployment, Replicate charges on a pay-per-second of compute basis. For example, the FLUX.1 Pro model is priced at approximately $0.05 per image, with variable pricing for LLMs. (oversite.org)
  5. WaveSpeed: Known for delivering maximum speed on supported image and video models, WaveSpeed operates on a pay-per-request model, offering competitive pricing comparable to fal.ai for supported models. (oversite.org)
  6. Groq: Distinguished by its proprietary Language Processing Unit (LPU), Groq provides ultra-low latency inference, making it suitable for real-time applications like chatbots and voice AI. (gmicloud.ai)
  7. Cerebras: Utilizing its Wafer-Scale Engine (WSE), Cerebras offers high throughput inference, ideal for bulk processing and offline generation tasks. (gmicloud.ai)
  8. Novita AI: An AI and agent cloud platform, Novita AI provides over 200 APIs covering LLMs, image, video, and audio processing. LLM inference starts at $0.02 per million input tokens, offering a wide range of models under a single account and billing system. (blogs.novita.ai)
  9. DeepInfra: Recognized for its cost-effective per-token rates across a broad catalog of open-source models, DeepInfra is suitable for budget-conscious deployments. (nortonmediaenterprise.com)
  10. Baseten: Catering to enterprise custom model deployment, Baseten offers a wide selection of GPUs and utilizes the open-source Truss framework for model serving. (rywalker.com)

These platforms offer diverse features and pricing structures, catering to various AI inference needs.

Baseten is surfaced unprompted by the AI in both category and recommendation answers, indicating strong visibility within the inference-platform space. The AI's direct description aligns with the company's product, team, and the profile's funding claims, but reported funding/valuation figures should be independently verified before reliance.

  • AI answers can vary run-to-run; presence in one set of responses does not guarantee consistent surfacing across other queries or times.
  • Funding totals, valuation, and growth claims cited by the AI come from secondary aggregators and should be validated from primary sources or major press coverage.
  • Public web results for private startups can be incomplete or out of date; treat team, customer, and headcount lists as indicative and verify during diligence.

Risks & Red Flags

No material public risk signals found. An absence of public concerns isn’t a clean bill of health — early or private companies may simply not generate coverage.

An absence of publicly reported concerns is not a clean bill of health: private disputes, confidential investigations, or unreported incidents could exist.

Search was limited to publicly-available sources and news as of 2026-06-20; no non-public or proprietary records were reviewed.

Sources reviewedHow we check for red flags: an AI web search for any publicly reported lawsuits, regulatory action, negative press, layoffs or security incidents about this specific company. Only verifiable, clearly-attributed concerns are shown — and an absence of findings isn't a clean bill of health.Searches we ranSearch NEWS, regulators, courts and watchdog reports for publicly reported RISKS or red flags about Baseten (the company at baseten.co — Baseten is an AI inference platform for deploying, optimizing, and scaling low-latency model runtimes across cloud or self-hosted infrastructure.). Look for: lawsuits or legal disputes; regulatory actions, fines, investigations or bans; negative press or controversy; distress signals (layoffs, shutdown, key-executive departures, funding trouble); security incidents (data breach, hack, major outage); and serious customer complaints or fraud/scam allegations. Use INDEPENDENT third-party sources — news outlets, regulators, courts, reputable trade press — NOT baseten.co's own website or company-owned blogs. Check Global sources. Make sure every result is about THIS company, not a different organization with a similar name. Cite each source. If genuinely nothing material is found, say so.

Frequently asked questions

What does Baseten do?

Baseten provides an inference platform to deploy, serve, and scale open-source and custom AI models in production.

How much funding has Baseten raised?

Baseten has raised $2B in disclosed funding. Its most recent disclosed round was a $300M Series E in 2026, led by CapitalG and IVP.

Who are Baseten's main competitors?

Baseten's named competitors include Together AI, DeepInfra, Replicate and Fireworks AI.

Who founded Baseten?

Baseten was founded by Tuhin Srivastava, Amir Haghighat, Phil Howes and Pankaj Gupta.

Where is Baseten headquartered?

Baseten is headquartered in San Francisco, California, United States.

What market does Baseten operate in?

Baseten operates in the Inference platform market (B2B). Its total addressable market is estimated at USD 18.84 billion (2025) — Global AI Inference Platform-as-a-Service (PaaS) market.

How does Baseten's pricing work?

b2b_saas — 3 plans: Basic (free, $0 per month, pay-as-you-go), Pro (quote; volume discounts), Enterprise (custom quote). Model API token rates e.g., Z AI GLM 5.2 $4.50 per 1M tokens, GLM 5 $3.15/1M, OpenAI GPT OSS 120B $0.50/1M. Dedicated GPU (per minute) examples: T4 $0.01052, H100 $0.10833; CPU 1x2 $0.00058.

Compiled by AlgoTurk from public web sources · . Not investment advice.

Baseten — Funding, Competitors & Research Brief · AlgoTurk