Key takeaways
- Seat-based SaaS pricing assumed headcount scales with revenue — AI broke that chain.
- A new line item, token spend ("variable cognitive labor"), now sits between headcount and infrastructure — routinely $15K–$150K+ per month.
- The denominator is collapsing: new customers hit the same output with far fewer seats, so they never buy them in the first place — a structural pricing collapse, not churn.
- SaaS economics is now a three-body problem: falling headcount cost (but pricier talent), rising token spend (but cheaper per unit), and revenue squeezed from both sides.
- Winners price on outcomes, treat token cost as a core competency (lowest cost per resolution), and build proprietary feedback loops that compound.
The SaaS business model was designed for a world where headcount scaled linearly with revenue. That world ended about eighteen months ago. Most CEOs and CFOs just haven't updated the spreadsheet.
A typical Series A SaaS company in 2022–23 needed forty to fifty people to reach the operational velocity investors expected. The same company in 2025–26 needs fifteen to twenty. The engineering team that once required eight backend developers, three frontend specialists, and a dedicated DevOps hire now ships equivalent output with four engineers and an AI compute budget that would have seemed absurd three years ago.
The cost structure did not shrink. It shifted. And that shift is rewriting the fundamental economics of how software companies price, grow, and survive.
01The vanishing denominator
Seat-based pricing was never really about seats, which were a proxy for organizational complexity. The more people a company employed, the more software licenses it needed, and the more a SaaS vendor could charge. Salesforce built a $30 billion annual revenue business on this assumption. So did ServiceNow, Atlassian, and virtually every mid-sized SaaS company founded between 2005 and 2020. When I was the CPO of social media and e-commerce platforms, we charged enterprises a flat platform fee and then a seat-based licensing fee for different teams.
The assumption was elegant and, for two decades, correct. Revenue growth at the customer correlated with headcount growth. Headcount growth correlated with seat expansion. Seat expansion correlated with net revenue retention above 120 percent. The entire SaaS valuation framework — from public market multiples to venture capital term sheets — was constructed on this chain of correlations.
AI broke the chain.
When Klarna announced in early 2024 that its AI assistant was handling two-thirds of customer service conversations, doing the equivalent work of 700 full-time agents, the implication for seat-based vendors was immediate. Every SaaS tool that sold per-seat licenses to Klarna's customer service organization just lost the majority of its addressable market inside that single account. Not because Klarna stopped needing the capability, but because Klarna stopped needing the humans who required the seats.
Klarna is not an outlier. It is a leading indicator. By mid-2026, we have observed the same pattern across dozens of Series A and Series B companies. Customer support teams that once numbered thirty operate with eight. Marketing departments that required twelve specialists now run with four people and a token budget. Engineering organizations that justified twenty headcount slots deliver more code, more reliably, with teams of six.
The denominator in the per-seat equation is collapsing. And it is not coming back.
02The token line item
Something strange has appeared in the financial models of well-run startups. Between headcount costs and infrastructure costs — the two line items that historically defined a SaaS company's burn rate — a new category has emerged: token spend.
At OpenAI's current pricing, a company making heavy use of GPT-5.5 pays $5.00 per million input tokens and $30.00 per million output tokens. Anthropic's frontier models sit in a comparable range. For a Series A company running AI across customer interactions, code generation, content production, data analysis, and internal operations, monthly token costs now routinely land between $15,000 and $80,000 — the entry-level salary of a marketing analyst or a customer support representative. At the upper end, companies processing high volumes of complex reasoning tasks report token bills exceeding $150,000 per month.
To put that in context, a senior software engineer in San Francisco costs roughly $250,000 per year fully loaded, or about $21,000 per month. A company spending $80,000 monthly on tokens is paying the equivalent of four senior engineers — except those tokens are doing work that previously required eight to twelve people across multiple functions.
This is not a technology expense in the traditional sense. Token costs represent a fundamentally new category: variable cognitive labor. The company is purchasing thinking by the unit, and the units are getting cheaper while the volume increases.
The trajectory matters enormously. OpenAI's pricing for frontier models has dropped roughly 90 percent over the past two years when measured by capability-adjusted cost per token. Anthropic and Google have followed similar curves. The models are getting dramatically more capable while the per-unit cost falls. This creates a compounding advantage for companies that architect their operations around token consumption rather than human headcount.
But it also creates a problem that no one in the SaaS pricing conversation seems to be addressing honestly.
03The pricing model paradox
Consider a hypothetical SaaS company — call it WorkflowCo — that sells project management software at $25 per seat per month. In 2023, a typical mid-market customer with 300 employees had 40 WorkflowCo licenses, generating $12,000 in annual recurring revenue. By 2026, a similarly new customer achieves similar operational output with 100 employees needing only 10 licenses. WorkflowCo's revenue from the new account drops to $3,000. WorkflowCo just lost 60 percent of its new revenue from a successful, growing market.
The evidence from seed investors is not that existing SaaS customers are immediately cutting seats at scale. The stronger signal is that the next generation of startups is being advised to build around agents, services replacement, and outcome-based economics from the beginning.
YC is explicitly looking for companies that replace services, not merely improve them. a16z argues AI can productize outsourced work and push pricing toward outcomes rather than human users. Sequoia and Bessemer are making similar arguments around work performed instead of access sold. The result is subtle but dangerous for seat-based SaaS: the new customer does not churn seats later. They simply never buy them in the first place.
This is not a churn problem. This is a structural collapse of the pricing model.
The reflex response from the SaaS industry has been to pivot toward usage-based or outcome-based pricing. Charge for what gets consumed, not who consumes it. On paper, this sounds reasonable. In practice, it introduces a set of problems that most companies are deeply unprepared for.
Usage-based pricing requires the vendor to absorb variable costs — including token costs — that scale unpredictably with customer behavior. A customer that discovers a particularly effective AI workflow might increase token consumption by 400 percent in a single quarter. The SaaS vendor's cost of goods sold spikes, but the customer expects the price to reflect the marginal cost of computation, not the full cost of the vendor's R&D, go-to-market, and operational overhead.
This is the trap that a16z identified years ago in their analysis of AI economics. The gross margins that defined SaaS — typically 75 to 85 percent — are structurally incompatible with a business model where the primary input cost is variable AI compute. Companies that pass through token costs to customers end up looking less like software companies and more like services businesses. Companies that absorb token costs to preserve margin end up subsidizing their heaviest users.
Neither option produces the economics that venture capital has spent two decades optimizing for.
04The compression effect
What makes this moment genuinely different from previous SaaS transitions — the shift from on-premise to cloud, or from perpetual licenses to subscriptions — is the speed at which organizational compression is occurring.
We have spoken with over forty founders in the past several months who describe the same pattern. A company raises a Series A with twelve to fifteen people. Within six months, that team is shipping products at a velocity that would have required thirty-five to forty people in 2022. The AI tooling — Cursor for engineering, Claude for strategy and analysis, various agents for customer operations — does not just make individuals faster. It eliminates entire roles.
The product manager who spent three days writing specifications now works with an AI that generates comprehensive specs in three hours. The QA team of four becomes a single engineer running AI-powered test generation. The data analyst who built dashboards is replaced by a natural language interface that any operator can query directly. The customer success manager handling fifty accounts can now manage two hundred because AI drafts every email, summarizes every call, and flags every risk signal automatically.
Each of these compressions is individually unremarkable. Collectively, they represent a structural transformation of what a company needs to look like at each stage of growth.
Cursor's parent company, Anysphere, reportedly crossed $300 million in annualized revenue by early 2025 with a team of roughly fifty people — $6 million in revenue per employee, a figure that would have been considered implausible for a software company at that scale three years ago. They are not an exception. They are the new template.
Harvey, the legal AI company, is handling entire legal workflows: document review, predictive case analysis, and contract drafting that previously required teams of junior associates. Their product does not make lawyers faster. It replaces the need for certain categories of legal work to be performed by humans at all.
The companies being built today are not leaner versions of their predecessors. They are architecturally different organisms.
05The three-body problem of SaaS economics
The old SaaS equation had two variables: headcount cost and infrastructure cost. The margin between revenue and those two cost categories determined whether a company was investable. The new equation has three variables that interact in ways that break conventional financial modeling.
Headcount cost is falling — but talent is pricier
Fewer people, more skilled, paid more each.
Headcount cost is declining as a percentage of total spend, but the humans who remain are more expensive because they need to be significantly more skilled. A team of four engineers who can effectively orchestrate AI-augmented development workflows commands higher individual salaries than a team of twelve engineers performing more narrowly defined tasks. Total headcount cost drops, but cost per head increases.
Token cost is rising — but cheaper per unit
Bigger absolute budget, falling cost per unit of output.
Token cost is rising as a percentage of total spend, but on a per-capability basis it is falling rapidly. A company might spend $80,000 per month on tokens today to achieve what would have cost $300,000 per month in human labor eighteen months ago. Next year, the same capability might cost $40,000 in tokens. The absolute token budget grows because companies keep finding new applications, but the cost per unit of output keeps declining.
Revenue is under structural pressure
Customers pay less while demanding more.
The revenue model is under structural pressure because customers are simultaneously shrinking their teams (reducing seat-based revenue) and demanding more AI-native capabilities (increasing the vendor's cost to serve). The customer is getting more value while paying less, and the vendor is delivering more capability while spending more on compute.
The companies that figure out how to navigate this three-body problem will define the next era of software economics.
The companies that do not will find their unit economics deteriorating quarter by quarter, even as their products get better and their customers get happier.
06What actually works
The emerging answer is not a single pricing model. It is a fundamental rethinking of what a SaaS company sells. The companies we see navigating this transition most effectively share three characteristics. They price on outcomes rather than inputs. They treat token costs as a core competency rather than a pass-through expense. And they build proprietary feedback loops that make their AI capabilities compound over time, creating a moat that raw token access cannot replicate.
We recently worked with a healthcare company to develop a pricing model based on per-patient retention. The clinics pay when a problem gets solved, regardless of whether a human or an AI solved it. This aligns the vendor's revenue with the customer's value creation rather than the customer's organizational size. It also means the vendor is intensely motivated to reduce its own token costs per resolution — a competitive advantage that compounds with every improvement in model efficiency.
This is the structural insight that most SaaS founders are missing. In a world where the primary input is variable AI compute, the competitive advantage belongs to whoever can deliver the most value per token consumed. Not per seat licensed. Not per API call billed. Per unit of customer outcome achieved.
That requires a depth of domain-specific optimization that generic AI tools cannot provide. It requires proprietary data flywheels, fine-tuned models, and workflow architecture that turns raw intelligence into reliable business outcomes. It requires, in other words, exactly the kind of deep vertical expertise that the best SaaS companies were always supposed to build — but that seat-based pricing never forced them to develop.