How do AI inference costs and AI product gross margins work in 2027?
Published Jun 14, 2026 · Updated Jun 14, 2026
Direct Answer
AI products carry structurally lower gross margins than traditional SaaS — roughly 50–60% versus 75–90% — because every AI query incurs a real compute cost, which means COGS matter again and reshape pricing, unit economics, and the Rule of 40. Per ICONIQ's January 2026 data, AI gross margins average about 52% (up from 41% in 2024), while mature SaaS runs 75–90%.
The reason is inference: AI companies run 40–50% COGS — with inference alone roughly 23% — versus SaaS's 10–25%. Frontier model inference costs $2–15 per million input tokens and $10–75 per million output tokens, though prices have fallen 10–100x since 2023 from efficiency gains, hardware, and competition, and Anthropic and OpenAI now offer roughly 90% discounts on cached input tokens.
The structural floor is real: AI margins will likely climb toward 60–65% but are unlikely to reach SaaS's 80%+, because the marginal cost of a query is no longer near zero.
For operators, AI economics are a clean lesson in why COGS matter again, pricing to cover variable cost, and managing margin under real unit costs.
1. COGS Matter Again
The end of near-zero marginal cost
Traditional SaaS had a near-magical property: the marginal cost of serving one more user was near zero, which produced 75–90% gross margins. AI breaks that — every query runs real compute (inference), a genuine variable cost. The near-zero marginal cost that defined SaaS is gone for AI products.
The margin gap
The numbers are stark: AI companies run 40–50% COGS (inference ~23%) for 50–60% gross margins, versus SaaS's 10–25% COGS and 75–90% margins. ICONIQ pegs AI at about 52%. The 20–30 point gross-margin gap is structural, not a maturity issue — it reflects the real cost of compute.
2. The Inference Cost Curve
Prices falling fast
The one relief is that inference cost is falling fast — 10–100x since 2023, driven by model efficiency, hardware advances, and competitive pricing. Frontier models now cost $2–15 per million input tokens and $10–75 per million output, and token caching offers ~90% discounts on repeated input.
The cost curve is bending down sharply.
Why margins still lag
Even with falling costs, AI margins are projected to reach only 60–65%, not SaaS's 80%+ — because demand and usage grow as costs fall, and there is a floor to compute cost. The savings get partly consumed by more usage, so the structural gap narrows but does not close. COGS remain a real line on the P&L.
3. The Pricing Implication
Price must cover variable cost
The biggest implication: AI pricing must cover variable COGS. A flat per-seat price that ignores usage can lose money on a heavy user whose inference cost exceeds their fee. This is exactly why usage-based and outcome-based pricing spread in AI — the price must track the cost of serving each customer, which a flat seat price does not.
The margin-aware pricing model
AI-native companies design pricing margin-first — usage tiers, credits, or outcome fees that ensure each unit of consumption is profitable. The lesson from the 23% inference COGS is that pricing and cost must be linked; decoupling them (flat price, variable cost) erodes margin invisibly until it shows up in the P&L.
4. The RevOps and Finance Lessons
Reintroduce COGS into the model
The clearest lesson is that COGS matter again for AI products. RevOps and finance teams accustomed to SaaS's near-zero marginal cost must reintroduce COGS into pricing, forecasting, and unit economics. Every customer now has a real cost to serve, so margin must be managed at the per-customer level, not assumed away as in classic SaaS.
Price to cover variable cost
The flat-price-variable-cost mismatch is the trap. RevOps should ensure pricing tracks consumption — usage or outcome components — so heavy users do not become unprofitable. The discipline is to know the cost to serve each customer and price above it, the way any business with real COGS must.
Watch margin as a first-class metric
With AI margins structurally lower and pressured by usage, gross margin becomes a first-class metric to manage, not a given. RevOps and finance should track gross margin by product and customer, optimize inference cost (caching, model selection, efficient routing), and treat margin as a lever — because the Rule of 40 and valuation depend on it, and AI does not hand it to you for free.
5. What to Watch
The questions for 2027 are how far inference costs fall, whether AI margins climb past 65%, and how pricing models mature to protect margin. With AI gross margins at 52% versus SaaS's 80%+ and the gap structural, COGS discipline is now central to AI economics. The durable lessons stand: reintroduce COGS into the model, price to cover variable cost, and watch gross margin as a first-class metric.
FAQ
Why do AI products have lower gross margins than SaaS? Because every query incurs a real compute cost (inference), unlike SaaS's near-zero marginal cost. AI runs 40–50% COGS for 50–60% gross margins, versus SaaS's 10–25% COGS and 75–90% margins — about a 52% average for AI per ICONIQ.
How much does AI inference cost? Frontier models cost $2–15 per million input tokens and $10–75 per million output tokens; standard models less. Costs have fallen 10–100x since 2023, and Anthropic and OpenAI offer ~90% discounts on cached input tokens.
Will AI margins reach SaaS levels? Probably not. AI gross margins are projected to improve toward 60–65% but are unlikely to reach SaaS's 80%+, because there is a floor to compute cost and usage grows as costs fall, keeping COGS a real expense.
Why does this change pricing? Because pricing must cover variable COGS. A flat per-seat price can lose money on a heavy user whose inference cost exceeds their fee, which is why usage-based and outcome-based pricing spread in AI — the price must track the cost to serve.
What can RevOps learn from AI economics? Reintroduce COGS into pricing and forecasting (the near-zero-marginal-cost era is over for AI), price to cover variable cost so heavy users stay profitable, and watch gross margin as a first-class metric tied to valuation.
Bottom Line
AI products carry structurally lower gross margins — about 52% versus SaaS's 80%+ — because inference makes every query a real compute cost, so COGS matter again. Inference prices are falling 10–100x, but margins will likely only reach 60–65%, and pricing must cover variable cost (driving usage and outcome models).
For operators, the lessons are exact: reintroduce COGS into the model, price to cover variable cost, and manage gross margin as a first-class metric.
Sources
- SaaS Mag — The AI COGS problem: SaaS gross margin compression 2026
- The SaaS CFO — Your AI feature is quietly destroying your gross margin
- Startups.com — Inference cost: the per-token economics of running AI
- SoftwareSeni — Why AI gross margins are lower than SaaS and what it means
- Bessemer Venture Partners — The AI pricing and monetization playbook
- CloudZero — Inference cost explained: how to reduce LLM and AI inference spend
*AI gross margin review — AI inference cost reviews, rating, AI gross margin review 2027, and a review of COGS, token economics, and margin-aware pricing for RevOps operators.*