How do AI inference costs and AI product gross margins work in 2027?

Question

Pulse RevOps · The Machine · Accepted Answer

Published Jun 14, 2026 · Updated Jun 14, 2026

## Direct Answer

**AI products carry structurally lower gross margins than traditional SaaS — roughly 50–60% versus 75–90% — because every AI query incurs a real compute cost, which means COGS matter again and reshape pricing, unit economics, and the Rule of 40.** Per **ICONIQ's** January 2026 data, AI gross margins average about **52%** (up from **41%** in 2024), while mature **SaaS** runs **75–90%**. The reason is **inference**: AI companies run **40–50%** COGS — with inference alone roughly **23%** — versus SaaS's **10–25%**. Frontier model inference costs **$2–15 per million input tokens** and **$10–75 per million output tokens**, though prices have fallen **10–100x** since 2023 from efficiency gains, hardware, and competition, and **Anthropic** and **OpenAI** now offer roughly **90%** discounts on cached input tokens. The structural floor is real: AI margins will likely climb toward **60–65%** but are unlikely to reach SaaS's **80%+**, because the marginal cost of a query is no longer near zero.

For operators, AI economics are a clean lesson in **why COGS matter again, pricing to cover variable cost, and managing margin under real unit costs.**

## 1. COGS Matter Again

### The end of near-zero marginal cost

Traditional **SaaS** had a near-magical property: the marginal cost of serving one more user was **near zero**, which produced **75–90%** gross margins. **AI** breaks that — **every query** runs real **compute** (inference), a genuine variable cost. The near-zero marginal cost that defined SaaS is gone for AI products.

### The margin gap

The numbers are stark: AI companies run **40–50%** COGS (inference ~**23%**) for **50–60%** gross margins, versus SaaS's **10–25%** COGS and **75–90%** margins. **ICONIQ** pegs AI at about **52%**. The **20–30 point** gross-margin gap is structural, not a maturity issue — it reflects the real cost of compute.

```mermaid
flowchart TD
  A[Gross Margin] --> B[Traditional SaaS]
  A --> C[AI Products]
  B --> D[COGS 10-25%]
  D --> E[Margin 75-90%]
  C --> F[COGS 40-50%, Inference ~23%]
  F --> G[Margin 50-60%, Avg ~52%]
  E --> H[Near-Zero Marginal Cost]
  G --> I[Real Compute Cost per Query]
```

## 2. The Inference Cost Curve

### Prices falling fast

The one relief is that **inference cost is falling fast** — **10–100x** since 2023, driven by model efficiency, hardware advances, and competitive pricing. Frontier models now cost **$2–15 per million input tokens** and **$10–75 per million output**, and **token caching** offers ~**90%** discounts on repeated input. The cost curve is bending down sharply.

### Why margins still lag

Even with falling costs, AI margins are projected to reach only **60–65%**, not SaaS's **80%+** — because **demand and usage grow** as costs fall, and there is a **floor** to compute cost. The savings get partly consumed by more usage, so the structural gap narrows but does not close. COGS remain a real line on the P&L.

```mermaid
flowchart LR
  A[Inference Cost] --> B[Down 10-100x Since 2023]
  B --> C[Efficiency + Hardware + Competition]
  B --> D[90% Caching Discounts]
  C --> E[Lower Per-Query Cost]
  D --> E
  E --> F[Margins Improve Toward 60-65%]
  F --> G[Still Below SaaS 80%+ Floor]
```

## 3. The Pricing Implication

### Price must cover variable cost

The biggest implication: AI pricing **must cover variable COGS**. A flat per-seat price that ignores usage can **lose money** on a heavy user whose inference cost exceeds their fee. This is exactly why **usage-based** and **outcome-based** pricing spread in AI — the price must track the **cost** of serving each customer, which a flat seat price does not.

### The margin-aware pricing model

AI-native companies design pricing **margin-first** — usage tiers, credits, or outcome fees that ensure each unit of consumption is **profitable**. The lesson from the **23%** inference COGS is that pricing and cost must be **linked**; decoupling them (flat price, variable cost) erodes margin invisibly until it shows up in the P&L.

## 4. The RevOps and Finance Lessons

### Reintroduce COGS into the model

The clearest lesson is that **COGS matter again** for AI products. RevOps and finance teams accustomed to SaaS's near-zero marginal cost must **reintroduce COGS** into pricing, forecasting, and unit economics. Every customer now has a real **cost to serve**, so margin must be managed at the **per-customer** level, not assumed away as in classic SaaS.

### Price to cover variable cost

The **flat-price-variable-cost** mismatch is the trap. RevOps should ensure pricing **tracks consumption** — usage or outcome components — so heavy users do not become **unprofitable**. The discipline is to know the **cost to serve** each customer and price above it, the way any business with real COGS must.

### Watch margin as a first-class metric

With AI margins structurally lower and pressured by usage, **gross margin** becomes a first-class metric to manag

How do AI inference costs and AI product gross margins work in 2027?

Direct Answer

1. COGS Matter Again

The end of near-zero marginal cost

The margin gap

2. The Inference Cost Curve

Prices falling fast

Why margins still lag

3. The Pricing Implication

Price must cover variable cost

The margin-aware pricing model

4. The RevOps and Finance Lessons

Reintroduce COGS into the model

Price to cover variable cost

Watch margin as a first-class metric

5. What to Watch

FAQ

Bottom Line

Sources

How do AI inference costs and AI product gross margins work in 2027?

Direct Answer

1. COGS Matter Again

The end of near-zero marginal cost

The margin gap

2. The Inference Cost Curve

Prices falling fast

Why margins still lag

3. The Pricing Implication

Price must cover variable cost

The margin-aware pricing model

4. The RevOps and Finance Lessons

Reintroduce COGS into the model

Price to cover variable cost

Watch margin as a first-class metric

5. What to Watch

FAQ

Bottom Line

Sources

What does the score mean?