How do you calculate 'true' LTV when you have variable churn by cohort age, and some customers never expand?
Direct Answer
"True" LTV is not a single number you pull from a billing dashboard — it is a cohort-weighted, survival-adjusted, margin-discounted estimate of the future cash a customer will generate, built from the actual retention curve rather than a constant-churn assumption. The single biggest error operators make is dividing ARPA by a blended monthly churn rate (the 1/churn shortcut), which silently assumes churn is constant forever and that every customer expands like the average.
When churn declines with cohort age (it almost always does) and a large share of customers never expand (they almost always do not), the 1/churn formula overstates LTV by 40-300%. The correct approach is to model the survival curve with Kaplan-Meier or a parametric hazard model, segment the population into expanders and non-expanders (or model expansion as its own stochastic process), apply gross margin and a discount rate, and report a range with confidence intervals rather than a point estimate.
This guide shows the full calculation, the data you need, the four modeling tiers from spreadsheet to Bayesian, and the board-reporting framing that survives diligence.
TLDR
- The
LTV = ARPA / churnformula is wrong whenever churn varies by cohort age or customers do not expand uniformly — and that describes nearly every real SaaS business. - Real churn curves are convex and decaying: early-life churn is high, late-life churn is low. A constant rate fits neither end and biases LTV badly.
- The fix is survival analysis: build a retention curve (Kaplan-Meier for non-parametric, Weibull/log-logistic for parametric), then LTV = sum over months of (survival probability x expected revenue x gross margin), discounted.
- Segment before you average. Split expanders from flat/contracting accounts; a blended NRR of 110% can hide a bimodal population of 140% expanders and 85% leakers.
- Always apply gross margin (not revenue) and a discount rate (8-12% for venture-stage SaaS). Undiscounted gross-revenue "LTV" is a vanity number.
- Report LTV as a range with a confidence interval and a payback period, never a single hero number. Boards and diligence teams trust the operator who shows the error bars.
- Tooling ladder: ChartMogul / Maxio for cohort retention exports, Python
lifelinesfor Kaplan-Meier and Cox models, and a Markov-chain or Bayesian model for the most rigorous treatment. - Cap the LTV horizon (36-60 months) and never let a single whale cohort dominate the blended figure.
I. Why The Textbook LTV Formula Breaks
1. The 1/churn shortcut and the assumption it hides
Almost every SaaS finance deck contains the formula LTV = ARPA / monthly churn rate, sometimes dressed up as LTV = ARPA x gross margin / churn. It is seductive because it needs only two numbers and one division. It is also, in most real businesses, materially wrong.
The formula is the closed-form solution to a very specific model: a customer pays a fixed amount every month, and in every month faces an identical, constant probability of cancelling. Under that model the expected lifetime is 1/churn months and LTV is ARPA/churn. The math is correct. The model is not.
The constant-hazard assumption is the load-bearing wall, and it almost never holds. Real retention curves are convex — churn is high in the first few months while customers are still deciding whether the product earns its place in the stack, then it flattens dramatically as survivors become habituated, integrated, and contractually entrenched.
A cohort that loses 8% in month one might lose only 1.5% in month twenty-four. The 1/churn formula has to pick a single number to represent that whole curve, and whichever number it picks, it is wrong somewhere.
Assumption in 1/churn | What actually happens in real SaaS | Direction of the error |
|---|---|---|
| Churn is constant across cohort age | Churn decays with tenure (convex survival curve) | Understates lifetime of survivors |
| Every customer expands at the average rate | Population is bimodal: expanders vs. flat/contractors | Overstates LTV for non-expanders |
| Revenue per account is fixed over time | ARPA drifts up (expansion) or down (downgrades) | Both directions, segment-dependent |
| One blended rate represents all segments | SMB churns 3-5x faster than enterprise | Blends a fast-decaying and slow-decaying curve into a meaningless middle |
| Lifetime can be infinite | Boards discount cash; horizons are finite | Overstates by ignoring time value of money |
| Gross margin is 100% | SaaS gross margin is 70-85% | Overstates LTV by 15-30% if you use revenue |
2. The magnitude of the error is not small
Operators sometimes accept that 1/churn is "approximate" and assume the error is a rounding issue. It is not. Consider a cohort with a true retention curve that loses 6% in month one, decaying to 1% per month by month twelve and holding there.
The implied average monthly churn over the first year might be 2.5%, giving a naive lifetime of 40 months. But the survivors past month twelve are churning at 1%, implying a conditional lifetime of 100 more months for that group. The naive blended figure understates the value of the loyal tail by more than 2x.
Now layer in expansion. If 30% of customers expand at 130% NRR and 70% stay flat or contract at 95%, the blended NRR is 0.3*1.30 + 0.7*0.95 = 1.055. A model that applies 105.5% NRR to every customer overstates the lifetime value of the 70% non-expanding majority — the segment that, by headcount, dominates your base — while understating the whales.
The blended number is wrong for everyone; it is right only for a customer who does not exist.
3. The "some customers never expand" problem specifically
The question calls out a critical real-world pattern: a meaningful share of customers buy once and never grow. They are not unprofitable — they pay reliably — but they break any model built on average expansion. If your LTV model bakes in net revenue retention above 100%, it is implicitly assuming every customer is on an upward revenue trajectory.
The non-expanders are not. For them, NRR is a *drag* (downgrades, seat reductions) or, at best, flat.
This is why expansion must be modeled as its own stochastic process, not folded into a blended growth rate. The cleanest framing: every customer has (a) a survival curve governing whether they are still paying, and (b) a revenue-trajectory process governing how much they pay conditional on survival.
The non-expanders simply have a flat or slightly declining revenue trajectory. Average them with expanders only at the very end, weighted by their true population share.
Cross-link: see q420 ("What is 'burn multiple'...") for how LTV feeds capital-efficiency metrics, and q424 ("What metrics should you include in a board-ready unit economics dashboard...") for where LTV sits in the reporting stack.
BANNER: THE SURVIVAL-ANALYSIS FOUNDATION
II. Retention Curves, Survival Functions, And Hazard Rates
1. The three objects you must distinguish
Survival analysis gives you three related but distinct functions, and confusing them is the second most common LTV error after 1/churn itself.
- The survival function S(t) — the probability a customer is still active at month
t. It starts at 1.0 (or 100%) and decays monotonically toward zero. This is your "retention curve." - The hazard rate h(t) — the probability a customer who survived to month
tchurns *during* montht. This is "conditional churn." In real SaaS it is high early and low late. - The retention rate per period —
S(t) / S(t-1), the share of last month's survivors who remain. This is what your cohort triangle in ChartMogul actually shows.
The 1/churn formula assumes h(t) is a flat horizontal line. Your job is to recover the real, downward-sloping h(t) from data.
| Function | Symbol | Range | Real-SaaS shape | What it answers |
|---|---|---|---|---|
| Survival | S(t) | 1.0 down to 0 | Convex, decaying | "What % of the cohort is still here?" |
| Hazard | h(t) | 0 to 1 per period | Decreasing | "What % of survivors leave this month?" |
| Cumulative hazard | H(t) | 0 upward | Increasing, concave | "Total accumulated churn pressure" |
| Period retention | S(t)/S(t-1) | 0 to 1+ | Rising toward an asymptote | "Month-over-month stickiness" |
2. Kaplan-Meier: the non-parametric workhorse
The Kaplan-Meier estimator builds the survival curve directly from observed data without assuming any functional form. For each month t, it computes the period retention (n_t - d_t) / n_t, where n_t is the number of customers "at risk" at the start of the month and d_t is the number who churned.
The survival function is the running product of those period retention rates.
Kaplan-Meier's superpower is right-censoring. Most of your customers have not churned yet — they are still paying. A customer who joined eight months ago and is still active contributes eight months of "survived" observations and then drops out of the at-risk pool without being counted as a churn.
Naive cohort math either ignores these customers (throwing away data) or wrongly treats the end of your observation window as a churn event (overstating churn). Kaplan-Meier handles censoring correctly by definition. Python's lifelines library implements it in three lines: fit KaplanMeierFitter on a duration column and an event-observed boolean.
3. Parametric hazard models: Weibull, log-logistic, exponential
Kaplan-Meier gives you a step function that ends where your data ends. To compute LTV you need the curve to *continue* past your observation window. Parametric models fit a smooth mathematical form to the observed data and let you extrapolate.
- Exponential — the constant-hazard model. This is
1/churnin disguise. Use it only as a baseline to show how wrong it is. - Weibull — a two-parameter model whose hazard can rise or fall monotonically. With a shape parameter below 1, hazard *decreases* with age — exactly the convex retention SaaS exhibits. This is the recommended default parametric model for subscription businesses.
- Log-logistic — allows a hazard that rises then falls, useful when there is an early "honeymoon" before churn risk peaks.
- Generalized Gamma — a flexible super-family that nests Weibull and log-normal; use it when you want the data to choose the shape.
Fit two or three forms, compare with AIC/BIC, and pick the one that both fits the observed window and extrapolates sensibly. Sanity-check the extrapolation: a Weibull tail that implies 0.1% monthly churn at month 60 may be too optimistic; cap it.
4. Cox proportional hazards: bringing in covariates
The Cox proportional hazards model lets you estimate how customer attributes — plan tier, acquisition channel, company size, onboarding completion, time-to-first-value — shift the hazard up or down. Instead of one survival curve you get a baseline curve plus multipliers per covariate.
This is how you move from "the average customer" to "this *kind* of customer," which is the entire point when your population is heterogeneous. The hazard ratio for "completed onboarding milestone" might be 0.45 — onboarding-completers churn at less than half the rate — a finding that is both an LTV input and a product roadmap mandate.
Cross-link: q445 ("hiring formula for local Account Executives in unfamiliar APAC/EMEA markets") depends on regional LTV estimates; never set a hiring budget against a blended global LTV.
5. Reading a Kaplan-Meier curve correctly — the four diagnostic regions
A fitted survival curve is not just an input to a formula; it is a diagnostic instrument, and operators who learn to read it catch retention problems months before they show up in MRR. Divide the curve into four regions and interrogate each.
The onboarding cliff (months 0-3) is the steepest portion of nearly every SaaS survival curve. A 6-12% drop here is normal; a 20%+ drop is a product-activation emergency. This region is governed almost entirely by time-to-first-value — whether the customer reached the "aha" moment the product was sold on.
If your curve has a brutal early cliff, no amount of late-stage save motion will fix LTV; the leak is upstream, in onboarding and activation.
The habituation bend (months 4-12) is where a healthy curve visibly flattens. The slope should be measurably gentler than the onboarding cliff. If it is not — if churn stays linear into month 12 — you have a product that customers tolerate but never come to depend on.
That is a stickiness problem that expansion revenue can mask on the NRR line but cannot fix in the survival curve.
The plateau (months 13-30) is where your most valuable signal lives. In a strong SaaS business the survival curve is nearly flat here — the survivors have integrated the product into their workflow, built switching costs, and stopped evaluating alternatives. The monthly hazard in this region is the single most important number for LTV, because the discounted tail of the curve is dominated by it.
A plateau hazard of 0.5%/month versus 1.5%/month is the difference between a 60-month effective life and a 24-month one.
The renewal-driven sawtooth (annual-contract businesses) is a pattern, not a region: survival is flat for eleven months then drops sharply at the twelve-month renewal anniversary, then flat again. If you bill annually, your Kaplan-Meier curve will look like a staircase, and the "churn" is concentrated entirely at renewal dates.
This changes everything about how you model S(t) — you cannot smooth a staircase into an exponential — and it concentrates your retention risk into a few predictable calendar moments, which is operationally a gift.
| Curve region | Months | Healthy signal | Warning signal | Lever that moves it |
|---|---|---|---|---|
| Onboarding cliff | 0-3 | < 12% cumulative drop | > 20% drop | Activation, time-to-first-value |
| Habituation bend | 4-12 | Visible flattening | Linear, no bend | Product depth, feature adoption |
| Plateau | 13-30 | Near-flat, < 1%/mo hazard | Continued decay | Switching costs, multi-product |
| Renewal sawtooth | Anniversary | Predictable, < 10% renewal loss | Large renewal cliffs | CS renewal motion, multi-year terms |
BANNER: THE FULL LTV CALCULATION, STEP BY STEP
III. From Survival Curve To A Defensible LTV Number
1. The master formula
Once you have a survival curve, true LTV is a discounted, margin-adjusted sum:
`` LTV = SUM over t=1..T of [ S(t) x R(t) x GM x (1 / (1+d)^t) ] ``
Where:
S(t)= survival probability at montht(from Kaplan-Meier / parametric fit)R(t)= expected revenue per surviving account at montht(incorporates expansion/contraction)GM= gross margin (decimal, e.g. 0.80)d= monthly discount rate (annual rate divided by 12)T= horizon cap (36-60 months recommended)
Notice this formula collapses to ARPA x GM / churn only if S(t) is a constant-hazard exponential, R(t) is constant, and d=0 and T=infinity. Every real refinement you add moves you away from the shortcut.
2. Building R(t): the expansion process
R(t) is where the "some customers never expand" problem lives. Do not model a single average revenue trajectory. Instead:
- Split the base into expanders, flat accounts, and contractors using historical revenue trajectories (a customer is an "expander" if revenue at month 12 exceeds revenue at month 1 by a threshold).
- Estimate a revenue path for each segment. Expanders: a compounding monthly uplift, capped (no account expands forever). Flat: revenue constant. Contractors: a slow decay until they hit a floor or churn.
- Weight by true population share. If 28% of accounts are expanders, the blended
R(t)is0.28 x R_exp(t) + 0.55 x R_flat(t) + 0.17 x R_con(t). - Or model LTV per segment and report all three. This is more honest and more useful — the expander LTV is your "good-fit ICP ceiling," the flat LTV is your "base case," and the contractor LTV is your "warning track."
| Segment | Share of base (illustrative) | Month-1 ARPA | Month-24 ARPA | NRR contribution |
|---|---|---|---|---|
| Expanders | 28% | $1,000 | $1,720 | ~135% |
| Flat | 55% | $1,000 | $1,000 | ~100% |
| Contractors | 17% | $1,000 | $760 | ~88% |
| Blended | 100% | $1,000 | $1,160 | ~111% |
The blended bottom row is the number most dashboards show. The three rows above it are the number you should actually manage.
3. Worked example: naive vs. true LTV
Take a cohort: month-1 ARPA $1,000, gross margin 80%, and a real retention curve. The naive approach observes ~3% average monthly churn over 18 months and computes LTV = 1000 x 0.80 / 0.03 = $26,667.
The true approach uses the survival curve below and the segmented revenue paths, discounted at 10% annual (0.83%/month), capped at 48 months.
| Month range | Survival S(t) | Avg revenue R(t) | Margin-adj discounted contribution |
|---|---|---|---|
| 1-6 | 1.00 to 0.83 | $1,010 | $4,310 |
| 7-12 | 0.83 to 0.74 | $1,055 | $3,560 |
| 13-24 | 0.74 to 0.62 | $1,120 | $5,890 |
| 25-36 | 0.62 to 0.54 | $1,170 | $4,420 |
| 37-48 | 0.54 to 0.49 | $1,205 | $3,560 |
| Total (48-mo cap) | ~$21,740 |
The true LTV ($21,740) is *below* the naive figure ($26,667) here — because discounting and the 48-month cap bite harder than the convex-curve uplift. In other cohorts with very low late-life churn the true figure exceeds the naive one. The point is not the direction; it is that you cannot know the direction without doing the work, and an 18-23% gap routinely flips a CAC payback decision.
4. The discount rate and horizon cap
Two judgment calls quietly swing LTV by 30%+:
- Discount rate. Venture-stage SaaS commonly uses 8-12% annually as a weighted-average-cost-of-capital proxy. Higher rate, lower LTV. Pick one, document it, hold it constant across reporting periods so trends are comparable.
- Horizon cap. Letting the sum run to infinity gives a curve-fit artifact undue weight. Cap at 36-60 months. A 48-month cap is a defensible default: it captures the bulk of discounted value while refusing to bet the business on a 7-year extrapolation.
| Discount rate (annual) | LTV (illustrative cohort) | vs. zero-discount |
|---|---|---|
| 0% | $27,900 | baseline |
| 8% | $23,100 | -17% |
| 10% | $21,740 | -22% |
| 12% | $20,550 | -26% |
| 15% | $18,900 | -32% |
5. Predictive LTV vs. descriptive LTV — two different questions
There are two distinct LTV questions, and conflating them is a common and costly error. Descriptive LTV asks: "What were the historical cohorts actually worth?" You answer it with full-history data, including post-hoc expander/contractor labels. It is a backward-looking accounting of value already largely realized, useful for understanding which channels and segments performed.
Predictive LTV asks: "What will the cohort we are acquiring *today* be worth?" This is the number that should drive CAC decisions, and it is much harder. You cannot use a customer's month-24 expansion behavior to predict a customer you signed yesterday — that information does not exist yet.
Predictive LTV must be built from early signals only: plan tier at signup, seat count at day 30, onboarding-milestone completion, acquisition channel, firmographic fit score. The Cox model is the right tool because it maps those early covariates onto a hazard multiplier and an expected revenue path.
The trap: many companies compute descriptive LTV from mature cohorts, then spend CAC against it as if it were predictive. If your ICP, pricing, or onboarding has improved, today's cohort may be worth *more* than the historical descriptive number — you are under-investing. If the market has gotten more competitive or you have drifted down-market, today's cohort is worth *less* — you are over-investing and the descriptive number is flattering you into a leaky bucket.
| Dimension | Descriptive LTV | Predictive LTV |
|---|---|---|
| Question answered | "What were past cohorts worth?" | "What will today's cohort be worth?" |
| Data used | Full history, post-hoc labels | Early signals only |
| Primary tool | Cohort triangle, Kaplan-Meier | Cox model, Bayesian with covariates |
| Drives | Channel attribution, segment P&L | CAC budgets, payback decisions |
| Main risk | Treated as predictive when stale | Overfitting thin early-signal data |
6. The expansion-cap discipline
A subtle modeling error inflates LTV more quietly than 1/churn itself: letting expansion compound without a cap. If your expander segment grows revenue 2.5%/month and you run that for 48 months, a $1,000 account becomes a $3,260 account — and the model treats that as routine. In reality, expansion saturates.
Customers run out of seats to add, departments to onboard, and use cases to fund. Every expander curve has an asymptote.
Model expansion as an S-curve toward a ceiling, not a straight exponential. A defensible approach: estimate the median expander's revenue at month 36 from data, treat that as roughly the ceiling, and have R_exp(t) approach it asymptotically. Capping expansion typically removes 10-20% from a naively-modeled expander LTV — and that is 10-20% of error you do not want a diligence team to find for you.
BANNER: SEGMENTATION — THE NON-NEGOTIABLE STEP
IV. Why You Must Segment Before You Average
1. Blended metrics hide bimodal populations
The single most dangerous habit in unit economics is reporting one blended LTV for a heterogeneous base. A company with a blended NRR of 110% can be either (a) every customer expanding at a healthy 110%, or (b) 30% of customers expanding at 145% while 70% slowly leak at 95%. Those are completely different businesses with completely different LTVs, CAC strategies, and product priorities — and they produce the identical blended number.
Frank Slootman's tenure at Snowflake (SNOW) is the canonical expansion story — consumption-based NRR famously ran above 150% — but even there, the *median* customer behaved very differently from the mean, which was dragged up by a tail of hyperscaler-grade accounts. Aaron Levie's Box (BOX) and the Atlassian (TEAM) co-founders Mike Cannon-Brookes and Scott Farquhar have all spoken about how seat-based and consumption-based expansion produce different curves.
The lesson generalizes: report the distribution, not just the mean.
2. The segmentation cuts that matter
| Segmentation axis | Why it changes the survival curve | Typical churn spread |
|---|---|---|
| Plan tier (SMB vs Mid vs Enterprise) | Enterprise has procurement friction, multi-year contracts, switching costs | SMB churns 3-5x faster |
| Acquisition channel | Self-serve / paid-search cohorts churn faster than sales-led or referral | 2-3x spread |
| Annual vs monthly billing | Annual contracts suppress churn mechanically (and front-load risk to renewal) | Monthly churns ~2x |
| Onboarding completion | Customers reaching first-value milestone churn far less | Cox hazard ratio ~0.4-0.5 |
| Industry / use case | Mission-critical use cases are stickier than nice-to-have | Wide, case-specific |
| Cohort vintage | Recent cohorts may reflect ICP drift, pricing changes, or improved onboarding | Trend signal |
3. The expander vs. non-expander split done right
For the specific problem in the question — customers who never expand — the cleanest treatment is a two-stage model. First, classify each historical account at a fixed observation point (e.g., month 12) as expander / flat / contractor. Second, fit a *separate* survival curve and revenue trajectory for each class.
Non-expanders often have *lower* churn than you would expect — they are not unhappy, they are simply at a stable equilibrium — so their flat-revenue, long-survival profile can still produce a respectable LTV. Do not write them off; price and serve them deliberately.
A subtle trap: classification leakage. If you classify a customer as an "expander" using month-12 data, you cannot then use that label to predict their month-3 behavior — that is using the future to predict the past. For predictive LTV (new cohorts), classify on early signals only (onboarding depth, seat count at day 30, plan tier).
For descriptive LTV (existing cohorts), full-history classification is fine.
Cross-link: q432 ("fastest partner enablement curriculum to get partners selling within 30 days") — partner-sourced cohorts deserve their own LTV curve; partner economics rarely match direct.
4. NRR, GRR, and where they fit
- Gross Revenue Retention (GRR) — revenue retained excluding all expansion; caps at 100%. It measures pure leakage and maps directly to the *downside* of your survival curve.
- Net Revenue Retention (NRR) — GRR plus expansion; can exceed 100%. It captures the revenue-trajectory process.
- For LTV, GRR informs
S(t)and the contraction part ofR(t); NRR informs the expansion part ofR(t). Reporting both, by segment, gives a board the full picture. A business with 95% GRR and 115% NRR is healthy-but-leaky; one with 88% GRR and 115% NRR is masking a serious retention problem with aggressive expansion.
| Metric | Includes expansion? | Can exceed 100%? | What it tells you about LTV |
|---|---|---|---|
| GRR | No | No | Floor on survival; leakage rate |
| NRR | Yes | Yes | Revenue-trajectory health |
| Logo retention | N/A (counts logos) | No | Survival curve in customer-count terms |
| Cohort dollar retention | Yes | Yes | The triangle your LTV model consumes |
BANNER: THE FOUR MODELING TIERS
V. Spreadsheet To Bayesian — Pick Your Rigor Level
1. Tier 1 — the cohort-triangle spreadsheet
The entry-level method that beats 1/churn without any statistics: export a cohort retention triangle (rows = signup cohort, columns = months since signup, cells = dollar retention). Average down each column to get an empirical S(t) for each month of age, directly from data.
Sum S(t) x R(t) x GM x discount. This is non-parametric, intuitive, and board-legible. Its limit: it cannot extrapolate past your oldest cohort's age, and small recent cohorts make late columns noisy.
2. Tier 2 — Kaplan-Meier plus a parametric tail
Use Kaplan-Meier (in lifelines or even a careful spreadsheet) for the observed window, then fit a Weibull to extrapolate. This correctly handles censoring and lets the curve continue to your horizon cap. This is the recommended default for any company past Series A — rigorous enough for diligence, cheap enough to maintain quarterly.
3. Tier 3 — Cox regression and Markov chains
The Cox model segments survival by covariates. A Markov-chain model treats the customer as moving between states — new, active, expanded, contracted, churned, reactivated — with transition probabilities estimated from data. The Markov approach is especially good for the "never expand" problem because "stays flat" is just one transition probability among several, and reactivation (win-back) becomes a first-class modeled event rather than an awkward exception.
It naturally produces LTV as the expected discounted reward of the Markov reward process.
4. Tier 4 — Bayesian hierarchical / BG-NBD-style models
The most rigorous tier puts a probability distribution on every parameter. A Bayesian hierarchical model shares information across cohorts: a small, young cohort's churn estimate is "shrunk" toward the all-cohort prior, which stabilizes noisy late-life columns. The output is not a point estimate but a full posterior distribution of LTV — you can report the median and a 90% credible interval directly.
For contractual subscription businesses, hierarchical survival models are the gold standard; for non-contractual / usage-based revenue, the BG/NBD and Gamma-Gamma family (Fader and Hardie's "buy-till-you-die" models) are the established choice.
| Tier | Method | Handles censoring | Extrapolates | Gives CI | Effort | When to use |
|---|---|---|---|---|---|---|
| 1 | Cohort triangle | Partially | No | No | Low | Pre-A, quick sanity check |
| 2 | Kaplan-Meier + Weibull | Yes | Yes | Crude | Medium | Series A+, default |
| 3 | Cox / Markov chain | Yes | Yes | Yes | Medium-high | Heterogeneous base, win-back matters |
| 4 | Bayesian hierarchical / BG-NBD | Yes | Yes | Yes (posterior) | High | Diligence, IPO-grade reporting |
Cross-link: q424 ("board-ready unit economics dashboard") — show Tier 2 numbers to the board with Tier 3/4 as the back-up appendix.
5. The Markov-chain model in depth — why it suits the "never expand" problem
The question's hardest sub-problem — customers who survive but never expand — is handled most naturally by a Markov-chain model, so it is worth a closer look. In a Markov model, every customer at every month occupies exactly one state, and you estimate the probability of moving between states from historical transitions.
A practical state set for SaaS: Onboarding, Active-Flat, Active-Expanded, Active-Contracted, At-Risk, Churned, and Reactivated. Churned is an absorbing state in the simplest version (once gone, gone) or a transient one if you model win-back. Each state carries a reward — the monthly margin-adjusted revenue earned while in that state.
LTV is then the expected total discounted reward of the process starting from Onboarding, a quantity with a clean closed-form solution from the transition matrix.
Why this fits the non-expander problem: "stays flat forever" is simply a customer who keeps transitioning Active-Flat to Active-Flat with high probability. There is no awkward exception, no blended growth rate that misrepresents them. Expanders are customers with meaningful Active-Flat to Active-Expanded transition probability.
Contractors flow toward Active-Contracted. Each behavior is a transition probability estimated from data, and the model produces a *distribution* of customer journeys rather than forcing one average path.
The Markov approach also makes reactivation a first-class citizen. The Churned to Reactivated transition, however small, can be material for LTV in products with seasonal or project-based usage. The 1/churn formula and even basic Kaplan-Meier cannot represent win-back at all; the Markov model does it for free.
| State | Description | Typical monthly reward | Key outbound transitions |
|---|---|---|---|
| Onboarding | First 1-3 months, pre-activation | Initial ARPA x margin | to Active-Flat, to Churned (cliff) |
| Active-Flat | Stable, no revenue change | Flat ARPA x margin | to Active-Expanded, to At-Risk |
| Active-Expanded | Grew revenue this period | Higher ARPA x margin | stays, to Active-Flat |
| Active-Contracted | Shrank revenue this period | Lower ARPA x margin | to At-Risk, to Active-Flat |
| At-Risk | Usage/health signals declining | Flat ARPA x margin | to Churned, recovers to Active-Flat |
| Churned | No longer paying | 0 | to Reactivated (small), else absorbing |
| Reactivated | Returned after churn | ARPA x margin | to Active-Flat |
6. Validating the model — backtesting and holdout
A survival or Markov LTV model is a forecast, and forecasts must be validated, not trusted. The discipline borrowed from forecasting practice:
- Holdout backtest. Fit the model on cohorts through, say, two years ago, then "predict" the retention of the subsequent year and compare to what actually happened. If the model's month-30 survival prediction was 0.58 and reality was 0.51, your model is optimistic — recalibrate before you let it set CAC budgets.
- Vintage stability. Re-estimate the model each quarter and watch whether the parameters drift. Large quarter-over-quarter swings in the Weibull shape parameter mean either your business is changing or your data is noisy; either way, investigate.
- Sensitivity bands. Report LTV under low / base / high assumptions for the two parameters that swing it most (usually discount rate and plateau hazard). If the band is wide, say so; a wide band honestly disclosed is more credible than a narrow one quietly assumed.
BANNER: TOOLING AND DATA PLUMBING
VI. The Stack: From Billing Events To LTV
1. Subscription analytics platforms
ChartMogul and Maxio (the merged Chargify + SaaSOptics entity) both produce cohort retention triangles, GRR/NRR, and segment filters out of the box — they are the fastest path to Tier 1 and a clean input to Tier 2. Recurly and Stripe Billing with Stripe Sigma can produce the raw event data.
The critical requirement: you need *event-level* data (when each subscription started, changed MRR, and ended) — not just monthly aggregates — to do survival analysis properly.
2. The analysis layer
| Layer | Tool options | Role in LTV |
|---|---|---|
| Billing source of truth | Stripe Billing, Recurly, Chargebee, Zuora | Raw subscription + MRR-change events |
| Subscription analytics | ChartMogul, Maxio, Baremetrics | Cohort triangles, GRR/NRR, segmentation |
| Warehouse | Snowflake (SNOW), BigQuery, Databricks | Joins billing with product + CRM data |
| Transformation | dbt | Builds the duration + event-observed columns |
| Survival modeling | Python lifelines, R survival, lifetimes (BG-NBD) | Kaplan-Meier, Weibull, Cox |
| Bayesian | PyMC, Stan, lifetimes | Posterior LTV distributions |
| BI / dashboard | Looker, Tableau, Hex, Mode | Board-facing LTV ranges |
3. The data-quality checklist
Survival models are only as good as the event log. Before modeling, verify:
- Churn date is unambiguous. A downgrade to a free tier is not the same event as a full cancellation — decide and be consistent.
- Reactivations are tracked. A customer who churns and returns three months later is a new survival "spell"; decide whether to model reactivation explicitly (Markov) or treat returns as new cohorts.
- Trials are excluded. Free-trial users who never convert are not paying customers and will wreck early-life churn if included.
- Pauses and migrations are flagged. A customer migrated from monthly to annual billing is not a churn; a billing-system migration that re-creates subscription records is not 1,000 new customers.
- The observation window is recorded. Every customer needs a "last date we have data for" so censoring is computed correctly.
Cross-link: q420 ("burn multiple") — the same warehouse and dbt layer that feeds LTV should feed the burn-multiple calculation; build the pipeline once.
4. A reference dbt and modeling pipeline
For teams building this in-house, the pipeline has a predictable shape and it is worth documenting so it survives staff turnover. The billing system emits raw events. A dbt staging layer normalizes them into one row per subscription-month with a status.
A dbt intermediate model collapses that into one row per customer with a tenure_months and an event_observed boolean — the exact two columns Kaplan-Meier consumes. A Python job (scheduled in Airflow, Dagster, or a cloud function) reads that table, fits the survival and Cox models, and writes an ltv_by_segment table back to the warehouse.
The BI tool reads only that final table. This separation matters: the statistics live in version-controlled Python, the SQL stays simple, and the dashboard is a thin presentation layer.
| Pipeline stage | Technology | Output | Refresh cadence |
|---|---|---|---|
| Raw event ingestion | Fivetran / Stripe export | raw_subscription_events | Hourly |
| Staging normalization | dbt staging models | stg_subscription_months | Daily |
| Customer-level shaping | dbt intermediate model | int_customer_survival (tenure + event flag) | Daily |
| Survival/Cox modeling | Python lifelines, scheduled | ltv_by_segment table | Weekly or quarterly |
| Presentation | Looker / Hex / Mode | Board LTV dashboard | On-read |
5. The cohort triangle export — getting it right in ChartMogul / Maxio
Even if you never leave Tier 1, the cohort triangle export must be configured deliberately. Three settings determine whether the triangle is usable: (a) revenue retention vs. logo retention — for LTV you want dollar retention; logo retention undercounts the value of expanders; (b) the cohort grain — monthly cohorts give resolution but thin cells, quarterly cohorts give stable cells but blur fast-moving trends; quarterly is the safer default for companies under ~$10M ARR; (c) whether trials and free accounts are excluded — they must be, or the early columns are meaningless.
ChartMogul and Maxio both expose all three settings; the default export is rarely the right one for LTV work.
BANNER: REPORTING LTV TO A BOARD
VII. Framing The Number So It Survives Diligence
1. Never show a single hero number
The fastest way to lose credibility with a sophisticated board or a diligence team is to present "Our LTV is $48,000" as a fact. It invites the obvious question — "computed how?" — and if the answer is ARPA/churn, the rest of your unit economics gets discounted too. Instead, present:
- A range, e.g. "$19K-$24K depending on cohort and discount rate."
- The method, named: "Kaplan-Meier survival curve, Weibull tail, 10% discount, 48-month cap."
- The segmentation, e.g. enterprise LTV vs. SMB LTV shown separately.
- The confidence interval if you are on Tier 3/4.
2. LTV:CAC and payback — the ratios that matter
LTV is rarely the headline; the LTV:CAC ratio and CAC payback period are. The widely cited benchmarks — popularized by firms like Bessemer Venture Partners, OpenView, and ICONIQ Growth in their SaaS benchmark reports — are roughly:
| Metric | Weak | Acceptable | Strong | Notes |
|---|---|---|---|---|
| LTV:CAC ratio | < 1.5:1 | 3:1 | > 4:1 | Use *gross-margin* LTV, fully-loaded CAC |
| CAC payback (months) | > 24 | 12-18 | < 12 | The cash-flow-honest twin of LTV:CAC |
| GRR | < 80% | 85-90% | > 90% | Enterprise should exceed SMB |
| NRR | < 100% | 105-115% | > 120% | Segment it — see Section IV |
A 3:1 ratio computed from inflated 1/churn LTV is really 1.8:1. That is the difference between a fundable growth motion and a leaky bucket — which is precisely why diligence teams rebuild your LTV from raw data.
3. Pair LTV with payback because LTV ignores time
LTV says nothing about *when* the cash arrives. A customer worth $30K over five years is very different from one worth $30K in eighteen months, especially if you are capital-constrained. CAC payback period — months of gross-margin revenue to recoup acquisition cost — is the cash-flow-honest counterweight.
Always report them together. A long payback with a great LTV:CAC ratio is a financing problem disguised as a good business.
4. The Bessemer / ICONIQ / OpenView framing
These growth-equity and benchmarking firms have done more than anyone to professionalize SaaS metrics, and their reports are useful precisely because they show *distributions* — top-quartile vs. median vs. bottom-quartile NRR, payback, and growth-efficiency figures. Use them as a mirror: if your segmented LTV implies a payback far outside the benchmark band, either you have found an edge or you have a model error.
Investigate before you celebrate. ICONIQ's "Growth & Efficiency" reports and OpenView's SaaS Benchmarks survey are the most-cited public sources; Bessemer's "State of the Cloud" frames the macro.
Cross-link: q445 — regional AE hiring economics must use the regional LTV:CAC, not the global blend.
5. The five questions a sharp board member will ask
When you present a survival-analysis LTV, expect interrogation from anyone who has sat through SaaS diligence. Prepare answers to these five before the meeting.
"What churn assumption is baked into the tail?" They want to know your plateau hazard and whether your Weibull extrapolation is optimistic. Have the observed-vs-extrapolated split ready: "We have data through month 30 at 0.7% monthly churn; the tail assumes 0.7% holds, not an improvement."
"Is this gross-margin LTV?" If you cannot immediately say "yes, at our 81% blended gross margin, defined identically to the P&L," the number is discounted. Revenue LTV is a vanity metric and they know it.
"What is the payback, not just the ratio?" A 4:1 LTV:CAC with a 30-month payback is a financing problem. Lead with payback for the cash-constrained reality.
"How does this differ by segment?" Never let the blended number stand alone. Have SMB and enterprise LTV ready, and the expander/flat/contractor split. A board member who has seen a bimodal base will not accept a single mean.
"What changed since last quarter and why?" LTV should move slowly. If it jumped, you either found something real or changed an assumption. Be able to attribute the delta to a specific cause — a cohort maturing, a discount-rate change, a data fix.
| Board question | Weak answer | Strong answer |
|---|---|---|
| Tail churn assumption? | "The model handles it" | "0.7%/mo plateau, observed through M30, held flat in tail" |
| Gross-margin LTV? | "It's revenue-based" | "Yes, 81% GM, matches the P&L line" |
| Payback? | "Ratio is 4:1" | "4:1 ratio, 14-month payback, segment range 9-22 months" |
| Segment differences? | "Blended is $22K" | "Enterprise $41K, SMB $11K, here is why" |
| Quarter-over-quarter delta? | "It went up" | "+6%, driven by the 2024-Q2 cohort crossing M24" |
6. LTV in fundraising vs. LTV in operating
The same number serves two audiences with different needs. In a fundraise, LTV is part of a narrative about capital efficiency and durability; investors will rebuild it from your data room, so the operating discipline must already be in place — there is no separate "fundraising LTV." In operating, LTV is a decision tool: it sets CAC ceilings by channel and segment, informs pricing, and flags retention problems.
The mistake is maintaining two versions. Keep one model, one set of assumptions, governed by finance; present the same number to the board, the data room, and the growth team. Divergence between an "operating LTV" and a "fundraising LTV" is exactly the kind of inconsistency diligence is designed to surface.
BANNER: ACCOUNTING, COMPLIANCE, AND LTV
VIII. ASC 606, Bookings vs. Revenue, And LTV Discipline
1. LTV is not a GAAP number — keep it that way
LTV is a management metric, not a financial-statement figure. ASC 606 governs how you *recognize* revenue (ratably over the service period for most SaaS), and your LTV model should be consistent with recognized revenue, not bookings. A customer who signs a three-year prepaid deal generates one large *booking* but recognized revenue spreads across 36 months — and that recognized stream, net of margin and discounted, is the LTV-relevant cash.
2. Common LTV-accounting mismatches
| Mistake | Why it inflates/distorts LTV | Fix |
|---|---|---|
| Using bookings instead of recognized revenue | Front-loads multi-year deals into one period | Model the recognized monthly stream |
| Ignoring contraction in deferred revenue | Hides mid-contract downgrades | Reconcile LTV revenue to the deferred-revenue rollforward |
| Counting one-time services revenue in ARPA | Inflates recurring R(t) | LTV uses recurring revenue only |
| Mixing gross and net of channel fees | Overstates margin | Use the same gross-margin definition as the P&L |
3. Consistency with the audited stack
When a company approaches a funding round, an audit, or an IPO, the diligence team will reconcile your LTV inputs against audited financials. If your LTV model's "revenue" cannot be tied back to recognized revenue under ASC 606, the metric — and your credibility — is discounted. The discipline: define R(t) as recurring recognized revenue per account, define GM identically to the P&L's gross-margin line, and keep a documented reconciliation.
Cross-link: q424 ("board-ready unit economics dashboard") — the dashboard should footnote that LTV is a non-GAAP management metric reconciled to recognized revenue.
4. The ten LTV errors that survive into board decks
Even teams that have abandoned 1/churn carry a predictable set of residual errors. A pre-flight checklist before any LTV reaches a board:
- Survivorship bias in the cohort triangle. Averaging only mature cohorts that "made it" while ignoring recent weak ones flatters the curve.
- Censoring treated as churn. Counting the end of the observation window as a cancellation overstates churn — the exact bug Kaplan-Meier exists to fix.
- Revenue, not margin. Reporting LTV on gross revenue overstates by the full cost-of-goods percentage.
- No discount rate. Undiscounted lifetime cash treats month-48 dollars as worth the same as month-1 dollars.
- Infinite horizon. Letting the sum run forever lets a fragile tail extrapolation dominate.
- Blended NRR applied per customer. Folding expansion into one rate misrepresents the non-expanding majority.
- Bookings instead of recognized revenue. Front-loads multi-year deals and breaks the ASC 606 reconciliation.
- One-time services in ARPA. Inflates the recurring
R(t)with non-recurring revenue. - Whale-dominated blends. A few hyperscale accounts drag the mean far from the median; report both.
- Stale descriptive LTV used as predictive. Spending today's CAC against the value of cohorts acquired under an old ICP or old pricing.
| Error | Detection check | Fix |
|---|---|---|
| Survivorship bias | Are recent weak cohorts in the average? | Include all cohorts; weight by recency |
| Censoring as churn | Does churn spike at the data-window edge? | Use Kaplan-Meier |
| Revenue not margin | Does the formula use GM? | Multiply by gross margin |
| No discount rate | Is d documented? | Apply 8-12% annual |
| Infinite horizon | Does the sum cap at 36-60 months? | Cap the horizon |
| Blended NRR per customer | Is expansion modeled per segment? | Split expander/flat/contractor |
| Bookings vs revenue | Does LTV tie to recognized revenue? | Reconcile to ASC 606 |
| Services in ARPA | Is R(t) recurring-only? | Strip one-time revenue |
| Whale domination | Is the median reported next to the mean? | Show distribution |
| Stale predictive use | Was the model fit on current-ICP cohorts? | Re-fit; use early signals |
BANNER: COUNTER-CASE — WHEN THIS ADVICE DOES NOT APPLY
IX. When Full Survival Modeling Is Overkill Or Misleading
Rigorous survival-analysis LTV is the right answer for *most* established SaaS businesses — but not all. Knowing when to *not* do the full model is a sign of judgment, not laziness.
1. Pre-product-market-fit and pre-Series-A
If you have fewer than ~100 paying customers and less than 12 months of history, a Kaplan-Meier curve will be almost entirely censored and a Weibull tail will be fantasy. Do not build a Tier 3/4 model at this stage. A simple cohort triangle plus brutal honesty ("we do not yet know our true LTV; here is the 6-month retention we have observed") is more credible than a precise-looking number built on 40 data points.
Spurious precision is worse than admitted uncertainty.
2. Usage-based / non-contractual revenue
For pure consumption businesses where customers never formally "churn" — they just stop using the product, then maybe come back — the *survival* framing is the wrong tool. There is no clean churn event. Here the BG/NBD + Gamma-Gamma family (the "buy-till-you-die" models) is the correct choice, modeling latent churn and spend separately.
Applying contractual Kaplan-Meier to a non-contractual business produces nonsense.
3. Very long sales cycles with tiny n
A business selling six-figure-plus contracts to a few dozen logos a year does not have the sample size for statistical survival modeling. Here, account-by-account, bottoms-up estimation — renewal probability assessed per named account by the CS team — beats any curve fit. The law of large numbers is not on your side; named-account judgment is.
4. When the business model is changing faster than the data
If you repriced six months ago, changed your ICP, or launched a new product line, your historical survival curve describes a company that no longer exists. Survival models assume the data-generating process is stable. After a major strategic change, weight recent cohorts heavily, flag the discontinuity explicitly, and treat old-cohort LTV as a separate, legacy number. A blended LTV across a strategy change is a blend of two different companies.
5. When LTV is being used to justify, not to decide
The honest counter-case: if leadership has already decided to spend aggressively on acquisition and wants an LTV number to *justify* it, no amount of modeling rigor will help — the model will be tuned until the ratio looks fundable. The fix is governance, not statistics: agree on the discount rate, horizon cap, and margin definition *before* running the model, and have finance, not growth, own the assumptions.
| Situation | Recommended approach | Avoid |
|---|---|---|
| Pre-PMF, < 100 customers | Cohort triangle + stated uncertainty | Weibull extrapolation, hero LTV number |
| Usage-based / non-contractual | BG/NBD + Gamma-Gamma | Contractual Kaplan-Meier |
| Enterprise, few dozen logos/yr | Bottoms-up named-account estimate | Statistical survival curve |
| Post-repricing / ICP change | Recent-cohort weighting, flagged discontinuity | Blended all-cohort LTV |
| LTV used to justify a decision | Pre-agree assumptions; finance owns model | Tuning the model to the desired ratio |
BANNER: IMPLEMENTATION ROADMAP
X. A 90-Day Plan To Replace 1/churn
1. Days 1-30 — instrument and triangulate
- Audit the event log. Confirm subscription start, MRR-change, and end events are captured cleanly. Resolve the data-quality checklist in Section VI.
- Build the cohort triangle (Tier 1). Get an empirical
S(t)and a first honest LTV range into the room. This alone usually corrects a wrong board number. - Define the constants. Get finance to sign off on gross-margin definition, discount rate, and horizon cap. Document them.
2. Days 31-60 — model and segment
- Fit Kaplan-Meier + Weibull (Tier 2). Move from triangle to a curve that extrapolates to the horizon cap.
- Segment. Produce separate curves for SMB vs. enterprise (minimum), and expander vs. flat vs. contractor.
- Fit a Cox model. Identify the two or three covariates that move churn most — these become both LTV inputs and product/CS priorities.
3. Days 61-90 — operationalize
- Wire it into the board dashboard. LTV as a range, segmented, with method named and CAC payback alongside.
- Set a refresh cadence. Quarterly is enough for most; the model is not a real-time metric.
- Decide if you need Tier 3/4. If you are heading into a raise or IPO diligence, build the Markov or Bayesian model as the rigorous back-up. Otherwise Tier 2, well-segmented, is sufficient.
| Phase | Deliverable | Owner | Tier reached |
|---|---|---|---|
| Days 1-30 | Clean event log + cohort triangle + signed-off constants | Data + Finance | Tier 1 |
| Days 31-60 | KM/Weibull curve, segmented; Cox covariate findings | Data / Analytics | Tier 2-3 |
| Days 61-90 | Board dashboard, refresh cadence, optional Bayesian back-up | Finance + Data | Tier 2-4 |
2. The cultural shift that matters most
The technical upgrade from 1/churn to survival analysis is real, but the durable win is cultural: a leadership team that reports LTV as a range, names its method, segments its base, and pairs LTV with payback will make better capital-allocation decisions than one chasing a single hero number.
The model is a means; honest, decision-useful unit economics is the end. The companies that compound — the ones diligence teams pass quickly — are not the ones with the highest LTV slide, but the ones whose LTV number survives being rebuilt from raw data by a skeptical outsider. Build the number you would be comfortable defending line by line, and you will rarely have to defend it at all.
Cross-links recap: q420 (burn multiple), q424 (board unit-economics dashboard), q432 (partner enablement curriculum), q445 (regional AE hiring economics) — LTV is the connective tissue across all of them.
Sources
- Fader, P. & Hardie, B. — "Probability Models for Customer-Base Analysis," Journal of Interactive Marketing.
- Fader, P., Hardie, B. & Lee, K. — "Counting Your Customers the Easy Way: An Alternative to the Pareto/NBD Model" (BG/NBD).
- Fader, P. & Hardie, B. — "The Gamma-Gamma Model of Monetary Value."
- Kaplan, E. & Meier, P. — "Nonparametric Estimation from Incomplete Observations," JASA, 1958.
- Cox, D.R. — "Regression Models and Life-Tables," Journal of the Royal Statistical Society, 1972.
lifelinesPython library documentation — survival regression and Kaplan-Meier.lifetimesPython library documentation — BG/NBD and Gamma-Gamma implementations.- ChartMogul — "SaaS Metrics Guide: Cohort Analysis and Customer Lifetime Value."
- ChartMogul — "Net Revenue Retention and Gross Revenue Retention explained."
- Maxio (Chargify + SaaSOptics) — "SaaS Cohort Analysis and Retention Reporting."
- Baremetrics — "Customer Lifetime Value: the complete guide."
- Bessemer Venture Partners — "State of the Cloud" annual report.
- Bessemer Venture Partners — "The SaaS Mendoza Line / Efficiency Score."
- ICONIQ Growth — "Growth & Efficiency" SaaS benchmark report.
- ICONIQ Growth — "The Topline Growth and Operational Excellence" series.
- OpenView Partners — "SaaS Benchmarks Report" (annual survey).
- KeyBanc Capital Markets — "Private SaaS Company Survey."
- David Skok, "For Entrepreneurs" — "SaaS Metrics 2.0" and "Unit Economics."
- a16z — "16 Startup Metrics" and "The Other 16 Startup Metrics."
- a16z — "16 More Startup Metrics" on LTV pitfalls.
- Snowflake (SNOW) investor relations — disclosed net revenue retention methodology.
- Atlassian (TEAM) shareholder letters — expansion and seat-growth commentary.
- Box (BOX) investor materials — net retention discussion.
- FASB ASC 606 — "Revenue from Contracts with Customers."
- AICPA — Revenue Recognition implementation guidance for software/SaaS.
- Klipfolio — "SaaS Metrics: LTV and CAC."
- Mosaic.tech — "The CFO's guide to SaaS LTV."
- Recurly Research — "State of Subscriptions / Churn benchmarks."
- Stripe — "Billing analytics and Sigma documentation."
- Andrew Gelman et al. — "Bayesian Data Analysis" (hierarchical models reference).
- Klein & Moeschberger — "Survival Analysis: Techniques for Censored and Truncated Data."
- SaaS Capital — "Retention and the Spending Benchmarks for Private SaaS Companies."
- Bain & Company — "The Economics of Customer Loyalty / loyalty effect."
- McKinsey & Company — "Grow fast or die slow: SaaS retention economics."
- Tomasz Tunguz (Theory Ventures / formerly Redpoint) — essays on SaaS retention and LTV modeling.