Implementation of Customer Lifetime Value Prediction
Customer Lifetime Value — expected profit sum from customer over entire interaction period. Accurate LTV assessment enables proper investment in acquisition (CAC), segment the base and prioritize retention. ML approach gives 30-50% more accurate forecasts than simple historical averages.
Two Approaches to LTV Prediction
Contractual Models (SaaS, Subscriptions): Customer is either active or gone. Task breaks down into:
- Churn prediction: probability of leaving in each period
- Revenue prediction: payment size given customer is active
- LTV = Σ P(alive at t) × Expected_Revenue(t) × Discount_Factor(t)
Non-Contractual Models (E-commerce, Retail): Customer doesn't announce departure. Classic approach — BG/NBD model (Beta Geometric/Negative Binomial Distribution):
- Frequency model: transaction frequency = NBD
- Dropout model: probability of customer "death" = Beta-Geometric
- Monetary value model: gamma-gamma model for average purchase value
Library lifetimes (Python) implements BG/NBD + gamma-gamma out of the box. Data: customer_id, frequency, recency, T (customer age), monetary_value.
ML Approach: Direct Prediction
Alternative to probabilistic models — direct 12-month LTV prediction via regression:
Features:
- RFM in first 30/60/90 days after onboarding
- Acquisition channel (paid search, organic, referral)
- Cohort characteristics (acquisition season)
- Behavioral: feature usage, session depth
- Segment: B2B vs. B2C, geography, company size
Algorithm: LightGBM Regressor with quantile loss for uncertainty. Metric: MAPE on holdout cohort (customers onboarded 12+ months ago).
Typical accuracy: MAPE 25-40% for 12-month forecast — sufficient for segmentation, but not precise CAC calculation.
Early LTV Predictor
Valuable nuance: predict LTV in first 7-30 days after registration, when data is sparse:
Early Life Signals:
- Onboarding completion rate
- Number of key actions in first week (product activation)
- NPS score from first survey
- Usage depth: number of modules/features
Random Forest with these features allows classifying "whales" (high LTV) with 60-70% Precision already 7 days after registration. This allows directing Customer Success to right customers from day one.
LTV-Based Segmentation
Forecast LTV → customer base segmentation:
| Segment | LTV Percentile | Strategy |
|---|---|---|
| Champions | > 90th | VIP support, referral programs |
| High Potential | 70-90th | Active CS, upsell |
| Core | 30-70th | Automated nurturing |
| At Risk | < 30th | Watchlist, fit check |
Segments reviewed quarterly or on significant behavior change.
Marketing Spend Optimization
Main application of LTV model — CAC optimization:
Bidding in Paid Channels:
- Google Ads / Meta Ads support passing predicted LTV as conversion value
- Smart Bidding optimizes for maximum LTV, not ROAS
- Result: budget shift to channels with better LTV/CAC ratio
Cohort Analysis: LTV by acquisition cohorts (month × channel × campaign) shows which campaigns attract customers with real value, not just cheap ones.
Model Monitoring
LTV — long-term forecast, difficult to validate quickly. Approaches:
- Shortened Horizon Validation: train on 24-month cohort, predict 12-month LTV, compare with actual in 12 months
- Relative Ranking Accuracy: absolute accuracy matters less than correct customer ordering by LTV
- Early vs. Final LTV Correlation: how much 7-day LTV correlates with 12-month actual
Timeline: BG/NBD + gamma-gamma model from lifetimes — 2-3 weeks. ML system with early predictor, monitoring and CRM/paid channels integration — 10-14 weeks.







