Implementing AI-Personalized Push Notifications in Mobile Application
News app sends same "Top News of the Day" notification to all 500K users at 8:00 AM. CTR — 2.1%. After segmentation by interests, same content with personalized headlines gives 7–12% CTR. AI-personalization — working with user behavior data, not "smart algorithm" in vacuum.
Data as Personalization Foundation
Without behavior data — no personalization. Minimal event set for model training:
-
notification_received— notification shown -
notification_opened— tap on notification -
notification_dismissed— swipe without opening -
content_viewed— viewing specific content in app -
content_shared,content_saved,content_liked
These events logged with feature set: content category, time of day, day of week, device type, OS version, headline length.
Storage: ClickHouse or BigQuery — optimized for analytical column queries. PostgreSQL unsuitable at >10M events/day.
Personalization Models: Simple to Complex
Level 1: Collaborative filtering. "Users like you clicked this". Via Matrix Factorization (Surprise library in Python, or implicit for implicit feedback). Trained once daily on last 30 days data.
Level 2: Content-based filtering. Analyzing content user read: extract keywords and categories via TF-IDF or sentence embeddings (all-MiniLM-L6-v2 from HuggingFace via transformers on inference server). New content — calculate cosine similarity with user history.
Level 3: CTR prediction. Binary "click/don't click" for each (user, content) pair. Model: LightGBM or XGBoost on tabular features + CatBoost for categorical (category, day of week). Inference fast — tens of milliseconds.
In practice: start with level 1 (quick deploy, interpretable), move to level 3 as data accumulates (minimum 50–100K events for stable training).
Personalizing Notification Text
One news — different headlines for different segments. Not generation via LLM on each send (too slow and expensive at scale). Approach:
- Editor creates 3–5 headline variants for one piece
- Multi-armed bandit (Thompson Sampling) picks variant per user based on their prior CTR with similar headlines
- After 24 hours analyze results, identify winner
Auto-generating variants — LLM (GPT-4o or Claude via API) creates 5 headline variants in different styles (neutral, clickbait, question, statistic, quote). Editor chooses from proposed, doesn't write from scratch.
Serving Layer: How It Works at Send
Every notification send (event-triggered or scheduled), personalization service:
- Gets target user list
- For each requests recommendation score from feature store (Redis with pre-computed vectors)
- If score below threshold — skip this user (suppress)
- If above — pick personalized text variant
- Logs decision for future training
Feature store — Redis hashes: user:{id}:features → {category_prefs: "...", avg_open_rate: 0.08, ...}. Updated nightly and incrementally on significant events.
Suppression — key tool. Better not to send than send irrelevant and get unsubscribe. Threshold determined empirically (A/B test).
A/B Testing and Metrics
Mandatory A/B test before global rollout: 10% users get personalized notifications, 90% — standard. Metrics after 2 weeks:
- CTR — primary metric
- Notification opt-out rate — did unsubscribe percentage decrease
- Session starts per notification — sessions generated per notification
- Revenue per notification — for e-commerce
Firebase A/B Testing + Remote Config covers basics. Advanced statistical analysis — own framework or Statsig/Eppo.
Mobile Client: What Changes
From client side — nothing. Push arrives via standard FCM, handled normally. All personalization logic — server-side. Client just sends behavior events.
Payload encryption if needed — via UNNotificationServiceExtension (iOS) decrypts before display.
Implementation Stages
- Audit existing notification system and event logging
- Set up analytical storage (ClickHouse / BigQuery)
- Develop event pipeline (mobile SDK → server → storage)
- Train first model (collaborative filtering), A/B test
- Feature store and serving layer
- Iterate on more complex models per test results
Timeline: minimal personalization (segmentation + bandit) — 4–6 weeks. Full ML pipeline — 12–16 weeks, requires data engineer + ML engineer.







