Notification System Architecture: Channels, Fan-Out, and Delivery at Scale
Every modern application needs to tell users about things — a new message, a completed deployment, a failed payment. What starts as a single email sender quickly becomes one of the most complex subsystems in your architecture. This guide covers how to design a notification system that scales across channels, respects user preferences, and guarantees delivery tracking.
Push vs Pull#
The first architectural decision is how notifications reach the consumer.
Push Model#
The server sends notifications to clients the moment events occur. Examples: WebSocket messages, mobile push notifications, email.
Pros: Low latency, real-time user experience. Cons: Requires persistent connections or third-party push infrastructure. Must handle offline users and retry delivery.
Pull Model#
The client periodically polls the server for new notifications. Examples: badge count APIs, notification inbox endpoints.
Pros: Simple to implement, no persistent connection needed. Cons: Higher latency (bounded by poll interval), wasted requests when there is nothing new, harder to scale with short intervals.
Hybrid (Most Common)#
Push a lightweight signal (WebSocket ping or silent push) telling the client to pull the full notification payload. This gives real-time awareness without pushing large payloads over persistent channels.
Notification Channels#
A mature system supports multiple delivery channels, each with different latency, cost, and reliability characteristics.
In-App#
- Delivered via WebSocket or pulled from an API.
- Stored in a notification inbox (database table or document).
- Cheapest channel — no third-party cost.
- User must have the app open to see real-time updates.
Email#
- Highest reach — every user has an email address.
- High latency (seconds to minutes depending on provider).
- Rich formatting (HTML templates, images, CTAs).
- Deliverability is a discipline unto itself: SPF, DKIM, DMARC, warm-up, bounce handling.
- Tools: Amazon SES, SendGrid, Postmark.
SMS#
- High open rates (~98%) but expensive per message.
- Regulatory requirements vary by country (opt-in, STOP handling, quiet hours).
- Best for time-critical alerts: 2FA codes, fraud alerts, outage notifications.
- Tools: Twilio, Vonage, MessageBird.
Mobile Push#
- Delivered through platform-specific services: Firebase Cloud Messaging (FCM) for Android, Apple Push Notification Service (APNs) for iOS.
- Silent push can wake the app to fetch data without showing a visible notification.
- Payload size limits (~4 KB) — keep the push lean and let the app fetch details.
- Token management: device tokens expire and rotate; maintain a registry.
WebSocket#
- Ideal for real-time in-app experiences: chat messages, live dashboards, collaborative editing.
- Requires connection management, heartbeats, and reconnection logic.
- Scale with a pub/sub backbone (Redis Pub/Sub, NATS) so any server can push to any connected client.
Priority and Throttling#
Not all notifications are equal. A fraud alert must arrive instantly; a weekly digest can wait.
Priority Levels#
| Priority | Examples | Delivery Target |
|---|---|---|
| Critical | Security alerts, 2FA, payment failures | Immediate, multi-channel |
| High | Direct messages, mentions | Seconds, push + in-app |
| Medium | Comments, reactions | Minutes, batched |
| Low | Marketing, digests | Hours, email only |
Throttling Strategies#
- Rate limiting per user — cap notifications per channel per time window (e.g., max 5 push notifications per hour).
- Batching — aggregate low-priority notifications into a single digest. "You have 12 new comments" beats 12 separate pushes.
- De-duplication — if the same event triggers multiple notifications (e.g., someone edits a comment you were already notified about), collapse them.
- Quiet hours — respect user-configured or locale-based do-not-disturb windows. Queue notifications and deliver when the window opens.
- Back-pressure — if the notification pipeline is overloaded, shed low-priority traffic first.
Fan-Out Patterns#
When an event needs to notify many users — a post in a channel with 100K members — fan-out becomes the bottleneck.
Fan-Out on Write#
When the event occurs, immediately write a notification record for every recipient.
- Pros: Read path is fast — each user queries their own notification inbox.
- Cons: Expensive for high-follower-count events. Writing 100K records synchronously blocks the producer.
- Mitigation: Use a message queue (SQS, Kafka) to fan out asynchronously.
Fan-Out on Read#
Store the event once. When a user opens their notification feed, query for events relevant to them.
- Pros: Write path is O(1). No wasted writes for users who never check notifications.
- Cons: Read path is expensive — must compute relevance at query time.
Hybrid Fan-Out#
Fan-out on write for users with low follower counts. For "celebrity" accounts (millions of followers), store the event once and merge it into feeds at read time. This is the approach Twitter and Instagram use.
Implementation Sketch#
Event → Message Queue → Fan-Out Workers → Notification Store (per user)
↓
Channel Dispatchers
(Email, SMS, Push, WS)
Fan-out workers read from the queue, determine recipients and channels based on preferences, write to the notification store, and enqueue channel-specific delivery jobs.
Delivery Tracking#
You cannot improve what you do not measure. Track every notification through its lifecycle.
Status States#
CREATED → QUEUED → SENT → DELIVERED → READ
↓
FAILED → RETRYING → SENT
↓
DROPPED (max retries exceeded)
Key Metrics#
| Metric | What It Tells You |
|---|---|
| Send rate | Throughput of the pipeline |
| Delivery rate | Percentage reaching the endpoint (email inbox, device) |
| Open/read rate | User engagement per channel |
| Failure rate | Infrastructure or provider issues |
| Latency (event → delivered) | End-to-end pipeline health |
| Unsubscribe rate | Content relevance and frequency tuning |
Retry Strategy#
- Exponential backoff with jitter for transient failures.
- Channel fallback — if push fails after N retries, escalate to SMS for critical notifications.
- Dead letter queue — park permanently failed notifications for manual investigation.
Preference Management#
Respecting user preferences is both a UX requirement and a legal one (GDPR, CAN-SPAM, TCPA).
Preference Model#
UserPreferences {
userId: string
channels: {
email: { enabled: bool, frequency: "instant" | "daily" | "weekly" }
push: { enabled: bool, quietHours: { start: "22:00", end: "08:00" } }
sms: { enabled: bool, onlyCritical: bool }
inApp: { enabled: bool }
}
categories: {
marketing: { enabled: bool }
social: { enabled: bool }
transactional:{ enabled: bool } // often legally required — cannot disable
security: { enabled: bool } // also usually mandatory
}
}
Rules Engine#
Before dispatching a notification, the preference engine evaluates:
- Is this category enabled for this user?
- Is this channel enabled?
- Is the user in quiet hours? If yes, queue for later.
- Has the rate limit been hit? If yes, batch or drop.
- Is this a mandatory notification? (2FA, legal) — override preferences.
Tools and Services#
| Tool | Channel | Notes |
|---|---|---|
| Firebase Cloud Messaging (FCM) | Mobile push, web push | Free, unified API for Android/iOS/web |
| Amazon SNS | Push, SMS, email, HTTP | Fan-out via topics, integrates with SQS and Lambda |
| Twilio | SMS, voice, WhatsApp | Programmable messaging, global reach |
| Amazon SES | Low cost, high deliverability with proper setup | |
| SendGrid | Template management, analytics | |
| OneSignal | Push, in-app, email, SMS | Unified notification platform with segmentation |
| Novu | All channels | Open-source notification infrastructure, preference management built in |
Build vs Buy#
For startups, a managed service like Novu or OneSignal accelerates shipping. For large-scale systems with custom fan-out needs, a purpose-built pipeline on Kafka/SQS with channel-specific dispatchers gives full control.
Architecture Summary#
A complete notification system has five layers:
- Event ingestion — services emit events to a message bus (Kafka, SNS).
- Fan-out and routing — workers determine recipients, channels, and priority using the preference engine.
- Channel dispatchers — dedicated workers per channel (email sender, push sender, SMS sender) with retry logic.
- Notification store — per-user inbox for in-app notifications and audit trail.
- Tracking and analytics — lifecycle status, delivery metrics, and feedback loops (bounces, unsubscribes).
Design each layer to scale independently. The email sender should not block push delivery. The fan-out workers should not block event ingestion. Queues between every layer provide the buffering and back-pressure that keep the system healthy under load.
Build, explore, and share system design diagrams on codelit.io — the visual system design and diagramming tool for developers and teams.
This is article #152 in our system design and software architecture series.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
Uber Real-Time Location System
Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.
6 componentsE-Commerce Checkout System
Production checkout flow with Stripe payments, inventory management, and fraud detection.
11 componentsNotification System
Multi-channel notification platform with preferences, templating, and delivery tracking.
9 componentsBuild this architecture
Generate an interactive Notification System Architecture in seconds.
Try it in Codelit →
Comments