News Feed System Design: Architecture, Fan-Out Strategies & Ranking
Designing a news feed is one of the most frequently asked system design interview questions — and for good reason. It touches on fan-out strategies, ranking algorithms, caching, real-time delivery, and content moderation all at once.
This guide breaks down news feed system design from first principles so you can reason through every layer confidently.
Why News Feed Design Is Hard#
A news feed looks simple on the surface: show users a list of posts from people they follow. In practice it involves:
- Write amplification — a single post may need to reach millions of followers.
- Read latency — users expect sub-100ms feed loads.
- Ranking quality — chronological order is not enough; relevance matters.
- Real-time freshness — new posts should appear without a full page refresh.
High-Level Architecture#
A typical news feed system consists of four services:
- Post Service — accepts new posts, stores them, triggers fan-out.
- Fan-Out Service — distributes posts to follower feeds.
- Feed Service — assembles and returns a user's feed.
- Ranking Service — scores and orders feed items.
Feed Generation: Push vs Pull vs Hybrid#
Fan-Out on Write (Push)#
When a user publishes a post, the system immediately writes it to every follower's feed cache.
- Pros: Read is fast — the feed is pre-computed.
- Cons: Write is expensive, especially for users with millions of followers.
Fan-Out on Read (Pull)#
The feed is assembled at read time by fetching recent posts from every user the reader follows.
- Pros: Write is cheap — no fan-out at publish time.
- Cons: Read latency is high because it requires merging many sorted lists.
Hybrid Approach#
Most production systems use a hybrid:
- Regular users → fan-out on write (push model).
- Celebrities / high-follower accounts → fan-out on read (pull model).
This is exactly how Twitter (now X) solved the celebrity problem: a user with 50 million followers would create 50 million writes on every tweet under a pure push model.
The Celebrity Problem#
When a celebrity posts, pure fan-out on write is impractical. The hybrid strategy handles this by:
- Skipping the push step for celebrity posts.
- At read time, merging the pre-computed feed with recent celebrity posts.
- Caching the merged result so subsequent reads are fast.
A follower-count threshold (e.g., 500K) determines which path a user takes.
Ranking Algorithms#
Chronological#
Sort by timestamp. Simple but leads to low engagement because irrelevant posts dominate.
ML-Based Ranking#
Modern feeds use a scoring model that considers:
- Affinity — how often the reader interacts with the author.
- Content type weight — videos, images, and links may rank differently.
- Recency decay — newer posts get a boost that fades over time.
- Engagement velocity — posts gaining likes quickly are promoted.
The ranking service scores each candidate item and returns a sorted list.
Feed Storage#
| Layer | Technology | Purpose |
|---|---|---|
| Hot cache | Redis sorted sets | Pre-computed feeds, keyed by user ID |
| Warm store | Cassandra / DynamoDB | Persistent feed items for cold-start users |
| Blob store | S3 | Media attachments referenced by feed items |
Each entry in a Redis sorted set is a post ID with the score set to the ranking value (or timestamp for chronological feeds).
Cache Strategies#
- Write-through cache — update the cache during fan-out so reads never miss.
- TTL expiry — expire stale feed entries after 24–48 hours.
- Cache warming — when a user logs in after a long absence, rebuild their feed asynchronously.
- Layered caching — L1 (local in-memory) for the feed service, L2 (Redis cluster) for shared state.
Real-Time Updates#
Users expect new posts to appear without refreshing. Two approaches:
Long Polling#
The client opens a request that the server holds until new data is available. Simple but inefficient at scale.
WebSockets#
A persistent connection allows the server to push new feed items instantly. This is preferred for high-traffic feeds because it reduces connection overhead.
A notification service listens to the fan-out pipeline and pushes lightweight payloads (post IDs) to connected clients, which then fetch full post data.
Cursor-Based Pagination#
Offset-based pagination breaks when new items are inserted (users see duplicates or miss posts). Cursor-based pagination solves this:
GET /feed?cursor=eyJsYXN0X2lkIjoiMTIzNDU2Nzg5MCJ9&limit=20
- The cursor encodes the last seen item (e.g., base64-encoded post ID or timestamp).
- The server fetches items after the cursor.
- The response includes a
next_cursorfor the following page.
This guarantees stable pagination even as new items arrive.
Content Moderation#
A production feed must filter harmful content before it reaches users:
- Pre-publish moderation — ML classifiers scan text and images at upload time.
- Post-publish moderation — a queue of flagged items is reviewed by human moderators.
- Feed-time filtering — blocked users, muted keywords, and policy-violating posts are stripped at read time.
- Appeal flow — authors can contest removals, triggering a secondary review.
Moderation should happen as early as possible in the pipeline to avoid caching harmful content.
Putting It All Together#
- User publishes a post → Post Service stores it and emits an event.
- Fan-Out Service checks follower count. Regular user → push to follower caches. Celebrity → skip.
- Reader requests feed → Feed Service reads pre-computed cache, merges celebrity posts.
- Ranking Service scores and sorts the merged list.
- Response is paginated with a cursor and sent to the client.
- WebSocket connection pushes new item notifications in real time.
Key Takeaways#
- Use a hybrid fan-out strategy to balance write and read costs.
- Solve the celebrity problem with a follower-count threshold.
- Prefer cursor-based pagination over offset-based.
- Layer Redis + Cassandra for hot and warm feed storage.
- Apply content moderation as early in the pipeline as possible.
Mastering news feed system design prepares you for a wide range of interview scenarios — from social networks to content platforms.
Ready to level up your system design and coding skills? Visit codelit.io for interactive tutorials, real-world projects, and expert-led courses.
This is article #181 on the Codelit blog.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
AI Agent Tool Use Architecture: Function Calling, ReAct Loops & Structured Outputs
6 min read
AI searchAI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG
8 min read
AI safetyAI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop
8 min read
Try these templates
Uber Real-Time Location System
Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.
6 componentsNetflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsE-Commerce Checkout System
Production checkout flow with Stripe payments, inventory management, and fraud detection.
11 componentsBuild this architecture
Generate an interactive News Feed System Design in seconds.
Try it in Codelit →
Comments