distributed counterssharded countersHyperLogLogRedis INCRFirestore distributed countersreal-time leaderboardeventual consistencysystem design

Distributed Counters: Scaling Counts to Millions of Writes per Second

March 29, 2026 7 min readBy Codelit Team Discussion

Counting sounds trivial until you need to do it at scale. A single row in a relational database can handle a few hundred increments per second before lock contention turns your counter into a bottleneck. Social platforms count likes, views, and shares across billions of events per day. This article explores the architecture patterns that make that possible.

Why Naive Counters Break#

A straightforward UPDATE counters SET value = value + 1 WHERE id = ? acquires a row-level lock on every write. Under high concurrency the lock queue grows, latency spikes, and throughput collapses. The fundamental issue is write contention on a single hot key.

Thread A ──▶ LOCK row ──▶ INCREMENT ──▶ UNLOCK
Thread B ──▶ ........wait........── LOCK ──▶ INCREMENT ──▶ UNLOCK
Thread C ──▶ ................wait................── LOCK ──▶ ...

Even optimistic concurrency (compare-and-swap) suffers under high contention because most retries fail, wasting CPU and network round-trips.

Sharded Counters#

The classic solution is to spread writes across N shards and sum them on read.

┌──────────────────────────────────────────┐
│           Logical Counter: post_likes    │
├──────────┬──────────┬──────────┬─────────┤
│ Shard 0  │ Shard 1  │ Shard 2  │ Shard 3 │
│  1,204   │  1,187   │  1,195   │  1,210  │
└──────────┴──────────┴──────────┴─────────┘
          Total = SUM(shards) = 4,796

Write Path#

Hash the writer (or pick randomly) to select a shard index i = hash(writer_id) % N.
Increment shard i — contention drops by a factor of N.

Read Path#

Read all N shards.
Return the sum.

Trade-offs#

Dimension	Impact
Write throughput	Scales linearly with shard count
Read cost	Increases linearly — must read N shards
Consistency	Eventual — a reader may see a stale sum
Complexity	Need to manage shard creation and rebalancing

A common optimization is to cache the sum with a short TTL (1–5 seconds) so that reads remain cheap while the displayed count is only slightly stale.

Approximate Counters with HyperLogLog#

When you need distinct counts (unique visitors, unique likers), exact counting requires storing every seen ID — memory grows linearly. HyperLogLog (HLL) trades accuracy for space.

How HLL Works#

Hash each element to a uniformly distributed bit string.
Observe the position of the leftmost 1-bit (the "run" of leading zeros).
Track the maximum run length across many sub-streams (registers).
Estimate cardinality from the harmonic mean of the register values.

With 16 KB of memory, HLL achieves roughly 0.81% standard error for cardinalities in the billions.

Redis HyperLogLog#

PFADD page:home:visitors "user-42"
PFADD page:home:visitors "user-99"
PFCOUNT page:home:visitors          -- returns approximate unique count
PFMERGE page:all:visitors page:home:visitors page:about:visitors

HLL is ideal when you can tolerate a small error margin — analytics dashboards, trending calculations, and spam detection all fit this profile.

Eventual Consistency for Counts#

Most users do not notice if a like count is off by a handful for a few seconds. This insight unlocks powerful optimizations.

Write-Behind Buffering#

Incoming increments land in a local buffer (in-memory or a fast queue).
A background process flushes the buffer periodically, batching increments into a single database write.
The read path adds the buffer delta to the last-known persisted value.

Client ──▶ Buffer (memory) ──▶ Batch flush every 1s ──▶ Database
                │
          Read = DB value + buffer delta

This reduces database writes by orders of magnitude and smooths out traffic spikes.

CRDT Counters#

In multi-region deployments, G-Counters (grow-only CRDT counters) allow each node to increment independently. The merge operation takes the element-wise max across all replicas.

Node A: [5, 0, 0]
Node B: [0, 3, 0]
Node C: [0, 0, 7]

Merged: [5, 3, 7]  →  Total = 15

For counters that also decrement, a PN-Counter pairs a G-Counter for increments with a G-Counter for decrements. The value is the difference.

Redis INCR — The Workhorse#

Redis INCR is atomic and single-threaded, making it naturally safe for concurrent increments without locks.

Common Patterns#

Simple counter:

INCR article:42:views
GET article:42:views

Rate limiter (sliding window):

key = "rate:{user_id}:{current_minute}"
count = INCR key
EXPIRE key 60
if count > limit then reject

Batched persistence:

-- Every 10 seconds, a worker reads and resets
value = GETSET article:42:views 0
-- Persist `value` to the durable store

Scaling Redis Counters#

For counters that exceed a single Redis instance's throughput (rare, since Redis handles around 100K ops/sec), use Redis Cluster with hash tags to control slot assignment, or use client-side sharding across multiple keys.

Firestore Distributed Counters#

Google's Firestore documentation explicitly recommends distributed counters because a single Firestore document supports roughly one write per second.

Implementation#

Create a subcollection counters/{counterId}/shards/{0..N}.
Each write picks a random shard and increments its count field.
Reads aggregate all shards with a collection query.

// Write — pick a random shard
const shardId = Math.floor(Math.random() * NUM_SHARDS);
const shardRef = doc(db, "counters", counterId, "shards", String(shardId));
await updateDoc(shardRef, { count: increment(1) });

// Read — sum all shards
const snapshot = await getDocs(collection(db, "counters", counterId, "shards"));
let total = 0;
snapshot.forEach(doc => { total += doc.data().count; });

Choosing Shard Count#

Start with 10 shards — supports ~10 writes/sec.
Scale to 100+ shards for viral content.
Use Cloud Functions to auto-scale shard count based on write rate.

Real-Time Leaderboard Counts#

Leaderboards combine counting with ranking. Redis sorted sets are the canonical solution.

ZINCRBY leaderboard:daily 1 "player:007"
ZREVRANGE leaderboard:daily 0 9 WITHSCORES   -- top 10
ZREVRANK leaderboard:daily "player:007"       -- player rank

Architecture for Scale#

┌────────────┐     ┌──────────────┐     ┌──────────────┐
│  Game Srvr │────▶│  Kafka Topic │────▶│  Leaderboard │
│  (events)  │     │  (scores)    │     │  Service     │
└────────────┘     └──────────────┘     └──────┬───────┘
                                               │
                                        ┌──────▼───────┐
                                        │  Redis Sorted │
                                        │  Set (ZADD)   │
                                        └──────────────┘

Game servers emit score events to Kafka.
A leaderboard consumer processes events and issues ZINCRBY commands.
The API layer reads from Redis for real-time rankings.
A periodic job snapshots the leaderboard to a durable store for historical queries.

Multi-Region Leaderboards#

For global leaderboards, run a Redis sorted set per region and merge periodically. Alternatively, use a single global Redis cluster with read replicas for rank queries.

Choosing the Right Pattern#

Scenario	Recommended Pattern
Simple page views	Redis INCR with periodic persistence
High-write social counters	Sharded counters (database or Firestore)
Unique visitor counts	HyperLogLog
Multi-region counters	CRDT G-Counter / PN-Counter
Real-time leaderboards	Redis sorted sets
Analytics aggregations	Write-behind buffer + batch processing

Key Takeaways#

Shard writes, cache reads — the universal principle behind every distributed counter pattern.
Approximate when you can — HyperLogLog saves orders of magnitude of memory for distinct counts.
Embrace eventual consistency — users tolerate slightly stale counts; your infrastructure will thank you.
Redis is your friend — INCR, sorted sets, and HLL cover most counting use cases out of the box.
Plan for viral spikes — auto-scaling shard counts prevents write bottlenecks when content goes viral.

Distributed counters appear in nearly every system design interview that involves social features, analytics, or gaming. Master these patterns and you will handle them with confidence.

This is article #378 in the Codelit system design series. Want to level up your system design skills? Explore the full collection at codelit.io.

Try it on Codelit

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

Build this architecture →

Comments

AI search

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

8 min read

AI safety

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

8 min read

API design

API Backward Compatibility: Ship Changes Without Breaking Consumers

6 min read

Try these templates

Distributed Rate Limiter

API rate limiting with sliding window, token bucket, and per-user quotas.

7 components

Distributed Key-Value Store

Redis/DynamoDB-like distributed KV store with consistent hashing, replication, and tunable consistency.

8 components

Build this architecture

Generate an interactive architecture for Distributed Counters in seconds.

Try it in Codelit →

distributed counterssharded countersHyperLogLogRedis INCRFirestore distributed countersreal-time leaderboardeventual consistencysystem design

Distributed Counters: Scaling Counts to Millions of Writes per Second

March 29, 2026 7 min readBy Codelit Team Discussion

Why Naive Counters Break#

Thread A ──▶ LOCK row ──▶ INCREMENT ──▶ UNLOCK
Thread B ──▶ ........wait........── LOCK ──▶ INCREMENT ──▶ UNLOCK
Thread C ──▶ ................wait................── LOCK ──▶ ...

Even optimistic concurrency (compare-and-swap) suffers under high contention because most retries fail, wasting CPU and network round-trips.

Sharded Counters#

The classic solution is to spread writes across N shards and sum them on read.

┌──────────────────────────────────────────┐
│           Logical Counter: post_likes    │
├──────────┬──────────┬──────────┬─────────┤
│ Shard 0  │ Shard 1  │ Shard 2  │ Shard 3 │
│  1,204   │  1,187   │  1,195   │  1,210  │
└──────────┴──────────┴──────────┴─────────┘
          Total = SUM(shards) = 4,796

Write Path#

Hash the writer (or pick randomly) to select a shard index i = hash(writer_id) % N.
Increment shard i — contention drops by a factor of N.

Read Path#

Read all N shards.
Return the sum.

Trade-offs#

Dimension	Impact
Write throughput	Scales linearly with shard count
Read cost	Increases linearly — must read N shards
Consistency	Eventual — a reader may see a stale sum
Complexity	Need to manage shard creation and rebalancing

A common optimization is to cache the sum with a short TTL (1–5 seconds) so that reads remain cheap while the displayed count is only slightly stale.

Approximate Counters with HyperLogLog#

When you need distinct counts (unique visitors, unique likers), exact counting requires storing every seen ID — memory grows linearly. HyperLogLog (HLL) trades accuracy for space.

How HLL Works#

Hash each element to a uniformly distributed bit string.
Observe the position of the leftmost 1-bit (the "run" of leading zeros).
Track the maximum run length across many sub-streams (registers).
Estimate cardinality from the harmonic mean of the register values.

With 16 KB of memory, HLL achieves roughly 0.81% standard error for cardinalities in the billions.

Redis HyperLogLog#

PFADD page:home:visitors "user-42"
PFADD page:home:visitors "user-99"
PFCOUNT page:home:visitors          -- returns approximate unique count
PFMERGE page:all:visitors page:home:visitors page:about:visitors

HLL is ideal when you can tolerate a small error margin — analytics dashboards, trending calculations, and spam detection all fit this profile.

Eventual Consistency for Counts#

Most users do not notice if a like count is off by a handful for a few seconds. This insight unlocks powerful optimizations.

Write-Behind Buffering#

Incoming increments land in a local buffer (in-memory or a fast queue).
A background process flushes the buffer periodically, batching increments into a single database write.
The read path adds the buffer delta to the last-known persisted value.

Client ──▶ Buffer (memory) ──▶ Batch flush every 1s ──▶ Database
                │
          Read = DB value + buffer delta

This reduces database writes by orders of magnitude and smooths out traffic spikes.

CRDT Counters#

In multi-region deployments, G-Counters (grow-only CRDT counters) allow each node to increment independently. The merge operation takes the element-wise max across all replicas.

Node A: [5, 0, 0]
Node B: [0, 3, 0]
Node C: [0, 0, 7]

Merged: [5, 3, 7]  →  Total = 15

For counters that also decrement, a PN-Counter pairs a G-Counter for increments with a G-Counter for decrements. The value is the difference.

Redis INCR — The Workhorse#

Redis INCR is atomic and single-threaded, making it naturally safe for concurrent increments without locks.

Common Patterns#

Simple counter:

INCR article:42:views
GET article:42:views

Rate limiter (sliding window):

key = "rate:{user_id}:{current_minute}"
count = INCR key
EXPIRE key 60
if count > limit then reject

Batched persistence:

-- Every 10 seconds, a worker reads and resets
value = GETSET article:42:views 0
-- Persist `value` to the durable store

Scaling Redis Counters#

Firestore Distributed Counters#

Google's Firestore documentation explicitly recommends distributed counters because a single Firestore document supports roughly one write per second.

Implementation#

Create a subcollection counters/{counterId}/shards/{0..N}.
Each write picks a random shard and increments its count field.
Reads aggregate all shards with a collection query.

// Write — pick a random shard
const shardId = Math.floor(Math.random() * NUM_SHARDS);
const shardRef = doc(db, "counters", counterId, "shards", String(shardId));
await updateDoc(shardRef, { count: increment(1) });

// Read — sum all shards
const snapshot = await getDocs(collection(db, "counters", counterId, "shards"));
let total = 0;
snapshot.forEach(doc => { total += doc.data().count; });

Choosing Shard Count#

Start with 10 shards — supports ~10 writes/sec.
Scale to 100+ shards for viral content.
Use Cloud Functions to auto-scale shard count based on write rate.

Real-Time Leaderboard Counts#

Leaderboards combine counting with ranking. Redis sorted sets are the canonical solution.

ZINCRBY leaderboard:daily 1 "player:007"
ZREVRANGE leaderboard:daily 0 9 WITHSCORES   -- top 10
ZREVRANK leaderboard:daily "player:007"       -- player rank

Architecture for Scale#

┌────────────┐     ┌──────────────┐     ┌──────────────┐
│  Game Srvr │────▶│  Kafka Topic │────▶│  Leaderboard │
│  (events)  │     │  (scores)    │     │  Service     │
└────────────┘     └──────────────┘     └──────┬───────┘
                                               │
                                        ┌──────▼───────┐
                                        │  Redis Sorted │
                                        │  Set (ZADD)   │
                                        └──────────────┘

Game servers emit score events to Kafka.
A leaderboard consumer processes events and issues ZINCRBY commands.
The API layer reads from Redis for real-time rankings.
A periodic job snapshots the leaderboard to a durable store for historical queries.

Multi-Region Leaderboards#

For global leaderboards, run a Redis sorted set per region and merge periodically. Alternatively, use a single global Redis cluster with read replicas for rank queries.

Choosing the Right Pattern#

Scenario	Recommended Pattern
Simple page views	Redis INCR with periodic persistence
High-write social counters	Sharded counters (database or Firestore)
Unique visitor counts	HyperLogLog
Multi-region counters	CRDT G-Counter / PN-Counter
Real-time leaderboards	Redis sorted sets
Analytics aggregations	Write-behind buffer + batch processing

Key Takeaways#

Shard writes, cache reads — the universal principle behind every distributed counter pattern.
Approximate when you can — HyperLogLog saves orders of magnitude of memory for distinct counts.
Embrace eventual consistency — users tolerate slightly stale counts; your infrastructure will thank you.
Redis is your friend — INCR, sorted sets, and HLL cover most counting use cases out of the box.
Plan for viral spikes — auto-scaling shard counts prevents write bottlenecks when content goes viral.

Distributed counters appear in nearly every system design interview that involves social features, analytics, or gaming. Master these patterns and you will handle them with confidence.

This is article #378 in the Codelit system design series. Want to level up your system design skills? Explore the full collection at codelit.io.

Try it on Codelit

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

Build this architecture →

Comments

AI search

Try these templates

Distributed Rate Limiter

API rate limiting with sliding window, token bucket, and per-user quotas.

7 components

Distributed Key-Value Store

Redis/DynamoDB-like distributed KV store with consistent hashing, replication, and tunable consistency.

8 components

Build this architecture

Generate an interactive architecture for Distributed Counters in seconds.

Try it in Codelit →

Distributed Counters: Scaling Counts to Millions of Writes per Second

Why Naive Counters Break#

Sharded Counters#

Write Path#

Read Path#

Trade-offs#

Approximate Counters with HyperLogLog#

How HLL Works#

Redis HyperLogLog#

Eventual Consistency for Counts#

Write-Behind Buffering#

CRDT Counters#

Redis INCR — The Workhorse#

Common Patterns#

Scaling Redis Counters#

Firestore Distributed Counters#

Implementation#

Choosing Shard Count#

Real-Time Leaderboard Counts#

Architecture for Scale#

Multi-Region Leaderboards#

Choosing the Right Pattern#

Key Takeaways#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Distributed Rate Limiter

Distributed Key-Value Store

Build this architecture

Distributed Counters: Scaling Counts to Millions of Writes per Second

Why Naive Counters Break#

Sharded Counters#

Write Path#

Read Path#

Trade-offs#

Approximate Counters with HyperLogLog#

How HLL Works#

Redis HyperLogLog#

Eventual Consistency for Counts#

Write-Behind Buffering#

CRDT Counters#

Redis INCR — The Workhorse#

Common Patterns#

Scaling Redis Counters#

Firestore Distributed Counters#

Implementation#

Choosing Shard Count#

Real-Time Leaderboard Counts#

Architecture for Scale#

Multi-Region Leaderboards#

Choosing the Right Pattern#

Key Takeaways#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Distributed Rate Limiter

Distributed Key-Value Store

Build this architecture