backpressuredistributed systemsresiliencesystem design

Backpressure & Flow Control in Distributed Systems

March 28, 2026 4 min readBy Codelit Team Discussion

Backpressure & Flow Control in Distributed Systems#

When a producer generates data faster than a consumer can process it, something has to give. Without backpressure, the system crashes — queues overflow, memory exhausts, latency spikes, and users see errors.

Backpressure is how systems communicate "slow down" upstream.

The Problem#

Producer (1000 msg/sec) → Queue → Consumer (100 msg/sec)

After 10 seconds: 9000 messages queued
After 100 seconds: 90,000 messages queued → OOM crash

Three Strategies#

1. Drop (Load Shedding)#

Drop excess messages when overloaded:

Producer → Queue (max 10K) → Consumer
                    ↓ full
              Drop new messages (or oldest)

When: Real-time data where old data is worthless (metrics, live video, stock tickers). Missing a few data points is better than crashing.

2. Buffer (Absorb Spikes)#

Queue messages and process them eventually:

Producer → Kafka (days of retention) → Consumer (catches up when it can)

When: All messages must be processed but latency is flexible (event sourcing, analytics, email queues).

3. Signal (Slow Down Producer)#

Tell the producer to reduce rate:

Consumer → "I'm overloaded" → Producer reduces rate

When: Producer can actually slow down (API rate limiting, TCP flow control, reactive streams).

Patterns#

Queue Depth Monitoring#

Watch queue size and alert before overflow:

Queue depth < 1000:  Normal ✓
Queue depth > 5000:  Warning ⚠ — consumers may be falling behind
Queue depth > 10000: Critical 🔴 — add consumers or shed load

// Auto-scale consumers based on queue depth
const depth = await sqs.getQueueAttributes({ QueueUrl, AttributeNames: ["ApproximateNumberOfMessages"] });
if (depth > 5000) scaleUp(workers);
if (depth < 100) scaleDown(workers);

Circuit Breaker + Backpressure#

When a downstream service is slow, stop sending:

Service A → Circuit Breaker → Service B (slow)
                  ↓ open
            Return 503 to caller (shed load)
            Retry after timeout

Reactive Streams#

Consumer tells producer how much it can handle:

Consumer: "Send me 100 items"
Producer: sends 100
Consumer: processes... "Send me 50 more"
Producer: sends 50

Tools: Project Reactor (Java), RxJS (JavaScript), Akka Streams (Scala)

TCP Flow Control#

TCP has built-in backpressure — the receive window:

Sender → [data] → Receiver
         ← window: 64KB (I can accept 64KB more)
Sender → [64KB data] → Receiver
         ← window: 0 (stop! I'm processing)
... processes ...
         ← window: 32KB (OK, send more)

Every TCP connection does this automatically. gRPC inherits it.

Kafka Consumer Lag#

Kafka consumers track their offset. Lag = how far behind:

Topic partition: [msg1, msg2, ..., msg1000] ← latest
Consumer offset:                  msg800 ← current position
Lag: 200 messages behind

Monitor lag. If it grows, add consumers or investigate processing bottleneck.

Rate Limiting as Backpressure#

API rate limits are a form of backpressure:

Client: 1000 req/sec
Server: 429 Too Many Requests (limit: 100/sec)
        Retry-After: 1
Client: slows to 100 req/sec ← backpressure worked

Architecture Examples#

Event Processing Pipeline#

App Events → Kafka (buffer) → Consumer Group (auto-scaled)
                                    ↓ if lag > threshold
                              Scale up consumers
                                    ↓ if processing fails
                              Dead letter topic (don't block)

API with Backpressure#

Client → Rate Limiter (token bucket)
              ↓ allowed
         API Server → Circuit Breaker → Database
              ↓ DB slow                     ↓ open
         Queue request (buffer)      Return cached/degraded response

Streaming Data Pipeline#

Sensors (10K/sec) → Load Balancer → Ingestion Service
                                          ↓ if queue > 80%
                                    Drop low-priority events
                                    Keep critical events
                                          ↓
                                    Stream Processor → Storage

Anti-Patterns#

Unbounded queues — always set max size. Unbounded = eventual OOM
Retry storms — failed requests retried immediately by all clients simultaneously. Use exponential backoff + jitter
Ignoring lag — Kafka consumer lag growing silently until hours behind
No circuit breakers — slow service takes down everything upstream

Monitoring#

Metric	Alert Threshold	Action
Queue depth	> 5x normal	Scale consumers
Consumer lag	> 5 minutes	Investigate bottleneck
Error rate	> 5%	Circuit breaker / shed load
P99 latency	> 3x normal	Add capacity or degrade
Memory usage	> 80%	Drop or buffer to disk

Summary#

Every system needs backpressure — unbounded growth always crashes
Drop when data is time-sensitive (metrics, video)
Buffer when all data matters (events, orders)
Signal when producers can slow down (APIs, streams)
Monitor queue depth and consumer lag — alert before crash
Auto-scale consumers based on queue depth

Design resilient architectures at codelit.io — 117 articles, performance audits, and load test simulation.

117 articles on system design at codelit.io/blog.

Try it on Codelit

GitHub Integration

Paste any repo URL to generate an interactive architecture diagram from real code

Build this architecture →

Comments

AI search

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

8 min read

AI safety

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

8 min read

API design

API Backward Compatibility: Ship Changes Without Breaking Consumers

6 min read

Try these templates

Distributed Rate Limiter

API rate limiting with sliding window, token bucket, and per-user quotas.

7 components

Distributed Key-Value Store

Redis/DynamoDB-like distributed KV store with consistent hashing, replication, and tunable consistency.

8 components

Build this architecture

Generate an interactive architecture for Backpressure & Flow Control in Distributed Systems in seconds.

Try it in Codelit →

backpressuredistributed systemsresiliencesystem design

Backpressure & Flow Control in Distributed Systems

March 28, 2026 4 min readBy Codelit Team Discussion

Backpressure & Flow Control in Distributed Systems#

Backpressure is how systems communicate "slow down" upstream.

The Problem#

Producer (1000 msg/sec) → Queue → Consumer (100 msg/sec)

After 10 seconds: 9000 messages queued
After 100 seconds: 90,000 messages queued → OOM crash

Three Strategies#

1. Drop (Load Shedding)#

Drop excess messages when overloaded:

Producer → Queue (max 10K) → Consumer
                    ↓ full
              Drop new messages (or oldest)

When: Real-time data where old data is worthless (metrics, live video, stock tickers). Missing a few data points is better than crashing.

2. Buffer (Absorb Spikes)#

Queue messages and process them eventually:

Producer → Kafka (days of retention) → Consumer (catches up when it can)

When: All messages must be processed but latency is flexible (event sourcing, analytics, email queues).

3. Signal (Slow Down Producer)#

Tell the producer to reduce rate:

Consumer → "I'm overloaded" → Producer reduces rate

When: Producer can actually slow down (API rate limiting, TCP flow control, reactive streams).

Patterns#

Queue Depth Monitoring#

Watch queue size and alert before overflow:

Queue depth < 1000:  Normal ✓
Queue depth > 5000:  Warning ⚠ — consumers may be falling behind
Queue depth > 10000: Critical 🔴 — add consumers or shed load

// Auto-scale consumers based on queue depth
const depth = await sqs.getQueueAttributes({ QueueUrl, AttributeNames: ["ApproximateNumberOfMessages"] });
if (depth > 5000) scaleUp(workers);
if (depth < 100) scaleDown(workers);

Circuit Breaker + Backpressure#

When a downstream service is slow, stop sending:

Service A → Circuit Breaker → Service B (slow)
                  ↓ open
            Return 503 to caller (shed load)
            Retry after timeout

Reactive Streams#

Consumer tells producer how much it can handle:

Consumer: "Send me 100 items"
Producer: sends 100
Consumer: processes... "Send me 50 more"
Producer: sends 50

Tools: Project Reactor (Java), RxJS (JavaScript), Akka Streams (Scala)

TCP Flow Control#

TCP has built-in backpressure — the receive window:

Sender → [data] → Receiver
         ← window: 64KB (I can accept 64KB more)
Sender → [64KB data] → Receiver
         ← window: 0 (stop! I'm processing)
... processes ...
         ← window: 32KB (OK, send more)

Every TCP connection does this automatically. gRPC inherits it.

Kafka Consumer Lag#

Kafka consumers track their offset. Lag = how far behind:

Topic partition: [msg1, msg2, ..., msg1000] ← latest
Consumer offset:                  msg800 ← current position
Lag: 200 messages behind

Monitor lag. If it grows, add consumers or investigate processing bottleneck.

Rate Limiting as Backpressure#

API rate limits are a form of backpressure:

Client: 1000 req/sec
Server: 429 Too Many Requests (limit: 100/sec)
        Retry-After: 1
Client: slows to 100 req/sec ← backpressure worked

Architecture Examples#

Event Processing Pipeline#

App Events → Kafka (buffer) → Consumer Group (auto-scaled)
                                    ↓ if lag > threshold
                              Scale up consumers
                                    ↓ if processing fails
                              Dead letter topic (don't block)

API with Backpressure#

Client → Rate Limiter (token bucket)
              ↓ allowed
         API Server → Circuit Breaker → Database
              ↓ DB slow                     ↓ open
         Queue request (buffer)      Return cached/degraded response

Streaming Data Pipeline#

Sensors (10K/sec) → Load Balancer → Ingestion Service
                                          ↓ if queue > 80%
                                    Drop low-priority events
                                    Keep critical events
                                          ↓
                                    Stream Processor → Storage

Anti-Patterns#

Unbounded queues — always set max size. Unbounded = eventual OOM
Retry storms — failed requests retried immediately by all clients simultaneously. Use exponential backoff + jitter
Ignoring lag — Kafka consumer lag growing silently until hours behind
No circuit breakers — slow service takes down everything upstream

Monitoring#

Metric	Alert Threshold	Action
Queue depth	> 5x normal	Scale consumers
Consumer lag	> 5 minutes	Investigate bottleneck
Error rate	> 5%	Circuit breaker / shed load
P99 latency	> 3x normal	Add capacity or degrade
Memory usage	> 80%	Drop or buffer to disk

Summary#

Every system needs backpressure — unbounded growth always crashes
Drop when data is time-sensitive (metrics, video)
Buffer when all data matters (events, orders)
Signal when producers can slow down (APIs, streams)
Monitor queue depth and consumer lag — alert before crash
Auto-scale consumers based on queue depth

Design resilient architectures at codelit.io — 117 articles, performance audits, and load test simulation.

117 articles on system design at codelit.io/blog.

Try it on Codelit

GitHub Integration

Paste any repo URL to generate an interactive architecture diagram from real code

Build this architecture →

Comments

AI search

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

8 min read

AI safety

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

8 min read

API design

API Backward Compatibility: Ship Changes Without Breaking Consumers

6 min read

Try these templates

Distributed Rate Limiter

API rate limiting with sliding window, token bucket, and per-user quotas.

7 components

Distributed Key-Value Store

Redis/DynamoDB-like distributed KV store with consistent hashing, replication, and tunable consistency.

8 components

Build this architecture

Generate an interactive architecture for Backpressure & Flow Control in Distributed Systems in seconds.

Try it in Codelit →

Backpressure & Flow Control in Distributed Systems

Backpressure & Flow Control in Distributed Systems#

The Problem#

Three Strategies#

1. Drop (Load Shedding)#

2. Buffer (Absorb Spikes)#

3. Signal (Slow Down Producer)#

Patterns#

Queue Depth Monitoring#

Circuit Breaker + Backpressure#

Reactive Streams#

TCP Flow Control#

Kafka Consumer Lag#

Rate Limiting as Backpressure#

Architecture Examples#

Event Processing Pipeline#

API with Backpressure#

Streaming Data Pipeline#

Anti-Patterns#

Monitoring#

Summary#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Distributed Rate Limiter

Distributed Key-Value Store

Build this architecture

Backpressure & Flow Control in Distributed Systems

Backpressure & Flow Control in Distributed Systems#

The Problem#

Three Strategies#

1. Drop (Load Shedding)#

2. Buffer (Absorb Spikes)#

3. Signal (Slow Down Producer)#

Patterns#

Queue Depth Monitoring#

Circuit Breaker + Backpressure#

Reactive Streams#

TCP Flow Control#

Kafka Consumer Lag#

Rate Limiting as Backpressure#

Architecture Examples#

Event Processing Pipeline#

API with Backpressure#

Streaming Data Pipeline#

Anti-Patterns#

Monitoring#

Summary#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Distributed Rate Limiter

Distributed Key-Value Store

Build this architecture