Data Consistency Patterns: From Eventual to Linearizable
Data Consistency Patterns#
In a single-database system, consistency is simple — read after write, get the latest value. In distributed systems, that guarantee becomes expensive. The art is choosing how much consistency you actually need for each operation.
The Consistency Spectrum#
Consistency is not binary. It is a spectrum from weakest to strongest:
Eventual → Monotonic Reads → Read-Your-Writes → Session
→ Causal → Bounded Staleness → Linearizable
Each step up costs latency, throughput, or availability. The CAP theorem tells us we cannot have all three of consistency, availability, and partition tolerance — but PACELC is more practical: during a Partition choose Availability or Consistency; Else (normal operation) choose Latency or Consistency.
Eventual Consistency#
The weakest useful guarantee: if no new writes occur, all replicas will eventually converge to the same value.
Write to primary → Replicate async → Replica 1 (stale for ~50ms)
→ Replica 2 (stale for ~120ms)
→ Replica 3 (stale for ~80ms)
When it works: social media feeds, analytics dashboards, product catalogs, activity logs. Data that is "close enough" is fine.
When it breaks: showing a user their own just-submitted comment and it is missing. The write succeeded but the read hit a stale replica.
Most systems labeled "eventually consistent" actually provide additional guarantees on top. Pure eventual consistency with no ordering is rarely useful in practice.
Read-Your-Writes#
A session guarantee: after a client writes a value, subsequent reads by that same client will see the write (or a later value).
Client A writes: SET name = "Alice" → Primary
Client A reads: GET name → Must return "Alice" (not stale)
Client B reads: GET name → May return old value (no guarantee)
Implementation Strategies#
Sticky sessions: route all of a client's requests to the same replica. Simple but breaks on failover.
Write-follows-read token: the write response includes a token (e.g., a logical timestamp). The client sends this token with subsequent reads, and the server ensures the replica is at least that fresh.
POST /profile → 200 OK, X-Consistency-Token: ts_1042
GET /profile (header: X-Consistency-Token: ts_1042)
→ Server checks replica is at or past ts_1042
→ If not, either wait or route to primary
Read from primary after write: for a short window after writing (e.g., 5 seconds), route that user's reads to the primary. Simple and effective for most web applications.
Monotonic Reads#
A guarantee that once a client sees a value, it will never see an older value in subsequent reads. Time does not go backwards.
Without this guarantee, load balancing across replicas can cause a user to see a comment, refresh, and have it disappear — then see it again on the next refresh.
Read 1 → Replica A (has version 5) → Returns version 5
Read 2 → Replica B (has version 3) → Returns version 3 ← violation
Read 3 → Replica A (has version 5) → Returns version 5
Fix: track the version each client has seen and ensure subsequent reads go to a replica that is at least as fresh.
Causal Consistency#
If operation A causally precedes operation B (B depends on A or saw A's result), then every node that sees B must also have seen A.
Alice posts: "Anyone want lunch?" (msg_1)
Bob replies: "Sure! Where?" (re: msg_1) (msg_2, depends on msg_1)
Causal consistency guarantees:
Every user who sees msg_2 has already seen msg_1
But msg_3 (unrelated) may appear in any order relative to msg_1
This is weaker than total ordering but captures the ordering humans expect. It is the strongest consistency level achievable without sacrificing availability during partitions.
Implementation#
Track causal dependencies using vector clocks or hybrid logical clocks. Each operation carries metadata about which operations it depends on.
msg_1: {author: "alice", clock: {alice: 1}}
msg_2: {author: "bob", clock: {alice: 1, bob: 1}, depends_on: [msg_1]}
A replica only makes msg_2 visible after it has received and applied msg_1.
Session Consistency#
Combines read-your-writes, monotonic reads, and monotonic writes within a single client session. This is the most practical consistency level for user-facing applications.
Session guarantees:
1. Read-your-writes — see your own changes
2. Monotonic reads — time never goes backwards
3. Monotonic writes — your writes apply in order
Many databases offer this as a built-in option. MongoDB's causal sessions and Azure Cosmos DB's session consistency both provide this level.
Bounded Staleness#
A middle ground: reads may return stale data, but the staleness is bounded — by time (no more than N seconds old) or by versions (no more than K versions behind).
Configuration: max_staleness = 5 seconds
Primary: version 100 at T=now
Replica: version 97 at T=now-3s → Acceptable (within 5s)
Replica: version 92 at T=now-8s → Not acceptable, wait or redirect
This is excellent for geo-distributed systems where strong consistency would require cross-continent round trips. Azure Cosmos DB popularized this as a named consistency level.
Use case: a global e-commerce site. Product prices can be 5 seconds stale — nobody notices. But inventory counts with 5-second staleness during a flash sale could oversell.
Linearizability#
The strongest guarantee. Every operation appears to take effect at a single, instantaneous point between its invocation and response. All clients see the same order.
If Client A's write completes before Client B's read begins:
Client B MUST see Client A's write.
No ambiguity. No staleness. Global real-time ordering.
This is what single-node databases provide naturally. In distributed systems, it requires consensus protocols (Raft, Paxos, Multi-Paxos) and costs at least one round-trip to a quorum.
When You Need It#
- Distributed locks and leader election
- Unique constraint enforcement across nodes
- Financial transactions requiring global ordering
- Configuration changes that must be seen atomically
The Cost#
Linearizable reads in a 3-node cluster:
Client → Leader → Confirm leadership with quorum → Respond
Latency: ~2x a local read
For geo-distributed clusters with nodes across continents, this can add hundreds of milliseconds per read.
Jepsen Testing#
How do you verify that your database actually provides the consistency it claims? Jepsen is the industry-standard framework for testing distributed systems under failure conditions.
Jepsen works by:
- Setting up a distributed database cluster
- Running concurrent operations (reads, writes, transactions)
- Injecting failures (network partitions, node crashes, clock skew)
- Checking whether the history of operations is consistent with the claimed guarantees
Jepsen has found bugs in:
- PostgreSQL (serializable isolation edge cases)
- MongoDB (stale reads during failover)
- CockroachDB (causal reverse anomalies)
- Redis (split-brain data loss)
- Elasticsearch (lost acknowledged writes)
If your system claims a consistency level, test it. If a vendor has not published Jepsen results, be cautious about their claims.
Running Your Own Consistency Tests#
Even without full Jepsen, you can catch common issues:
1. Write a value, immediately read from a different node
2. Perform concurrent increments, verify final count
3. Kill a node mid-transaction, check for partial writes
4. Introduce network delay between nodes, verify ordering
Choosing the Right Consistency Level#
Match consistency to the operation, not the entire system:
Operation | Recommended Level
---------------------------|---------------------------
Social media timeline | Eventual
User viewing own profile | Read-your-writes
Chat messages | Causal
Shopping cart | Session
Global inventory count | Bounded staleness (tight)
Payment processing | Linearizable
Distributed lock | Linearizable
Analytics dashboard | Eventual
Most applications need mixed consistency — strong for writes that matter, relaxed for reads that can tolerate staleness.
Practical Guidelines#
- Default to eventual consistency and strengthen where needed
- Implement read-your-writes for any user-facing write-then-read flow
- Use session consistency as the baseline for authenticated users
- Reserve linearizability for coordination operations (locks, leader election)
- Measure actual replication lag — "eventual" might mean 50ms or 5 minutes
- Test under failure conditions, not just happy-path scenarios
- Document the consistency guarantees of every API endpoint
This is article #256 in the Codelit engineering series. Browse all posts at codelit.io for deep dives on distributed systems, databases, and backend architecture.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
AI Agent Tool Use Architecture: Function Calling, ReAct Loops & Structured Outputs
6 min read
AI searchAI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG
8 min read
AI safetyAI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop
8 min read
Try these templates
Netflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsSearch Engine Architecture
Web-scale search with crawling, indexing, ranking, and sub-second query serving.
8 componentsData Warehouse & Analytics
Snowflake-like data warehouse with ELT pipelines, SQL analytics, dashboards, and data governance.
8 componentsBuild this architecture
Generate an interactive architecture for Data Consistency Patterns in seconds.
Try it in Codelit →
Comments