CRDTs — Conflict-Free Replicated Data Types for Distributed Systems
The problem CRDTs solve#
Two users edit the same document on different devices. Both are offline. When they reconnect, their changes must merge automatically — without conflicts, without a central server deciding who wins.
Traditional databases use locks or last-write-wins. CRDTs take a different approach: they make every possible merge deterministic and correct by construction.
What is a CRDT?#
A Conflict-free Replicated Data Type (CRDT) is a data structure that can be replicated across multiple nodes, updated independently and concurrently, and merged automatically into a consistent state — without coordination.
The mathematical guarantee: all replicas that have received the same set of updates will converge to the same state, regardless of the order updates were applied.
Two families of CRDTs#
State-based CRDTs (CvRDTs)#
Each replica maintains the full state. Replicas periodically send their entire state to peers. States are merged using a join operation that must be:
- Commutative: merge(A, B) = merge(B, A)
- Associative: merge(merge(A, B), C) = merge(A, merge(B, C))
- Idempotent: merge(A, A) = A
These properties form a join-semilattice. As long as the merge function satisfies them, convergence is guaranteed.
Trade-off: simple to reason about, but sending full state is expensive for large data structures.
Operation-based CRDTs (CmRDTs)#
Instead of sending state, replicas broadcast operations (e.g., "increment counter by 1"). The delivery layer must guarantee:
- Causal delivery: if operation A happened before B, every replica receives A before B
- Exactly-once delivery: no duplicate operations
Trade-off: smaller messages, but requires a more reliable delivery layer.
Core CRDT data types#
G-Counter (grow-only counter)#
Each node maintains its own counter. The value is the sum of all node counters.
- Increment: node
iincrements its own entry - Merge: take the max of each node's entry
- Query: sum all entries
This works because max is commutative, associative, and idempotent.
Limitation: can only increment, never decrement.
PN-Counter (positive-negative counter)#
Two G-Counters: one for increments (P), one for decrements (N). The value is P - N.
- Increment: add to the P counter
- Decrement: add to the N counter
- Merge: merge both G-Counters independently
Used in distributed like/dislike systems, inventory counts, and online user counters.
LWW-Register (last-writer-wins register)#
Stores a value with a timestamp. On merge, the value with the highest timestamp wins.
- Update: set value with current timestamp
- Merge: keep the entry with the larger timestamp
Caveat: requires synchronized clocks. If clocks drift, you get unexpected results. Hybrid logical clocks (HLC) mitigate this.
OR-Set (observed-remove set)#
The most practical CRDT set. Each element is tagged with a unique identifier when added.
- Add: insert element with a new unique tag
- Remove: remove all tags currently associated with the element
- Merge: union of all tagged elements
If one replica adds an element while another removes it concurrently, the add wins — because the add created a new tag that the remove did not observe.
CRDTs for collaborative editing#
Text editing is the killer use case for CRDTs. Two approaches dominate:
Sequence CRDTs#
Assign each character a unique, ordered position identifier. Concurrent inserts at different positions never conflict. Algorithms include:
- RGA (Replicated Growable Array): used by Automerge
- YATA (Yet Another Transformation Approach): used by Yjs
- Fugue: newer algorithm with better interleaving properties
Tree CRDTs#
Model the document as a tree (for rich text, JSON, or file systems). Each node has a unique ID and a parent reference. Concurrent moves and edits are resolved using CRDT merge rules.
Tools and libraries#
| Tool | Language | Approach | Use case |
|---|---|---|---|
| Yjs | JavaScript | YATA sequence CRDT | Real-time collaborative editors |
| Automerge | JS / Rust | RGA + JSON CRDT | Document sync, local-first apps |
| Riak | Erlang | State-based CRDTs | Distributed key-value store |
| Diamond Types | Rust | Fugue sequence CRDT | High-performance text editing |
| cr-sqlite | SQLite extension | CRDTs over SQL tables | Offline-first databases |
Yjs in practice#
Yjs provides shared types (Y.Text, Y.Array, Y.Map) that sync across peers. It supports:
- WebSocket, WebRTC, and custom network providers
- Awareness protocol for cursor positions and presence
- Sub-document loading for large workspaces
- Bindings for ProseMirror, CodeMirror, Monaco, Quill, and TipTap
Automerge in practice#
Automerge models documents as JSON-like objects where every field is a CRDT:
- Tracks full change history
- Supports branching and merging (like Git for data)
- Binary sync protocol for efficient replication
- Rust core with JavaScript bindings for cross-platform use
CRDTs vs OT (Operational Transformation)#
| Aspect | CRDTs | OT |
|---|---|---|
| Central server | Not required | Usually required |
| Offline support | Native | Complex |
| Correctness | Mathematically proven | Historically error-prone |
| Metadata overhead | Higher (unique IDs per element) | Lower |
| Adoption | Yjs, Automerge, Riak | Google Docs, SharePoint |
CRDTs trade memory overhead for decentralization and correctness. For local-first applications, they are the clear winner.
Eventual consistency and CRDTs#
CRDTs provide strong eventual consistency (SEC):
- Eventual delivery: every update eventually reaches every replica
- Convergence: replicas that have received the same updates are in the same state
- No coordination: no consensus protocol, no leader election
This is stronger than plain eventual consistency because convergence is guaranteed without conflict resolution logic.
Garbage collection#
CRDTs accumulate metadata (tombstones for deleted elements, unique tags). Without cleanup, memory grows unbounded.
Strategies:
- Causal stability: once all replicas have observed a deletion, the tombstone can be removed
- Epoch-based GC: periodically snapshot the state and discard history before the epoch
- Peer tracking: maintain a version vector; GC metadata that all peers have acknowledged
This remains one of the hardest practical problems in CRDT implementations.
When to use CRDTs#
Good fit:
- Collaborative editing (text, diagrams, whiteboards)
- Offline-first mobile and desktop apps
- Multi-region databases without cross-region coordination
- Shopping carts, shared counters, distributed configuration
Poor fit:
- Strong consistency requirements (bank transfers)
- Small number of writers with reliable connectivity (use a database)
- Simple last-write-wins semantics suffice (use a timestamp)
Explore CRDTs in your architecture#
On Codelit, generate a collaborative editing system or a multi-region database to see how CRDTs enable conflict-free replication. Click the sync layer to explore merge strategies and consistency guarantees.
This is article #229 in the Codelit engineering blog series.
Design distributed architectures visually at codelit.io.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
Uber Real-Time Location System
Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.
6 componentsReal-Time Collaborative Editor
Notion-like document editor with real-time collaboration, conflict resolution, and rich media.
9 componentsDistributed Rate Limiter
API rate limiting with sliding window, token bucket, and per-user quotas.
7 componentsBuild this architecture
Generate an interactive architecture for CRDTs in seconds.
Try it in Codelit →
Comments