Vector Clocks — Tracking Causality in Distributed Systems
The problem with physical clocks#
In a distributed system, you cannot rely on wall-clock time to order events. Clocks drift, NTP corrections introduce jumps, and two nodes may disagree on the current time by milliseconds — or seconds. When you need to know which write happened first, physical time is unreliable.
Logical clocks solve this by tracking causality — the order in which events depend on each other — without relying on synchronized clocks.
The happened-before relation#
Leslie Lamport defined the happened-before relation (denoted a → b) with three rules:
- If a and b are events in the same process and a occurs before b, then a → b
- If a is the sending of a message and b is the receipt of that message, then a → b
- If a → b and b → c, then a → c (transitivity)
If neither a → b nor b → a, the events are concurrent — the system cannot determine which happened first. This is not an error; it is a fundamental property of distributed systems.
Lamport timestamps#
A Lamport timestamp is a single integer counter that provides a partial ordering of events.
How it works#
- Each process maintains a counter, starting at 0
- Before each local event, increment the counter
- When sending a message, include the current counter value
- When receiving a message, set your counter to
max(local, received) + 1
Limitations#
Lamport timestamps guarantee: if a → b, then L(a) < L(b). But the converse is not true — L(a) < L(b) does not mean a → b. Two concurrent events may have different timestamps by coincidence.
This means Lamport timestamps can detect ordering when causality exists, but they cannot detect concurrency. For that, you need vector clocks.
Vector clocks#
A vector clock extends the Lamport timestamp from a single counter to a vector of counters — one per process in the system.
How vector clocks work#
For a system with N processes, each process maintains a vector of N integers:
- Process i increments
VC[i]before each local event - When sending a message, include the entire vector
- When receiving a message with vector
VC_msg, update each entry:VC[j] = max(VC[j], VC_msg[j])for all j, then incrementVC[i]
Comparing vector clocks#
Given two vector clocks A and B:
- A < B (A happened before B) if every entry in A is less than or equal to the corresponding entry in B, and at least one is strictly less
- A = B if every entry is equal
- A and B are concurrent if neither A < B nor B < A — meaning some entries in A are greater and some entries in B are greater
This is the key advantage: vector clocks can detect concurrent events where Lamport timestamps cannot.
Example#
Three processes — P1, P2, P3 — start with vectors [0,0,0]:
- P1 performs a write:
[1,0,0] - P1 sends to P2: P2 updates to
[1,1,0] - P3 performs a write independently:
[0,0,1]
Comparing P2's [1,1,0] with P3's [0,0,1]: P2 has higher values in positions 1 and 2, but P3 has a higher value in position 3. These events are concurrent.
Version vectors#
Version vectors are closely related to vector clocks but serve a different purpose. While vector clocks track events, version vectors track data versions.
In a replicated database:
- Each replica maintains a version vector for each data item
- When a replica updates an item, it increments its own entry in the vector
- When replicas synchronize, they compare version vectors to detect conflicts
The key difference: version vectors are attached to data items rather than to processes or events. This makes them more practical for key-value stores where you need to know whether two copies of the same key are in conflict.
Conflict detection with vector clocks#
When a client reads a value from a distributed store, it receives the value along with its vector clock. When writing back:
- The client sends the new value along with the vector clock it received
- The server compares the incoming clock with the stored clock
- If the incoming clock descends from the stored clock, the write is an update — safe to overwrite
- If the clocks are concurrent, a conflict exists — the system must either merge or keep both versions (siblings)
Conflict resolution strategies#
- Last-writer-wins (LWW) — discard one version based on timestamp. Simple but loses data.
- Application-level merge — return both versions to the client and let application logic merge them. Used by Riak.
- CRDTs — conflict-free replicated data types that merge automatically by mathematical design.
Dotted Version Vectors#
Standard version vectors have a problem: sibling explosion. When concurrent writes create siblings, subsequent writes that do not resolve the conflict create even more siblings.
Dotted Version Vectors (DVVs) solve this by associating each value with a single "dot" — a (node, counter) pair — rather than a full vector. This allows the system to precisely track which node wrote which version and prune obsolete siblings.
DVVs provide:
- Accurate sibling tracking without exponential growth
- Efficient garbage collection of old versions
- Correct identification of which values supersede which
Riak adopted DVVs to replace traditional vector clocks for exactly these reasons.
Causal consistency#
Vector clocks enable causal consistency — a consistency model that guarantees:
- If operation A causally precedes operation B, then every node sees A before B
- Concurrent operations may be seen in different orders by different nodes
Causal consistency is stronger than eventual consistency but weaker than linearizability. It is often the strongest consistency level achievable without coordination — making it attractive for systems that need low latency.
Implementing causal consistency#
- Attach vector clocks to all operations
- A node delays applying an operation until all causally preceding operations have been applied
- Concurrent operations can be applied in any order
This requires tracking dependencies, which adds overhead proportional to the number of nodes.
Scaling challenges#
Vector clocks grow linearly with the number of processes. In a system with thousands of clients, this becomes impractical:
- Space overhead — each clock entry adds bytes to every message
- Comparison cost — comparing two vectors is O(N)
- Garbage collection — pruning old entries requires careful coordination
Solutions#
- Bounded vector clocks — limit the number of entries and evict the oldest. Risks false conflicts but keeps size manageable.
- Server-side clocks — only track server replicas, not individual clients. DynamoDB uses this approach.
- Hash-based schemes — compress the vector using hash functions at the cost of occasional false positives.
- Interval tree clocks — dynamic structures that grow and shrink as nodes join and leave.
Real-world applications#
Riak#
Riak used vector clocks (later DVVs) to track object versions. Clients receive a causal context with each read and must return it with writes. Concurrent writes produce siblings that the application resolves.
Amazon DynamoDB#
DynamoDB originally used vector clocks as described in the Dynamo paper. Later versions moved toward server-side reconciliation with last-writer-wins, trading conflict detection for operational simplicity.
CockroachDB#
CockroachDB uses a hybrid approach — Hybrid Logical Clocks (HLC) that combine physical time with logical counters. This provides the ordering guarantees of logical clocks while staying close to real time.
Vector clocks vs. other approaches#
| Mechanism | Detects causality | Detects concurrency | Overhead |
|---|---|---|---|
| Physical timestamps | No | No | Minimal |
| Lamport timestamps | Partial | No | Single integer |
| Vector clocks | Yes | Yes | O(N) integers |
| Dotted Version Vectors | Yes | Yes | O(N) with better pruning |
| Hybrid Logical Clocks | Yes | Partial | Timestamp + counter |
Visualize vector clocks#
On Codelit, generate a distributed key-value store to see how vector clocks propagate across replicas. Click on a data item to trace its version history and observe conflict detection in action.
This is article #238 in the Codelit engineering blog series.
Build and explore distributed system architectures visually at codelit.io.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
Build this architecture
Generate an interactive architecture for Vector Clocks in seconds.
Try it in Codelit →
Comments