distributed systemsdistributed lockingRedisZooKeeperconcurrency

Distributed Locking Patterns: Redis, ZooKeeper, Database Locks & Beyond

March 28, 2026 7 min readBy Codelit Team Discussion

When multiple processes across different machines need to coordinate access to a shared resource, you need a distributed lock. Getting it wrong leads to data corruption, double-processing, or deadlocks.

This guide covers the most important distributed locking patterns, from Redis single-instance locks to ZooKeeper recipes, along with the pitfalls that catch most engineers off guard.

Why Distributed Locks?#

In a single-process application, a mutex or semaphore is enough. In a distributed system you face additional challenges:

No shared memory — processes run on different machines.
Partial failures — a lock holder can crash without releasing the lock.
Clock skew — nodes disagree on the current time.
Network partitions — a node may be isolated but still believe it holds the lock.

Distributed locks provide mutual exclusion across these failure modes — when implemented correctly.

Redis-Based Locking#

SET NX (Single Instance)#

The simplest Redis lock uses a single command:

SET resource_name unique_value NX PX 30000

NX — only set if the key does not exist (acquire).
PX 30000 — auto-expire after 30 seconds (safety net).
unique_value — a UUID so only the holder can release the lock.

Release with a Lua script to make the check-and-delete atomic:

if redis.call("get", KEYS[1]) == ARGV[1] then
    return redis.call("del", KEYS[1])
else
    return 0
end

Limitation: If the single Redis instance fails, the lock is lost.

Redlock Algorithm#

Martin Kleppmann famously critiqued single-instance Redis locks. The Redlock algorithm, proposed by Salvatore Sanfilippo, uses N independent Redis instances (typically 5):

Acquire the lock on all N instances with the same key and unique value.
Consider the lock acquired only if a majority (N/2 + 1) succeed within a validity window.
If acquisition fails, release on all instances.

Trade-offs:

More resilient than a single instance.
Still relies on clock assumptions — Kleppmann argues this makes it unsafe under certain GC pauses or clock jumps.
In practice, Redlock works well for efficiency locks (preventing duplicate work) but may not be suitable for correctness locks (where safety is critical).

ZooKeeper Locks#

ZooKeeper provides strong consistency guarantees via its ZAB consensus protocol, making it a popular choice for correctness-critical locks.

Ephemeral Sequential Nodes#

The standard ZooKeeper lock recipe:

Create an ephemeral sequential znode under /locks/resource: /locks/resource/lock-0000000001
List all children of /locks/resource.
If your node has the lowest sequence number, you hold the lock.
Otherwise, set a watch on the node with the next-lower sequence number.
When that node is deleted (lock released or holder crashed), re-check.

Advantages:

Ephemeral nodes auto-delete when the session expires, preventing orphaned locks.
Sequential ordering prevents the herd effect — only one waiter is notified per release.

Disadvantages:

Higher latency than Redis (consensus round-trips).
Operational complexity of running a ZooKeeper ensemble.

Database Advisory Locks#

If you already run a relational database, advisory locks avoid introducing another system:

PostgreSQL#

-- Acquire (blocks until available)
SELECT pg_advisory_lock(12345);

-- Try to acquire (non-blocking)
SELECT pg_try_advisory_lock(12345);

-- Release
SELECT pg_advisory_unlock(12345);

MySQL#

SELECT GET_LOCK('resource_name', 10);  -- 10s timeout
SELECT RELEASE_LOCK('resource_name');

Trade-offs:

No extra infrastructure needed.
Tied to a database connection — if the connection drops, the lock is released.
Not suitable for high-throughput locking (database becomes the bottleneck).

Fencing Tokens#

Even with a correct lock implementation, a process may believe it holds the lock after it has expired (due to a GC pause, for example). Fencing tokens solve this:

Each lock acquisition returns a monotonically increasing token (e.g., ZooKeeper's zxid or a counter).
The lock holder includes the token in every write to the shared resource.
The resource rejects writes with a token lower than the last accepted token.

This ensures that a stale lock holder cannot corrupt data, even if two processes briefly believe they hold the lock.

Without fencing tokens, no distributed lock is truly safe for correctness.

Lock Expiry and Renewal#

Lock expiry prevents deadlocks when a holder crashes, but introduces a race condition: what if the holder is still working when the lock expires?

Renewal (Heartbeat Extension)#

The holder periodically extends the lock before it expires:

# Pseudocode
while work_in_progress:
    if time_until_expiry < threshold:
        extend_lock(lock_key, new_ttl)
    do_work_chunk()

Rules of thumb:

Set the initial TTL to 3-5x the expected operation duration.
Renew at 1/3 of the TTL interval.
If renewal fails, abort the operation — another process may have acquired the lock.

Deadlock Prevention#

Distributed deadlocks occur when two processes each hold a lock the other needs.

Strategies#

Lock ordering — always acquire locks in a globally consistent order (e.g., sorted by resource name).
Timeout-based — if a lock cannot be acquired within a deadline, release all held locks and retry with back-off.
Try-lock with rollback — attempt to acquire all required locks non-blocking. If any fail, release the ones you got and retry.

In distributed systems, timeout-based prevention is the most common because enforcing global ordering is difficult across services.

Leader Election with Locks#

Distributed locks naturally extend to leader election:

All candidates attempt to acquire a lock on a well-known key (e.g., /election/leader).
The one that succeeds is the leader.
Other candidates watch for lock release.
When the leader crashes or resigns, the lock expires and a new candidate acquires it.

ZooKeeper's ephemeral nodes make this particularly clean. In Redis, you need a renewal loop to maintain leadership.

Caution: Leader election via locks is simpler than full consensus (Raft/Paxos) but offers weaker guarantees. It works well for leader-worker patterns where brief dual-leadership is tolerable.

Tools Comparison#

Tool	Consistency	Latency	Ops Complexity	Best For
Redis SET NX	Weak (single instance)	Very low	Low	Efficiency locks, dedup
Redlock	Moderate	Low	Medium	Distributed efficiency locks
ZooKeeper	Strong (ZAB)	Medium	High	Correctness locks, elections
etcd	Strong (Raft)	Medium	Medium	Kubernetes-native systems
PostgreSQL advisory	Strong (single node)	Medium	Low	Existing Postgres deployments
Consul	Strong (Raft)	Medium	Medium	Service-mesh environments

Choosing the Right Pattern#

Ask yourself two questions:

What happens if two processes enter the critical section simultaneously?
- Duplicate work (wasteful but harmless) → Redis SET NX is fine.
- Data corruption → use ZooKeeper or etcd with fencing tokens.
What infrastructure do you already run?
- Already have Redis → start with SET NX.
- Already have PostgreSQL → advisory locks.
- Need strong guarantees → ZooKeeper or etcd.

Key Takeaways#

SET NX + TTL is the simplest distributed lock but is unsafe without fencing tokens.
Redlock improves availability but does not eliminate clock-skew risks.
ZooKeeper ephemeral sequential nodes provide the strongest lock semantics.
Fencing tokens are essential for correctness — no lock algorithm alone is enough.
Prefer timeout-based deadlock prevention in distributed environments.
Use lock renewal to avoid premature expiry, but always handle renewal failure gracefully.

Understanding distributed locking patterns is fundamental to building reliable systems at scale.

Ready to deepen your distributed systems knowledge? Visit codelit.io for hands-on courses, system design practice, and real-world engineering content.

This is article #182 on the Codelit blog.

Try it on Codelit

GitHub Integration

Paste any repo URL to generate an interactive architecture diagram from real code

Build this architecture →

Comments

rate limiting

Rate Limiting Algorithms Compared: Token Bucket, Sliding Window & More

7 min read

async

Async Processing Patterns: Queues, Workers & Background Jobs

5 min read

consistent hashing

Consistent Hashing: The Algorithm Behind Distributed Systems

7 min read

Try these templates

Distributed Rate Limiter

API rate limiting with sliding window, token bucket, and per-user quotas.

7 components

Distributed Key-Value Store

Redis/DynamoDB-like distributed KV store with consistent hashing, replication, and tunable consistency.

8 components

Build this architecture

Generate an interactive architecture for Distributed Locking Patterns in seconds.

Try it in Codelit →

distributed systemsdistributed lockingRedisZooKeeperconcurrency

Distributed Locking Patterns: Redis, ZooKeeper, Database Locks & Beyond

March 28, 2026 7 min readBy Codelit Team Discussion

This guide covers the most important distributed locking patterns, from Redis single-instance locks to ZooKeeper recipes, along with the pitfalls that catch most engineers off guard.

Why Distributed Locks?#

In a single-process application, a mutex or semaphore is enough. In a distributed system you face additional challenges:

No shared memory — processes run on different machines.
Partial failures — a lock holder can crash without releasing the lock.
Clock skew — nodes disagree on the current time.
Network partitions — a node may be isolated but still believe it holds the lock.

Distributed locks provide mutual exclusion across these failure modes — when implemented correctly.

Redis-Based Locking#

SET NX (Single Instance)#

The simplest Redis lock uses a single command:

SET resource_name unique_value NX PX 30000

NX — only set if the key does not exist (acquire).
PX 30000 — auto-expire after 30 seconds (safety net).
unique_value — a UUID so only the holder can release the lock.

Release with a Lua script to make the check-and-delete atomic:

if redis.call("get", KEYS[1]) == ARGV[1] then
    return redis.call("del", KEYS[1])
else
    return 0
end

Limitation: If the single Redis instance fails, the lock is lost.

Redlock Algorithm#

Martin Kleppmann famously critiqued single-instance Redis locks. The Redlock algorithm, proposed by Salvatore Sanfilippo, uses N independent Redis instances (typically 5):

Acquire the lock on all N instances with the same key and unique value.
Consider the lock acquired only if a majority (N/2 + 1) succeed within a validity window.
If acquisition fails, release on all instances.

Trade-offs:

More resilient than a single instance.
Still relies on clock assumptions — Kleppmann argues this makes it unsafe under certain GC pauses or clock jumps.
In practice, Redlock works well for efficiency locks (preventing duplicate work) but may not be suitable for correctness locks (where safety is critical).

ZooKeeper Locks#

ZooKeeper provides strong consistency guarantees via its ZAB consensus protocol, making it a popular choice for correctness-critical locks.

Ephemeral Sequential Nodes#

The standard ZooKeeper lock recipe:

Create an ephemeral sequential znode under /locks/resource: /locks/resource/lock-0000000001
List all children of /locks/resource.
If your node has the lowest sequence number, you hold the lock.
Otherwise, set a watch on the node with the next-lower sequence number.
When that node is deleted (lock released or holder crashed), re-check.

Advantages:

Ephemeral nodes auto-delete when the session expires, preventing orphaned locks.
Sequential ordering prevents the herd effect — only one waiter is notified per release.

Disadvantages:

Higher latency than Redis (consensus round-trips).
Operational complexity of running a ZooKeeper ensemble.

Database Advisory Locks#

If you already run a relational database, advisory locks avoid introducing another system:

PostgreSQL#

-- Acquire (blocks until available)
SELECT pg_advisory_lock(12345);

-- Try to acquire (non-blocking)
SELECT pg_try_advisory_lock(12345);

-- Release
SELECT pg_advisory_unlock(12345);

MySQL#

SELECT GET_LOCK('resource_name', 10);  -- 10s timeout
SELECT RELEASE_LOCK('resource_name');

Trade-offs:

No extra infrastructure needed.
Tied to a database connection — if the connection drops, the lock is released.
Not suitable for high-throughput locking (database becomes the bottleneck).

Fencing Tokens#

Even with a correct lock implementation, a process may believe it holds the lock after it has expired (due to a GC pause, for example). Fencing tokens solve this:

Each lock acquisition returns a monotonically increasing token (e.g., ZooKeeper's zxid or a counter).
The lock holder includes the token in every write to the shared resource.
The resource rejects writes with a token lower than the last accepted token.

This ensures that a stale lock holder cannot corrupt data, even if two processes briefly believe they hold the lock.

Without fencing tokens, no distributed lock is truly safe for correctness.

Lock Expiry and Renewal#

Lock expiry prevents deadlocks when a holder crashes, but introduces a race condition: what if the holder is still working when the lock expires?

Renewal (Heartbeat Extension)#

The holder periodically extends the lock before it expires:

# Pseudocode
while work_in_progress:
    if time_until_expiry < threshold:
        extend_lock(lock_key, new_ttl)
    do_work_chunk()

Rules of thumb:

Set the initial TTL to 3-5x the expected operation duration.
Renew at 1/3 of the TTL interval.
If renewal fails, abort the operation — another process may have acquired the lock.

Deadlock Prevention#

Distributed deadlocks occur when two processes each hold a lock the other needs.

Strategies#

Lock ordering — always acquire locks in a globally consistent order (e.g., sorted by resource name).
Timeout-based — if a lock cannot be acquired within a deadline, release all held locks and retry with back-off.
Try-lock with rollback — attempt to acquire all required locks non-blocking. If any fail, release the ones you got and retry.

In distributed systems, timeout-based prevention is the most common because enforcing global ordering is difficult across services.

Leader Election with Locks#

Distributed locks naturally extend to leader election:

All candidates attempt to acquire a lock on a well-known key (e.g., /election/leader).
The one that succeeds is the leader.
Other candidates watch for lock release.
When the leader crashes or resigns, the lock expires and a new candidate acquires it.

ZooKeeper's ephemeral nodes make this particularly clean. In Redis, you need a renewal loop to maintain leadership.

Caution: Leader election via locks is simpler than full consensus (Raft/Paxos) but offers weaker guarantees. It works well for leader-worker patterns where brief dual-leadership is tolerable.

Tools Comparison#

Tool	Consistency	Latency	Ops Complexity	Best For
Redis SET NX	Weak (single instance)	Very low	Low	Efficiency locks, dedup
Redlock	Moderate	Low	Medium	Distributed efficiency locks
ZooKeeper	Strong (ZAB)	Medium	High	Correctness locks, elections
etcd	Strong (Raft)	Medium	Medium	Kubernetes-native systems
PostgreSQL advisory	Strong (single node)	Medium	Low	Existing Postgres deployments
Consul	Strong (Raft)	Medium	Medium	Service-mesh environments

Choosing the Right Pattern#

Ask yourself two questions:

What happens if two processes enter the critical section simultaneously?
- Duplicate work (wasteful but harmless) → Redis SET NX is fine.
- Data corruption → use ZooKeeper or etcd with fencing tokens.
What infrastructure do you already run?
- Already have Redis → start with SET NX.
- Already have PostgreSQL → advisory locks.
- Need strong guarantees → ZooKeeper or etcd.

Key Takeaways#

SET NX + TTL is the simplest distributed lock but is unsafe without fencing tokens.
Redlock improves availability but does not eliminate clock-skew risks.
ZooKeeper ephemeral sequential nodes provide the strongest lock semantics.
Fencing tokens are essential for correctness — no lock algorithm alone is enough.
Prefer timeout-based deadlock prevention in distributed environments.
Use lock renewal to avoid premature expiry, but always handle renewal failure gracefully.

Understanding distributed locking patterns is fundamental to building reliable systems at scale.

Ready to deepen your distributed systems knowledge? Visit codelit.io for hands-on courses, system design practice, and real-world engineering content.

This is article #182 on the Codelit blog.

Try it on Codelit

GitHub Integration

Paste any repo URL to generate an interactive architecture diagram from real code

Build this architecture →

Comments

rate limiting

Try these templates

Distributed Rate Limiter

API rate limiting with sliding window, token bucket, and per-user quotas.

7 components

Distributed Key-Value Store

Redis/DynamoDB-like distributed KV store with consistent hashing, replication, and tunable consistency.

8 components

Build this architecture

Generate an interactive architecture for Distributed Locking Patterns in seconds.

Try it in Codelit →

Distributed Locking Patterns: Redis, ZooKeeper, Database Locks & Beyond

Why Distributed Locks?#

Redis-Based Locking#

SET NX (Single Instance)#

Redlock Algorithm#

ZooKeeper Locks#

Ephemeral Sequential Nodes#

Database Advisory Locks#

PostgreSQL#

MySQL#

Fencing Tokens#

Lock Expiry and Renewal#

Renewal (Heartbeat Extension)#

Deadlock Prevention#

Strategies#

Leader Election with Locks#

Tools Comparison#

Choosing the Right Pattern#

Key Takeaways#

Comments

Related articles

Rate Limiting Algorithms Compared: Token Bucket, Sliding Window & More

Async Processing Patterns: Queues, Workers & Background Jobs

Consistent Hashing: The Algorithm Behind Distributed Systems

Try these templates

Distributed Rate Limiter

Distributed Key-Value Store

Build this architecture

Distributed Locking Patterns: Redis, ZooKeeper, Database Locks & Beyond

Why Distributed Locks?#

Redis-Based Locking#

SET NX (Single Instance)#

Redlock Algorithm#

ZooKeeper Locks#

Ephemeral Sequential Nodes#

Database Advisory Locks#

PostgreSQL#

MySQL#

Fencing Tokens#

Lock Expiry and Renewal#

Renewal (Heartbeat Extension)#

Deadlock Prevention#

Strategies#

Leader Election with Locks#

Tools Comparison#

Choosing the Right Pattern#

Key Takeaways#

Comments

Related articles

Rate Limiting Algorithms Compared: Token Bucket, Sliding Window & More

Async Processing Patterns: Queues, Workers & Background Jobs

Consistent Hashing: The Algorithm Behind Distributed Systems

Try these templates

Distributed Rate Limiter

Distributed Key-Value Store

Build this architecture