Circuit Breaker Pattern Explained — Prevent Cascading Failures
Why services cascade-fail#
Service A calls Service B. Service B is slow (overloaded database). Service A's threads pile up waiting. Service A becomes slow. Service C calls Service A. Service C becomes slow. Your entire system is down because of one slow database.
The circuit breaker prevents this by failing fast instead of waiting forever.
The three states#
Like an electrical circuit breaker that trips when current is too high:
Closed (normal)#
Requests flow through normally. The breaker monitors failures.
Open (tripped)#
Too many failures detected. The breaker rejects all requests immediately — no waiting, no retrying. Returns a fallback response or error.
Half-Open (testing)#
After a timeout, the breaker allows a few test requests through. If they succeed, close the breaker (recovery). If they fail, open it again.
[Closed] → failures exceed threshold → [Open]
[Open] → timeout expires → [Half-Open]
[Half-Open] → test succeeds → [Closed]
[Half-Open] → test fails → [Open]
Configuration#
Failure threshold: How many failures before opening. Common: 5 failures in 60 seconds.
Timeout duration: How long to stay open before testing. Common: 30-60 seconds.
Success threshold: How many successes in half-open before closing. Common: 3 consecutive successes.
What counts as failure: Timeouts, 5xx errors, connection refused. Not 4xx (client errors).
Implementation#
class CircuitBreaker {
private state: "closed" | "open" | "half-open" = "closed";
private failures = 0;
private lastFailure = 0;
private readonly threshold = 5;
private readonly timeout = 30000; // 30 seconds
async call<T>(fn: () => Promise<T>, fallback?: T): Promise<T> {
if (this.state === "open") {
if (Date.now() - this.lastFailure > this.timeout) {
this.state = "half-open";
} else {
if (fallback !== undefined) return fallback;
throw new Error("Circuit breaker is open");
}
}
try {
const result = await fn();
this.onSuccess();
return result;
} catch (err) {
this.onFailure();
if (fallback !== undefined) return fallback;
throw err;
}
}
private onSuccess() {
this.failures = 0;
this.state = "closed";
}
private onFailure() {
this.failures++;
this.lastFailure = Date.now();
if (this.failures >= this.threshold) {
this.state = "open";
}
}
}
Fallback strategies#
When the circuit is open, don't just throw errors:
Cached data: Return the last known good response.
Default value: Return a sensible default. Product recommendations → return popular items.
Graceful degradation: Disable the feature. Review service down → show "Reviews temporarily unavailable."
Alternative service: Try a backup provider. Primary payment processor down → try secondary.
Real-world usage#
Netflix Hystrix (now in maintenance): The library that popularized circuit breakers in microservices. Every service-to-service call wrapped in a circuit breaker.
Resilience4j: Modern Java replacement for Hystrix. Lightweight, functional API.
Polly (.NET): Circuit breaker, retry, timeout, bulkhead — all in one library.
opossum (Node.js): Simple circuit breaker for Node applications.
Related patterns#
Retry with backoff: Retry failed requests with increasing delays. Use WITH a circuit breaker — retry when closed, fail fast when open.
Bulkhead: Isolate resources per dependency. If Service B fails, only the threads allocated to B are affected, not all threads.
Timeout: Always set timeouts on external calls. Without timeouts, a slow service ties up your threads forever. Timeouts trigger the circuit breaker.
When NOT to use#
- Local function calls — no network, no need
- Idempotent retries are sufficient — if retry alone fixes it
- The dependency is critical — if there's no fallback, failing fast doesn't help the user
Visualize your resilience architecture#
See how circuit breakers, retries, and bulkheads fit into your microservices — try Codelit to generate an interactive diagram.
Key takeaways#
- Closed → Open → Half-Open — the three states of a circuit breaker
- Fail fast when a dependency is down — don't wait and pile up
- Always have a fallback — cached data, defaults, or graceful degradation
- Combine with retries + timeouts — circuit breaker is one layer of resilience
- Monitor breaker state — an open circuit breaker is an alert-worthy event
- 5 failures / 30-second timeout is a reasonable starting configuration
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Cost Estimator
See estimated AWS monthly costs for every component in your architecture
Related articles
Build this architecture
Generate an interactive architecture for Circuit Breaker Pattern Explained in seconds.
Try it in Codelit →
Comments