Saga Pattern: Managing Distributed Transactions in Microservices
Saga Pattern: Distributed Transactions in Microservices#
In a monolith, transactions are simple — wrap everything in a database transaction, and it either all commits or all rolls back. In microservices, each service has its own database. There's no shared transaction.
The saga pattern solves this: break the transaction into local transactions with compensating actions for rollback.
The Problem#
E-commerce checkout in microservices:
1. Order Service: create order ✓
2. Payment Service: charge card ✓
3. Inventory Service: reserve items ✗ (out of stock!)
4. ??? — order exists, card charged, but items unavailable
Without a saga, you have inconsistent state. With a saga, step 4 triggers compensation:
4. Compensate Payment: refund card
5. Compensate Order: cancel order
Two Approaches#
Choreography (Event-Driven)#
Services react to each other's events. No central coordinator.
Order Service → publishes "OrderCreated"
↓
Payment Service → listens → charges card → publishes "PaymentCompleted"
↓
Inventory Service → listens → reserves items → publishes "ItemsReserved"
↓
Shipping Service → listens → creates shipment → publishes "ShipmentCreated"
On failure:
Inventory Service → publishes "ItemsReservationFailed"
↓
Payment Service → listens → refunds → publishes "PaymentRefunded"
↓
Order Service → listens → cancels order
Pros: Simple, decoupled, no single point of failure Cons: Hard to track flow, implicit logic, debugging is painful Best for: Simple flows (3-4 steps)
Orchestration (Coordinator)#
A saga orchestrator explicitly controls the flow:
Saga Orchestrator:
1. Tell Order Service: "Create order" → success
2. Tell Payment Service: "Charge card" → success
3. Tell Inventory Service: "Reserve items" → FAILED
4. Tell Payment Service: "Refund card" (compensate)
5. Tell Order Service: "Cancel order" (compensate)
6. Mark saga as failed
Pros: Explicit flow, easy to debug, centralized state Cons: Orchestrator is a single point of failure, coupling to coordinator Best for: Complex flows (5+ steps), flows with conditional logic
Comparison#
| Factor | Choreography | Orchestration |
|---|---|---|
| Coupling | Low | Medium (to orchestrator) |
| Visibility | Hard to trace | Easy (saga state machine) |
| Complexity | Grows with steps | Centralized |
| Failure handling | Distributed | Centralized |
| Testing | Hard (event chains) | Easier (test orchestrator) |
| Best for | Simple flows | Complex flows |
Compensating Actions#
Every step needs a compensating action for rollback:
| Step | Action | Compensation |
|---|---|---|
| Create order | INSERT INTO orders | Cancel order (UPDATE status = 'cancelled') |
| Charge card | Stripe charge | Stripe refund |
| Reserve inventory | Decrement stock | Increment stock |
| Create shipment | Schedule pickup | Cancel pickup |
| Send email | Send confirmation | Send cancellation email |
Rules:
- Compensations must be idempotent (safe to retry)
- Compensations must always succeed (or have their own retry)
- Some actions can't be compensated (sent emails) — use semantic compensation (send correction)
Saga State Machine#
Track saga progress:
States: STARTED → PAYMENT_PENDING → PAYMENT_COMPLETED
→ INVENTORY_PENDING → INVENTORY_RESERVED
→ SHIPPING_PENDING → COMPLETED
Failure states: PAYMENT_FAILED → COMPENSATING → COMPENSATED
INVENTORY_FAILED → COMPENSATING_PAYMENT → COMPENSATED
interface SagaState {
id: string;
status: "started" | "completed" | "compensating" | "compensated" | "failed";
steps: {
name: string;
status: "pending" | "completed" | "failed" | "compensated";
result?: unknown;
}[];
}
Implementation Example#
Orchestrator in TypeScript#
class OrderSaga {
private steps: SagaStep[] = [
{
name: "createOrder",
execute: (ctx) => orderService.create(ctx.order),
compensate: (ctx) => orderService.cancel(ctx.orderId),
},
{
name: "chargePayment",
execute: (ctx) => paymentService.charge(ctx.amount, ctx.paymentMethod),
compensate: (ctx) => paymentService.refund(ctx.chargeId),
},
{
name: "reserveInventory",
execute: (ctx) => inventoryService.reserve(ctx.items),
compensate: (ctx) => inventoryService.release(ctx.reservationId),
},
];
async execute(input: OrderInput): Promise<SagaResult> {
const ctx: Record<string, unknown> = { ...input };
const completedSteps: SagaStep[] = [];
for (const step of this.steps) {
try {
const result = await step.execute(ctx);
Object.assign(ctx, result);
completedSteps.push(step);
} catch (error) {
// Compensate in reverse order
for (const completed of completedSteps.reverse()) {
await completed.compensate(ctx);
}
return { status: "failed", error, compensated: true };
}
}
return { status: "completed", ctx };
}
}
Failure Scenarios#
What if compensation fails?#
Retry with exponential backoff. If it still fails, send to a dead letter queue for manual resolution.
Step fails → compensate → compensation fails → retry (3x)
→ still fails → dead letter queue → alert ops team
What if the orchestrator crashes?#
Store saga state in a durable store (PostgreSQL, Redis). On restart, resume from last known state.
Network partition during saga?#
Timeout + retry. Design all steps to be idempotent so retries are safe.
When NOT to Use Sagas#
- Single database — use a regular transaction
- All-or-nothing required — sagas are eventually consistent, not ACID
- Simple workflows — if you have 2 steps, just handle errors directly
- Performance critical — sagas add latency (multiple service calls)
Architecture Example#
E-Commerce Order Saga#
Client → API Gateway → Order Saga Orchestrator
→ Order Service (create) ← compensate: cancel
→ Payment Service (charge) ← compensate: refund
→ Inventory Service (reserve) ← compensate: release
→ Shipping Service (schedule) ← compensate: cancel
→ Notification Service (email) ← compensate: correction email
Saga state stored in PostgreSQL
Failed sagas → Dead Letter Queue → Manual review
Summary#
- Choreography for simple flows (3-4 steps), decoupled services
- Orchestration for complex flows (5+ steps), better visibility
- Every step needs a compensating action — design them upfront
- Make everything idempotent — retries are inevitable
- Store saga state durably — survive orchestrator crashes
- Dead letter queue for unrecoverable failures
- Don't use sagas when a database transaction works — simpler is better
Design saga architectures at codelit.io — 113 articles, 100 product specs, 90+ templates, 29 exports.
Try it on Codelit
GitHub Integration
Paste any repo URL to generate an interactive architecture diagram from real code
Related articles
Try these templates
Scalable SaaS Application
Modern SaaS with microservices, event-driven processing, and multi-tenant architecture.
10 componentsDistributed Rate Limiter
API rate limiting with sliding window, token bucket, and per-user quotas.
7 componentsDistributed Key-Value Store
Redis/DynamoDB-like distributed KV store with consistent hashing, replication, and tunable consistency.
8 componentsBuild this architecture
Generate an interactive architecture for Saga Pattern in seconds.
Try it in Codelit →
Comments