deploymentDevOpsCI/CDinfrastructurearchitecture

Deployment Strategies Compared: Rolling, Blue-Green, Canary, and Beyond

March 29, 2026 6 min readBy Codelit Team Discussion

Deployment Strategies Compared#

Every deployment is a risk. The strategy you choose determines how much risk you absorb and how fast you can recover. There is no single best strategy — only the right one for your situation.

Recreate (Big Bang)#

Stop the old version entirely, then start the new version.

v1 [████████████] STOP
                       START [████████████] v2

Downtime window: seconds to minutes

Pros: simple, no version compatibility issues, clean cutover. Cons: downtime is guaranteed, no rollback without redeployment.

Use when: development/staging environments, scheduled maintenance windows, applications that tolerate downtime.

Rolling Update#

Replace instances of the old version with the new version one at a time (or in batches).

v1 [████] [████] [████] [████]
v1 [████] [████] [████] [xxxx] v2 [████]
v1 [████] [████] [xxxx] v2 [████] [████]
v1 [████] [xxxx] v2 [████] [████] [████]
v1 [xxxx] v2 [████] [████] [████] [████]

Pros: zero downtime, simple to implement, built into Kubernetes by default. Cons: both versions run simultaneously (requires backward compatibility), slow rollback (must re-roll), limited traffic control.

Configuration (Kubernetes):

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxUnavailable: 1
    maxSurge: 1

Use when: stateless services, backward-compatible changes, standard deployments without special risk.

Blue-Green Deployment#

Run two identical environments. Route all traffic to one (blue), deploy to the other (green), then switch.

Load Balancer --&gt; [Blue: v1] (serving traffic)
                  [Green: idle]

Deploy v2 to Green:
Load Balancer --&gt; [Blue: v1] (serving traffic)
                  [Green: v2] (ready, tested)

Switch:
Load Balancer --&gt; [Green: v2] (serving traffic)
                  [Blue: v1] (standby for rollback)

Pros: instant rollback (switch back to blue), zero downtime, full testing of new version before switch. Cons: double infrastructure cost, database migrations must be backward-compatible, stateful applications need special handling.

Use when: critical production systems, need instant rollback, can afford double infrastructure.

Canary Deployment#

Route a small percentage of traffic to the new version. Gradually increase if metrics look healthy.

Step 1:  v1 [95% traffic]  |  v2 [5% traffic]
Step 2:  v1 [80% traffic]  |  v2 [20% traffic]
Step 3:  v1 [50% traffic]  |  v2 [50% traffic]
Step 4:  v1 [0% traffic]   |  v2 [100% traffic]

Key metrics to watch during canary:

Error rate (4xx, 5xx)
Latency (p50, p95, p99)
Business metrics (conversion rate, cart abandonment)
Resource usage (CPU, memory)

Automated canary analysis:

If error_rate_v2 &gt; error_rate_v1 * 1.1:
    rollback()
If p99_latency_v2 &gt; p99_latency_v1 * 1.2:
    rollback()
Else:
    promote_to_next_percentage()

Pros: low risk (only a fraction of users affected), data-driven promotion, catches issues early. Cons: requires traffic splitting infrastructure, monitoring must be excellent, slower deployment.

Tools: Argo Rollouts, Flagger, Istio, AWS App Mesh.

Use when: high-traffic services, risky changes, need confidence before full rollout.

A/B Testing Deployment#

Route traffic based on user attributes (not random percentage). Measure business outcomes.

User segment A (returning users)  --&gt; v1 (current checkout)
User segment B (new users)        --&gt; v2 (redesigned checkout)

Measure: conversion rate, average order value, bounce rate

Difference from canary: canary validates technical health; A/B testing validates business hypotheses.

Pros: business-outcome driven, targeted testing, supports product experimentation. Cons: requires user segmentation infrastructure, longer test duration, statistical rigor needed.

Use when: product experiments, UX changes, pricing tests, feature validation.

Shadow (Dark Launch)#

Send a copy of production traffic to the new version without serving its responses to users.

User request --&gt; Load Balancer --&gt; v1 (response served to user)
                               --&gt; v2 (response discarded, only logged)

Compare: latency, errors, response differences

Pros: zero user impact, real production traffic testing, catches performance issues. Cons: doubles infrastructure load, write operations need careful handling (do not duplicate side effects), complex setup.

Use when: major rewrites, database migration validation, performance benchmarking with real traffic.

Feature Flags#

Deploy code to production but control visibility through runtime flags. Not a deployment strategy per se, but a powerful complement to any strategy.

deploy(v2_with_flag)

flag("new-checkout"):
  enabled: false       // deployed but hidden
  rollout: 0%

// Gradually enable:
  rollout: 5%          // canary via flag
  rollout: 50%         // wider rollout
  rollout: 100%        // full rollout

// Instant rollback:
  enabled: false       // no redeployment needed

Pros: instant rollback without redeployment, granular control, supports canary and A/B testing at the application layer. Cons: code complexity (if-else branches), flag debt accumulates, testing matrix grows.

Use when: combined with any deployment strategy for additional safety.

Comparison Table#

Strategy      | Downtime | Rollback Speed | Risk    | Infra Cost | Complexity
--------------|----------|----------------|---------|------------|----------
Recreate      | Yes      | Slow (redeploy)| High    | 1x         | Low
Rolling       | No       | Slow (re-roll) | Medium  | 1x-1.25x   | Low
Blue-Green    | No       | Instant        | Low     | 2x         | Medium
Canary        | No       | Fast           | Low     | 1x-1.1x    | High
A/B Testing   | No       | Fast           | Low     | 1x-1.1x    | High
Shadow        | No       | N/A            | None    | 2x         | High
Feature Flags | No       | Instant        | Low     | 1x         | Medium

Choosing the Right Strategy#

Decision flow:

Is downtime acceptable?
  Yes --&gt; Recreate (simplest)
  No  --&gt; Continue

Need instant rollback?
  Yes --&gt; Blue-Green or Feature Flags
  No  --&gt; Continue

Is this a risky change?
  Yes --&gt; Canary (technical validation)
  No  --&gt; Rolling Update (default)

Is this a product experiment?
  Yes --&gt; A/B Testing

Is this a major rewrite?
  Yes --&gt; Shadow launch first, then Canary

Common combinations:

Rolling + Feature Flags — deploy fast, control exposure in code
Canary + Automated Analysis — data-driven promotion with auto-rollback
Blue-Green + Feature Flags — instant infra rollback plus granular feature control
Shadow + Canary — validate with real traffic, then gradually expose

Key Takeaways#

Rolling update is the sensible default for most services
Blue-green when instant rollback is non-negotiable
Canary when you need data-driven confidence before full rollout
Feature flags complement every strategy — use them
Shadow launches are underused and invaluable for major changes
No strategy eliminates risk — monitoring, alerting, and runbooks are still essential

The best deployment pipeline combines multiple strategies. Use rolling updates for routine changes, canary for risky ones, and feature flags everywhere.

This is article #262 in the Codelit engineering series. We publish in-depth technical guides on architecture, infrastructure, and modern engineering practices. Explore more at codelit.dev.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

GitHub Integration

Paste a repo URL and generate architecture from your actual codebase

Build this architecture →

Comments

AI agents

Context Engineering for Agentic Systems

2 min read

AI agents

AI Agent Memory Architecture

2 min read

AI agents

Production AI Agent Deployment Checklist

2 min read

Try these templates

Netflix Video Streaming Architecture

Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.

10 components

GitHub-like CI/CD Pipeline

Continuous integration and deployment system with parallel jobs, artifact caching, and environment management.

9 components

Search Engine Architecture

Web-scale search with crawling, indexing, ranking, and sub-second query serving.

8 components

Build this architecture

Generate an interactive architecture for Deployment Strategies Compared in seconds.

Try it in Codelit →

deploymentDevOpsCI/CDinfrastructurearchitecture

Deployment Strategies Compared: Rolling, Blue-Green, Canary, and Beyond

March 29, 2026 6 min readBy Codelit Team Discussion

Deployment Strategies Compared#

Every deployment is a risk. The strategy you choose determines how much risk you absorb and how fast you can recover. There is no single best strategy — only the right one for your situation.

Recreate (Big Bang)#

Stop the old version entirely, then start the new version.

v1 [████████████] STOP
                       START [████████████] v2

Downtime window: seconds to minutes

Pros: simple, no version compatibility issues, clean cutover. Cons: downtime is guaranteed, no rollback without redeployment.

Use when: development/staging environments, scheduled maintenance windows, applications that tolerate downtime.

Rolling Update#

Replace instances of the old version with the new version one at a time (or in batches).

v1 [████] [████] [████] [████]
v1 [████] [████] [████] [xxxx] v2 [████]
v1 [████] [████] [xxxx] v2 [████] [████]
v1 [████] [xxxx] v2 [████] [████] [████]
v1 [xxxx] v2 [████] [████] [████] [████]

Configuration (Kubernetes):

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxUnavailable: 1
    maxSurge: 1

Use when: stateless services, backward-compatible changes, standard deployments without special risk.

Blue-Green Deployment#

Run two identical environments. Route all traffic to one (blue), deploy to the other (green), then switch.

Load Balancer --&gt; [Blue: v1] (serving traffic)
                  [Green: idle]

Deploy v2 to Green:
Load Balancer --&gt; [Blue: v1] (serving traffic)
                  [Green: v2] (ready, tested)

Switch:
Load Balancer --&gt; [Green: v2] (serving traffic)
                  [Blue: v1] (standby for rollback)

Use when: critical production systems, need instant rollback, can afford double infrastructure.

Canary Deployment#

Route a small percentage of traffic to the new version. Gradually increase if metrics look healthy.

Step 1:  v1 [95% traffic]  |  v2 [5% traffic]
Step 2:  v1 [80% traffic]  |  v2 [20% traffic]
Step 3:  v1 [50% traffic]  |  v2 [50% traffic]
Step 4:  v1 [0% traffic]   |  v2 [100% traffic]

Key metrics to watch during canary:

Error rate (4xx, 5xx)
Latency (p50, p95, p99)
Business metrics (conversion rate, cart abandonment)
Resource usage (CPU, memory)

Automated canary analysis:

If error_rate_v2 &gt; error_rate_v1 * 1.1:
    rollback()
If p99_latency_v2 &gt; p99_latency_v1 * 1.2:
    rollback()
Else:
    promote_to_next_percentage()

Tools: Argo Rollouts, Flagger, Istio, AWS App Mesh.

Use when: high-traffic services, risky changes, need confidence before full rollout.

A/B Testing Deployment#

Route traffic based on user attributes (not random percentage). Measure business outcomes.

User segment A (returning users)  --&gt; v1 (current checkout)
User segment B (new users)        --&gt; v2 (redesigned checkout)

Measure: conversion rate, average order value, bounce rate

Difference from canary: canary validates technical health; A/B testing validates business hypotheses.

Pros: business-outcome driven, targeted testing, supports product experimentation. Cons: requires user segmentation infrastructure, longer test duration, statistical rigor needed.

Use when: product experiments, UX changes, pricing tests, feature validation.

Shadow (Dark Launch)#

Send a copy of production traffic to the new version without serving its responses to users.

User request --&gt; Load Balancer --&gt; v1 (response served to user)
                               --&gt; v2 (response discarded, only logged)

Compare: latency, errors, response differences

Use when: major rewrites, database migration validation, performance benchmarking with real traffic.

Feature Flags#

Deploy code to production but control visibility through runtime flags. Not a deployment strategy per se, but a powerful complement to any strategy.

deploy(v2_with_flag)

flag("new-checkout"):
  enabled: false       // deployed but hidden
  rollout: 0%

// Gradually enable:
  rollout: 5%          // canary via flag
  rollout: 50%         // wider rollout
  rollout: 100%        // full rollout

// Instant rollback:
  enabled: false       // no redeployment needed

Use when: combined with any deployment strategy for additional safety.

Comparison Table#

Strategy      | Downtime | Rollback Speed | Risk    | Infra Cost | Complexity
--------------|----------|----------------|---------|------------|----------
Recreate      | Yes      | Slow (redeploy)| High    | 1x         | Low
Rolling       | No       | Slow (re-roll) | Medium  | 1x-1.25x   | Low
Blue-Green    | No       | Instant        | Low     | 2x         | Medium
Canary        | No       | Fast           | Low     | 1x-1.1x    | High
A/B Testing   | No       | Fast           | Low     | 1x-1.1x    | High
Shadow        | No       | N/A            | None    | 2x         | High
Feature Flags | No       | Instant        | Low     | 1x         | Medium

Choosing the Right Strategy#

Decision flow:

Is downtime acceptable?
  Yes --&gt; Recreate (simplest)
  No  --&gt; Continue

Need instant rollback?
  Yes --&gt; Blue-Green or Feature Flags
  No  --&gt; Continue

Is this a risky change?
  Yes --&gt; Canary (technical validation)
  No  --&gt; Rolling Update (default)

Is this a product experiment?
  Yes --&gt; A/B Testing

Is this a major rewrite?
  Yes --&gt; Shadow launch first, then Canary

Common combinations:

Rolling + Feature Flags — deploy fast, control exposure in code
Canary + Automated Analysis — data-driven promotion with auto-rollback
Blue-Green + Feature Flags — instant infra rollback plus granular feature control
Shadow + Canary — validate with real traffic, then gradually expose

Key Takeaways#

Rolling update is the sensible default for most services
Blue-green when instant rollback is non-negotiable
Canary when you need data-driven confidence before full rollout
Feature flags complement every strategy — use them
Shadow launches are underused and invaluable for major changes
No strategy eliminates risk — monitoring, alerting, and runbooks are still essential

The best deployment pipeline combines multiple strategies. Use rolling updates for routine changes, canary for risky ones, and feature flags everywhere.

This is article #262 in the Codelit engineering series. We publish in-depth technical guides on architecture, infrastructure, and modern engineering practices. Explore more at codelit.dev.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

GitHub Integration

Paste a repo URL and generate architecture from your actual codebase

Build this architecture →

Comments

AI agents

Build this architecture

Generate an interactive architecture for Deployment Strategies Compared in seconds.

Try it in Codelit →

Deployment Strategies Compared: Rolling, Blue-Green, Canary, and Beyond

Deployment Strategies Compared#

Recreate (Big Bang)#

Rolling Update#

Blue-Green Deployment#

Canary Deployment#

A/B Testing Deployment#

Shadow (Dark Launch)#

Feature Flags#

Comparison Table#

Choosing the Right Strategy#

Key Takeaways#

Comments

Related articles

Context Engineering for Agentic Systems

AI Agent Memory Architecture

Production AI Agent Deployment Checklist

Try these templates

Netflix Video Streaming Architecture

GitHub-like CI/CD Pipeline

Search Engine Architecture

Build this architecture

Deployment Strategies Compared: Rolling, Blue-Green, Canary, and Beyond

Deployment Strategies Compared#

Recreate (Big Bang)#

Rolling Update#

Blue-Green Deployment#

Canary Deployment#

A/B Testing Deployment#

Shadow (Dark Launch)#

Feature Flags#

Comparison Table#

Choosing the Right Strategy#

Key Takeaways#

Comments

Related articles

Context Engineering for Agentic Systems

AI Agent Memory Architecture

Production AI Agent Deployment Checklist

Try these templates

Netflix Video Streaming Architecture

GitHub-like CI/CD Pipeline

Search Engine Architecture

Build this architecture