Strangler Fig Pattern — Incremental Migration Without the Big Bang Rewrite
What is the strangler fig pattern?#
The strangler fig pattern is a migration strategy where you incrementally replace a legacy system by building new functionality alongside it, gradually routing traffic from old to new until the legacy system can be decommissioned.
The name comes from the strangler fig tree, which grows around a host tree, eventually replacing it entirely while the host decomposes.
Why not rewrite from scratch?#
Big bang rewrites fail for predictable reasons:
- Multi-year timelines — the business cannot wait that long for new features
- Moving target — the legacy system keeps changing while the rewrite is in progress
- All-or-nothing risk — you ship everything at once and discover critical bugs in production
- Lost institutional knowledge — undocumented behavior in the legacy system gets dropped
The strangler fig pattern avoids all of these by delivering incremental value. Each migrated feature goes to production independently, reducing risk and providing early feedback.
How it works: the proxy layer#
The core mechanism is a routing proxy (or facade) that sits between clients and both systems. The proxy decides which system handles each request based on predefined rules.
Step-by-step process#
- Deploy a proxy in front of the legacy system — initially, 100% of traffic goes to the legacy system
- Build the first feature in the new system
- Update the proxy to route requests for that feature to the new system
- Verify correctness by monitoring both systems
- Repeat for each subsequent feature
- Decommission the legacy system when no traffic remains
The proxy can be an API gateway (Kong, NGINX, Envoy), a load balancer with path-based routing, or even application-level middleware.
Proxy-based routing strategies#
Path-based routing#
Route entire URL paths to the new system:
location /api/orders {
proxy_pass http://new-system;
}
location / {
proxy_pass http://legacy-system;
}
This is the simplest approach and works well when features map cleanly to URL paths.
Header-based routing#
Route based on request headers, allowing gradual rollout:
# Envoy route configuration
routes:
- match:
prefix: "/api/orders"
headers:
- name: "x-use-new-system"
exact_match: "true"
route:
cluster: new_system
- match:
prefix: "/api/orders"
route:
cluster: legacy_system
Percentage-based routing#
Send a percentage of traffic to the new system, increasing over time as confidence grows. This is essentially a canary deployment applied to migration.
Feature-by-feature migration#
Not every feature is equally suited for early migration. Prioritize based on:
- Business value — migrate features that benefit most from the new architecture
- Independence — start with features that have minimal dependencies on other legacy components
- Risk tolerance — migrate low-risk features first to build confidence
- Data isolation — features with self-contained data are easier to migrate
A typical migration order for a monolith-to-microservices transition:
- Authentication and user profiles (well-defined boundary)
- Notifications and emails (low coupling, easy to verify)
- Search and catalog (read-heavy, benefits from new tech)
- Orders and payments (high value, migrate after gaining confidence)
- Reporting and analytics (often the last to move)
Parallel running#
Parallel running sends the same request to both systems and compares the results. This is the highest-confidence verification strategy.
Shadow traffic#
The proxy sends a copy of each request to the new system but only returns the legacy response to the client. The new system's response is logged for comparison.
async def handle_request(request):
legacy_response = await legacy_system.process(request)
# Fire-and-forget to new system
asyncio.create_task(
compare_responses(request, legacy_response, new_system)
)
return legacy_response
Diff comparison#
A comparison service analyzes responses from both systems:
- Exact match — responses are identical (ideal)
- Semantic match — responses differ in format but contain equivalent data
- Mismatch — a genuine discrepancy that needs investigation
GitHub used this approach when migrating from a MySQL-backed permissions system to a new service, running both in parallel for months before cutting over.
Data synchronization during migration#
The hardest part of any migration is the data layer. Both systems need access to consistent data during the transition period.
Shared database#
Both systems read from and write to the same database. Simple but creates tight coupling and makes it hard to evolve the new system's schema.
Change Data Capture (CDC)#
Use CDC to stream changes from the legacy database to the new system's data store:
- Legacy system writes to its database
- Debezium (or similar) captures changes from the transaction log
- Changes are published to Kafka
- The new system consumes events and updates its own store
This allows each system to own its data model while staying synchronized.
Dual writes#
The application writes to both databases on each operation. This is fragile — if one write fails, the systems diverge. Use this only as a short-term bridge with compensating transactions or an outbox pattern.
Data migration phases#
- Bulk migration — copy historical data from legacy to new store
- Continuous sync — CDC keeps both stores in sync during the transition
- Cutover — stop writing to the legacy store and make the new store authoritative
- Cleanup — archive or delete the legacy data store
Monitoring both systems#
During migration, you need visibility into both systems simultaneously.
Key metrics to track#
- Response time — compare p50, p95, and p99 latencies between legacy and new
- Error rates — any increase in errors after routing a feature to the new system
- Data consistency — periodic reconciliation jobs that compare data across both stores
- Feature coverage — what percentage of traffic still hits the legacy system
Dashboards#
Maintain a migration dashboard showing:
- Traffic split per feature (legacy vs new)
- Error rate comparison
- Data sync lag
- Overall migration progress (percentage of features migrated)
Rollback strategy#
Every migrated feature must be individually rollable. The proxy makes this straightforward:
- Instant rollback — update the proxy to route the feature back to the legacy system
- Data rollback — if the new system modified data, replay changes back to the legacy store using CDC in reverse
- Partial rollback — roll back a single feature without affecting other migrated features
Rollback should be a single configuration change, not a code deployment. Pre-test rollback procedures for every feature before migrating it.
Real-world examples#
Amazon#
Amazon migrated from a monolithic bookstore application to a service-oriented architecture over several years. Each team extracted their domain into an independent service while the monolith continued serving traffic. A routing layer directed requests to the appropriate service.
Shopify#
Shopify incrementally extracted services from their Ruby on Rails monolith. They used a pattern called "components" — modular boundaries within the monolith that were later extracted into services. The storefront rendering layer was one of the last pieces to migrate.
Spotify#
Spotify migrated from a monolithic Python backend to a microservices architecture. They used an API gateway as the strangler proxy, routing individual endpoints to new Java/Kotlin services while the Python monolith continued handling the rest.
Gov.uk#
The UK government digital service replaced a legacy government portal by building new pages on a modern stack (Ruby on Rails) and routing URLs one by one from the old system to the new one. Over two years, every page was migrated without a single big-bang cutover.
Anti-patterns to avoid#
- Migrating the database first — migrate features first, data follows
- Building the new system in isolation — without production traffic, you are guessing at requirements
- Keeping the legacy system on life support — set a decommission deadline and stick to it
- Skipping parallel running — the cost of comparison testing is far less than the cost of production bugs
- Ignoring the proxy layer — without a routing proxy, you end up with spaghetti integration
Visualize the strangler fig pattern in your architecture#
On Codelit, generate a migration architecture with proxy routing, legacy and new systems, and data synchronization flows. Click on the proxy layer to explore routing rules and traffic split configurations.
This is article #233 in the Codelit engineering blog series.
Build and explore migration architectures visually at codelit.io.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
Netflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsSearch Engine Architecture
Web-scale search with crawling, indexing, ranking, and sub-second query serving.
8 componentsGoogle Search Engine Architecture
Web-scale search with crawling, indexing, PageRank, query processing, ads, and knowledge graph.
10 componentsBuild this architecture
Generate an interactive architecture for Strangler Fig Pattern in seconds.
Try it in Codelit →
Comments