Database Migration Strategies: Zero-Downtime Schema Changes and Data Migrations
Changing a database schema in production is one of the riskiest operations in software engineering. A bad migration can lock tables, corrupt data, or bring down an application. Database migration strategies exist to make these changes safe, repeatable, and reversible — even under live traffic.
Schema Migration Fundamentals#
A schema migration is a versioned, ordered change to the database structure. Each migration has an up (apply) and optionally a down (rollback) operation. The migration tool tracks which migrations have been applied so they run exactly once.
Migration File Example#
-- V003__add_email_verified_column.sql (Flyway naming)
ALTER TABLE users ADD COLUMN email_verified BOOLEAN DEFAULT FALSE;
CREATE INDEX idx_users_email_verified ON users (email_verified);
Every schema change — adding a column, creating an index, modifying a constraint — gets its own migration file, checked into version control alongside application code.
Migration Tools Compared#
Flyway#
- SQL-first approach: write plain SQL migration files.
- Naming convention:
V1__description.sql,V2__description.sql. - Tracks applied migrations in a
flyway_schema_historytable. - Supports Java, CLI, Maven, and Gradle execution.
- Best for teams that prefer raw SQL control.
Liquibase#
- Supports SQL, XML, YAML, and JSON changelog formats.
- Changesets are identified by author and ID, not file names.
- Built-in rollback generation for many change types.
- Diff and snapshot commands help compare database states.
- Best for teams that need multi-database portability.
Prisma Migrate#
- Generates SQL migrations from changes to the Prisma schema file.
prisma migrate devcreates and applies migrations in development.prisma migrate deployapplies pending migrations in production.- Shadow database technique detects schema drift.
- Best for TypeScript/JavaScript teams already using the Prisma ORM.
Comparison#
| Feature | Flyway | Liquibase | Prisma Migrate |
|---|---|---|---|
| Language | SQL | SQL/XML/YAML | Generated SQL |
| Rollback | Manual (undo scripts) | Auto-generated for some changes | Manual |
| Drift detection | Validate checksum | Diff command | Shadow database |
| CI/CD integration | CLI, Maven, Gradle | CLI, Maven, Gradle | CLI, npm scripts |
| Learning curve | Low | Medium | Low (if using Prisma) |
Zero-Downtime Migrations#
The challenge: your application is running, serving requests, and you need to change the schema without downtime. Standard ALTER TABLE statements can acquire locks that block reads and writes for the duration of the operation.
Techniques for Zero-Downtime Changes#
1. Online DDL (MySQL, PostgreSQL)
Modern databases support non-blocking schema changes for many operations:
- PostgreSQL:
ALTER TABLE ... ADD COLUMNwith a default value is instant in PostgreSQL 11+ (no table rewrite). - MySQL:
ALTER TABLE ... ALGORITHM=INPLACE, LOCK=NONEperforms many changes without blocking DML.
2. Ghost Tables (pt-online-schema-change, gh-ost)
For operations that still require table rewrites:
- Create a ghost table with the new schema.
- Copy existing rows in small batches.
- Capture ongoing changes via triggers (pt-osc) or binlog (gh-ost).
- Swap the original and ghost tables atomically.
gh-ost is preferred because it avoids trigger overhead and provides better control over replication lag.
3. Concurrent Index Creation
-- PostgreSQL: does not lock the table for writes
CREATE INDEX CONCURRENTLY idx_orders_customer ON orders (customer_id);
Always use CONCURRENTLY in PostgreSQL for production index creation. Without it, the CREATE INDEX statement holds a lock that blocks inserts and updates.
The Expand-Contract Pattern#
The expand-contract pattern (also called parallel change) is the safest strategy for schema changes that affect application code. It splits a breaking change into three non-breaking phases:
Phase 1 — Expand#
Add the new structure alongside the old one. Both exist simultaneously.
-- Add new column (old column "name" stays)
ALTER TABLE users ADD COLUMN first_name VARCHAR(100);
ALTER TABLE users ADD COLUMN last_name VARCHAR(100);
Deploy application code that writes to both old and new columns. Reads still use the old column.
Phase 2 — Migrate#
Backfill the new columns with data from the old column:
UPDATE users SET
first_name = split_part(name, ' ', 1),
last_name = split_part(name, ' ', 2)
WHERE first_name IS NULL;
Run this in batches to avoid long-running transactions. Once complete, switch reads to the new columns.
Phase 3 — Contract#
Remove the old column once no code references it:
ALTER TABLE users DROP COLUMN name;
Each phase is a separate deployment. If anything goes wrong, you roll back the application code — the database still has both columns.
Timeline:
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Expand │──▶│ Migrate │──▶│Contract │
│ (add) │ │ (backfill)│ │ (remove)│
└─────────┘ └─────────┘ └─────────┘
│ │ │
Deploy v2 Deploy v3 Deploy v4
(dual write) (read new) (drop old)
Data Migration Strategies#
Schema migrations change structure; data migrations change content. Examples include reformatting phone numbers, splitting a full name into first and last name, or encrypting a previously plaintext column.
Best Practices#
- Batch processing — Never update millions of rows in a single transaction. Use batches of 1,000 to 10,000 rows with a short sleep between batches to avoid overwhelming the database.
-- Batch update pattern
UPDATE users
SET phone = format_phone(phone)
WHERE id IN (
SELECT id FROM users
WHERE phone NOT LIKE '+%'
LIMIT 5000
);
-
Idempotency — Design migrations so they can be re-run safely. The
WHERE phone NOT LIKE '+%'clause above ensures already-migrated rows are skipped. -
Progress tracking — Log progress (e.g., last processed ID) so a failed migration can resume from where it stopped.
-
Validation — After migration, run queries that verify data integrity:
-- Ensure no rows were missed
SELECT COUNT(*) FROM users WHERE phone NOT LIKE '+%';
-- Should return 0
Blue-Green Database Deployments#
Blue-green deployments for databases extend the classic blue-green pattern to the data layer:
┌──────────────────┐
│ Load Balancer │
└────────┬─────────┘
│
┌───────────┴───────────┐
▼ ▼
┌──────────┐ ┌──────────┐
│ App Blue │ │ App Green│
└────┬─────┘ └────┬─────┘
│ │
▼ ▼
┌──────────┐ ┌──────────┐
│ DB Blue │ ──sync──▶ │ DB Green │
└──────────┘ └──────────┘
How it works:
- Blue is the active environment. Green is a copy.
- Apply schema migrations to Green.
- Deploy the new application version to Green.
- Run smoke tests against Green.
- Switch traffic from Blue to Green.
- If problems arise, switch back to Blue.
Challenges:
- Data synchronization — Writes to Blue during migration must replicate to Green. Logical replication (PostgreSQL) or binlog replication (MySQL) handles this.
- Backward compatibility — The Green schema must remain compatible with Blue application code during the switchover window.
- Cost — You maintain two full database instances. Use this for critical migrations, not routine changes.
Testing Migrations#
Never run a migration in production without testing it first:
- Clone production schema — Use
pg_dump --schema-onlyor equivalent to create a test database with the production schema (no data). - Run migration against clone — Verify it completes without errors.
- Load test data — Use a representative subset of production data (anonymized) and re-run the migration to test with realistic volumes.
- Measure lock duration — Use
pg_stat_activityorSHOW PROCESSLISTto verify the migration does not hold locks beyond acceptable thresholds. - Test rollback — If you have down migrations, verify they work correctly.
- CI/CD integration — Run migrations as part of your pipeline. Flyway, Liquibase, and Prisma Migrate all support CLI execution in CI.
# Example CI step
- name: Test migrations
run: |
createdb migration_test
flyway -url=jdbc:postgresql://localhost/migration_test migrate
flyway -url=jdbc:postgresql://localhost/migration_test validate
Key Takeaways#
- Version every schema change as a migration file in source control.
- Use the expand-contract pattern for breaking schema changes — it decouples database changes from application deployments.
- Prefer online DDL and tools like gh-ost over locking
ALTER TABLEstatements. - Batch data migrations and make them idempotent so they can be safely re-run.
- Blue-green database deployments provide a safety net for high-risk migrations but add operational complexity.
- Always test migrations against a production-like schema and dataset before applying them to production.
Build and explore system design concepts hands-on at codelit.io.
221 articles on system design at codelit.io/blog.
Try it on Codelit
GitHub Integration
Paste any repo URL to generate an interactive architecture diagram from real code
Related articles
Try these templates
Build this architecture
Generate an interactive architecture for Database Migration Strategies in seconds.
Try it in Codelit →
Comments