PostgreSQL Autovacuum Tuning — Workers, Scale Factors, Cost Limits, and Wraparound Prevention
Why autovacuum matters more than you think#
PostgreSQL does not reclaim disk space automatically when you delete or update rows. Instead, old row versions (called "dead tuples") accumulate in every table. The autovacuum daemon is the background process responsible for cleaning them up.
If autovacuum falls behind, you get table bloat, slower queries, wasted disk, and eventually the dreaded transaction ID wraparound — a hard limit that forces PostgreSQL into single-user mode to protect data integrity.
Most production PostgreSQL issues trace back to autovacuum not keeping up. This guide covers how to tune it properly.
How VACUUM works under the hood#
When a row is updated or deleted, PostgreSQL does not remove the old version immediately. Instead it marks it as dead. VACUUM does three things:
- Reclaims dead tuples — marks the space as reusable for future inserts
- Updates the visibility map — enables index-only scans
- Freezes old transaction IDs — prevents wraparound
VACUUM does not return disk space to the OS (that requires VACUUM FULL, which locks the table). Regular VACUUM simply makes space reusable within the table file.
Autovacuum architecture#
The autovacuum launcher starts worker processes that vacuum individual tables. Key components:
- Launcher — a single process that wakes up every
autovacuum_naptimeand checks which tables need vacuuming - Workers — up to
autovacuum_max_workersconcurrent processes, each vacuuming one table - Cost-based throttling — workers pause periodically to avoid overwhelming I/O
Default configuration (and why it is too conservative)#
-- These defaults suit a small dev database, not production
autovacuum_naptime = 1min
autovacuum_max_workers = 3
autovacuum_vacuum_scale_factor = 0.2
autovacuum_vacuum_cost_limit = 200
autovacuum_vacuum_cost_delay = 2ms
On a 100-million-row table, a scale factor of 0.2 means autovacuum waits until 20 million dead tuples accumulate before it starts. That is far too late.
Tuning autovacuum workers#
autovacuum_max_workers#
Controls how many tables can be vacuumed in parallel. The default of 3 is often insufficient for databases with hundreds of active tables.
-- For databases with 50+ active tables
autovacuum_max_workers = 6
Important: The autovacuum_vacuum_cost_limit is shared across all workers. Doubling workers without adjusting cost limits means each worker gets half the I/O budget.
autovacuum_naptime#
How often the launcher checks for work. The default of 60 seconds is fine for most workloads, but high-churn databases benefit from shorter intervals:
-- Check every 10 seconds on high-write databases
autovacuum_naptime = 10s
Scale factor and threshold tuning#
Autovacuum triggers when dead tuples exceed:
threshold + scale_factor * table_size
The default scale factor of 0.2 (20%) is problematic for large tables. For a 500-million-row table, that means 100 million dead tuples before vacuum starts.
Recommended approach: lower global defaults, override per table#
-- Global: more aggressive defaults
ALTER SYSTEM SET autovacuum_vacuum_scale_factor = 0.05;
ALTER SYSTEM SET autovacuum_vacuum_threshold = 50;
ALTER SYSTEM SET autovacuum_analyze_scale_factor = 0.02;
ALTER SYSTEM SET autovacuum_analyze_threshold = 50;
SELECT pg_reload_conf();
For very large or very active tables, use per-table settings:
-- High-churn orders table: vacuum after 10,000 dead tuples
ALTER TABLE orders SET (
autovacuum_vacuum_scale_factor = 0,
autovacuum_vacuum_threshold = 10000,
autovacuum_analyze_scale_factor = 0,
autovacuum_analyze_threshold = 5000
);
Setting scale_factor = 0 with a fixed threshold gives predictable vacuum frequency regardless of table size.
Cost limit tuning#
Autovacuum uses cost-based throttling to limit I/O impact. Each page read or dirtied costs points. When the cost limit is reached, the worker sleeps for cost_delay.
-- Default: very conservative
autovacuum_vacuum_cost_limit = 200
autovacuum_vacuum_cost_delay = 2ms
-- Production: let vacuum work harder
ALTER SYSTEM SET autovacuum_vacuum_cost_limit = 1000;
ALTER SYSTEM SET autovacuum_vacuum_cost_delay = 2ms;
SELECT pg_reload_conf();
On modern SSDs, you can push the cost limit to 2000 or higher. The goal is to let vacuum finish quickly so it does not compete with queries for extended periods.
Per-table cost overrides#
-- Critical table: vacuum aggressively, independent of global cost limit
ALTER TABLE events SET (
autovacuum_vacuum_cost_limit = 2000,
autovacuum_vacuum_cost_delay = 0
);
Setting cost_delay = 0 removes all throttling for that table. Use this only for tables where vacuum falling behind causes real problems.
Monitoring bloat#
You cannot tune what you do not measure. Key queries for monitoring:
Dead tuple count per table#
SELECT schemaname, relname,
n_dead_tup,
n_live_tup,
round(n_dead_tup::numeric / greatest(n_live_tup, 1) * 100, 2) AS dead_pct,
last_autovacuum,
last_autoanalyze
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 20;
Estimated table bloat#
SELECT tablename,
pg_size_pretty(pg_total_relation_size(tablename::regclass)) AS total_size,
pg_size_pretty(
pg_total_relation_size(tablename::regclass) -
pg_relation_size(tablename::regclass)
) AS index_size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(tablename::regclass) DESC
LIMIT 20;
Autovacuum currently running#
SELECT pid, datname, relid::regclass AS table_name,
phase, heap_blks_total, heap_blks_scanned, heap_blks_vacuumed
FROM pg_stat_progress_vacuum;
Set up alerts when dead_pct exceeds 10% on any large table or when autovacuum has not run on an active table in 24 hours.
Transaction ID wraparound prevention#
PostgreSQL uses 32-bit transaction IDs. With roughly 4 billion available, a busy database can exhaust them. When a table's oldest unfrozen transaction ID approaches the limit, PostgreSQL forces an anti-wraparound vacuum that cannot be cancelled and blocks other autovacuum work.
Key settings#
-- When to trigger aggressive (anti-wraparound) vacuum
autovacuum_freeze_max_age = 200000000 -- default
-- What age to freeze tuples during normal vacuum
vacuum_freeze_min_age = 50000000 -- default
Monitoring wraparound risk#
SELECT c.oid::regclass AS table_name,
greatest(age(c.relfrozenxid), age(t.relfrozenxid)) AS xid_age,
pg_size_pretty(pg_total_relation_size(c.oid)) AS total_size
FROM pg_class c
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
WHERE c.relkind IN ('r', 'm')
AND n.nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY greatest(age(c.relfrozenxid), age(t.relfrozenxid)) DESC
LIMIT 20;
Alert when any table's xid_age exceeds 500 million. At 1 billion, you are entering dangerous territory. At autovacuum_freeze_max_age (default 200 million), PostgreSQL will force an aggressive vacuum automatically.
Per-table tuning strategy#
Not every table needs the same vacuum settings. Categorize your tables:
| Category | Scale Factor | Threshold | Cost Limit | Example |
|---|---|---|---|---|
| High-churn, large | 0 | 10,000 | 2000 | events, logs, audit_trail |
| High-churn, small | 0.05 | 50 | default | sessions, tokens |
| Append-only | 0.1 | 1000 | default | analytics, metrics |
| Mostly-read | 0.2 | 50 | default | users, config |
Apply settings with ALTER TABLE ... SET (storage_parameter = value). Review quarterly as table usage patterns change.
Common mistakes#
- Leaving defaults on large databases — the 20% scale factor is almost always too high for tables over 10 million rows
- Adding workers without adjusting cost limits — more workers with the same total budget means each worker is slower
- Disabling autovacuum on any table — this guarantees eventual wraparound. Never do this.
- Running VACUUM FULL during peak hours — it takes an ACCESS EXCLUSIVE lock. Use
pg_repackinstead for online table compaction - Ignoring TOAST tables — large text/jsonb columns are stored in TOAST tables that also need vacuuming
Recommended production baseline#
ALTER SYSTEM SET autovacuum_max_workers = 6;
ALTER SYSTEM SET autovacuum_naptime = '15s';
ALTER SYSTEM SET autovacuum_vacuum_scale_factor = 0.05;
ALTER SYSTEM SET autovacuum_vacuum_threshold = 50;
ALTER SYSTEM SET autovacuum_analyze_scale_factor = 0.02;
ALTER SYSTEM SET autovacuum_analyze_threshold = 50;
ALTER SYSTEM SET autovacuum_vacuum_cost_limit = 1000;
ALTER SYSTEM SET autovacuum_vacuum_cost_delay = '2ms';
SELECT pg_reload_conf();
Then add per-table overrides for your highest-churn tables. Monitor with the queries above and adjust based on observed bloat and vacuum duration.
Article #440 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.
Try it on Codelit
Cost Estimator
See estimated AWS monthly costs for every component in your architecture
AI Architecture Review
Get an AI audit covering security gaps, bottlenecks, and scaling risks
Related articles
Try these templates
WhatsApp-Scale Messaging System
End-to-end encrypted messaging with offline delivery, group chats, and media sharing at billions-of-messages scale.
9 componentsGmail-Scale Email Service
Email platform handling billions of messages with spam filtering, search indexing, attachment storage, and push notifications.
10 componentsPostgreSQL High Availability Cluster
Production PostgreSQL with streaming replication, connection pooling, automated failover, and monitoring.
10 componentsBuild this architecture
Generate an interactive architecture for PostgreSQL Autovacuum Tuning in seconds.
Try it in Codelit →
Comments