Time Series Database Architecture — Storage, Compression & Query Patterns
Data with a timestamp changes everything#
Time series data is any data where each point is associated with a timestamp: server CPU metrics every 10 seconds, stock prices every millisecond, IoT sensor readings every minute. The defining characteristic is that data arrives ordered by time, is almost always appended (never updated), and queries almost always filter by time range.
This access pattern is so different from general-purpose workloads that it demands specialized storage engines.
What makes time series data different#
Traditional databases assume random reads and writes across the entire dataset. Time series workloads have unique properties:
- Write-heavy — millions of data points per second, rarely updated or deleted
- Time-ordered — data arrives roughly in chronological order
- Recent data is hot — most queries target the last hours or days
- Old data is cold — historical data is queried rarely and can be lower resolution
- High cardinality metadata — thousands of unique series (host + metric combinations)
- Predictable patterns — values change slowly, enabling aggressive compression
Write-optimized storage engines#
Time series databases optimize for high-throughput sequential writes.
Log-Structured Merge Trees (LSM)#
Many TSDBs use LSM trees or variants. Writes go to an in-memory buffer (memtable), then flush to sorted, immutable files on disk (SSTables). Background compaction merges files.
- Write amplification is traded for write throughput
- No random I/O on writes — everything is sequential
- InfluxDB's TSI (Time Series Index) and storage engine use this approach
Time-Structured Merge Tree (TSMT)#
InfluxDB's TSM engine is a purpose-built variant. Data is organized into shards by time range, each shard containing a set of TSM files. This makes retention policies trivial — dropping old data means deleting entire shards.
Append-only columnar storage#
Prometheus uses a custom append-only storage format. Each time series gets its own chunk of samples. Chunks are immutable once full and are memory-mapped for fast reads.
ClickHouse uses a columnar MergeTree engine where data is stored column-by-column, enabling vectorized query execution and extreme compression ratios.
Compression — fitting billions of points in memory#
Time series data compresses extraordinarily well because consecutive values are similar.
Gorilla compression (Facebook)#
Facebook's Gorilla paper introduced a compression scheme that achieves 12x compression on real-world metrics:
- Timestamps: Delta-of-delta encoding. Most consecutive timestamps have the same delta (e.g., exactly 10 seconds apart), so the delta-of-delta is zero — encoded in a single bit.
- Values: XOR encoding. Consecutive floating-point values are XORed — when values change slowly, most bits are zero, requiring very few bits to encode.
Raw: 1709251200, 45.2, 1709251210, 45.3, 1709251220, 45.2
Deltas: 10, 10, 10 (timestamps)
Delta-of-delta: 0, 0 → 1 bit each
XOR values: small differences → few bits each
This compression is used by Prometheus, VictoriaMetrics, and Thanos.
Delta-of-delta encoding#
Beyond Gorilla, integer metrics (counters, gauges with integer values) benefit from simple delta-of-delta encoding:
Raw values: 100, 105, 110, 115, 120
Deltas: 5, 5, 5, 5
Delta-of-delta: 0, 0, 0 → nearly free to store
Dictionary encoding for tags#
High-cardinality tag values (hostnames, region names) repeat constantly. Dictionary encoding maps each unique string to a small integer, drastically reducing storage for metadata.
Retention policies and tiered storage#
Not all data deserves the same storage treatment.
Retention policies automatically delete data older than a threshold. A typical setup:
| Tier | Resolution | Retention | Storage |
|---|---|---|---|
| Hot | Raw (10s) | 7 days | SSD / memory |
| Warm | 1-minute avg | 90 days | SSD |
| Cold | 1-hour avg | 2 years | HDD / object storage |
| Archive | Daily summary | Forever | S3 / GCS |
This tiered approach keeps costs manageable while preserving historical trends.
Downsampling — trading resolution for efficiency#
Downsampling aggregates high-resolution data into lower-resolution summaries. Instead of keeping every 10-second CPU reading for a year, store 5-minute averages after 30 days.
Common aggregation functions:
avg— average value in the windowmin/max— extremes for alerting reviewsum— total for counterscount— number of raw points (for weighted re-aggregation)percentile— p50, p95, p99 for latency data
Tools handling downsampling:
- InfluxDB — continuous queries and tasks
- Prometheus — recording rules
- TimescaleDB — continuous aggregates (materialized views that auto-refresh)
- VictoriaMetrics — downsampling via
-downsampling.periodflag
The tool landscape#
InfluxDB#
Purpose-built TSDB with its own query language (Flux, InfluxQL). Strong ecosystem, cloud-hosted option, built-in dashboarding with Chronograf.
- Best for: Metrics, IoT, and application monitoring
- Storage: TSM engine with built-in compression
- Query: Flux (functional) or InfluxQL (SQL-like)
TimescaleDB#
PostgreSQL extension that adds time series superpowers. Full SQL compatibility, joins with relational data, and hypertables that auto-partition by time.
- Best for: Teams already on PostgreSQL who need time series alongside relational data
- Storage: PostgreSQL heap with chunk-based partitioning
- Query: Full PostgreSQL SQL — joins, CTEs, window functions
Prometheus#
Pull-based monitoring system and TSDB. The standard for Kubernetes and cloud-native monitoring. Paired with Grafana for visualization.
- Best for: Infrastructure and application monitoring, alerting
- Storage: Local append-only chunks with Gorilla compression
- Query: PromQL — purpose-built for time series aggregation
- Limitation: Local storage only — use Thanos or Cortex for long-term retention
ClickHouse#
Columnar OLAP database that excels at time series analytics. Handles petabytes of data with sub-second query performance.
- Best for: Analytics on time series data, high-cardinality workloads, log analysis
- Storage: MergeTree with columnar compression
- Query: SQL with time series extensions
QuestDB#
High-performance TSDB written in Java and C++, optimized for fast SQL queries on time series data. Uses memory-mapped files and SIMD instructions.
- Best for: Financial data, high-frequency ingestion, SQL-native teams
- Storage: Column-based with append-only design
- Query: PostgreSQL-compatible SQL with time series extensions
Query patterns for time series#
Time series queries follow predictable patterns:
Range queries: "Give me CPU usage for host-42 in the last 6 hours"
SELECT time, cpu_usage FROM metrics
WHERE host = 'host-42'
AND time > now() - INTERVAL '6 hours'
ORDER BY time;
Aggregation over windows: "Average CPU per 5-minute bucket"
SELECT time_bucket('5 minutes', time) AS bucket,
avg(cpu_usage) AS avg_cpu
FROM metrics
WHERE time > now() - INTERVAL '24 hours'
GROUP BY bucket ORDER BY bucket;
Top-N series: "Which 10 hosts have the highest p99 latency?"
SELECT host, percentile_cont(0.99) WITHIN GROUP (ORDER BY latency) AS p99
FROM requests
WHERE time > now() - INTERVAL '1 hour'
GROUP BY host ORDER BY p99 DESC LIMIT 10;
Rate of change: "What's the request rate per second?"
rate(http_requests_total[5m])
Use cases by industry#
Monitoring and observability#
Server metrics, application traces, log volumes. Prometheus + Grafana is the de facto stack. At scale, companies use Thanos, Cortex, or VictoriaMetrics for long-term storage.
IoT and industrial#
Sensor readings from thousands of devices — temperature, pressure, vibration. InfluxDB and TimescaleDB dominate this space. Key challenge: handling device cardinality and intermittent connectivity.
Financial markets#
Tick data, order book snapshots, trade execution metrics. QuestDB and ClickHouse handle the throughput. Microsecond-precision timestamps and out-of-order ingestion are critical requirements.
Visualize your time series architecture#
See how ingestion pipelines, storage tiers, and query layers connect — try Codelit to generate an interactive diagram showing your time series infrastructure from collectors to dashboards.
Key takeaways#
- Time series data has unique access patterns — write-heavy, time-ordered, recent-hot
- Gorilla compression is transformative — 12x compression makes in-memory storage viable
- Retention policies are mandatory — raw data at full resolution cannot be kept forever
- Downsampling preserves trends — trade resolution for storage efficiency on historical data
- Choose tools by use case — Prometheus for monitoring, TimescaleDB for SQL, ClickHouse for analytics, QuestDB for finance
- Columnar storage wins — column-oriented designs enable both compression and fast aggregation
This is article #174 on the Codelit engineering blog — we publish in-depth guides on system design, infrastructure, and software architecture. Explore all of them at codelit.io.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Cost Estimator
See estimated AWS monthly costs for every component in your architecture
AI Architecture Review
Get an AI audit covering security gaps, bottlenecks, and scaling risks
Related articles
Try these templates
Uber Real-Time Location System
Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.
6 componentsReal-Time Collaborative Editor
Notion-like document editor with real-time collaboration, conflict resolution, and rich media.
9 componentsNetflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsBuild this architecture
Generate an interactive Time Series Database Architecture in seconds.
Try it in Codelit →
Comments