databasesperformancesystem-designbackend

Database Index Design Patterns — Covering, Partial, Expression, and Beyond

March 29, 2026 7 min readBy Codelit Team Discussion

Why index design matters more than index existence#

Adding an index is easy. Adding the right index is the difference between a 2ms query and a 2-second one. Most teams create indexes reactively — a query is slow, so they slap an index on the WHERE column. That works until it doesn't.

Real index design requires understanding your query patterns, data distribution, and the trade-offs between read speed and write overhead.

B-tree indexes — the default workhorse#

B-tree is the default index type in PostgreSQL, MySQL, and most relational databases. It works for equality checks, range scans, and sorting.

CREATE INDEX idx_users_email ON users (email);

B-trees are balanced — lookups are O(log n). They support operators like =, <, >, <=, >=, and BETWEEN.

When B-tree falls short: full-text search, JSON queries, array containment, and geometric lookups all need specialized index types.

Covering indexes — eliminate table lookups#

A covering index includes all columns a query needs. The database can answer the query entirely from the index without touching the heap (table data).

-- Query: SELECT email, name FROM users WHERE email = 'x@y.com'
-- Covering index:
CREATE INDEX idx_users_email_name ON users (email) INCLUDE (name);

The INCLUDE clause (PostgreSQL 11+) adds columns to the leaf pages without affecting the sort order. MySQL achieves this with composite indexes.

Index-only scans#

When a covering index satisfies the query, PostgreSQL performs an index-only scan. You can verify this in EXPLAIN ANALYZE:

Index Only Scan using idx_users_email_name on users
  Index Cond: (email = 'x@y.com')
  Heap Fetches: 0

Heap Fetches: 0 means the table was never touched. This is the fastest possible read path.

Caveat: index-only scans require the visibility map to be up-to-date. Run VACUUM regularly on heavily updated tables.

Multi-column indexes — column order matters#

A multi-column index on (a, b, c) supports queries filtering on:

a alone
a and b
a, b, and c

It does not efficiently support queries filtering only on b or c. This is the leftmost prefix rule.

-- Good: uses the index
SELECT * FROM orders WHERE customer_id = 5 AND status = 'shipped';

-- Bad: cannot use index on (customer_id, status)
SELECT * FROM orders WHERE status = 'shipped';

Choosing column order#

Equality columns first — columns compared with = go leftmost
Range columns last — columns with <, >, BETWEEN go rightmost
High cardinality first — when multiple equality columns exist, put the most selective one first

-- Optimal for: WHERE tenant_id = ? AND created_at > ?
CREATE INDEX idx_orders_tenant_created
  ON orders (tenant_id, created_at);

Partial indexes — index only what you query#

A partial index includes only rows matching a predicate. Smaller index = faster scans, less storage, less write overhead.

-- Only 2% of orders are 'pending', but you query them constantly
CREATE INDEX idx_orders_pending
  ON orders (created_at)
  WHERE status = 'pending';

This index is a fraction of the size of a full index on created_at. PostgreSQL uses it when the query includes the matching WHERE clause.

Common use cases#

Soft deletes: WHERE deleted_at IS NULL — index only active records
Status filters: WHERE status = 'active' — index the hot subset
Null exclusion: WHERE column IS NOT NULL — skip sparse data

Expression indexes — index computed values#

When queries filter on a function or expression, a regular column index won't help. Expression indexes solve this.

-- Query: WHERE LOWER(email) = 'x@y.com'
CREATE INDEX idx_users_lower_email ON users (LOWER(email));

-- Query: WHERE (data ->> 'type') = 'premium'
CREATE INDEX idx_events_type ON events ((data ->> 'type'));

The expression in the index must exactly match the expression in the query. PostgreSQL compares expressions literally.

Performance tip: expression indexes are recomputed on every INSERT and UPDATE. Use them when the expression is cheap and the read benefit is large.

GIN indexes — full-text search and JSON#

GIN (Generalized Inverted Index) maps values to the rows containing them. It's the backbone of full-text search, array operations, and JSONB queries in PostgreSQL.

Full-text search#

-- Create a tsvector column or use an expression index
CREATE INDEX idx_articles_search
  ON articles USING GIN (to_tsvector('english', title || ' ' || body));

-- Query
SELECT * FROM articles
WHERE to_tsvector('english', title || ' ' || body)
  @@ to_tsquery('english', 'kubernetes & scaling');

GIN breaks the text into lexemes and builds an inverted index. Searches are fast even on millions of documents.

JSONB containment#

CREATE INDEX idx_events_data ON events USING GIN (data);

-- Query: find events where data contains {"type": "click"}
SELECT * FROM events WHERE data @> '{"type": "click"}';

Trade-off: GIN indexes are slower to update than B-tree. They use a pending list that gets merged during VACUUM. For write-heavy tables, tune gin_pending_list_limit.

BRIN indexes — time-series and append-only data#

BRIN (Block Range Index) stores summary information (min/max) for ranges of physical pages. It's tiny — often 1000x smaller than a B-tree — and perfect for naturally ordered data.

-- Time-series table where rows are inserted in timestamp order
CREATE INDEX idx_events_time ON events USING BRIN (created_at);

BRIN works because physically adjacent rows have similar created_at values. When you query WHERE created_at BETWEEN '2026-03-01' AND '2026-03-29', BRIN skips entire page ranges that fall outside the window.

When BRIN works well#

Append-only or insert-mostly tables
Data is physically ordered by the indexed column
Table is large (millions of rows)
You need low storage overhead

When BRIN fails#

Randomly inserted data (physical order doesn't match logical order)
Frequent updates that scatter values across pages
Point lookups (B-tree is faster for exact matches)

Choosing the right index type — decision guide#

Pattern	Index type	Why
Equality and range	B-tree	Default, fast for most queries
Query reads only indexed columns	Covering (INCLUDE)	Avoids heap fetches
Subset of rows queried	Partial	Smaller, faster index
Function in WHERE clause	Expression	Matches computed predicates
Full-text search	GIN	Inverted index for lexemes
JSONB containment	GIN	Supports `@>` and `?` operators
Time-series range scans	BRIN	Tiny index, huge tables
Geospatial queries	GiST/SP-GiST	R-tree for geometry

Anti-patterns to avoid#

Indexing every column — each index slows writes and consumes storage. Only index columns that appear in WHERE, JOIN, and ORDER BY.

Redundant indexes — an index on (a, b) makes a separate index on (a) redundant. Audit with pg_stat_user_indexes.

Unused indexes — check idx_scan in pg_stat_user_indexes. If it's zero for weeks, drop the index.

Wrong column order — placing a low-cardinality column first in a composite index wastes selectivity.

Monitoring index health#

-- Find unused indexes
SELECT schemaname, indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;

-- Check index size vs table size
SELECT pg_size_pretty(pg_indexes_size('orders')) AS index_size,
       pg_size_pretty(pg_table_size('orders')) AS table_size;

If your total index size exceeds your table size, you likely have redundant or unused indexes.

Visualize your database architecture#

Map out your index strategy alongside your schema and query patterns — try Codelit to generate interactive architecture diagrams.

Key takeaways#

Covering indexes eliminate heap fetches — use INCLUDE for read-heavy queries
Column order matters — equality first, range last, high cardinality first
Partial indexes reduce size and overhead for hot subsets
Expression indexes match computed WHERE clauses exactly
GIN powers full-text search and JSONB — accept slower writes
BRIN is ideal for time-series data with natural physical ordering
Audit regularly — unused and redundant indexes cost you on every write

Article #428 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.

{ }

Explore the Discord architecture interactively

Try it →

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

AI Architecture Review

Get an AI audit covering security gaps, bottlenecks, and scaling risks

Build this architecture →

Comments

API design

API Backward Compatibility: Ship Changes Without Breaking Consumers

6 min read

api design

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

8 min read

system design

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

7 min read

Try these templates

Multiplayer Game Backend

Real-time multiplayer game server with matchmaking, state sync, leaderboards, and anti-cheat.

8 components

Figma Collaborative Design Platform

Browser-based design tool with real-time multiplayer editing, component libraries, and developer handoff.

10 components

Build this architecture

Generate an interactive architecture for Database Index Design Patterns in seconds.

Try it in Codelit →

databasesperformancesystem-designbackend

Database Index Design Patterns — Covering, Partial, Expression, and Beyond

March 29, 2026 7 min readBy Codelit Team Discussion

Why index design matters more than index existence#

Real index design requires understanding your query patterns, data distribution, and the trade-offs between read speed and write overhead.

B-tree indexes — the default workhorse#

B-tree is the default index type in PostgreSQL, MySQL, and most relational databases. It works for equality checks, range scans, and sorting.

CREATE INDEX idx_users_email ON users (email);

B-trees are balanced — lookups are O(log n). They support operators like =, <, >, <=, >=, and BETWEEN.

When B-tree falls short: full-text search, JSON queries, array containment, and geometric lookups all need specialized index types.

Covering indexes — eliminate table lookups#

A covering index includes all columns a query needs. The database can answer the query entirely from the index without touching the heap (table data).

-- Query: SELECT email, name FROM users WHERE email = 'x@y.com'
-- Covering index:
CREATE INDEX idx_users_email_name ON users (email) INCLUDE (name);

The INCLUDE clause (PostgreSQL 11+) adds columns to the leaf pages without affecting the sort order. MySQL achieves this with composite indexes.

Index-only scans#

When a covering index satisfies the query, PostgreSQL performs an index-only scan. You can verify this in EXPLAIN ANALYZE:

Index Only Scan using idx_users_email_name on users
  Index Cond: (email = 'x@y.com')
  Heap Fetches: 0

Heap Fetches: 0 means the table was never touched. This is the fastest possible read path.

Caveat: index-only scans require the visibility map to be up-to-date. Run VACUUM regularly on heavily updated tables.

Multi-column indexes — column order matters#

A multi-column index on (a, b, c) supports queries filtering on:

a alone
a and b
a, b, and c

It does not efficiently support queries filtering only on b or c. This is the leftmost prefix rule.

-- Good: uses the index
SELECT * FROM orders WHERE customer_id = 5 AND status = 'shipped';

-- Bad: cannot use index on (customer_id, status)
SELECT * FROM orders WHERE status = 'shipped';

Choosing column order#

Equality columns first — columns compared with = go leftmost
Range columns last — columns with <, >, BETWEEN go rightmost
High cardinality first — when multiple equality columns exist, put the most selective one first

-- Optimal for: WHERE tenant_id = ? AND created_at > ?
CREATE INDEX idx_orders_tenant_created
  ON orders (tenant_id, created_at);

Partial indexes — index only what you query#

A partial index includes only rows matching a predicate. Smaller index = faster scans, less storage, less write overhead.

-- Only 2% of orders are 'pending', but you query them constantly
CREATE INDEX idx_orders_pending
  ON orders (created_at)
  WHERE status = 'pending';

This index is a fraction of the size of a full index on created_at. PostgreSQL uses it when the query includes the matching WHERE clause.

Common use cases#

Soft deletes: WHERE deleted_at IS NULL — index only active records
Status filters: WHERE status = 'active' — index the hot subset
Null exclusion: WHERE column IS NOT NULL — skip sparse data

Expression indexes — index computed values#

When queries filter on a function or expression, a regular column index won't help. Expression indexes solve this.

-- Query: WHERE LOWER(email) = 'x@y.com'
CREATE INDEX idx_users_lower_email ON users (LOWER(email));

-- Query: WHERE (data ->> 'type') = 'premium'
CREATE INDEX idx_events_type ON events ((data ->> 'type'));

The expression in the index must exactly match the expression in the query. PostgreSQL compares expressions literally.

Performance tip: expression indexes are recomputed on every INSERT and UPDATE. Use them when the expression is cheap and the read benefit is large.

GIN indexes — full-text search and JSON#

GIN (Generalized Inverted Index) maps values to the rows containing them. It's the backbone of full-text search, array operations, and JSONB queries in PostgreSQL.

Full-text search#

-- Create a tsvector column or use an expression index
CREATE INDEX idx_articles_search
  ON articles USING GIN (to_tsvector('english', title || ' ' || body));

-- Query
SELECT * FROM articles
WHERE to_tsvector('english', title || ' ' || body)
  @@ to_tsquery('english', 'kubernetes & scaling');

GIN breaks the text into lexemes and builds an inverted index. Searches are fast even on millions of documents.

JSONB containment#

CREATE INDEX idx_events_data ON events USING GIN (data);

-- Query: find events where data contains {"type": "click"}
SELECT * FROM events WHERE data @> '{"type": "click"}';

Trade-off: GIN indexes are slower to update than B-tree. They use a pending list that gets merged during VACUUM. For write-heavy tables, tune gin_pending_list_limit.

BRIN indexes — time-series and append-only data#

BRIN (Block Range Index) stores summary information (min/max) for ranges of physical pages. It's tiny — often 1000x smaller than a B-tree — and perfect for naturally ordered data.

-- Time-series table where rows are inserted in timestamp order
CREATE INDEX idx_events_time ON events USING BRIN (created_at);

When BRIN works well#

Append-only or insert-mostly tables
Data is physically ordered by the indexed column
Table is large (millions of rows)
You need low storage overhead

When BRIN fails#

Randomly inserted data (physical order doesn't match logical order)
Frequent updates that scatter values across pages
Point lookups (B-tree is faster for exact matches)

Choosing the right index type — decision guide#

Pattern	Index type	Why
Equality and range	B-tree	Default, fast for most queries
Query reads only indexed columns	Covering (INCLUDE)	Avoids heap fetches
Subset of rows queried	Partial	Smaller, faster index
Function in WHERE clause	Expression	Matches computed predicates
Full-text search	GIN	Inverted index for lexemes
JSONB containment	GIN	Supports `@>` and `?` operators
Time-series range scans	BRIN	Tiny index, huge tables
Geospatial queries	GiST/SP-GiST	R-tree for geometry

Anti-patterns to avoid#

Indexing every column — each index slows writes and consumes storage. Only index columns that appear in WHERE, JOIN, and ORDER BY.

Redundant indexes — an index on (a, b) makes a separate index on (a) redundant. Audit with pg_stat_user_indexes.

Unused indexes — check idx_scan in pg_stat_user_indexes. If it's zero for weeks, drop the index.

Wrong column order — placing a low-cardinality column first in a composite index wastes selectivity.

Monitoring index health#

-- Find unused indexes
SELECT schemaname, indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;

-- Check index size vs table size
SELECT pg_size_pretty(pg_indexes_size('orders')) AS index_size,
       pg_size_pretty(pg_table_size('orders')) AS table_size;

If your total index size exceeds your table size, you likely have redundant or unused indexes.

Visualize your database architecture#

Map out your index strategy alongside your schema and query patterns — try Codelit to generate interactive architecture diagrams.

Key takeaways#

Covering indexes eliminate heap fetches — use INCLUDE for read-heavy queries
Column order matters — equality first, range last, high cardinality first
Partial indexes reduce size and overhead for hot subsets
Expression indexes match computed WHERE clauses exactly
GIN powers full-text search and JSONB — accept slower writes
BRIN is ideal for time-series data with natural physical ordering
Audit regularly — unused and redundant indexes cost you on every write

Article #428 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.

{ }

Explore the Discord architecture interactively

Try it →

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

AI Architecture Review

Get an AI audit covering security gaps, bottlenecks, and scaling risks

Build this architecture →

Comments

API design

Try these templates

Multiplayer Game Backend

Real-time multiplayer game server with matchmaking, state sync, leaderboards, and anti-cheat.

8 components

Figma Collaborative Design Platform

Browser-based design tool with real-time multiplayer editing, component libraries, and developer handoff.

10 components

Build this architecture

Generate an interactive architecture for Database Index Design Patterns in seconds.

Try it in Codelit →

Database Index Design Patterns — Covering, Partial, Expression, and Beyond

Why index design matters more than index existence#

B-tree indexes — the default workhorse#

Covering indexes — eliminate table lookups#

Index-only scans#

Multi-column indexes — column order matters#

Choosing column order#

Partial indexes — index only what you query#

Common use cases#

Expression indexes — index computed values#

GIN indexes — full-text search and JSON#

Full-text search#

JSONB containment#

BRIN indexes — time-series and append-only data#

When BRIN works well#

When BRIN fails#

Choosing the right index type — decision guide#

Anti-patterns to avoid#

Monitoring index health#

Visualize your database architecture#

Key takeaways#

Comments

Related articles

API Backward Compatibility: Ship Changes Without Breaking Consumers

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

Try these templates

Multiplayer Game Backend

Figma Collaborative Design Platform

Build this architecture

Database Index Design Patterns — Covering, Partial, Expression, and Beyond

Why index design matters more than index existence#

B-tree indexes — the default workhorse#

Covering indexes — eliminate table lookups#

Index-only scans#

Multi-column indexes — column order matters#

Choosing column order#

Partial indexes — index only what you query#

Common use cases#

Expression indexes — index computed values#

GIN indexes — full-text search and JSON#

Full-text search#

JSONB containment#

BRIN indexes — time-series and append-only data#

When BRIN works well#

When BRIN fails#

Choosing the right index type — decision guide#

Anti-patterns to avoid#

Monitoring index health#

Visualize your database architecture#

Key takeaways#

Comments

Related articles

API Backward Compatibility: Ship Changes Without Breaking Consumers

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

Try these templates

Multiplayer Game Backend

Figma Collaborative Design Platform

Build this architecture