PostgreSQL JSONB Patterns — Operators, GIN Indexes, and Hybrid Document-Relational Design
Why JSONB exists in PostgreSQL#
Relational databases are great at structured data. But some data doesn't fit neatly into columns: user preferences, API payloads, event metadata, dynamic form submissions. You have two choices: create a dozen nullable columns, or store the flexible part as JSONB.
JSONB (Binary JSON) stores JSON data in a decomposed binary format. Unlike the older json type, JSONB supports indexing, efficient querying, and in-place updates.
CREATE TABLE orders (
id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
customer_id bigint NOT NULL REFERENCES customers(id),
status text NOT NULL DEFAULT 'pending',
metadata jsonb NOT NULL DEFAULT '{}'::jsonb,
created_at timestamptz NOT NULL DEFAULT now()
);
The metadata column holds whatever the order needs: shipping preferences, gift messages, discount codes, A/B test variants. The structured fields stay as proper columns.
JSONB operators#
Accessing values#
-- Arrow operator: returns JSON
SELECT metadata->'shipping' FROM orders;
-- Double arrow: returns text
SELECT metadata->>'shipping_method' FROM orders;
-- Path access: nested values
SELECT metadata#>'{address,city}' FROM orders;
SELECT metadata#>>'{address,city}' FROM orders; -- as text
Containment operators#
-- Does the JSONB contain this key-value pair?
SELECT * FROM orders
WHERE metadata @> '{"priority": "express"}'::jsonb;
-- Is the JSONB contained by this value?
SELECT * FROM orders
WHERE metadata <@ '{"priority": "express", "gift": true}'::jsonb;
-- Does the JSONB have this key?
SELECT * FROM orders
WHERE metadata ? 'discount_code';
-- Does the JSONB have any of these keys?
SELECT * FROM orders
WHERE metadata ?| array['discount_code', 'coupon'];
-- Does the JSONB have all of these keys?
SELECT * FROM orders
WHERE metadata ?& array['shipping_method', 'address'];
Modification operators#
-- Concatenate (merge) JSONB values
UPDATE orders
SET metadata = metadata || '{"tracked": true}'::jsonb
WHERE id = 42;
-- Remove a key
UPDATE orders
SET metadata = metadata - 'temporary_flag'
WHERE id = 42;
-- Remove a nested path
UPDATE orders
SET metadata = metadata #- '{address,apartment}'
WHERE id = 42;
GIN indexing for JSONB#
Without an index, every JSONB query does a sequential scan. GIN (Generalized Inverted Index) indexes solve this.
Default GIN index#
CREATE INDEX idx_orders_metadata ON orders USING GIN (metadata);
This supports the @>, ?, ?|, and ?& operators. It does not support -> or ->> comparisons.
-- Uses the GIN index (containment check)
SELECT * FROM orders WHERE metadata @> '{"priority": "express"}';
-- Does NOT use the GIN index (comparison on extracted value)
SELECT * FROM orders WHERE metadata->>'priority' = 'express';
jsonb_path_ops GIN index#
A smaller, faster alternative when you only need @> containment queries:
CREATE INDEX idx_orders_metadata_path ON orders
USING GIN (metadata jsonb_path_ops);
This index is 2-3x smaller than the default GIN index and faster for containment queries. The trade-off: it only supports @>, not ?, ?|, or ?&.
Expression indexes for specific paths#
When you query a specific JSON path frequently:
CREATE INDEX idx_orders_shipping ON orders
((metadata->>'shipping_method'));
-- Now this uses the B-tree expression index
SELECT * FROM orders
WHERE metadata->>'shipping_method' = 'overnight';
This is smaller and faster than a full GIN index when you only query one path.
Partial indexes on JSON paths#
Combine expression indexes with WHERE clauses for surgical precision:
-- Only index orders that have a discount code
CREATE INDEX idx_orders_discount ON orders
((metadata->>'discount_code'))
WHERE metadata ? 'discount_code';
-- Only index express priority orders
CREATE INDEX idx_express_orders ON orders (created_at)
WHERE metadata @> '{"priority": "express"}'::jsonb;
Partial indexes are dramatically smaller. If 5% of orders are express priority, the index is 95% smaller than indexing all orders.
Composite partial indexes#
CREATE INDEX idx_pending_express ON orders (customer_id, created_at)
WHERE status = 'pending'
AND metadata @> '{"priority": "express"}'::jsonb;
This index only covers pending express orders. A query that filters on all three conditions (status, priority, customer) reads a tiny index.
Document vs relational hybrid#
The hybrid approach uses relational columns for data you query, join, or constrain, and JSONB for everything else.
When to use JSONB#
- Dynamic attributes that vary per row (product specifications, user preferences)
- Event metadata where the schema changes over time
- External API payloads you store for auditing
- Configuration blobs that don't need relational queries
- Rapid prototyping when the schema isn't settled yet
When to use separate tables#
- Data you JOIN on — JSONB joins are expensive and unnatural
- Data with referential integrity — JSONB can't enforce foreign keys
- High-cardinality queries — B-tree indexes on columns outperform GIN on JSONB
- Aggregation targets —
SUM(price)on a column is far faster than extracting from JSONB - Data that's always present — if every row has the same structure, use columns
The migration path#
Start flexible, then promote fields to columns when patterns emerge:
-- Step 1: Field lives in JSONB during prototyping
SELECT metadata->>'loyalty_tier' FROM orders;
-- Step 2: Pattern is stable — promote to a column
ALTER TABLE orders ADD COLUMN loyalty_tier text;
UPDATE orders SET loyalty_tier = metadata->>'loyalty_tier';
ALTER TABLE orders ALTER COLUMN loyalty_tier SET NOT NULL;
-- Step 3: Remove from JSONB
UPDATE orders SET metadata = metadata - 'loyalty_tier';
Performance considerations#
JSONB storage size#
JSONB stores keys in every row. A million rows with {"shipping_method": "standard"} stores the string shipping_method a million times. Long key names waste storage. Use concise keys in high-volume tables.
TOAST compression#
PostgreSQL automatically compresses JSONB values larger than ~2 KB using TOAST (The Oversized-Attribute Storage Technique). Large JSONB documents don't bloat the main table, but they add overhead on read because they require decompression.
Query plan analysis#
Always verify your indexes are used:
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders
WHERE metadata @> '{"priority": "express"}'::jsonb;
Look for Bitmap Index Scan on your GIN index, not Seq Scan.
Avoid extracting in WHERE clauses#
-- Slow: extracts value, then compares (no GIN index)
WHERE metadata->>'priority' = 'express'
-- Fast: containment check (uses GIN index)
WHERE metadata @> '{"priority": "express"}'::jsonb
Avoid deep nesting#
Deeply nested JSONB (5+ levels) is hard to index, hard to query, and hard to debug. Flatten when possible.
jsonb_agg and jsonb_build_object#
Build JSONB results in queries without application-side serialization:
SELECT customer_id,
jsonb_agg(jsonb_build_object(
'order_id', id,
'status', status,
'created', created_at
)) AS recent_orders
FROM orders
GROUP BY customer_id;
The practical takeaway#
JSONB in PostgreSQL gives you document database flexibility with relational guarantees. The implementation strategy:
- Use columns for structured, queryable data — status, foreign keys, timestamps
- Use JSONB for flexible, variable data — metadata, preferences, payloads
- Index with GIN for containment queries, expression indexes for specific paths
- Partial indexes keep index size small when only a subset of rows matters
- Promote JSONB fields to columns when query patterns stabilize
- Always EXPLAIN ANALYZE to confirm your indexes are actually used
Article #452 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
AI Architecture Review
Get an AI audit covering security gaps, bottlenecks, and scaling risks
Related articles
API Backward Compatibility: Ship Changes Without Breaking Consumers
6 min read
api designBatch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency
8 min read
system designCircuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j
7 min read
Try these templates
Multiplayer Game Backend
Real-time multiplayer game server with matchmaking, state sync, leaderboards, and anti-cheat.
8 componentsFigma Collaborative Design Platform
Browser-based design tool with real-time multiplayer editing, component libraries, and developer handoff.
10 componentsPostgreSQL High Availability Cluster
Production PostgreSQL with streaming replication, connection pooling, automated failover, and monitoring.
10 componentsBuild this architecture
Generate an interactive architecture for PostgreSQL JSONB Patterns in seconds.
Try it in Codelit →
Comments