Ticket Booking System Design: Handling Concurrency, Payments, and Flash Sales
Ticket Booking System Design#
Ticket booking systems face a unique combination of challenges: millions of users competing for limited inventory, payments that can fail mid-flow, and bots trying to grab everything. This guide covers the architecture decisions that make a booking system reliable under extreme load.
High-Level Architecture#
Client (Web/Mobile)
│
├── API Gateway ──────── Rate Limiter + Auth
├── Seat Map Service ──── Venue/Event Catalog
├── Reservation Service ── Lock Manager + TTL Store
├── Payment Service ────── Payment Gateway Integration
├── Order Service ──────── Event Store + Projections
├── Waitlist Service ───── Priority Queue
└── Notification Service ── Email / Push / SMS
Each service owns its data and communicates through async events where possible, with synchronous calls only on the critical booking path.
Seat Selection and Inventory#
The seat map is the core data model. Each event has a venue layout with sections, rows, and individual seats.
Event
└── Venue
└── Section (e.g., "Orchestra Left")
└── Row (e.g., "Row A")
└── Seat (e.g., "Seat 12")
├── status: available | held | reserved | sold
├── price_tier: string
└── held_until: timestamp | null
Seat states:
- Available: Open for selection.
- Held: Temporarily locked while a user is in the checkout flow. Expires after a TTL (typically 5-10 minutes).
- Reserved: Payment initiated. Locked until payment confirms or times out.
- Sold: Payment confirmed. Final state.
General admission: For events without assigned seats, inventory is a counter rather than individual seat records. Atomic decrement operations replace seat-level locking.
Concurrency Control#
The hardest problem. Thousands of users may try to book the same seat simultaneously.
Optimistic Locking#
Each seat record has a version number. To reserve a seat:
1. Read seat record (version = 5, status = available)
2. Attempt update: SET status = 'held', version = 6
WHERE seat_id = X AND version = 5
3. If rows_affected = 0 → conflict, seat was taken
4. If rows_affected = 1 → success, seat is held
Optimistic locking works well when contention is moderate. No external lock infrastructure is needed. The database handles atomicity.
Distributed Locks (Redis / ZooKeeper)#
For high-contention scenarios like flash sales, optimistic locking generates too many retries. A distributed lock serializes access.
1. ACQUIRE lock on seat_id (SET seat:123 NX EX 10)
2. If acquired:
a. Check seat status
b. If available → mark as held
c. RELEASE lock
3. If not acquired → seat is being processed, retry or fail fast
Lock TTL is critical. If a service crashes while holding a lock, the TTL ensures the lock auto-releases. Set it long enough for the operation but short enough to avoid blocking.
Hybrid Approach#
Use optimistic locking for normal traffic and switch to distributed locks during detected flash-sale conditions based on request rate monitoring.
Reservation Expiry#
A held seat must expire if the user abandons checkout. Without expiry, inventory leaks.
Hold Flow:
User selects seat → status = held, held_until = now + 8min
│
├── User completes payment → status = sold
├── User abandons → TTL expires → status = available
└── Background sweeper checks held_until < now() every 30s
Two expiry mechanisms for reliability:
- Lazy expiry: On any read, check if
held_untilhas passed. If so, treat as available. - Active sweeper: A background job scans for expired holds and resets them. Catches cases where no one reads the seat before expiry matters.
Redis sorted sets work well here: score = expiry timestamp, member = seat ID. A worker polls ZRANGEBYSCORE for expired entries.
Payment Timeout#
Payment is the most failure-prone step. Networks fail, gateways timeout, users close browsers.
Payment State Machine:
pending → processing → succeeded
→ failed
→ timed_out
Timeout handling:
- When payment is initiated, set a reservation TTL (e.g., 10 minutes).
- If the payment gateway does not respond within the timeout, mark the payment as
timed_out. - Release the seat back to available inventory.
- If a late payment confirmation arrives, issue an automatic refund.
Idempotency keys: Every payment request includes a unique idempotency key. If the client retries due to a timeout, the gateway deduplicates and returns the original result instead of charging twice.
Waitlist#
When an event sells out, users join a waitlist. When seats become available (cancellations, expired holds), waitlisted users get notified.
Waitlist Queue:
┌──────────────────────────────┐
│ User A (joined 10:01) ─ next │──→ Seat released → Notify + auto-hold
│ User B (joined 10:03) │
│ User C (joined 10:05) │
└──────────────────────────────┘
Implementation: A priority queue ordered by join time. When a seat is released, dequeue the next user and create a time-limited hold. If that user does not complete purchase within the window, move to the next.
Fairness: The queue is strictly FIFO. No user can jump ahead regardless of how many times they refresh.
Flash Sale Scalability#
Flash sales (concert tickets going on sale at noon) create extreme traffic spikes. The system must handle 100x normal load for a few minutes.
Request Queuing#
Instead of letting all requests hit the booking service simultaneously, funnel them through a queue.
Clients → API Gateway → Virtual Waiting Room → Booking Queue → Reservation Service
Virtual waiting room: Users are assigned a random position in a queue. The frontend polls for their turn. This converts a thundering herd into a controlled stream.
Inventory Caching#
Pre-load available seat counts into Redis. Decrement atomically with DECR. Only users who successfully decrement proceed to the actual booking flow. This filters out the majority of requests before they hit the database.
1. DECR available_seats:{event_id}
2. If result >= 0 → proceed to seat selection
3. If result < 0 → sold out, reject immediately (INCR to correct)
Database Sharding#
Shard seat inventory by event ID. Each event's data lives on a single shard, avoiding cross-shard transactions. Hot events get dedicated resources.
Auto-Scaling#
Pre-warm infrastructure before announced sale times. Scale API servers, database read replicas, and cache nodes based on expected demand.
Event Sourcing for Bookings#
Instead of storing only the current state of a booking, store every state change as an immutable event.
Event Stream for Booking #456:
1. SeatSelected { seat: "A12", user: "u789", at: "10:00:01" }
2. HoldCreated { seat: "A12", expires: "10:08:01" }
3. PaymentInitiated { amount: 150, gateway: "stripe" }
4. PaymentSucceeded { transaction_id: "txn_abc" }
5. TicketIssued { ticket_id: "tkt_xyz", qr_code: "..." }
Benefits:
- Complete audit trail: Every action is recorded. Dispute resolution becomes straightforward.
- Temporal queries: Reconstruct the state of any booking at any point in time.
- Event replay: Rebuild read models or fix bugs by replaying events through corrected logic.
- Decoupled projections: Different services consume events to build their own views (analytics, reporting, notifications).
Projections: Materialized views built from the event stream. The seat availability view, the user's booking history, and the revenue dashboard are all projections of the same event stream.
Anti-Bot Measures#
Bots and scalpers are a constant threat. A layered defense is necessary.
Rate Limiting#
Layer 1: IP-based rate limiting at the API gateway (e.g., 10 requests/second)
Layer 2: User-based rate limiting (e.g., 5 seat holds per user per event)
Layer 3: Device fingerprinting to catch distributed bot networks
CAPTCHA and Proof-of-Work#
Insert a CAPTCHA challenge before seat selection during high-demand events. Alternatively, require a lightweight proof-of-work computation that is trivial for a single browser but expensive for bot farms running thousands of sessions.
Behavioral Analysis#
Monitor session patterns in real-time:
- Time between page load and seat selection (too fast = bot)
- Mouse movement patterns (absent = headless browser)
- Request header consistency (missing or uniform headers = scripted)
Flag suspicious sessions and route them to a challenge flow or block them entirely.
Purchase Limits#
Enforce per-user and per-household purchase limits. Cross-reference payment methods, shipping addresses, and account creation dates to detect scalper accounts.
Key Numbers to Know#
| Metric | Approximate Value |
|---|---|
| Seat hold TTL | 5-10 minutes |
| Payment timeout | 10 minutes |
| Flash sale peak QPS | 100K-500K |
| Reservation DB write latency | < 50ms (p99) |
| Waitlist notification latency | < 5 seconds |
| Bot traffic during flash sales | 50-80% of total |
Summary#
A ticket booking system design revolves around inventory integrity under concurrency. Optimistic locking handles normal load while distributed locks manage flash sales. Reservation expiry prevents inventory leaks. Event sourcing provides auditability and flexibility. Anti-bot measures protect fairness. The virtual waiting room pattern transforms uncontrollable spikes into manageable throughput.
Design and diagram booking system architectures on codelit.io — the developer-first diagramming and architecture tool.
This is article #190 in the Codelit engineering blog series.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
Uber Real-Time Location System
Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.
6 componentsE-Commerce Checkout System
Production checkout flow with Stripe payments, inventory management, and fraud detection.
11 componentsAirbnb-like Booking Platform
Property rental marketplace with search, booking, payments, and reviews.
10 componentsBuild this architecture
Generate an interactive architecture for Ticket Booking System Design in seconds.
Try it in Codelit →
Comments