Google Maps System Design: Architecture Behind Real-Time Navigation at Scale
Google Maps System Design#
Google Maps serves over a billion users, rendering map tiles, computing routes, estimating arrival times, and indexing hundreds of millions of places. Designing a system like this requires solving problems across rendering, geospatial data, graph algorithms, and distributed infrastructure. This guide breaks down each layer.
High-Level Architecture#
Client (Mobile/Web)
│
├── Map Tile Service ──── Tile Storage (CDN + Object Store)
├── Routing Service ───── Road Graph Engine
├── Traffic Service ────── Real-Time Data Pipeline
├── Place Search Service ── Search Index + POI Database
├── ETA Service ────────── ML Prediction Layer
└── Offline Maps ────────── Packaged Tile Bundles
The system splits into independent services, each optimized for its workload. A shared geospatial index underpins most of them.
Map Tile Rendering#
Maps are not rendered as a single image. The world is divided into tiles at multiple zoom levels. Each zoom level doubles the grid resolution.
Zoom 0: 1 tile (whole world)
Zoom 1: 4 tiles (2x2)
Zoom 2: 16 tiles (4x4)
...
Zoom 20: ~1 trillion tiles
Pre-rendering vs dynamic rendering: Base map tiles (roads, terrain, water) are pre-rendered and stored in object storage behind a CDN. Dynamic overlays (traffic, transit) are rendered on the fly or cached with short TTLs.
Vector tiles vs raster tiles: Modern implementations use vector tiles. The server sends geometry data and the client renders it. This reduces bandwidth, enables smooth rotation and 3D views, and allows client-side styling.
Tile addressing: Each tile is identified by (zoom, x, y). The client calculates which tiles are visible based on the viewport and zoom level, then fetches them in parallel.
Geospatial Indexing#
Efficient spatial queries are the backbone of the system. Three structures matter most.
Quadtree#
A quadtree recursively divides 2D space into four quadrants. Each node splits when it exceeds a capacity threshold. This adapts naturally to density: urban areas get deeper subdivision, oceans stay shallow.
┌───────┬───────┐
│ │ ┌──┬──┤
│ │ ├──┼──┤
├───┬───┼─┴──┴──┤
│ │ │ │
└───┴───┴───────┘
Use case: nearest-neighbor searches for places, clustering points of interest.
R-Tree#
R-trees index rectangles (bounding boxes). They group nearby objects into minimum bounding rectangles at each tree level. R-trees excel at range queries like "find all restaurants within this viewport."
S2 Geometry (S2 Cells)#
Google's S2 library projects the Earth's surface onto a cube, then uses a Hilbert curve to map 2D regions to 1D cell IDs. Each cell ID is a 64-bit integer.
Earth surface → Cube projection → Hilbert curve → Cell ID (uint64)
Why S2 wins at scale: Cell IDs are integers, so they work with standard B-tree indexes in databases. Covering a region means finding a set of cell ID ranges. Containment and intersection checks become integer comparisons. Google uses S2 cells extensively across Maps, Earth, and other geo products.
Routing Algorithms#
Dijkstra's Algorithm#
The baseline. Explores nodes in order of increasing distance from the source. Guarantees the shortest path but explores too many nodes for continent-scale graphs.
A* Search#
Adds a heuristic (typically haversine distance to the destination) to prioritize exploration toward the target. Faster than Dijkstra in practice, but still too slow for cross-country routes on a raw road graph.
Contraction Hierarchies#
The real workhorse for production routing. The road graph is preprocessed by iteratively contracting less important nodes and adding shortcut edges.
Preprocessing:
1. Rank all nodes by importance (highways > local roads)
2. Contract low-rank nodes, add shortcuts
3. Repeat until hierarchy is built
Query time:
1. Bidirectional search: forward from source, backward from target
2. Only explore edges going "up" in hierarchy
3. Meet in the middle at a highway-level node
Query time drops from seconds to milliseconds. The tradeoff is preprocessing time and storage for shortcut edges.
Multi-Modal Routing#
Walking, cycling, transit, and driving each use different graph overlays. Transit routing requires time-dependent graphs where edge weights change by schedule. The system composes legs across modes using transfer points.
Real-Time Traffic#
Traffic data comes from multiple sources: GPS traces from phones, road sensors, incident reports, and historical patterns.
GPS Probes → Kafka → Map Matching → Speed Aggregation → Traffic Tiles
│
Route Weight Updates
Map matching: Raw GPS coordinates are noisy. A Hidden Markov Model snaps GPS traces to the most likely road segments.
Speed aggregation: For each road segment, the system computes current average speed from recent probes. Segments without fresh data fall back to historical averages for that time of day and day of week.
Traffic tiles: Color-coded overlays (green/yellow/red) are generated from segment speeds and served as dynamic tile layers with short cache TTLs.
Route weight updates: The routing engine periodically refreshes edge weights in the road graph based on current traffic. Live rerouting uses these updated weights.
ETA Estimation#
ETA is not just distance divided by speed. Google uses ML models trained on billions of historical trips.
Features include: current traffic speeds, historical speed patterns, time of day, day of week, weather, road type, number of turns, traffic signal density, and real-time incident data.
Model architecture: A combination of graph neural networks (capturing road network structure) and sequence models (capturing temporal patterns). The model predicts travel time for each road segment, then sums across the route.
Accuracy matters: Even small percentage improvements in ETA accuracy affect user trust and downstream products like ride-sharing pricing.
Place Search#
Place search combines text search with geospatial filtering.
Query: "coffee near me"
│
├── Text matching: tokenize, fuzzy match against place names/categories
├── Geo filtering: S2 cell range scan around user location
├── Ranking: relevance + distance + ratings + popularity + freshness
└── Result: ranked list of places with metadata
Index structure: An inverted index for text terms, cross-referenced with S2 cell IDs. A query first narrows by text match, then filters by geographic proximity using cell ID ranges.
Autocomplete: A prefix trie with geo-aware scoring. As the user types, candidates are ranked by both string similarity and proximity to the current viewport.
Offline Maps#
Users download map regions for use without connectivity.
Packaging: The system precomputes tile bundles for predefined regions (cities, states). Each bundle includes vector tiles across zoom levels, a compressed road graph for routing, and POI data.
Storage optimization: Delta updates let users refresh offline maps without re-downloading everything. Only changed tiles and graph segments are transferred.
Routing offline: A simplified contraction hierarchy is bundled with the offline package. It covers the downloaded region with boundary nodes connecting to neighboring regions.
Download package:
├── Vector tiles (zoom 0-16)
├── Road graph (contraction hierarchy)
├── POI database (compressed)
└── Metadata + version info
Scaling Strategies#
CDN-first for tiles: The vast majority of tile requests hit CDN cache. Only cold or dynamic tiles reach origin servers.
Sharding the road graph: The global road graph is partitioned by geographic region. Cross-region queries are handled by hierarchical routing that stitches regional results.
Read replicas for place data: Place search is read-heavy. Multiple replicas serve queries while writes go through a single primary.
Async processing for traffic: Traffic data flows through a streaming pipeline. Map matching and speed aggregation run as stream processors, decoupled from serving.
Regional deployment: Services are deployed in multiple regions. Users connect to the nearest region for low latency. Geo-data is replicated where needed but partitioned to avoid full global replication.
Key Numbers to Know#
| Metric | Approximate Value |
|---|---|
| Monthly active users | 1 billion+ |
| Road graph edges (global) | ~1 billion |
| Tile requests per day | Hundreds of billions |
| GPS probes per second | Millions |
| Place database size | 200M+ places |
| Routing query latency | < 200ms (p99) |
Summary#
Google Maps system design is a composition of specialized systems: tile rendering for visualization, geospatial indexing for spatial queries, contraction hierarchies for fast routing, streaming pipelines for traffic, ML models for ETA, and careful caching and sharding for scale. Each piece solves a distinct problem, and the architecture keeps them loosely coupled.
Build and visualize system designs like this on codelit.io — the developer-first diagramming and architecture tool.
This is article #189 in the Codelit engineering blog series.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
Uber Real-Time Location System
Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.
6 componentsReal-Time Collaborative Editor
Notion-like document editor with real-time collaboration, conflict resolution, and rich media.
9 componentsNetflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsBuild this architecture
Generate an interactive Google Maps System Design in seconds.
Try it in Codelit →
Comments