API Gateway Design Patterns — Routing, Rate Limiting, and Beyond
The front door of every modern system#
Every request to your backend passes through something — whether it's a reverse proxy, a load balancer, or a full API gateway. The difference matters.
An API gateway sits between your clients and your services. It handles cross-cutting concerns like authentication, rate limiting, request routing, and protocol translation so your services don't have to.
If you've ever wondered why companies like Netflix, Stripe, and Uber invest heavily in their gateway layer, this post explains why.
Why not just call services directly?#
Without a gateway, every client needs to:
- Know the address of every service
- Handle authentication on every request
- Deal with different protocols (REST, gRPC, WebSocket)
- Implement retry logic and circuit breaking
- Manage API versioning per service
That's a lot of responsibility pushed to the client. An API gateway centralizes all of this.
Core gateway patterns#
1. Request routing#
The most basic pattern. The gateway maps public URLs to internal services:
GET /api/users/* → User Service
GET /api/orders/* → Order Service
GET /api/products/* → Product Service
This decouples your public API from your internal service topology. You can split, merge, or migrate services without changing client code.
2. Authentication and authorization#
The gateway validates tokens (JWT, OAuth) before requests reach your services. This means:
- Services trust that requests are already authenticated
- Token validation logic lives in one place
- You can swap auth providers without touching services
Pattern: Gateway validates the token, extracts user context, and forwards it as headers (X-User-Id, X-User-Role) to downstream services.
3. Rate limiting#
Protect your services from abuse and overload:
- Per-user limits: 100 requests/minute per API key
- Per-endpoint limits: Write endpoints get stricter limits than reads
- Global limits: Total system capacity protection
Rate limiting at the gateway is more effective than at individual services because it catches abuse before it spreads.
4. Request aggregation (BFF pattern)#
The Backend for Frontend pattern uses the gateway to combine multiple service calls into one client response:
Client: GET /api/dashboard
Gateway internally calls:
→ User Service (profile)
→ Order Service (recent orders)
→ Analytics Service (stats)
Returns: Combined JSON response
This reduces client round trips and is especially important for mobile where latency matters.
5. Protocol translation#
Your gateway can accept REST from web clients and translate to gRPC for internal services. Clients get the simplicity of REST; services get the performance of gRPC.
6. Circuit breaking#
When a downstream service is failing, the gateway can short-circuit requests instead of letting them pile up:
- Closed: Normal operation, requests pass through
- Open: Service is down, return cached/fallback response immediately
- Half-open: Periodically test if service has recovered
This prevents cascading failures across your system.
Gateway architectures#
Single gateway#
One gateway handles everything. Simple to operate, but becomes a bottleneck at scale. Works well for small-to-medium systems.
BFF gateways#
Separate gateways per client type (web, mobile, IoT). Each gateway is optimized for its client's needs — different aggregation, different rate limits, different response formats.
Mesh gateway#
In a service mesh (Istio, Linkerd), every service gets a sidecar proxy. The "gateway" logic is distributed. Better for microservices at scale but more complex to operate.
Common mistakes#
Over-engineering the gateway. Your gateway should route and protect, not contain business logic. If you're writing if/else statements about order processing in your gateway, that logic belongs in a service.
Single point of failure. Your gateway needs to be highly available. Run multiple instances behind a load balancer. Use health checks. Plan for gateway failures.
Ignoring observability. The gateway sees every request. Add structured logging, distributed tracing (correlation IDs), and metrics (latency percentiles, error rates) here.
Not versioning. API versioning at the gateway (/v1/users, /v2/users) lets you evolve your API without breaking existing clients.
Popular API gateways#
| Gateway | Best for | Protocol support |
|---|---|---|
| Kong | Plugin ecosystem | REST, gRPC, WebSocket |
| AWS API Gateway | Serverless/AWS | REST, WebSocket |
| Envoy | Service mesh | gRPC, HTTP/2 |
| Nginx | Raw performance | HTTP, TCP, UDP |
| Traefik | Docker/K8s | HTTP, TCP, gRPC |
When you don't need a gateway#
- Monolithic apps — if you have one service, a reverse proxy (Nginx) is enough
- Internal tools — low traffic, trusted clients, no need for rate limiting
- Prototypes — add the gateway when you have actual traffic to manage
Visualize your gateway architecture#
The best way to understand how a gateway fits into your system is to visualize it. Try describing your architecture in Codelit — it will generate an interactive diagram showing how your gateway connects to services, databases, and external APIs.
Key takeaways#
- Centralize cross-cutting concerns at the gateway — auth, rate limiting, logging
- Keep the gateway thin — route and protect, don't embed business logic
- Plan for availability — the gateway is on the critical path for every request
- Use BFF pattern when different clients need different API shapes
- Add observability — the gateway is the best place to measure your API's health
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
OpenAI API Request Pipeline
7-stage pipeline from API call to token generation, handling millions of requests per minute.
8 componentsNetflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsDistributed Rate Limiter
API rate limiting with sliding window, token bucket, and per-user quotas.
7 componentsBuild this architecture
Generate an interactive architecture for API Gateway Design Patterns in seconds.
Try it in Codelit →
Comments