API Gateway Patterns: Authentication, Rate Limiting, and Routing at Scale
API Gateway Patterns: Authentication, Rate Limiting, and Routing at Scale#
Every microservices architecture needs an API gateway. It's the front door — handling authentication, routing, rate limiting, and protocol translation before requests reach your services.
But which patterns matter? And which gateway should you use?
What an API Gateway Does#
Without a gateway:
Client → Auth Service (authenticate)
Client → User Service (get profile)
Client → Order Service (get orders)
Client → Payment Service (get payment methods)
Four separate calls. Client knows about every service. No centralized auth.
With a gateway:
Client → API Gateway → Auth (middleware)
→ /users → User Service
→ /orders → Order Service
→ /payments → Payment Service
One entry point. Auth happens once. Client doesn't know about internal services.
Core Patterns#
1. Request Routing#
Route requests to the correct backend based on path, method, or headers:
GET /api/users/* → User Service (port 3001)
POST /api/orders/* → Order Service (port 3002)
GET /api/products/* → Product Service (port 3003)
WS /ws/* → WebSocket Service (port 3004)
Path-based is most common. Header-based (e.g., X-Version: v2) for API versioning.
2. Authentication & Authorization#
Verify identity at the gateway, not in every service:
Client → Gateway → Verify JWT → Extract user_id → Add X-User-ID header → Backend
Patterns:
- JWT validation — Gateway verifies signature, extracts claims
- OAuth2 introspection — Gateway calls auth server to validate token
- API key lookup — Gateway checks key in Redis/DB
- mTLS — Mutual TLS for service-to-service auth
Best practice: Gateway handles authentication (who are you), services handle authorization (what can you do).
3. Rate Limiting#
Protect backends from abuse and ensure fair usage:
| Algorithm | How It Works | Best For |
|---|---|---|
| Token Bucket | Tokens refill at fixed rate, each request consumes one | Burst-friendly, most common |
| Sliding Window | Count requests in rolling time window | Smooth rate limiting |
| Fixed Window | Count requests per time interval | Simple, but allows bursts at boundaries |
| Leaky Bucket | Requests queue and process at fixed rate | Smooth output rate |
Implementation:
Redis key: rate_limit:{user_id}:{window}
INCR + EXPIRE for fixed window
Sorted set for sliding window
Typical limits:
- Free tier: 100 req/min
- Pro: 1000 req/min
- Enterprise: 10000 req/min + custom
4. Request Aggregation (BFF Pattern)#
Combine multiple service calls into one response for the client:
Client: GET /api/dashboard
Gateway aggregates:
→ User Service: GET /users/123
→ Order Service: GET /orders?user=123&limit=5
→ Analytics Service: GET /stats?user=123
Response: { user: {...}, recentOrders: [...], stats: {...} }
Backend for Frontend (BFF): One gateway per client type (web, mobile, internal).
5. Circuit Breaking#
Stop forwarding requests to a failing service:
Gateway → Service A: timeout!
Gateway → Service A: timeout!
Gateway → Service A: timeout! (3 failures)
Gateway: Circuit OPEN — return 503 immediately for 30s
Gateway: (30s later) Circuit HALF-OPEN — try one request
Gateway → Service A: 200 OK! Circuit CLOSED
Prevents cascade failures. Gives the failing service time to recover.
6. Request/Response Transformation#
Modify requests and responses passing through:
- Header injection — Add
X-Request-ID,X-User-IDfrom JWT - Body transformation — Convert XML to JSON for legacy backends
- Response filtering — Strip internal fields before returning to client
- Protocol translation — HTTP/REST → gRPC for internal services
7. Caching#
Cache GET responses at the gateway to reduce backend load:
Client → Gateway → Cache hit? → Return cached response (1ms)
→ Cache miss? → Backend (50ms) → Cache → Return
Cache invalidation: TTL-based (simple), event-based (Kafka consumer), or manual purge API.
8. Observability#
The gateway sees all traffic — perfect for:
- Access logs — every request with latency, status, user
- Metrics — request rate, error rate, P99 latency per route
- Distributed tracing — inject trace ID into every request
- Alerting — spike in 5xx responses, latency degradation
Gateway Options Compared#
| Gateway | Type | Best For | Complexity |
|---|---|---|---|
| Nginx | Reverse proxy | Simple routing, SSL termination | Low |
| Kong | API gateway | Plugins (auth, rate limit, logging) | Medium |
| AWS API Gateway | Managed | Serverless (Lambda), zero ops | Low |
| Envoy | Service proxy | Service mesh (Istio), gRPC | High |
| Traefik | Cloud-native | Docker/K8s auto-discovery | Medium |
| Express Gateway | Node.js | Custom logic, JS ecosystem | Low |
| Apigee | Enterprise | API management, monetization | High |
Quick Decision#
- Serverless app (Lambda): AWS API Gateway
- Simple microservices: Nginx or Kong
- Kubernetes: Traefik or Envoy (with Istio)
- Enterprise API management: Kong or Apigee
- Custom logic needed: Express Gateway or custom Node.js
Anti-Patterns#
1. God Gateway#
Don't put business logic in the gateway. Keep it to cross-cutting concerns (auth, rate limiting, routing).
2. Single Point of Failure#
Always deploy gateways in a cluster behind a load balancer:
DNS → Load Balancer → Gateway 1
→ Gateway 2
→ Gateway 3
3. Tight Coupling#
If adding a new service requires gateway config changes, your gateway is too coupled. Use service discovery (Consul, K8s DNS).
Architecture Examples#
E-Commerce API#
Mobile App → API Gateway → Auth Middleware
Web App → /products → Product Service → PostgreSQL
→ /cart → Cart Service → Redis
→ /orders → Order Service → PostgreSQL + Kafka
→ /payments → Payment Service → Stripe
Multi-Tenant SaaS#
Client → API Gateway → Tenant Resolver (from subdomain/header)
→ Rate Limiter (per tenant plan)
→ Route to tenant's service cluster
Summary#
- Every microservices system needs a gateway — don't expose services directly
- Auth at the gateway, authorization in services
- Rate limit by user/API key — protect backends and enforce plans
- Circuit break failing services — prevent cascade failures
- Cache at the gateway for common read-heavy endpoints
- Keep the gateway thin — cross-cutting concerns only, no business logic
- Deploy in a cluster — the gateway can't be a single point of failure
Design your API gateway architecture at codelit.io — generate interactive diagrams with security audits and infrastructure exports.
Try it on Codelit
Cost Estimator
See estimated AWS monthly costs for every component in your architecture
Related articles
Try these templates
Scalable SaaS Application
Modern SaaS with microservices, event-driven processing, and multi-tenant architecture.
10 componentsDistributed Rate Limiter
API rate limiting with sliding window, token bucket, and per-user quotas.
7 componentsWhatsApp-Scale Messaging System
End-to-end encrypted messaging with offline delivery, group chats, and media sharing at billions-of-messages scale.
9 componentsBuild this architecture
Generate an interactive architecture for API Gateway Patterns in seconds.
Try it in Codelit →
Comments