API gatewaymicroservicesinfrastructuresystem design

API Gateway Patterns: Authentication, Rate Limiting, and Routing at Scale

March 28, 2026 6 min readBy Codelit Team Discussion

API Gateway Patterns: Authentication, Rate Limiting, and Routing at Scale#

Every microservices architecture needs an API gateway. It's the front door — handling authentication, routing, rate limiting, and protocol translation before requests reach your services.

But which patterns matter? And which gateway should you use?

What an API Gateway Does#

Without a gateway:

Client → Auth Service (authenticate)
Client → User Service (get profile)
Client → Order Service (get orders)
Client → Payment Service (get payment methods)

Four separate calls. Client knows about every service. No centralized auth.

With a gateway:

Client → API Gateway → Auth (middleware)
                     → /users → User Service
                     → /orders → Order Service
                     → /payments → Payment Service

One entry point. Auth happens once. Client doesn't know about internal services.

Core Patterns#

1. Request Routing#

Route requests to the correct backend based on path, method, or headers:

GET  /api/users/*     → User Service (port 3001)
POST /api/orders/*    → Order Service (port 3002)
GET  /api/products/*  → Product Service (port 3003)
WS   /ws/*            → WebSocket Service (port 3004)

Path-based is most common. Header-based (e.g., X-Version: v2) for API versioning.

2. Authentication & Authorization#

Verify identity at the gateway, not in every service:

Client → Gateway → Verify JWT → Extract user_id → Add X-User-ID header → Backend

Patterns:

JWT validation — Gateway verifies signature, extracts claims
OAuth2 introspection — Gateway calls auth server to validate token
API key lookup — Gateway checks key in Redis/DB
mTLS — Mutual TLS for service-to-service auth

Best practice: Gateway handles authentication (who are you), services handle authorization (what can you do).

3. Rate Limiting#

Protect backends from abuse and ensure fair usage:

Algorithm	How It Works	Best For
Token Bucket	Tokens refill at fixed rate, each request consumes one	Burst-friendly, most common
Sliding Window	Count requests in rolling time window	Smooth rate limiting
Fixed Window	Count requests per time interval	Simple, but allows bursts at boundaries
Leaky Bucket	Requests queue and process at fixed rate	Smooth output rate

Implementation:

Redis key: rate_limit:{user_id}:{window}
INCR + EXPIRE for fixed window
Sorted set for sliding window

Typical limits:

Free tier: 100 req/min
Pro: 1000 req/min
Enterprise: 10000 req/min + custom

4. Request Aggregation (BFF Pattern)#

Combine multiple service calls into one response for the client:

Client: GET /api/dashboard

Gateway aggregates:
  → User Service: GET /users/123
  → Order Service: GET /orders?user=123&limit=5
  → Analytics Service: GET /stats?user=123

Response: { user: {...}, recentOrders: [...], stats: {...} }

Backend for Frontend (BFF): One gateway per client type (web, mobile, internal).

5. Circuit Breaking#

Stop forwarding requests to a failing service:

Gateway → Service A: timeout!
Gateway → Service A: timeout!
Gateway → Service A: timeout! (3 failures)
Gateway: Circuit OPEN — return 503 immediately for 30s
Gateway: (30s later) Circuit HALF-OPEN — try one request
Gateway → Service A: 200 OK! Circuit CLOSED

Prevents cascade failures. Gives the failing service time to recover.

6. Request/Response Transformation#

Modify requests and responses passing through:

Header injection — Add X-Request-ID, X-User-ID from JWT
Body transformation — Convert XML to JSON for legacy backends
Response filtering — Strip internal fields before returning to client
Protocol translation — HTTP/REST → gRPC for internal services

7. Caching#

Cache GET responses at the gateway to reduce backend load:

Client → Gateway → Cache hit? → Return cached response (1ms)
                 → Cache miss? → Backend (50ms) → Cache → Return

Cache invalidation: TTL-based (simple), event-based (Kafka consumer), or manual purge API.

8. Observability#

The gateway sees all traffic — perfect for:

Access logs — every request with latency, status, user
Metrics — request rate, error rate, P99 latency per route
Distributed tracing — inject trace ID into every request
Alerting — spike in 5xx responses, latency degradation

Gateway Options Compared#

Gateway	Type	Best For	Complexity
Nginx	Reverse proxy	Simple routing, SSL termination	Low
Kong	API gateway	Plugins (auth, rate limit, logging)	Medium
AWS API Gateway	Managed	Serverless (Lambda), zero ops	Low
Envoy	Service proxy	Service mesh (Istio), gRPC	High
Traefik	Cloud-native	Docker/K8s auto-discovery	Medium
Express Gateway	Node.js	Custom logic, JS ecosystem	Low
Apigee	Enterprise	API management, monetization	High

Quick Decision#

Serverless app (Lambda): AWS API Gateway
Simple microservices: Nginx or Kong
Kubernetes: Traefik or Envoy (with Istio)
Enterprise API management: Kong or Apigee
Custom logic needed: Express Gateway or custom Node.js

Anti-Patterns#

1. God Gateway#

Don't put business logic in the gateway. Keep it to cross-cutting concerns (auth, rate limiting, routing).

2. Single Point of Failure#

Always deploy gateways in a cluster behind a load balancer:

DNS → Load Balancer → Gateway 1
                    → Gateway 2
                    → Gateway 3

3. Tight Coupling#

If adding a new service requires gateway config changes, your gateway is too coupled. Use service discovery (Consul, K8s DNS).

Architecture Examples#

E-Commerce API#

Mobile App  → API Gateway → Auth Middleware
Web App                  → /products → Product Service → PostgreSQL
                         → /cart → Cart Service → Redis
                         → /orders → Order Service → PostgreSQL + Kafka
                         → /payments → Payment Service → Stripe

Multi-Tenant SaaS#

Client → API Gateway → Tenant Resolver (from subdomain/header)
                     → Rate Limiter (per tenant plan)
                     → Route to tenant's service cluster

Generate your API gateway architecture →

Summary#

Every microservices system needs a gateway — don't expose services directly
Auth at the gateway, authorization in services
Rate limit by user/API key — protect backends and enforce plans
Circuit break failing services — prevent cascade failures
Cache at the gateway for common read-heavy endpoints
Keep the gateway thin — cross-cutting concerns only, no business logic
Deploy in a cluster — the gateway can't be a single point of failure

Design your API gateway architecture at codelit.io — generate interactive diagrams with security audits and infrastructure exports.

Try it on Codelit

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

Build this architecture →

Comments

AI search

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

8 min read

AI safety

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

8 min read

API design

API Backward Compatibility: Ship Changes Without Breaking Consumers

6 min read

Try these templates

Scalable SaaS Application

Modern SaaS with microservices, event-driven processing, and multi-tenant architecture.

10 components

Distributed Rate Limiter

API rate limiting with sliding window, token bucket, and per-user quotas.

7 components

WhatsApp-Scale Messaging System

End-to-end encrypted messaging with offline delivery, group chats, and media sharing at billions-of-messages scale.

9 components

Build this architecture

Generate an interactive architecture for API Gateway Patterns in seconds.

Try it in Codelit →

API gatewaymicroservicesinfrastructuresystem design

API Gateway Patterns: Authentication, Rate Limiting, and Routing at Scale

March 28, 2026 6 min readBy Codelit Team Discussion

API Gateway Patterns: Authentication, Rate Limiting, and Routing at Scale#

Every microservices architecture needs an API gateway. It's the front door — handling authentication, routing, rate limiting, and protocol translation before requests reach your services.

But which patterns matter? And which gateway should you use?

What an API Gateway Does#

Without a gateway:

Client → Auth Service (authenticate)
Client → User Service (get profile)
Client → Order Service (get orders)
Client → Payment Service (get payment methods)

Four separate calls. Client knows about every service. No centralized auth.

With a gateway:

Client → API Gateway → Auth (middleware)
                     → /users → User Service
                     → /orders → Order Service
                     → /payments → Payment Service

One entry point. Auth happens once. Client doesn't know about internal services.

Core Patterns#

1. Request Routing#

Route requests to the correct backend based on path, method, or headers:

GET  /api/users/*     → User Service (port 3001)
POST /api/orders/*    → Order Service (port 3002)
GET  /api/products/*  → Product Service (port 3003)
WS   /ws/*            → WebSocket Service (port 3004)

Path-based is most common. Header-based (e.g., X-Version: v2) for API versioning.

2. Authentication & Authorization#

Verify identity at the gateway, not in every service:

Client → Gateway → Verify JWT → Extract user_id → Add X-User-ID header → Backend

Patterns:

JWT validation — Gateway verifies signature, extracts claims
OAuth2 introspection — Gateway calls auth server to validate token
API key lookup — Gateway checks key in Redis/DB
mTLS — Mutual TLS for service-to-service auth

Best practice: Gateway handles authentication (who are you), services handle authorization (what can you do).

3. Rate Limiting#

Protect backends from abuse and ensure fair usage:

Algorithm	How It Works	Best For
Token Bucket	Tokens refill at fixed rate, each request consumes one	Burst-friendly, most common
Sliding Window	Count requests in rolling time window	Smooth rate limiting
Fixed Window	Count requests per time interval	Simple, but allows bursts at boundaries
Leaky Bucket	Requests queue and process at fixed rate	Smooth output rate

Implementation:

Redis key: rate_limit:{user_id}:{window}
INCR + EXPIRE for fixed window
Sorted set for sliding window

Typical limits:

Free tier: 100 req/min
Pro: 1000 req/min
Enterprise: 10000 req/min + custom

4. Request Aggregation (BFF Pattern)#

Combine multiple service calls into one response for the client:

Client: GET /api/dashboard

Gateway aggregates:
  → User Service: GET /users/123
  → Order Service: GET /orders?user=123&limit=5
  → Analytics Service: GET /stats?user=123

Response: { user: {...}, recentOrders: [...], stats: {...} }

Backend for Frontend (BFF): One gateway per client type (web, mobile, internal).

5. Circuit Breaking#

Stop forwarding requests to a failing service:

Gateway → Service A: timeout!
Gateway → Service A: timeout!
Gateway → Service A: timeout! (3 failures)
Gateway: Circuit OPEN — return 503 immediately for 30s
Gateway: (30s later) Circuit HALF-OPEN — try one request
Gateway → Service A: 200 OK! Circuit CLOSED

Prevents cascade failures. Gives the failing service time to recover.

6. Request/Response Transformation#

Modify requests and responses passing through:

Header injection — Add X-Request-ID, X-User-ID from JWT
Body transformation — Convert XML to JSON for legacy backends
Response filtering — Strip internal fields before returning to client
Protocol translation — HTTP/REST → gRPC for internal services

7. Caching#

Cache GET responses at the gateway to reduce backend load:

Client → Gateway → Cache hit? → Return cached response (1ms)
                 → Cache miss? → Backend (50ms) → Cache → Return

Cache invalidation: TTL-based (simple), event-based (Kafka consumer), or manual purge API.

8. Observability#

The gateway sees all traffic — perfect for:

Access logs — every request with latency, status, user
Metrics — request rate, error rate, P99 latency per route
Distributed tracing — inject trace ID into every request
Alerting — spike in 5xx responses, latency degradation

Gateway Options Compared#

Gateway	Type	Best For	Complexity
Nginx	Reverse proxy	Simple routing, SSL termination	Low
Kong	API gateway	Plugins (auth, rate limit, logging)	Medium
AWS API Gateway	Managed	Serverless (Lambda), zero ops	Low
Envoy	Service proxy	Service mesh (Istio), gRPC	High
Traefik	Cloud-native	Docker/K8s auto-discovery	Medium
Express Gateway	Node.js	Custom logic, JS ecosystem	Low
Apigee	Enterprise	API management, monetization	High

Quick Decision#

Serverless app (Lambda): AWS API Gateway
Simple microservices: Nginx or Kong
Kubernetes: Traefik or Envoy (with Istio)
Enterprise API management: Kong or Apigee
Custom logic needed: Express Gateway or custom Node.js

Anti-Patterns#

1. God Gateway#

Don't put business logic in the gateway. Keep it to cross-cutting concerns (auth, rate limiting, routing).

2. Single Point of Failure#

Always deploy gateways in a cluster behind a load balancer:

DNS → Load Balancer → Gateway 1
                    → Gateway 2
                    → Gateway 3

3. Tight Coupling#

If adding a new service requires gateway config changes, your gateway is too coupled. Use service discovery (Consul, K8s DNS).

Architecture Examples#

E-Commerce API#

Mobile App  → API Gateway → Auth Middleware
Web App                  → /products → Product Service → PostgreSQL
                         → /cart → Cart Service → Redis
                         → /orders → Order Service → PostgreSQL + Kafka
                         → /payments → Payment Service → Stripe

Multi-Tenant SaaS#

Client → API Gateway → Tenant Resolver (from subdomain/header)
                     → Rate Limiter (per tenant plan)
                     → Route to tenant's service cluster

Generate your API gateway architecture →

Summary#

Every microservices system needs a gateway — don't expose services directly
Auth at the gateway, authorization in services
Rate limit by user/API key — protect backends and enforce plans
Circuit break failing services — prevent cascade failures
Cache at the gateway for common read-heavy endpoints
Keep the gateway thin — cross-cutting concerns only, no business logic
Deploy in a cluster — the gateway can't be a single point of failure

Design your API gateway architecture at codelit.io — generate interactive diagrams with security audits and infrastructure exports.

Try it on Codelit

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

Build this architecture →

Comments

AI search

Build this architecture

Generate an interactive architecture for API Gateway Patterns in seconds.

Try it in Codelit →

API Gateway Patterns: Authentication, Rate Limiting, and Routing at Scale

API Gateway Patterns: Authentication, Rate Limiting, and Routing at Scale#

What an API Gateway Does#

Core Patterns#

1. Request Routing#

2. Authentication & Authorization#

3. Rate Limiting#

4. Request Aggregation (BFF Pattern)#

5. Circuit Breaking#

6. Request/Response Transformation#

7. Caching#

8. Observability#

Gateway Options Compared#

Quick Decision#

Anti-Patterns#

1. God Gateway#

2. Single Point of Failure#

3. Tight Coupling#

Architecture Examples#

E-Commerce API#

Multi-Tenant SaaS#

Summary#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Scalable SaaS Application

Distributed Rate Limiter

WhatsApp-Scale Messaging System

Build this architecture

API Gateway Patterns: Authentication, Rate Limiting, and Routing at Scale

API Gateway Patterns: Authentication, Rate Limiting, and Routing at Scale#

What an API Gateway Does#

Core Patterns#

1. Request Routing#

2. Authentication & Authorization#

3. Rate Limiting#

4. Request Aggregation (BFF Pattern)#

5. Circuit Breaking#

6. Request/Response Transformation#

7. Caching#

8. Observability#

Gateway Options Compared#

Quick Decision#

Anti-Patterns#

1. God Gateway#

2. Single Point of Failure#

3. Tight Coupling#

Architecture Examples#

E-Commerce API#

Multi-Tenant SaaS#

Summary#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Scalable SaaS Application

Distributed Rate Limiter

WhatsApp-Scale Messaging System

Build this architecture