Load Balancer Types & Algorithms: L4 vs L7, Round Robin to Consistent Hashing
Load Balancer Types & Algorithms Explained#
Every system that handles more than one server needs a load balancer. But which type? Which algorithm? When do you need L7 vs L4?
This guide covers everything from basic round robin to consistent hashing — with real architecture examples.
Why Load Balance?#
Without a load balancer:
Client → Server 1 (overloaded, 95% CPU)
Client → Server 1 (still the same one)
Client → Server 1 (it crashes)
Server 2 (idle, 5% CPU)
Server 3 (idle, 5% CPU)
With a load balancer:
Client → Load Balancer → Server 1 (33% CPU)
→ Server 2 (33% CPU)
→ Server 3 (33% CPU)
L4 vs L7 Load Balancing#
Layer 4 (Transport)#
Routes based on IP address and TCP port. Doesn't inspect the request content.
Client:443 → L4 LB → Backend:3000 (TCP connection forwarded)
Pros: Extremely fast (no payload inspection), low latency, handles any protocol Cons: Can't route by URL path, headers, or cookies Use when: Raw TCP/UDP traffic, gRPC, database proxying, maximum performance
Tools: AWS NLB, Nginx (stream), HAProxy (TCP mode), Envoy
Layer 7 (Application)#
Routes based on HTTP headers, URL path, cookies, query params.
Client → L7 LB → /api/* → API servers
→ /static/* → CDN/static servers
→ /ws/* → WebSocket servers
Pros: Content-based routing, SSL termination, compression, caching, auth Cons: Higher latency (must parse HTTP), more resource intensive Use when: HTTP traffic, microservices routing, A/B testing, canary deploys
Tools: AWS ALB, Nginx, HAProxy, Envoy, Traefik, Cloudflare
Quick Decision#
| Need | Use |
|---|---|
| Route by URL path | L7 |
| Route by header/cookie | L7 |
| SSL termination | L7 |
| Raw TCP (database, gRPC) | L4 |
| Maximum throughput | L4 |
| WebSocket + HTTP mixed | L7 |
Load Balancing Algorithms#
1. Round Robin#
Requests distributed sequentially: Server 1, 2, 3, 1, 2, 3...
Request 1 → Server 1
Request 2 → Server 2
Request 3 → Server 3
Request 4 → Server 1
Pros: Simple, even distribution Cons: Ignores server load — slow server gets same traffic as fast one Best for: Homogeneous servers with similar request costs
2. Weighted Round Robin#
Servers with more capacity get proportionally more traffic.
Server 1 (weight: 3) → gets 3 out of 6 requests
Server 2 (weight: 2) → gets 2 out of 6 requests
Server 3 (weight: 1) → gets 1 out of 6 requests
Best for: Mixed server sizes (e.g., during rolling upgrades with old + new hardware)
3. Least Connections#
Route to the server with the fewest active connections.
Server 1: 50 active connections ← skip
Server 2: 12 active connections ← ROUTE HERE
Server 3: 30 active connections ← skip
Best for: Long-lived connections (WebSocket, database) where request duration varies
4. Least Response Time#
Route to the server with the fastest average response time.
Best for: Heterogeneous backends where speed matters more than connection count
5. IP Hash#
Hash the client IP to deterministically route to the same server.
hash("192.168.1.100") % 3 = Server 2 (always)
Pros: Session affinity without cookies Cons: Uneven distribution if IPs cluster, breaks when servers added/removed Best for: Simple session stickiness without load balancer state
6. Consistent Hashing#
Like IP hash, but adding/removing servers only affects 1/N of the traffic.
Hash Ring: [0 ----S1---- 0.33 ----S2---- 0.66 ----S3---- 1.0]
hash(client_ip) = 0.45 → routes to S2
Add S4 at 0.5: 0.45 → still routes to S2 (minimal disruption)
Best for: Caching layers (Memcached, CDN), stateful services, minimizing cache invalidation on scale events
Algorithm Decision Matrix#
| Scenario | Algorithm |
|---|---|
| All servers identical, short requests | Round Robin |
| Mixed server sizes | Weighted Round Robin |
| Long-lived connections (WebSocket) | Least Connections |
| Need session affinity | IP Hash or Cookie-based |
| Caching layer (minimize cache misses) | Consistent Hashing |
| Latency-sensitive | Least Response Time |
Health Checks#
The load balancer must detect unhealthy servers:
Active health checks — LB periodically pings each server:
LB → GET /health → Server 1: 200 OK ✓
LB → GET /health → Server 2: 200 OK ✓
LB → GET /health → Server 3: timeout ✗ (removed from pool)
Passive health checks — LB monitors real traffic for errors:
- 5 consecutive 5xx responses → mark unhealthy
- Connection timeout → mark unhealthy
Best practice: Use both. Active for fast detection, passive for real-world accuracy.
Cloud Load Balancers#
AWS#
| Service | Layer | Best For |
|---|---|---|
| ALB | L7 | HTTP routing, path-based, gRPC |
| NLB | L4 | TCP/UDP, extreme performance |
| CLB | L4/L7 | Legacy (don't use for new projects) |
GCP#
| Service | Layer | Best For |
|---|---|---|
| HTTP(S) LB | L7 | Global HTTP routing |
| TCP/UDP LB | L4 | Regional TCP balancing |
Self-Hosted#
| Tool | Best For |
|---|---|
| Nginx | Most popular, easy config, L4 + L7 |
| HAProxy | High performance, advanced health checks |
| Envoy | Service mesh, gRPC, observability |
| Traefik | Docker/K8s native, auto-discovery |
Architecture Patterns#
Basic Web App#
Internet → CloudFront (CDN) → ALB (L7) → Auto Scaling Group
→ EC2 Instance 1
→ EC2 Instance 2
→ EC2 Instance 3
Microservices with Service Mesh#
Client → API Gateway (L7 LB)
→ Service A → Envoy sidecar → Service B
→ Service C → Envoy sidecar → Service D
↓
Envoy Control Plane (Istio)
Global Load Balancing#
User (US) → DNS (GeoDNS) → US Load Balancer → US Servers
User (EU) → DNS (GeoDNS) → EU Load Balancer → EU Servers
User (AP) → DNS (GeoDNS) → AP Load Balancer → AP Servers
Summary#
- Start with L7 (ALB/Nginx) for HTTP traffic — routing flexibility matters
- Use L4 (NLB) for raw TCP, gRPC, or when latency is critical
- Round Robin works for 80% of use cases
- Least Connections for WebSocket/long-lived connections
- Consistent Hashing for caching layers
- Always add health checks — active + passive
- Consider global LB (GeoDNS) for multi-region deployments
Design your load balanced architecture at codelit.io — generate interactive diagrams with performance audits and infrastructure exports.
Try it on Codelit
Cost Estimator
See estimated AWS monthly costs for every component in your architecture
Related articles
Try these templates
Build this architecture
Generate an interactive architecture for Load Balancer Types & Algorithms in seconds.
Try it in Codelit →
Comments