load balancerinfrastructuresystem designnetworking

Load Balancer Types & Algorithms: L4 vs L7, Round Robin to Consistent Hashing

March 28, 2026 6 min readBy Codelit Team Discussion

Load Balancer Types & Algorithms Explained#

Every system that handles more than one server needs a load balancer. But which type? Which algorithm? When do you need L7 vs L4?

This guide covers everything from basic round robin to consistent hashing — with real architecture examples.

Why Load Balance?#

Without a load balancer:

Client → Server 1 (overloaded, 95% CPU)
Client → Server 1 (still the same one)
Client → Server 1 (it crashes)
         Server 2 (idle, 5% CPU)
         Server 3 (idle, 5% CPU)

With a load balancer:

Client → Load Balancer → Server 1 (33% CPU)
                       → Server 2 (33% CPU)
                       → Server 3 (33% CPU)

L4 vs L7 Load Balancing#

Layer 4 (Transport)#

Routes based on IP address and TCP port. Doesn't inspect the request content.

Client:443 → L4 LB → Backend:3000 (TCP connection forwarded)

Pros: Extremely fast (no payload inspection), low latency, handles any protocol Cons: Can't route by URL path, headers, or cookies Use when: Raw TCP/UDP traffic, gRPC, database proxying, maximum performance

Tools: AWS NLB, Nginx (stream), HAProxy (TCP mode), Envoy

Layer 7 (Application)#

Routes based on HTTP headers, URL path, cookies, query params.

Client → L7 LB → /api/* → API servers
               → /static/* → CDN/static servers
               → /ws/* → WebSocket servers

Pros: Content-based routing, SSL termination, compression, caching, auth Cons: Higher latency (must parse HTTP), more resource intensive Use when: HTTP traffic, microservices routing, A/B testing, canary deploys

Tools: AWS ALB, Nginx, HAProxy, Envoy, Traefik, Cloudflare

Quick Decision#

Need	Use
Route by URL path	L7
Route by header/cookie	L7
SSL termination	L7
Raw TCP (database, gRPC)	L4
Maximum throughput	L4
WebSocket + HTTP mixed	L7

Load Balancing Algorithms#

1. Round Robin#

Requests distributed sequentially: Server 1, 2, 3, 1, 2, 3...

Request 1 → Server 1
Request 2 → Server 2
Request 3 → Server 3
Request 4 → Server 1

Pros: Simple, even distribution Cons: Ignores server load — slow server gets same traffic as fast one Best for: Homogeneous servers with similar request costs

2. Weighted Round Robin#

Servers with more capacity get proportionally more traffic.

Server 1 (weight: 3) → gets 3 out of 6 requests
Server 2 (weight: 2) → gets 2 out of 6 requests
Server 3 (weight: 1) → gets 1 out of 6 requests

Best for: Mixed server sizes (e.g., during rolling upgrades with old + new hardware)

3. Least Connections#

Route to the server with the fewest active connections.

Server 1: 50 active connections ← skip
Server 2: 12 active connections ← ROUTE HERE
Server 3: 30 active connections ← skip

Best for: Long-lived connections (WebSocket, database) where request duration varies

4. Least Response Time#

Route to the server with the fastest average response time.

Best for: Heterogeneous backends where speed matters more than connection count

5. IP Hash#

Hash the client IP to deterministically route to the same server.

hash("192.168.1.100") % 3 = Server 2 (always)

Pros: Session affinity without cookies Cons: Uneven distribution if IPs cluster, breaks when servers added/removed Best for: Simple session stickiness without load balancer state

6. Consistent Hashing#

Like IP hash, but adding/removing servers only affects 1/N of the traffic.

Hash Ring: [0 ----S1---- 0.33 ----S2---- 0.66 ----S3---- 1.0]

hash(client_ip) = 0.45 → routes to S2
Add S4 at 0.5:   0.45 → still routes to S2 (minimal disruption)

Best for: Caching layers (Memcached, CDN), stateful services, minimizing cache invalidation on scale events

Algorithm Decision Matrix#

Scenario	Algorithm
All servers identical, short requests	Round Robin
Mixed server sizes	Weighted Round Robin
Long-lived connections (WebSocket)	Least Connections
Need session affinity	IP Hash or Cookie-based
Caching layer (minimize cache misses)	Consistent Hashing
Latency-sensitive	Least Response Time

Health Checks#

The load balancer must detect unhealthy servers:

Active health checks — LB periodically pings each server:

LB → GET /health → Server 1: 200 OK ✓
LB → GET /health → Server 2: 200 OK ✓
LB → GET /health → Server 3: timeout ✗ (removed from pool)

Passive health checks — LB monitors real traffic for errors:

5 consecutive 5xx responses → mark unhealthy
Connection timeout → mark unhealthy

Best practice: Use both. Active for fast detection, passive for real-world accuracy.

Cloud Load Balancers#

AWS#

Service	Layer	Best For
ALB	L7	HTTP routing, path-based, gRPC
NLB	L4	TCP/UDP, extreme performance
CLB	L4/L7	Legacy (don't use for new projects)

GCP#

Service	Layer	Best For
HTTP(S) LB	L7	Global HTTP routing
TCP/UDP LB	L4	Regional TCP balancing

Self-Hosted#

Tool	Best For
Nginx	Most popular, easy config, L4 + L7
HAProxy	High performance, advanced health checks
Envoy	Service mesh, gRPC, observability
Traefik	Docker/K8s native, auto-discovery

Architecture Patterns#

Basic Web App#

Internet → CloudFront (CDN) → ALB (L7) → Auto Scaling Group
                                              → EC2 Instance 1
                                              → EC2 Instance 2
                                              → EC2 Instance 3

Microservices with Service Mesh#

Client → API Gateway (L7 LB)
              → Service A → Envoy sidecar → Service B
              → Service C → Envoy sidecar → Service D
                                    ↓
                             Envoy Control Plane (Istio)

Global Load Balancing#

User (US) → DNS (GeoDNS) → US Load Balancer → US Servers
User (EU) → DNS (GeoDNS) → EU Load Balancer → EU Servers
User (AP) → DNS (GeoDNS) → AP Load Balancer → AP Servers

Generate your load balanced architecture →

Summary#

Start with L7 (ALB/Nginx) for HTTP traffic — routing flexibility matters
Use L4 (NLB) for raw TCP, gRPC, or when latency is critical
Round Robin works for 80% of use cases
Least Connections for WebSocket/long-lived connections
Consistent Hashing for caching layers
Always add health checks — active + passive
Consider global LB (GeoDNS) for multi-region deployments

Design your load balanced architecture at codelit.io — generate interactive diagrams with performance audits and infrastructure exports.

Try it on Codelit

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

Build this architecture →

Comments

AI search

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

8 min read

AI safety

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

8 min read

API design

API Backward Compatibility: Ship Changes Without Breaking Consumers

6 min read

Try these templates

Robinhood Trading Platform

Commission-free stock trading with real-time market data, order execution, portfolio management, and regulatory compliance.

10 components

Build this architecture

Generate an interactive architecture for Load Balancer Types & Algorithms in seconds.

Try it in Codelit →

load balancerinfrastructuresystem designnetworking

Load Balancer Types & Algorithms: L4 vs L7, Round Robin to Consistent Hashing

March 28, 2026 6 min readBy Codelit Team Discussion

Load Balancer Types & Algorithms Explained#

Every system that handles more than one server needs a load balancer. But which type? Which algorithm? When do you need L7 vs L4?

This guide covers everything from basic round robin to consistent hashing — with real architecture examples.

Why Load Balance?#

Without a load balancer:

Client → Server 1 (overloaded, 95% CPU)
Client → Server 1 (still the same one)
Client → Server 1 (it crashes)
         Server 2 (idle, 5% CPU)
         Server 3 (idle, 5% CPU)

With a load balancer:

Client → Load Balancer → Server 1 (33% CPU)
                       → Server 2 (33% CPU)
                       → Server 3 (33% CPU)

L4 vs L7 Load Balancing#

Layer 4 (Transport)#

Routes based on IP address and TCP port. Doesn't inspect the request content.

Client:443 → L4 LB → Backend:3000 (TCP connection forwarded)

Tools: AWS NLB, Nginx (stream), HAProxy (TCP mode), Envoy

Layer 7 (Application)#

Routes based on HTTP headers, URL path, cookies, query params.

Client → L7 LB → /api/* → API servers
               → /static/* → CDN/static servers
               → /ws/* → WebSocket servers

Tools: AWS ALB, Nginx, HAProxy, Envoy, Traefik, Cloudflare

Quick Decision#

Need	Use
Route by URL path	L7
Route by header/cookie	L7
SSL termination	L7
Raw TCP (database, gRPC)	L4
Maximum throughput	L4
WebSocket + HTTP mixed	L7

Load Balancing Algorithms#

1. Round Robin#

Requests distributed sequentially: Server 1, 2, 3, 1, 2, 3...

Request 1 → Server 1
Request 2 → Server 2
Request 3 → Server 3
Request 4 → Server 1

Pros: Simple, even distribution Cons: Ignores server load — slow server gets same traffic as fast one Best for: Homogeneous servers with similar request costs

2. Weighted Round Robin#

Servers with more capacity get proportionally more traffic.

Server 1 (weight: 3) → gets 3 out of 6 requests
Server 2 (weight: 2) → gets 2 out of 6 requests
Server 3 (weight: 1) → gets 1 out of 6 requests

Best for: Mixed server sizes (e.g., during rolling upgrades with old + new hardware)

3. Least Connections#

Route to the server with the fewest active connections.

Server 1: 50 active connections ← skip
Server 2: 12 active connections ← ROUTE HERE
Server 3: 30 active connections ← skip

Best for: Long-lived connections (WebSocket, database) where request duration varies

4. Least Response Time#

Route to the server with the fastest average response time.

Best for: Heterogeneous backends where speed matters more than connection count

5. IP Hash#

Hash the client IP to deterministically route to the same server.

hash("192.168.1.100") % 3 = Server 2 (always)

Pros: Session affinity without cookies Cons: Uneven distribution if IPs cluster, breaks when servers added/removed Best for: Simple session stickiness without load balancer state

6. Consistent Hashing#

Like IP hash, but adding/removing servers only affects 1/N of the traffic.

Hash Ring: [0 ----S1---- 0.33 ----S2---- 0.66 ----S3---- 1.0]

hash(client_ip) = 0.45 → routes to S2
Add S4 at 0.5:   0.45 → still routes to S2 (minimal disruption)

Best for: Caching layers (Memcached, CDN), stateful services, minimizing cache invalidation on scale events

Algorithm Decision Matrix#

Scenario	Algorithm
All servers identical, short requests	Round Robin
Mixed server sizes	Weighted Round Robin
Long-lived connections (WebSocket)	Least Connections
Need session affinity	IP Hash or Cookie-based
Caching layer (minimize cache misses)	Consistent Hashing
Latency-sensitive	Least Response Time

Health Checks#

The load balancer must detect unhealthy servers:

Active health checks — LB periodically pings each server:

LB → GET /health → Server 1: 200 OK ✓
LB → GET /health → Server 2: 200 OK ✓
LB → GET /health → Server 3: timeout ✗ (removed from pool)

Passive health checks — LB monitors real traffic for errors:

5 consecutive 5xx responses → mark unhealthy
Connection timeout → mark unhealthy

Best practice: Use both. Active for fast detection, passive for real-world accuracy.

Cloud Load Balancers#

AWS#

Service	Layer	Best For
ALB	L7	HTTP routing, path-based, gRPC
NLB	L4	TCP/UDP, extreme performance
CLB	L4/L7	Legacy (don't use for new projects)

GCP#

Service	Layer	Best For
HTTP(S) LB	L7	Global HTTP routing
TCP/UDP LB	L4	Regional TCP balancing

Self-Hosted#

Tool	Best For
Nginx	Most popular, easy config, L4 + L7
HAProxy	High performance, advanced health checks
Envoy	Service mesh, gRPC, observability
Traefik	Docker/K8s native, auto-discovery

Architecture Patterns#

Basic Web App#

Internet → CloudFront (CDN) → ALB (L7) → Auto Scaling Group
                                              → EC2 Instance 1
                                              → EC2 Instance 2
                                              → EC2 Instance 3

Microservices with Service Mesh#

Client → API Gateway (L7 LB)
              → Service A → Envoy sidecar → Service B
              → Service C → Envoy sidecar → Service D
                                    ↓
                             Envoy Control Plane (Istio)

Global Load Balancing#

User (US) → DNS (GeoDNS) → US Load Balancer → US Servers
User (EU) → DNS (GeoDNS) → EU Load Balancer → EU Servers
User (AP) → DNS (GeoDNS) → AP Load Balancer → AP Servers

Generate your load balanced architecture →

Summary#

Start with L7 (ALB/Nginx) for HTTP traffic — routing flexibility matters
Use L4 (NLB) for raw TCP, gRPC, or when latency is critical
Round Robin works for 80% of use cases
Least Connections for WebSocket/long-lived connections
Consistent Hashing for caching layers
Always add health checks — active + passive
Consider global LB (GeoDNS) for multi-region deployments

Design your load balanced architecture at codelit.io — generate interactive diagrams with performance audits and infrastructure exports.

Try it on Codelit

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

Build this architecture →

Comments

AI search

Try these templates

Robinhood Trading Platform

Commission-free stock trading with real-time market data, order execution, portfolio management, and regulatory compliance.

10 components

Build this architecture

Generate an interactive architecture for Load Balancer Types & Algorithms in seconds.

Try it in Codelit →

Load Balancer Types & Algorithms: L4 vs L7, Round Robin to Consistent Hashing

Load Balancer Types & Algorithms Explained#

Why Load Balance?#

L4 vs L7 Load Balancing#

Layer 4 (Transport)#

Layer 7 (Application)#

Quick Decision#

Load Balancing Algorithms#

1. Round Robin#

2. Weighted Round Robin#

3. Least Connections#

4. Least Response Time#

5. IP Hash#

6. Consistent Hashing#

Algorithm Decision Matrix#

Health Checks#

Cloud Load Balancers#

AWS#

GCP#

Self-Hosted#

Architecture Patterns#

Basic Web App#

Microservices with Service Mesh#

Global Load Balancing#

Summary#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Robinhood Trading Platform

Build this architecture

Load Balancer Types & Algorithms: L4 vs L7, Round Robin to Consistent Hashing

Load Balancer Types & Algorithms Explained#

Why Load Balance?#

L4 vs L7 Load Balancing#

Layer 4 (Transport)#

Layer 7 (Application)#

Quick Decision#

Load Balancing Algorithms#

1. Round Robin#

2. Weighted Round Robin#

3. Least Connections#

4. Least Response Time#

5. IP Hash#

6. Consistent Hashing#

Algorithm Decision Matrix#

Health Checks#

Cloud Load Balancers#

AWS#

GCP#

Self-Hosted#

Architecture Patterns#

Basic Web App#

Microservices with Service Mesh#

Global Load Balancing#

Summary#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Robinhood Trading Platform

Build this architecture