networkinginfrastructuresystem-designload-balancingdns

DNS-Based Load Balancing — Distributing Traffic at the Edge

March 29, 2026 7 min readBy Codelit Team Discussion

What is DNS-based load balancing?#

DNS-based load balancing distributes traffic across multiple servers by returning different IP addresses in response to DNS queries. Instead of a single A record pointing to one server, the DNS resolver returns one or more addresses from a pool — steering clients to different endpoints before they even open a TCP connection.

This happens at the very edge of the request path, before any application-layer load balancer gets involved.

Why use DNS for load balancing?#

Global distribution — route users to the nearest data centre without a central proxy
No single point of failure — DNS is inherently distributed
Layer independence — works with any protocol (HTTP, gRPC, TCP, UDP)
Cost efficiency — no dedicated load balancer hardware for cross-region routing
Massive scale — DNS handles billions of queries per day with minimal overhead

Round-robin DNS#

The simplest form of DNS load balancing. The authoritative DNS server rotates through a list of IP addresses, returning them in a different order for each query.

How it works#

; Zone file
api.example.com.  300  IN  A  203.0.113.1
api.example.com.  300  IN  A  203.0.113.2
api.example.com.  300  IN  A  203.0.113.3

Query 1 returns: 203.0.113.1, 203.0.113.2, 203.0.113.3 Query 2 returns: 203.0.113.2, 203.0.113.3, 203.0.113.1 Query 3 returns: 203.0.113.3, 203.0.113.1, 203.0.113.2

Limitations#

No health awareness — if a server goes down, DNS keeps returning its IP
Uneven distribution — DNS caching means many clients share the same resolved IP
No session affinity — consecutive requests from the same client may hit different servers
Client behaviour varies — some clients always use the first IP, others pick randomly

Weighted DNS#

Weighted DNS assigns a weight to each record, controlling the probability that a particular IP is returned. This allows gradual traffic shifting — useful for canary deployments, capacity-proportional routing, and migrations.

Configuration example (Route 53)#

{
  "Name": "api.example.com",
  "Type": "A",
  "SetIdentifier": "primary",
  "Weight": 80,
  "TTL": 60,
  "ResourceRecords": [{"Value": "203.0.113.1"}]
}

{
  "Name": "api.example.com",
  "Type": "A",
  "SetIdentifier": "canary",
  "Weight": 20,
  "TTL": 60,
  "ResourceRecords": [{"Value": "203.0.113.2"}]
}

80% of DNS responses return the primary IP, 20% return the canary. Adjusting weights enables zero-downtime migrations.

GeoDNS#

GeoDNS returns different IP addresses based on the geographic location of the DNS resolver (or the client, if EDNS Client Subnet is supported).

Use cases#

Latency reduction — route European users to Frankfurt, US users to Virginia
Data sovereignty — keep EU data in EU regions to comply with GDPR
Content localisation — serve region-specific content from nearby servers

How location is determined#

Resolver IP — the DNS server maps the recursive resolver's IP to a geographic location using a GeoIP database
EDNS Client Subnet (ECS) — the resolver forwards a truncated version of the client's IP, giving the authoritative server a more accurate location

Limitations#

GeoIP databases are imperfect — corporate VPNs and public DNS resolvers (8.8.8.8) can mislocate users
ECS adoption is not universal
Geographic proximity does not always equal network proximity

Latency-based routing#

Latency-based routing goes beyond geography. Instead of mapping IPs to locations, it measures actual network latency between clients and endpoints, then returns the IP with the lowest latency.

How Route 53 implements it#

AWS continuously measures latency between its edge locations and each AWS region
When a query arrives, Route 53 identifies the edge location closest to the resolver
It returns the record associated with the lowest-latency region from that edge location

{
  "Name": "api.example.com",
  "Type": "A",
  "SetIdentifier": "us-east-1",
  "Region": "us-east-1",
  "TTL": 60,
  "ResourceRecords": [{"Value": "203.0.113.1"}]
}

{
  "Name": "api.example.com",
  "Type": "A",
  "SetIdentifier": "eu-west-1",
  "Region": "eu-west-1",
  "TTL": 60,
  "ResourceRecords": [{"Value": "203.0.113.2"}]
}

Latency-based vs GeoDNS#

Aspect	GeoDNS	Latency-based
Routing signal	Geographic location	Measured network latency
Accuracy	Good for most cases	Better for edge cases (VPNs, anycast)
Setup complexity	Moderate	Lower (cloud-managed)
Provider examples	NS1, Cloudflare	Route 53, Azure Traffic Manager

Health-check integration#

DNS load balancing without health checks is dangerous — it sends traffic to dead servers. Modern DNS providers integrate health checks directly into the routing decision.

Health check flow#

The DNS provider periodically sends health-check probes to each endpoint (HTTP, HTTPS, TCP, or custom)
If an endpoint fails consecutive checks, it is marked unhealthy
The DNS server stops returning the unhealthy endpoint's IP
When the endpoint recovers, it is gradually re-added

Configuration considerations#

Check interval — 10–30 seconds is typical. Faster detection means more probe traffic.
Failure threshold — require 2–3 consecutive failures before marking unhealthy to avoid flapping
Recovery threshold — require 2–3 consecutive successes before marking healthy again
Check path — use a dedicated health endpoint that verifies downstream dependencies

The TTL problem#

Even after DNS removes an unhealthy IP, clients with cached responses continue sending traffic to the dead server until their cached TTL expires. This is the fundamental limitation of DNS-based health checks.

Route 53 failover#

Route 53 offers a dedicated failover routing policy for active-passive setups:

Primary-secondary failover#

Primary record  -> 203.0.113.1 (active, health-checked)
Secondary record -> 203.0.113.2 (standby, returned only when primary is unhealthy)

Multi-level failover#

Combine failover with other routing policies:

Top level — latency-based routing to the nearest region
Second level — failover within each region (primary AZ to secondary AZ)
Third level — weighted routing within each AZ for canary deployments

This creates a routing tree where each level handles a different concern.

DNS TTL strategies#

TTL (Time To Live) controls how long resolvers and clients cache a DNS response. It is the single most important parameter in DNS-based load balancing.

Low TTL (30–60 seconds)#

Pro — fast failover, traffic shifts take effect quickly
Con — higher query volume to authoritative servers, slightly higher latency for first request after cache expiry

High TTL (300–3600 seconds)#

Pro — fewer DNS queries, faster resolution for cached clients
Con — slow failover, stale records served during outages

Recommended strategy#

Scenario	Recommended TTL
Active-passive failover	30–60 seconds
Latency-based routing	60 seconds
Stable multi-region	300 seconds
Static content CDN	3600 seconds
During a migration	30 seconds (lower before, raise after)

TTL floor reality#

Many recursive resolvers enforce a minimum TTL (often 30 seconds). Some corporate resolvers cache for much longer regardless of the TTL you set. Plan for worst-case cache staleness, not just the TTL you configure.

Combining DNS with application-layer load balancing#

DNS load balancing works best as the first layer in a multi-tier strategy:

DNS layer — GeoDNS or latency-based routing directs users to the nearest region
Edge layer — a regional load balancer (ALB, Envoy) handles TLS termination and HTTP routing
Service layer — service mesh or internal load balancer distributes traffic across pods

DNS handles coarse-grained, cross-region routing. Application-layer load balancers handle fine-grained, request-level decisions.

Common pitfalls#

Ignoring DNS caching — always account for cached TTLs in failover time calculations
No health checks — round-robin DNS without health checks is a ticking time bomb
Over-relying on DNS — DNS cannot do connection draining, rate limiting, or request-level routing
Forgetting EDNS Client Subnet — without ECS, GeoDNS accuracy drops for users behind public resolvers
TTL too high during migrations — lower TTL well before the migration, not during it

398 articles on system design at codelit.io/blog.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

Build this architecture →

Comments

api design

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

8 min read

system design

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

7 min read

api

API-First Design Methodology — Design Before You Implement

7 min read

Build this architecture

Generate an interactive architecture for DNS in seconds.

Try it in Codelit →

networkinginfrastructuresystem-designload-balancingdns

DNS-Based Load Balancing — Distributing Traffic at the Edge

March 29, 2026 7 min readBy Codelit Team Discussion

What is DNS-based load balancing?#

This happens at the very edge of the request path, before any application-layer load balancer gets involved.

Why use DNS for load balancing?#

Global distribution — route users to the nearest data centre without a central proxy
No single point of failure — DNS is inherently distributed
Layer independence — works with any protocol (HTTP, gRPC, TCP, UDP)
Cost efficiency — no dedicated load balancer hardware for cross-region routing
Massive scale — DNS handles billions of queries per day with minimal overhead

Round-robin DNS#

The simplest form of DNS load balancing. The authoritative DNS server rotates through a list of IP addresses, returning them in a different order for each query.

How it works#

; Zone file
api.example.com.  300  IN  A  203.0.113.1
api.example.com.  300  IN  A  203.0.113.2
api.example.com.  300  IN  A  203.0.113.3

Query 1 returns: 203.0.113.1, 203.0.113.2, 203.0.113.3 Query 2 returns: 203.0.113.2, 203.0.113.3, 203.0.113.1 Query 3 returns: 203.0.113.3, 203.0.113.1, 203.0.113.2

Limitations#

No health awareness — if a server goes down, DNS keeps returning its IP
Uneven distribution — DNS caching means many clients share the same resolved IP
No session affinity — consecutive requests from the same client may hit different servers
Client behaviour varies — some clients always use the first IP, others pick randomly

Weighted DNS#

Configuration example (Route 53)#

{
  "Name": "api.example.com",
  "Type": "A",
  "SetIdentifier": "primary",
  "Weight": 80,
  "TTL": 60,
  "ResourceRecords": [{"Value": "203.0.113.1"}]
}

{
  "Name": "api.example.com",
  "Type": "A",
  "SetIdentifier": "canary",
  "Weight": 20,
  "TTL": 60,
  "ResourceRecords": [{"Value": "203.0.113.2"}]
}

80% of DNS responses return the primary IP, 20% return the canary. Adjusting weights enables zero-downtime migrations.

GeoDNS#

GeoDNS returns different IP addresses based on the geographic location of the DNS resolver (or the client, if EDNS Client Subnet is supported).

Use cases#

Latency reduction — route European users to Frankfurt, US users to Virginia
Data sovereignty — keep EU data in EU regions to comply with GDPR
Content localisation — serve region-specific content from nearby servers

How location is determined#

Resolver IP — the DNS server maps the recursive resolver's IP to a geographic location using a GeoIP database
EDNS Client Subnet (ECS) — the resolver forwards a truncated version of the client's IP, giving the authoritative server a more accurate location

Limitations#

GeoIP databases are imperfect — corporate VPNs and public DNS resolvers (8.8.8.8) can mislocate users
ECS adoption is not universal
Geographic proximity does not always equal network proximity

Latency-based routing#

Latency-based routing goes beyond geography. Instead of mapping IPs to locations, it measures actual network latency between clients and endpoints, then returns the IP with the lowest latency.

How Route 53 implements it#

AWS continuously measures latency between its edge locations and each AWS region
When a query arrives, Route 53 identifies the edge location closest to the resolver
It returns the record associated with the lowest-latency region from that edge location

{
  "Name": "api.example.com",
  "Type": "A",
  "SetIdentifier": "us-east-1",
  "Region": "us-east-1",
  "TTL": 60,
  "ResourceRecords": [{"Value": "203.0.113.1"}]
}

{
  "Name": "api.example.com",
  "Type": "A",
  "SetIdentifier": "eu-west-1",
  "Region": "eu-west-1",
  "TTL": 60,
  "ResourceRecords": [{"Value": "203.0.113.2"}]
}

Latency-based vs GeoDNS#

Aspect	GeoDNS	Latency-based
Routing signal	Geographic location	Measured network latency
Accuracy	Good for most cases	Better for edge cases (VPNs, anycast)
Setup complexity	Moderate	Lower (cloud-managed)
Provider examples	NS1, Cloudflare	Route 53, Azure Traffic Manager

Health-check integration#

DNS load balancing without health checks is dangerous — it sends traffic to dead servers. Modern DNS providers integrate health checks directly into the routing decision.

Health check flow#

The DNS provider periodically sends health-check probes to each endpoint (HTTP, HTTPS, TCP, or custom)
If an endpoint fails consecutive checks, it is marked unhealthy
The DNS server stops returning the unhealthy endpoint's IP
When the endpoint recovers, it is gradually re-added

Configuration considerations#

Check interval — 10–30 seconds is typical. Faster detection means more probe traffic.
Failure threshold — require 2–3 consecutive failures before marking unhealthy to avoid flapping
Recovery threshold — require 2–3 consecutive successes before marking healthy again
Check path — use a dedicated health endpoint that verifies downstream dependencies

The TTL problem#

Route 53 failover#

Route 53 offers a dedicated failover routing policy for active-passive setups:

Primary-secondary failover#

Primary record  -> 203.0.113.1 (active, health-checked)
Secondary record -> 203.0.113.2 (standby, returned only when primary is unhealthy)

Multi-level failover#

Combine failover with other routing policies:

Top level — latency-based routing to the nearest region
Second level — failover within each region (primary AZ to secondary AZ)
Third level — weighted routing within each AZ for canary deployments

This creates a routing tree where each level handles a different concern.

DNS TTL strategies#

TTL (Time To Live) controls how long resolvers and clients cache a DNS response. It is the single most important parameter in DNS-based load balancing.

Low TTL (30–60 seconds)#

Pro — fast failover, traffic shifts take effect quickly
Con — higher query volume to authoritative servers, slightly higher latency for first request after cache expiry

High TTL (300–3600 seconds)#

Pro — fewer DNS queries, faster resolution for cached clients
Con — slow failover, stale records served during outages

Recommended strategy#

Scenario	Recommended TTL
Active-passive failover	30–60 seconds
Latency-based routing	60 seconds
Stable multi-region	300 seconds
Static content CDN	3600 seconds
During a migration	30 seconds (lower before, raise after)

TTL floor reality#

Combining DNS with application-layer load balancing#

DNS load balancing works best as the first layer in a multi-tier strategy:

DNS layer — GeoDNS or latency-based routing directs users to the nearest region
Edge layer — a regional load balancer (ALB, Envoy) handles TLS termination and HTTP routing
Service layer — service mesh or internal load balancer distributes traffic across pods

DNS handles coarse-grained, cross-region routing. Application-layer load balancers handle fine-grained, request-level decisions.

Common pitfalls#

Ignoring DNS caching — always account for cached TTLs in failover time calculations
No health checks — round-robin DNS without health checks is a ticking time bomb
Over-relying on DNS — DNS cannot do connection draining, rate limiting, or request-level routing
Forgetting EDNS Client Subnet — without ECS, GeoDNS accuracy drops for users behind public resolvers
TTL too high during migrations — lower TTL well before the migration, not during it

398 articles on system design at codelit.io/blog.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

Build this architecture →

Comments

api design

Build this architecture

Generate an interactive architecture for DNS in seconds.

Try it in Codelit →

DNS-Based Load Balancing — Distributing Traffic at the Edge

What is DNS-based load balancing?#

Why use DNS for load balancing?#

Round-robin DNS#

How it works#

Limitations#

Weighted DNS#

Configuration example (Route 53)#

GeoDNS#

Use cases#

How location is determined#

Limitations#

Latency-based routing#

How Route 53 implements it#

Latency-based vs GeoDNS#

Health-check integration#

Health check flow#

Configuration considerations#

The TTL problem#

Route 53 failover#

Primary-secondary failover#

Multi-level failover#

DNS TTL strategies#

Low TTL (30–60 seconds)#

High TTL (300–3600 seconds)#

Recommended strategy#

TTL floor reality#

Combining DNS with application-layer load balancing#

Common pitfalls#

Comments

Related articles

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

API-First Design Methodology — Design Before You Implement

Build this architecture

DNS-Based Load Balancing — Distributing Traffic at the Edge

What is DNS-based load balancing?#

Why use DNS for load balancing?#

Round-robin DNS#

How it works#

Limitations#

Weighted DNS#

Configuration example (Route 53)#

GeoDNS#

Use cases#

How location is determined#

Limitations#

Latency-based routing#

How Route 53 implements it#

Latency-based vs GeoDNS#

Health-check integration#

Health check flow#

Configuration considerations#

The TTL problem#

Route 53 failover#

Primary-secondary failover#

Multi-level failover#

DNS TTL strategies#

Low TTL (30–60 seconds)#

High TTL (300–3600 seconds)#

Recommended strategy#

TTL floor reality#

Combining DNS with application-layer load balancing#

Common pitfalls#

Comments

Related articles

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

API-First Design Methodology — Design Before You Implement

Build this architecture