DNS Architecture Design: From Resolution to Global Traffic Management
DNS Architecture Design#
DNS is the invisible backbone of the internet. Every HTTP request, every API call, every email delivery begins with a DNS lookup. Getting DNS architecture right impacts latency, availability, security, and disaster recovery.
How DNS Works#
DNS translates human-readable domain names into IP addresses. The resolution process is a hierarchical chain:
Browser cache → OS cache → Recursive resolver → Root nameserver
→ TLD nameserver → Authoritative nameserver → IP address returned
The Resolution Flow in Detail#
- Browser cache — checks if the domain was resolved recently
- OS resolver cache — the operating system maintains its own cache
- Recursive resolver — your ISP or configured resolver (e.g., 8.8.8.8) queries on your behalf
- Root nameserver — 13 root server clusters direct to the correct TLD
- TLD nameserver —
.com,.io,.devservers point to the authoritative NS - Authoritative nameserver — holds the actual DNS records and returns the answer
A typical uncached lookup takes 50-200ms. Caching reduces subsequent lookups to under 1ms.
DNS Record Types#
A and AAAA Records#
The most fundamental records. A maps a domain to an IPv4 address; AAAA maps to IPv6.
example.com. A 93.184.216.34
example.com. AAAA 2606:2800:220:1:248:1893:25c8:1946
CNAME Records#
Aliases one domain to another. The resolver follows the chain until it finds an A/AAAA record.
www.example.com. CNAME example.com.
Key constraint: CNAME cannot coexist with other record types at the same name. This is why many providers offer ALIAS/ANAME records as a workaround at the zone apex.
MX Records#
Mail exchange records direct email to the correct mail servers, with priority values (lower = higher priority).
example.com. MX 10 mail1.example.com.
example.com. MX 20 mail2.example.com.
TXT Records#
Arbitrary text data used for verification, SPF, DKIM, DMARC, and domain ownership proofs.
example.com. TXT "v=spf1 include:_spf.google.com ~all"
example.com. TXT "google-site-verification=abc123..."
SRV Records#
Service locator records specify host, port, priority, and weight for specific services.
_sip._tcp.example.com. SRV 10 60 5060 sipserver.example.com.
Format: priority weight port target. Used heavily in VoIP, LDAP, and Kubernetes service discovery.
TTL Strategies#
Time-to-Live controls how long resolvers cache a record. Getting TTL right is a balancing act:
| TTL Value | Use Case | Trade-off |
|---|---|---|
| 60s | Pre-migration, failover-ready | Higher query volume, more resolver load |
| 300s | Active services, moderate change frequency | Good balance for most workloads |
| 3600s | Stable services | Lower query cost, slower failover |
| 86400s | Rarely changing records | Minimal queries, very slow propagation |
TTL Best Practices#
- Lower TTL before migrations — drop to 60s at least 24-48 hours before a DNS change
- Raise TTL after stabilization — once a migration is confirmed, increase to reduce load
- Use different TTLs per record — MX records can have long TTLs; A records for active services should be shorter
- Account for resolver non-compliance — some resolvers ignore low TTLs or enforce minimums
GeoDNS and Global Traffic Management#
GeoDNS returns different IP addresses based on the requester's geographic location.
User in Europe → resolves to eu-west-1 load balancer
User in Asia → resolves to ap-southeast-1 load balancer
User in US → resolves to us-east-1 load balancer
Implementation Patterns#
- Latency-based routing — Route 53 measures latency from resolver networks to each endpoint
- Geographic routing — map continents, countries, or states to specific endpoints
- Weighted routing — distribute traffic by percentage across endpoints (useful for blue/green deploys)
- Multivalue answer — return multiple IPs and let the client choose (poor man's load balancing)
DNS Failover#
DNS-based failover detects unhealthy endpoints and removes them from responses.
Health check → endpoint /health returns 5xx
→ DNS provider removes endpoint from rotation
→ Next query returns only healthy endpoints
→ Recovery detected → endpoint re-added
Limitations: failover speed is bounded by TTL. With a 300s TTL, failover can take up to 5 minutes for cached clients. Combine DNS failover with application-level retries for faster recovery.
DNSSEC#
DNSSEC adds cryptographic signatures to DNS responses, preventing cache poisoning and man-in-the-middle attacks.
The Chain of Trust#
Root zone (signed) → TLD zone (signed) → Your zone (signed)
Each level signs the delegation to the next
Key Components#
- RRSIG — signature over a record set
- DNSKEY — public key used to verify signatures
- DS — delegation signer, links parent zone to child zone's key
- NSEC/NSEC3 — authenticated denial of existence
DNSSEC Considerations#
- Increases response size (UDP fragmentation risk)
- Key rotation must be planned carefully
- Not all resolvers validate DNSSEC
- Misconfiguration can make your domain unresolvable
Private DNS Zones#
Private DNS zones resolve names only within a private network (VPC, VPN, or internal infrastructure).
Use Cases#
- Service discovery —
api.internal.company.comresolves to private IPs - Split-horizon DNS — same domain returns different IPs internally vs externally
- Database endpoints —
postgres.db.internalavoids hardcoding IPs
Implementation#
VPC DNS (e.g., AmazonProvidedDNS at 169.254.169.253)
→ Private hosted zone: internal.company.com
→ Records resolve only within associated VPCs
→ External queries fall through to public DNS
Tools and Services#
Amazon Route 53#
Full-featured DNS with health checks, failover, latency-based routing, and tight AWS integration. Supports alias records for AWS resources (CloudFront, ALB, S3) at no extra query cost.
Cloudflare DNS#
Extremely fast authoritative DNS (sub-10ms globally). Free tier available. Pairs with Cloudflare's proxy for DDoS protection and CDN. Supports DNSSEC with one click.
NS1#
Advanced traffic management with filter chains — combine geographic, cost, performance, and availability signals. Strong API-first approach. Popular for complex multi-CDN and multi-cloud architectures.
Other Notable Tools#
- Google Cloud DNS — managed DNS with 100% SLA
- Azure DNS — native Azure integration with alias record support
- CoreDNS — Kubernetes-native DNS server, extensible via plugins
- PowerDNS — open-source authoritative server with API and Lua scripting
Architecture Checklist#
- Use ALIAS/ANAME records at zone apex instead of CNAME
- Set appropriate TTLs per record type and change frequency
- Enable DNSSEC for public-facing zones
- Configure health checks and failover for critical endpoints
- Use private hosted zones for internal service discovery
- Monitor DNS query latency and error rates
- Plan key rotation for DNSSEC
- Test failover scenarios regularly
- Document your DNS architecture and zone delegation chain
Key Takeaways#
DNS architecture is deceptively simple on the surface but deeply impactful at scale. The difference between a 50ms and 5ms DNS lookup multiplied across millions of requests is measurable. Proper TTL management, failover configuration, and security hardening with DNSSEC form the foundation of reliable infrastructure.
Build and visualize your DNS architecture on codelit.io — the system design tool for developers who ship.
This is article #155 in the Codelit engineering blog series.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Cost Estimator
See estimated AWS monthly costs for every component in your architecture
Related articles
AI Agent Tool Use Architecture: Function Calling, ReAct Loops & Structured Outputs
6 min read
AI searchAI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG
8 min read
AI safetyAI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop
8 min read
Try these templates
Netflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsSearch Engine Architecture
Web-scale search with crawling, indexing, ranking, and sub-second query serving.
8 componentsHeadless CMS Platform
Headless content management with structured content, media pipeline, API-first delivery, and editorial workflows.
8 componentsBuild this architecture
Generate an interactive DNS Architecture Design in seconds.
Try it in Codelit →
Comments