Kubernetes Architecture Explained — Components, Patterns, and When to Use It
Kubernetes is everywhere — but do you actually need it?#
Kubernetes (K8s) runs some of the largest systems in the world. Google, Spotify, Airbnb, and thousands of companies use it to orchestrate containers at scale.
But Kubernetes is also the most over-adopted technology in the industry. Teams with 3 engineers running 2 services don't need a container orchestration platform. They need a PaaS.
Let's break down what Kubernetes actually does, how it works, and when it earns its complexity.
The 30-second version#
Kubernetes manages containers across multiple machines. You tell it "I want 3 copies of this service running" and it figures out where to put them, restarts them if they crash, and routes traffic to them.
That's it. Everything else is details.
Architecture overview#
Kubernetes has two layers: the control plane (the brain) and worker nodes (the muscle).
Control plane components#
API Server (kube-apiserver): The front door. Every interaction with the cluster goes through the API server — kubectl commands, dashboard requests, internal components talking to each other. It validates requests and updates the cluster state in etcd.
etcd: The cluster's database. A distributed key-value store that holds all cluster state — what pods exist, what services are configured, what secrets are stored. If etcd dies, the cluster is brain-dead.
Scheduler (kube-scheduler): Decides where to run new pods. Considers resource requests, node capacity, affinity rules, and constraints. "This pod needs 2GB RAM and a GPU — put it on node-3."
Controller Manager: Runs control loops that watch cluster state and make corrections. If a deployment says "3 replicas" but only 2 are running, the ReplicaSet controller creates a new one.
Worker node components#
kubelet: The agent on each node. Talks to the API server, pulls container images, starts/stops containers, reports node health. One kubelet per node.
kube-proxy: Manages network rules on each node. Handles service discovery and load balancing — when you hit a Service IP, kube-proxy routes the request to an actual pod.
Container runtime: Actually runs the containers. Usually containerd or CRI-O (Docker was deprecated as a runtime in K8s 1.24).
Key abstractions#
Pods#
The smallest deployable unit. A pod is one or more containers that share network and storage. In practice, most pods run a single container.
apiVersion: v1
kind: Pod
metadata:
name: web-server
spec:
containers:
- name: app
image: nginx:latest
ports:
- containerPort: 80
Deployments#
Manage pod replicas. You declare the desired state ("3 replicas of v2.1") and the deployment controller makes it happen — rolling updates, rollbacks, scaling.
Services#
Stable network endpoint for a set of pods. Pods come and go (new IPs every time), but a Service provides a consistent DNS name and load balancing.
- ClusterIP: Internal-only (default)
- NodePort: Exposes on each node's IP
- LoadBalancer: Provisions a cloud load balancer
Ingress#
HTTP routing rules. Maps external URLs to internal services:
api.example.com → api-service:8080
app.example.com → frontend-service:3000
example.com/docs → docs-service:4000
When Kubernetes is the right choice#
Multiple services that scale independently. If your frontend scales to 10 replicas during peak hours while your backend stays at 3, Kubernetes handles this naturally.
You need zero-downtime deployments. Rolling updates, canary deployments, blue-green deployments — all built in.
Multi-cloud or hybrid cloud. Kubernetes provides a consistent API regardless of where it runs — AWS, GCP, Azure, on-premise, or a mix.
Your team has DevOps expertise. Kubernetes requires someone who understands networking, storage, RBAC, and can debug pod scheduling issues at 3 AM.
When Kubernetes is overkill#
Fewer than 5 services. Use a managed platform (Railway, Fly.io, Cloud Run) or even a single VM with Docker Compose.
Small team without DevOps. The learning curve is steep. If your team is 3 engineers, you'll spend more time managing K8s than building your product.
Serverless fits better. If your workloads are event-driven with unpredictable traffic, Lambda/Cloud Functions might be simpler and cheaper.
You're not running containers yet. Adopt Docker first. Kubernetes orchestrates containers — if you don't have containers, you don't need an orchestrator.
Common patterns#
Sidecar pattern#
Add a helper container alongside your main container in the same pod. Common for logging agents, service mesh proxies (Envoy), or TLS termination.
Init containers#
Run setup tasks before the main container starts — database migrations, config file generation, dependency checks.
Horizontal Pod Autoscaler (HPA)#
Automatically scale pods based on CPU, memory, or custom metrics. "Scale to 10 pods when CPU exceeds 70%."
Namespaces#
Logical isolation within a cluster. Use for environments (dev/staging/prod), teams, or tenants.
Managed vs. self-hosted#
| Option | Pros | Cons |
|---|---|---|
| EKS (AWS) | Deep AWS integration | Expensive, complex IAM |
| GKE (Google) | Best managed K8s, Autopilot mode | GCP lock-in |
| AKS (Azure) | Free control plane | Azure ecosystem required |
| Self-hosted | Full control | You maintain everything |
Recommendation: Unless you have specific compliance requirements for self-hosting, use a managed service. GKE Autopilot is the lowest-effort option.
Visualize your K8s architecture#
Understanding how your control plane, worker nodes, services, and ingress connect is critical before deploying. Try Codelit to generate an interactive diagram of your Kubernetes setup — see how pods, services, and external traffic flow together.
Key takeaways#
- Control plane = brain (API server, etcd, scheduler, controllers)
- Worker nodes = muscle (kubelet, kube-proxy, container runtime)
- Pods are the unit of deployment — usually one container per pod
- Services provide stable networking — pods are ephemeral, services aren't
- Use managed K8s unless you have a strong reason to self-host
- Don't adopt K8s for fewer than 5 services — simpler tools exist
- Learn Docker first — containers before orchestration
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Cost Estimator
See estimated AWS monthly costs for every component in your architecture
GitHub Integration
Paste a repo URL and generate architecture from your actual codebase
Related articles
Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency
8 min read
system designCircuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j
7 min read
testingAPI Contract Testing with Pact — Consumer-Driven Contracts for Microservices
8 min read
Try these templates
Netflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsSearch Engine Architecture
Web-scale search with crawling, indexing, ranking, and sub-second query serving.
8 componentsKubernetes Container Orchestration
K8s cluster with pod scheduling, service mesh, auto-scaling, and CI/CD deployment pipeline.
9 componentsBuild this architecture
Generate an interactive Kubernetes Architecture Explained in seconds.
Try it in Codelit →
Comments