Kubernetes Custom Resource Definitions — Extending the API Server
What are custom resources#
Kubernetes has built-in resources: Pods, Services, Deployments, ConfigMaps. A Custom Resource Definition (CRD) lets you add your own resource types to the Kubernetes API server.
After you create a CRD for Database, you can run kubectl get databases and manage your custom objects exactly like built-in resources. The API server handles storage, RBAC, versioning, and watch events — you just define the schema.
Why CRDs exist#
Before CRDs, extending Kubernetes meant either:
- ConfigMaps with conventions — unstructured, no validation, no tooling support
- API aggregation — running a separate API server (complex, hard to maintain)
- ThirdPartyResources — the deprecated predecessor to CRDs
CRDs solved this by letting you register new resource types declaratively. No code needed just to define the resource — the API server does the rest.
Creating a CRD#
Here is a CRD that defines a Database resource:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: databases.example.com
spec:
group: example.com
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
required:
- engine
- version
properties:
engine:
type: string
enum:
- postgres
- mysql
- redis
version:
type: string
replicas:
type: integer
minimum: 1
maximum: 10
default: 3
storage:
type: string
pattern: "^[0-9]+(Gi|Ti)$"
status:
type: object
properties:
ready:
type: boolean
replicas:
type: integer
endpoint:
type: string
subresources:
status: {}
additionalPrinterColumns:
- name: Engine
type: string
jsonPath: .spec.engine
- name: Version
type: string
jsonPath: .spec.version
- name: Ready
type: boolean
jsonPath: .status.ready
scope: Namespaced
names:
plural: databases
singular: database
kind: Database
shortNames:
- db
Apply it with kubectl apply -f database-crd.yaml. Now you can create Database objects:
apiVersion: example.com/v1
kind: Database
metadata:
name: orders-db
namespace: production
spec:
engine: postgres
version: "16"
replicas: 3
storage: 100Gi
CRD schema validation#
The openAPIV3Schema field is where you define validation rules. Kubernetes rejects any resource that does not match the schema at admission time.
Key validation features:
| Feature | Example | Purpose |
|---|---|---|
required | required: [engine, version] | Mandatory fields |
enum | enum: [postgres, mysql, redis] | Allowed values |
minimum/maximum | minimum: 1, maximum: 10 | Numeric bounds |
pattern | `pattern: "^[0-9]+(Gi | Ti)$"` |
default | default: 3 | Default values |
format | format: date-time | String formats |
maxLength | maxLength: 63 | String length limits |
Structural schemas are required. Since Kubernetes 1.16, all CRDs must have a structural schema. This means every field must have a declared type, and additionalProperties must be explicitly handled.
The controller pattern#
A CRD without a controller is just data storage. The controller is what makes custom resources do things.
The reconciliation loop:
1. Watch for changes to Database resources
2. Compare desired state (spec) with actual state
3. Take action to converge actual state toward desired state
4. Update the status subresource with current state
5. Repeat
Controller pseudocode:
func reconcile(database Database):
// Check if the actual database exists
actual = checkCloudProvider(database.spec)
if actual == nil:
// Create the database
createDatabase(database.spec)
updateStatus(database, {ready: false, replicas: 0})
return requeue(after=30s)
if actual.replicas != database.spec.replicas:
// Scale the database
scaleDatabase(actual, database.spec.replicas)
updateStatus(database, {ready: true, replicas: actual.replicas})
return requeue(after=30s)
// Everything matches desired state
updateStatus(database, {ready: true, replicas: actual.replicas, endpoint: actual.endpoint})
return done()
Key principles:
- Level-triggered, not edge-triggered — reconcile based on current state, not events. If the controller crashes and restarts, it re-reads current state and converges.
- Idempotent — running reconcile twice with the same input produces the same result.
- Status reflects reality — the status subresource reports what actually exists, not what was requested.
The operator pattern#
An operator is a controller that encodes domain knowledge. It does not just create resources — it handles the full lifecycle: upgrades, backups, failover, scaling, monitoring.
What operators manage:
- Day 1: Initial deployment and configuration
- Day 2: Upgrades, backups, scaling, recovery
- Day N: Decommissioning, data migration
Real-world operators:
| Operator | Manages | CRDs |
|---|---|---|
| CloudNativePG | PostgreSQL clusters | Cluster, Backup, ScheduledBackup |
| Strimzi | Apache Kafka | Kafka, KafkaTopic, KafkaUser |
| Cert-Manager | TLS certificates | Certificate, Issuer, ClusterIssuer |
| Prometheus Operator | Monitoring | Prometheus, ServiceMonitor, AlertmanagerConfig |
The status subresource#
The status subresource separates user intent (spec) from system state (status).
Why this matters:
- Users update
spec— "I want 3 replicas" - Controllers update
status— "There are currently 2 replicas, scaling in progress" - These are separate API endpoints with separate RBAC
Enable it in the CRD with subresources: { status: {} }. The controller updates status via:
kubectl patch database orders-db --type=merge --subresource=status \
-p '{"status":{"ready":true,"replicas":3,"endpoint":"orders-db.prod.svc:5432"}}'
Without the status subresource, a user editing the spec could accidentally overwrite the status, and vice versa.
Webhook validation#
Schema validation catches structural issues (wrong types, missing fields). Webhook validation catches semantic issues (business logic).
Two types of webhooks:
Validating webhooks#
Reject invalid resources. The API server sends the resource to your webhook, and the webhook returns allow or deny.
Examples:
- Reject if storage is being decreased (data loss risk)
- Reject if engine is changed (cannot migrate in place)
- Reject if total replicas across all databases exceed cluster capacity
Mutating webhooks#
Modify resources before they are stored. The API server sends the resource, and the webhook returns a JSON patch.
Examples:
- Inject default labels and annotations
- Set resource requests/limits based on the database engine
- Add sidecar containers for monitoring
Execution order: Mutating webhooks run first, then validating webhooks. This ensures validation sees the final, mutated version of the resource.
Versioning CRDs#
As your CRD evolves, you need to support multiple API versions simultaneously.
Version strategy:
versions:
- name: v1alpha1
served: true
storage: false
- name: v1beta1
served: true
storage: false
- name: v1
served: true
storage: true
Rules:
- Only one version can be
storage: true(the version stored in etcd) - Multiple versions can be
served: true(accepted by the API server) - Set old versions to
served: falsewhen you want to stop accepting them
Conversion webhooks translate between versions. When a client requests v1alpha1 but the resource is stored as v1, the API server calls your conversion webhook to translate.
Best practices for CRD versioning:
- Never remove fields — deprecate them and stop using them in the controller
- Add new fields as optional with defaults so existing resources remain valid
- Use conversion webhooks to translate between versions, not data migration scripts
- Promote through alpha, beta, stable — each promotion is a signal of API stability
Building controllers with frameworks#
Writing controllers from scratch means handling: API watches, work queues, caching, leader election, metrics, and error handling. Use a framework instead.
Kubebuilder (Go):
The official Kubernetes controller framework. Generates boilerplate, test scaffolding, and RBAC manifests.
kubebuilder init --domain example.com
kubebuilder create api --group infra --version v1 --kind Database
This generates the CRD, controller skeleton, and Dockerfile. You fill in the reconcile function.
Operator SDK:
Built on Kubebuilder. Adds support for Helm-based and Ansible-based operators in addition to Go.
Metacontroller:
Declarative controller framework. You write the reconciliation logic as a webhook (any language), and Metacontroller handles the Kubernetes API interaction.
Production checklist#
Before deploying a CRD to production:
- Schema validation — comprehensive
openAPIV3Schemawith required fields, enums, patterns - Status subresource — separate spec from status
- Printer columns — make
kubectl getoutput useful - Webhook validation — catch business logic violations at admission time
- RBAC — restrict who can create, update, delete your custom resources
- Finalizers — clean up external resources when custom resources are deleted
- Metrics — expose reconciliation latency, error rate, queue depth
- Leader election — prevent multiple controller instances from conflicting
- Conversion webhooks — if you have or plan multiple API versions
Summary#
- CRDs extend the Kubernetes API with your own resource types
- Schema validation catches structural errors at admission time
- Controllers reconcile desired state with actual state in a continuous loop
- Operators encode domain knowledge for full lifecycle management
- Status subresource separates user intent from system state
- Webhook validation enforces business logic beyond what schemas can express
- Versioning uses alpha/beta/stable progression with conversion webhooks
Article #457 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Cost Estimator
See estimated AWS monthly costs for every component in your architecture
GitHub Integration
Paste a repo URL and generate architecture from your actual codebase
Related articles
Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency
8 min read
system designCircuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j
7 min read
testingAPI Contract Testing with Pact — Consumer-Driven Contracts for Microservices
8 min read
Try these templates
Kubernetes Container Orchestration
K8s cluster with pod scheduling, service mesh, auto-scaling, and CI/CD deployment pipeline.
9 componentsCustomer Support Platform
Zendesk-like helpdesk with tickets, live chat, knowledge base, and AI-powered auto-responses.
8 componentsAWS Lambda Serverless Architecture
Event-driven serverless computing with API Gateway, Lambda functions, DynamoDB, S3, and SQS.
10 componentsBuild this architecture
Generate an interactive architecture for Kubernetes Custom Resource Definitions in seconds.
Try it in Codelit →
Comments