Kubernetes Resource Quotas — Namespace Limits, LimitRange, and Quota Scopes Explained
Why resource quotas exist#
Without quotas, a single team can consume all cluster resources. One runaway deployment requesting 64 GB of memory starves every other namespace. One developer creating 500 ConfigMaps hits etcd storage limits. One batch job spinning up 200 pods exhausts the node pool.
Resource quotas are Kubernetes' mechanism for fair sharing. They set hard limits on what each namespace can consume.
ResourceQuota vs LimitRange#
These two resources solve different problems:
| ResourceQuota | LimitRange | |
|---|---|---|
| Scope | Entire namespace | Individual pod/container |
| What it limits | Total CPU, memory, storage, object count | Per-container min/max/default |
| Enforcement | Rejects creation when quota exceeded | Sets defaults, rejects outliers |
| When you need it | Multi-tenant clusters | Preventing a single pod from being too greedy |
Use both together. ResourceQuota caps the namespace total. LimitRange ensures individual pods are reasonable.
ResourceQuota: namespace-level limits#
Compute quotas#
Limit the total CPU and memory a namespace can request and consume:
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
namespace: team-backend
spec:
hard:
requests.cpu: "20"
requests.memory: 40Gi
limits.cpu: "40"
limits.memory: 80Gi
Important distinction:
requests.cpu— the sum of all CPU requests in the namespacelimits.cpu— the sum of all CPU limits in the namespace
If a namespace has a compute quota, every pod must specify requests and limits. Pods without them are rejected. This is where LimitRange defaults become essential.
Storage quotas#
Limit persistent volume claims:
apiVersion: v1
kind: ResourceQuota
metadata:
name: storage-quota
namespace: team-backend
spec:
hard:
requests.storage: 500Gi
persistentvolumeclaims: "10"
fast-storage.storageclass.storage.k8s.io/requests.storage: 100Gi
fast-storage.storageclass.storage.k8s.io/persistentvolumeclaims: "3"
This limits:
- Total storage across all PVCs to 500 Gi
- Maximum 10 PVCs in the namespace
- Maximum 100 Gi on the
fast-storageStorageClass specifically - Maximum 3 PVCs using
fast-storage
Object count quotas#
Limit how many Kubernetes objects a namespace can create:
apiVersion: v1
kind: ResourceQuota
metadata:
name: object-quota
namespace: team-backend
spec:
hard:
pods: "50"
services: "10"
services.loadbalancers: "2"
services.nodeports: "5"
secrets: "20"
configmaps: "20"
replicationcontrollers: "10"
resourcequotas: "1"
Why limit object counts?
- LoadBalancers provision cloud resources (each one costs money)
- NodePorts consume ports from a limited range (30000-32767)
- Pods consume scheduler and kubelet resources even when idle
- ConfigMaps/Secrets are stored in etcd, which has storage limits
LimitRange: per-pod guardrails#
LimitRange sets defaults and boundaries for individual containers:
apiVersion: v1
kind: LimitRange
metadata:
name: container-limits
namespace: team-backend
spec:
limits:
- type: Container
default:
cpu: 500m
memory: 512Mi
defaultRequest:
cpu: 100m
memory: 128Mi
max:
cpu: "4"
memory: 8Gi
min:
cpu: 50m
memory: 64Mi
- type: Pod
max:
cpu: "8"
memory: 16Gi
- type: PersistentVolumeClaim
max:
storage: 50Gi
min:
storage: 1Gi
What each field does#
- default — applied as the
limitsfor containers that don't specify them - defaultRequest — applied as
requestsfor containers that don't specify them - max — no container can request or be limited above this
- min — no container can request below this
Why LimitRange matters with ResourceQuota#
When a ResourceQuota exists for compute resources, every pod must have requests and limits set. Without LimitRange defaults, developers must manually specify these on every container — or their pods get rejected.
LimitRange solves this by automatically injecting defaults.
Priority class quotas#
In clusters with priority-based preemption, you can set quotas per priority class:
apiVersion: v1
kind: ResourceQuota
metadata:
name: high-priority-quota
namespace: team-backend
spec:
hard:
pods: "10"
requests.cpu: "10"
requests.memory: 20Gi
scopeSelector:
matchExpressions:
- scopeName: PriorityClass
operator: In
values:
- high-priority
This ensures that a namespace can only run 10 high-priority pods consuming up to 10 CPU and 20 Gi of memory. The same namespace might have a separate, larger quota for normal-priority workloads.
Quota scopes#
Scopes let you apply quotas to specific subsets of resources:
| Scope | What it matches |
|---|---|
Terminating | Pods with activeDeadlineSeconds set |
NotTerminating | Pods without activeDeadlineSeconds |
BestEffort | Pods with no resource requests/limits |
NotBestEffort | Pods with at least one request or limit |
PriorityClass | Pods matching specific priority classes |
CrossNamespacePodAffinity | Pods with cross-namespace affinity terms |
Example: separate quotas for batch and long-running workloads#
apiVersion: v1
kind: ResourceQuota
metadata:
name: long-running-quota
namespace: team-backend
spec:
hard:
pods: "20"
requests.cpu: "10"
requests.memory: 20Gi
scopes:
- NotTerminating
---
apiVersion: v1
kind: ResourceQuota
metadata:
name: batch-quota
namespace: team-backend
spec:
hard:
pods: "50"
requests.cpu: "30"
requests.memory: 60Gi
scopes:
- Terminating
Batch jobs (with activeDeadlineSeconds) get a larger allocation because they finish and release resources. Long-running services get a smaller, more predictable allocation.
Monitoring quota usage#
kubectl commands#
# View quota usage for a namespace
kubectl describe resourcequota -n team-backend
# Output:
# Name: compute-quota
# Namespace: team-backend
# Resource Used Hard
# -------- ---- ----
# limits.cpu 12 40
# limits.memory 24Gi 80Gi
# requests.cpu 6 20
# requests.memory 12Gi 40Gi
# View all quotas across all namespaces
kubectl get resourcequota --all-namespaces
Prometheus metrics#
The kube-state-metrics exporter exposes quota data:
# Current usage
kube_resourcequota{namespace="team-backend", resource="requests.cpu", type="used"}
# Hard limit
kube_resourcequota{namespace="team-backend", resource="requests.cpu", type="hard"}
Useful alerts#
# Alert when quota usage exceeds 80%
- alert: NamespaceQuotaNearLimit
expr: |
kube_resourcequota{type="used"}
/ kube_resourcequota{type="hard"}
> 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "Namespace {{ $labels.namespace }} quota for {{ $labels.resource }} is above 80%"
Grafana dashboard essentials#
Build a dashboard showing:
- Quota utilization percentage per namespace per resource
- Trend lines — are namespaces growing toward their limits?
- Rejected requests — how often are pods being rejected due to quota?
- Top consumers — which deployments use the most resources within each namespace?
Common pitfalls#
1. Forgetting LimitRange when adding ResourceQuota#
Adding a compute ResourceQuota without LimitRange defaults breaks all deployments that don't specify requests/limits. Pods are rejected immediately.
Fix: Always create a LimitRange alongside a ResourceQuota.
2. Quotas that are too tight#
Teams hit quota limits during legitimate scaling events (traffic spikes, deployments). They file urgent tickets to increase quotas.
Fix: Set quotas at 2-3x normal usage to accommodate bursts. Monitor and adjust quarterly.
3. Not accounting for system overhead#
DaemonSets, monitoring agents, and sidecar containers (Istio, Linkerd) consume resources in every namespace.
Fix: Account for system pods when setting quotas. If Istio injects a 128 Mi sidecar into every pod, and your quota allows 50 pods, that's 6.25 Gi of memory just for sidecars.
4. Ignoring object count quotas#
Teams create hundreds of ConfigMaps or Secrets during CI/CD runs without cleanup.
Fix: Set object count quotas and implement garbage collection for stale resources.
A practical multi-tenant setup#
# Namespace
apiVersion: v1
kind: Namespace
metadata:
name: team-backend
labels:
team: backend
---
# Resource quota
apiVersion: v1
kind: ResourceQuota
metadata:
name: default-quota
namespace: team-backend
spec:
hard:
requests.cpu: "20"
requests.memory: 40Gi
limits.cpu: "40"
limits.memory: 80Gi
pods: "100"
services: "20"
services.loadbalancers: "3"
persistentvolumeclaims: "20"
requests.storage: 200Gi
---
# LimitRange defaults
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: team-backend
spec:
limits:
- type: Container
default:
cpu: 500m
memory: 512Mi
defaultRequest:
cpu: 100m
memory: 128Mi
max:
cpu: "4"
memory: 8Gi
The practical takeaway#
Resource quotas are how you prevent the "noisy neighbor" problem in multi-tenant Kubernetes clusters. The implementation order:
- Start with LimitRange — set sensible defaults so pods without explicit requests/limits still work
- Add ResourceQuota — cap total namespace consumption for compute, storage, and object counts
- Use scopes — separate quotas for batch vs long-running workloads
- Monitor — track utilization with Prometheus and alert at 80% threshold
- Review quarterly — adjust quotas as team needs change
Article #451 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Cost Estimator
See estimated AWS monthly costs for every component in your architecture
GitHub Integration
Paste a repo URL and generate architecture from your actual codebase
Related articles
Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency
8 min read
system designCircuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j
7 min read
testingAPI Contract Testing with Pact — Consumer-Driven Contracts for Microservices
8 min read
Try these templates
Build this architecture
Generate an interactive architecture for Kubernetes Resource Quotas in seconds.
Try it in Codelit →
Comments