kubernetesinfrastructuredevopssystem-design

Kubernetes Resource Quotas — Namespace Limits, LimitRange, and Quota Scopes Explained

March 29, 2026 7 min readBy Codelit Team Discussion

Why resource quotas exist#

Without quotas, a single team can consume all cluster resources. One runaway deployment requesting 64 GB of memory starves every other namespace. One developer creating 500 ConfigMaps hits etcd storage limits. One batch job spinning up 200 pods exhausts the node pool.

Resource quotas are Kubernetes' mechanism for fair sharing. They set hard limits on what each namespace can consume.

ResourceQuota vs LimitRange#

These two resources solve different problems:

	ResourceQuota	LimitRange
Scope	Entire namespace	Individual pod/container
What it limits	Total CPU, memory, storage, object count	Per-container min/max/default
Enforcement	Rejects creation when quota exceeded	Sets defaults, rejects outliers
When you need it	Multi-tenant clusters	Preventing a single pod from being too greedy

Use both together. ResourceQuota caps the namespace total. LimitRange ensures individual pods are reasonable.

ResourceQuota: namespace-level limits#

Compute quotas#

Limit the total CPU and memory a namespace can request and consume:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
  namespace: team-backend
spec:
  hard:
    requests.cpu: "20"
    requests.memory: 40Gi
    limits.cpu: "40"
    limits.memory: 80Gi

Important distinction:

requests.cpu — the sum of all CPU requests in the namespace
limits.cpu — the sum of all CPU limits in the namespace

If a namespace has a compute quota, every pod must specify requests and limits. Pods without them are rejected. This is where LimitRange defaults become essential.

Storage quotas#

Limit persistent volume claims:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: storage-quota
  namespace: team-backend
spec:
  hard:
    requests.storage: 500Gi
    persistentvolumeclaims: "10"
    fast-storage.storageclass.storage.k8s.io/requests.storage: 100Gi
    fast-storage.storageclass.storage.k8s.io/persistentvolumeclaims: "3"

This limits:

Total storage across all PVCs to 500 Gi
Maximum 10 PVCs in the namespace
Maximum 100 Gi on the fast-storage StorageClass specifically
Maximum 3 PVCs using fast-storage

Object count quotas#

Limit how many Kubernetes objects a namespace can create:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: object-quota
  namespace: team-backend
spec:
  hard:
    pods: "50"
    services: "10"
    services.loadbalancers: "2"
    services.nodeports: "5"
    secrets: "20"
    configmaps: "20"
    replicationcontrollers: "10"
    resourcequotas: "1"

Why limit object counts?

LoadBalancers provision cloud resources (each one costs money)
NodePorts consume ports from a limited range (30000-32767)
Pods consume scheduler and kubelet resources even when idle
ConfigMaps/Secrets are stored in etcd, which has storage limits

LimitRange: per-pod guardrails#

LimitRange sets defaults and boundaries for individual containers:

apiVersion: v1
kind: LimitRange
metadata:
  name: container-limits
  namespace: team-backend
spec:
  limits:
    - type: Container
      default:
        cpu: 500m
        memory: 512Mi
      defaultRequest:
        cpu: 100m
        memory: 128Mi
      max:
        cpu: "4"
        memory: 8Gi
      min:
        cpu: 50m
        memory: 64Mi
    - type: Pod
      max:
        cpu: "8"
        memory: 16Gi
    - type: PersistentVolumeClaim
      max:
        storage: 50Gi
      min:
        storage: 1Gi

What each field does#

default — applied as the limits for containers that don't specify them
defaultRequest — applied as requests for containers that don't specify them
max — no container can request or be limited above this
min — no container can request below this

Why LimitRange matters with ResourceQuota#

When a ResourceQuota exists for compute resources, every pod must have requests and limits set. Without LimitRange defaults, developers must manually specify these on every container — or their pods get rejected.

LimitRange solves this by automatically injecting defaults.

Priority class quotas#

In clusters with priority-based preemption, you can set quotas per priority class:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: high-priority-quota
  namespace: team-backend
spec:
  hard:
    pods: "10"
    requests.cpu: "10"
    requests.memory: 20Gi
  scopeSelector:
    matchExpressions:
      - scopeName: PriorityClass
        operator: In
        values:
          - high-priority

This ensures that a namespace can only run 10 high-priority pods consuming up to 10 CPU and 20 Gi of memory. The same namespace might have a separate, larger quota for normal-priority workloads.

Quota scopes#

Scopes let you apply quotas to specific subsets of resources:

Scope	What it matches
`Terminating`	Pods with `activeDeadlineSeconds` set
`NotTerminating`	Pods without `activeDeadlineSeconds`
`BestEffort`	Pods with no resource requests/limits
`NotBestEffort`	Pods with at least one request or limit
`PriorityClass`	Pods matching specific priority classes
`CrossNamespacePodAffinity`	Pods with cross-namespace affinity terms

Example: separate quotas for batch and long-running workloads#

apiVersion: v1
kind: ResourceQuota
metadata:
  name: long-running-quota
  namespace: team-backend
spec:
  hard:
    pods: "20"
    requests.cpu: "10"
    requests.memory: 20Gi
  scopes:
    - NotTerminating
---
apiVersion: v1
kind: ResourceQuota
metadata:
  name: batch-quota
  namespace: team-backend
spec:
  hard:
    pods: "50"
    requests.cpu: "30"
    requests.memory: 60Gi
  scopes:
    - Terminating

Batch jobs (with activeDeadlineSeconds) get a larger allocation because they finish and release resources. Long-running services get a smaller, more predictable allocation.

Monitoring quota usage#

kubectl commands#

# View quota usage for a namespace
kubectl describe resourcequota -n team-backend

# Output:
# Name:            compute-quota
# Namespace:       team-backend
# Resource         Used    Hard
# --------         ----    ----
# limits.cpu       12      40
# limits.memory    24Gi    80Gi
# requests.cpu     6       20
# requests.memory  12Gi    40Gi

# View all quotas across all namespaces
kubectl get resourcequota --all-namespaces

Prometheus metrics#

The kube-state-metrics exporter exposes quota data:

# Current usage
kube_resourcequota{namespace="team-backend", resource="requests.cpu", type="used"}

# Hard limit
kube_resourcequota{namespace="team-backend", resource="requests.cpu", type="hard"}

Useful alerts#

# Alert when quota usage exceeds 80%
- alert: NamespaceQuotaNearLimit
  expr: |
    kube_resourcequota{type="used"}
    / kube_resourcequota{type="hard"}
    > 0.8
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Namespace {{ $labels.namespace }} quota for {{ $labels.resource }} is above 80%"

Grafana dashboard essentials#

Build a dashboard showing:

Quota utilization percentage per namespace per resource
Trend lines — are namespaces growing toward their limits?
Rejected requests — how often are pods being rejected due to quota?
Top consumers — which deployments use the most resources within each namespace?

Common pitfalls#

1. Forgetting LimitRange when adding ResourceQuota#

Adding a compute ResourceQuota without LimitRange defaults breaks all deployments that don't specify requests/limits. Pods are rejected immediately.

Fix: Always create a LimitRange alongside a ResourceQuota.

2. Quotas that are too tight#

Teams hit quota limits during legitimate scaling events (traffic spikes, deployments). They file urgent tickets to increase quotas.

Fix: Set quotas at 2-3x normal usage to accommodate bursts. Monitor and adjust quarterly.

3. Not accounting for system overhead#

DaemonSets, monitoring agents, and sidecar containers (Istio, Linkerd) consume resources in every namespace.

Fix: Account for system pods when setting quotas. If Istio injects a 128 Mi sidecar into every pod, and your quota allows 50 pods, that's 6.25 Gi of memory just for sidecars.

4. Ignoring object count quotas#

Teams create hundreds of ConfigMaps or Secrets during CI/CD runs without cleanup.

Fix: Set object count quotas and implement garbage collection for stale resources.

A practical multi-tenant setup#

# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: team-backend
  labels:
    team: backend
---
# Resource quota
apiVersion: v1
kind: ResourceQuota
metadata:
  name: default-quota
  namespace: team-backend
spec:
  hard:
    requests.cpu: "20"
    requests.memory: 40Gi
    limits.cpu: "40"
    limits.memory: 80Gi
    pods: "100"
    services: "20"
    services.loadbalancers: "3"
    persistentvolumeclaims: "20"
    requests.storage: 200Gi
---
# LimitRange defaults
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: team-backend
spec:
  limits:
    - type: Container
      default:
        cpu: 500m
        memory: 512Mi
      defaultRequest:
        cpu: 100m
        memory: 128Mi
      max:
        cpu: "4"
        memory: 8Gi

The practical takeaway#

Resource quotas are how you prevent the "noisy neighbor" problem in multi-tenant Kubernetes clusters. The implementation order:

Start with LimitRange — set sensible defaults so pods without explicit requests/limits still work
Add ResourceQuota — cap total namespace consumption for compute, storage, and object counts
Use scopes — separate quotas for batch vs long-running workloads
Monitor — track utilization with Prometheus and alert at 80% threshold
Review quarterly — adjust quotas as team needs change

Article #451 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

GitHub Integration

Paste a repo URL and generate architecture from your actual codebase

Build this architecture →

Comments

api design

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

8 min read

system design

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

7 min read

testing

API Contract Testing with Pact — Consumer-Driven Contracts for Microservices

8 min read

Try these templates

Kubernetes Container Orchestration

K8s cluster with pod scheduling, service mesh, auto-scaling, and CI/CD deployment pipeline.

9 components

CI/CD Pipeline Architecture

End-to-end continuous integration and deployment with testing, security scanning, staging, and production rollout.

10 components

Build this architecture

Generate an interactive architecture for Kubernetes Resource Quotas in seconds.

Try it in Codelit →

kubernetesinfrastructuredevopssystem-design

Kubernetes Resource Quotas — Namespace Limits, LimitRange, and Quota Scopes Explained

March 29, 2026 7 min readBy Codelit Team Discussion

Why resource quotas exist#

Resource quotas are Kubernetes' mechanism for fair sharing. They set hard limits on what each namespace can consume.

ResourceQuota vs LimitRange#

These two resources solve different problems:

	ResourceQuota	LimitRange
Scope	Entire namespace	Individual pod/container
What it limits	Total CPU, memory, storage, object count	Per-container min/max/default
Enforcement	Rejects creation when quota exceeded	Sets defaults, rejects outliers
When you need it	Multi-tenant clusters	Preventing a single pod from being too greedy

Use both together. ResourceQuota caps the namespace total. LimitRange ensures individual pods are reasonable.

ResourceQuota: namespace-level limits#

Compute quotas#

Limit the total CPU and memory a namespace can request and consume:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
  namespace: team-backend
spec:
  hard:
    requests.cpu: "20"
    requests.memory: 40Gi
    limits.cpu: "40"
    limits.memory: 80Gi

Important distinction:

requests.cpu — the sum of all CPU requests in the namespace
limits.cpu — the sum of all CPU limits in the namespace

If a namespace has a compute quota, every pod must specify requests and limits. Pods without them are rejected. This is where LimitRange defaults become essential.

Storage quotas#

Limit persistent volume claims:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: storage-quota
  namespace: team-backend
spec:
  hard:
    requests.storage: 500Gi
    persistentvolumeclaims: "10"
    fast-storage.storageclass.storage.k8s.io/requests.storage: 100Gi
    fast-storage.storageclass.storage.k8s.io/persistentvolumeclaims: "3"

This limits:

Total storage across all PVCs to 500 Gi
Maximum 10 PVCs in the namespace
Maximum 100 Gi on the fast-storage StorageClass specifically
Maximum 3 PVCs using fast-storage

Object count quotas#

Limit how many Kubernetes objects a namespace can create:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: object-quota
  namespace: team-backend
spec:
  hard:
    pods: "50"
    services: "10"
    services.loadbalancers: "2"
    services.nodeports: "5"
    secrets: "20"
    configmaps: "20"
    replicationcontrollers: "10"
    resourcequotas: "1"

Why limit object counts?

LoadBalancers provision cloud resources (each one costs money)
NodePorts consume ports from a limited range (30000-32767)
Pods consume scheduler and kubelet resources even when idle
ConfigMaps/Secrets are stored in etcd, which has storage limits

LimitRange: per-pod guardrails#

LimitRange sets defaults and boundaries for individual containers:

apiVersion: v1
kind: LimitRange
metadata:
  name: container-limits
  namespace: team-backend
spec:
  limits:
    - type: Container
      default:
        cpu: 500m
        memory: 512Mi
      defaultRequest:
        cpu: 100m
        memory: 128Mi
      max:
        cpu: "4"
        memory: 8Gi
      min:
        cpu: 50m
        memory: 64Mi
    - type: Pod
      max:
        cpu: "8"
        memory: 16Gi
    - type: PersistentVolumeClaim
      max:
        storage: 50Gi
      min:
        storage: 1Gi

What each field does#

default — applied as the limits for containers that don't specify them
defaultRequest — applied as requests for containers that don't specify them
max — no container can request or be limited above this
min — no container can request below this

Why LimitRange matters with ResourceQuota#

LimitRange solves this by automatically injecting defaults.

Priority class quotas#

In clusters with priority-based preemption, you can set quotas per priority class:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: high-priority-quota
  namespace: team-backend
spec:
  hard:
    pods: "10"
    requests.cpu: "10"
    requests.memory: 20Gi
  scopeSelector:
    matchExpressions:
      - scopeName: PriorityClass
        operator: In
        values:
          - high-priority

This ensures that a namespace can only run 10 high-priority pods consuming up to 10 CPU and 20 Gi of memory. The same namespace might have a separate, larger quota for normal-priority workloads.

Quota scopes#

Scopes let you apply quotas to specific subsets of resources:

Scope	What it matches
`Terminating`	Pods with `activeDeadlineSeconds` set
`NotTerminating`	Pods without `activeDeadlineSeconds`
`BestEffort`	Pods with no resource requests/limits
`NotBestEffort`	Pods with at least one request or limit
`PriorityClass`	Pods matching specific priority classes
`CrossNamespacePodAffinity`	Pods with cross-namespace affinity terms

Example: separate quotas for batch and long-running workloads#

apiVersion: v1
kind: ResourceQuota
metadata:
  name: long-running-quota
  namespace: team-backend
spec:
  hard:
    pods: "20"
    requests.cpu: "10"
    requests.memory: 20Gi
  scopes:
    - NotTerminating
---
apiVersion: v1
kind: ResourceQuota
metadata:
  name: batch-quota
  namespace: team-backend
spec:
  hard:
    pods: "50"
    requests.cpu: "30"
    requests.memory: 60Gi
  scopes:
    - Terminating

Batch jobs (with activeDeadlineSeconds) get a larger allocation because they finish and release resources. Long-running services get a smaller, more predictable allocation.

Monitoring quota usage#

kubectl commands#

# View quota usage for a namespace
kubectl describe resourcequota -n team-backend

# Output:
# Name:            compute-quota
# Namespace:       team-backend
# Resource         Used    Hard
# --------         ----    ----
# limits.cpu       12      40
# limits.memory    24Gi    80Gi
# requests.cpu     6       20
# requests.memory  12Gi    40Gi

# View all quotas across all namespaces
kubectl get resourcequota --all-namespaces

Prometheus metrics#

The kube-state-metrics exporter exposes quota data:

# Current usage
kube_resourcequota{namespace="team-backend", resource="requests.cpu", type="used"}

# Hard limit
kube_resourcequota{namespace="team-backend", resource="requests.cpu", type="hard"}

Useful alerts#

# Alert when quota usage exceeds 80%
- alert: NamespaceQuotaNearLimit
  expr: |
    kube_resourcequota{type="used"}
    / kube_resourcequota{type="hard"}
    > 0.8
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Namespace {{ $labels.namespace }} quota for {{ $labels.resource }} is above 80%"

Grafana dashboard essentials#

Build a dashboard showing:

Quota utilization percentage per namespace per resource
Trend lines — are namespaces growing toward their limits?
Rejected requests — how often are pods being rejected due to quota?
Top consumers — which deployments use the most resources within each namespace?

Common pitfalls#

1. Forgetting LimitRange when adding ResourceQuota#

Adding a compute ResourceQuota without LimitRange defaults breaks all deployments that don't specify requests/limits. Pods are rejected immediately.

Fix: Always create a LimitRange alongside a ResourceQuota.

2. Quotas that are too tight#

Teams hit quota limits during legitimate scaling events (traffic spikes, deployments). They file urgent tickets to increase quotas.

Fix: Set quotas at 2-3x normal usage to accommodate bursts. Monitor and adjust quarterly.

3. Not accounting for system overhead#

DaemonSets, monitoring agents, and sidecar containers (Istio, Linkerd) consume resources in every namespace.

Fix: Account for system pods when setting quotas. If Istio injects a 128 Mi sidecar into every pod, and your quota allows 50 pods, that's 6.25 Gi of memory just for sidecars.

4. Ignoring object count quotas#

Teams create hundreds of ConfigMaps or Secrets during CI/CD runs without cleanup.

Fix: Set object count quotas and implement garbage collection for stale resources.

A practical multi-tenant setup#

# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: team-backend
  labels:
    team: backend
---
# Resource quota
apiVersion: v1
kind: ResourceQuota
metadata:
  name: default-quota
  namespace: team-backend
spec:
  hard:
    requests.cpu: "20"
    requests.memory: 40Gi
    limits.cpu: "40"
    limits.memory: 80Gi
    pods: "100"
    services: "20"
    services.loadbalancers: "3"
    persistentvolumeclaims: "20"
    requests.storage: 200Gi
---
# LimitRange defaults
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: team-backend
spec:
  limits:
    - type: Container
      default:
        cpu: 500m
        memory: 512Mi
      defaultRequest:
        cpu: 100m
        memory: 128Mi
      max:
        cpu: "4"
        memory: 8Gi

The practical takeaway#

Resource quotas are how you prevent the "noisy neighbor" problem in multi-tenant Kubernetes clusters. The implementation order:

Start with LimitRange — set sensible defaults so pods without explicit requests/limits still work
Add ResourceQuota — cap total namespace consumption for compute, storage, and object counts
Use scopes — separate quotas for batch vs long-running workloads
Monitor — track utilization with Prometheus and alert at 80% threshold
Review quarterly — adjust quotas as team needs change

Article #451 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Cost Estimator

See estimated AWS monthly costs for every component in your architecture

GitHub Integration

Paste a repo URL and generate architecture from your actual codebase

Build this architecture →

Comments

api design

Try these templates

Kubernetes Container Orchestration

K8s cluster with pod scheduling, service mesh, auto-scaling, and CI/CD deployment pipeline.

9 components

CI/CD Pipeline Architecture

End-to-end continuous integration and deployment with testing, security scanning, staging, and production rollout.

10 components

Build this architecture

Generate an interactive architecture for Kubernetes Resource Quotas in seconds.

Try it in Codelit →

Kubernetes Resource Quotas — Namespace Limits, LimitRange, and Quota Scopes Explained

Why resource quotas exist#

ResourceQuota vs LimitRange#

ResourceQuota: namespace-level limits#

Compute quotas#

Storage quotas#

Object count quotas#

LimitRange: per-pod guardrails#

What each field does#

Why LimitRange matters with ResourceQuota#

Priority class quotas#

Quota scopes#

Example: separate quotas for batch and long-running workloads#

Monitoring quota usage#

kubectl commands#

Prometheus metrics#

Useful alerts#

Grafana dashboard essentials#

Common pitfalls#

1. Forgetting LimitRange when adding ResourceQuota#

2. Quotas that are too tight#

3. Not accounting for system overhead#

4. Ignoring object count quotas#

A practical multi-tenant setup#

The practical takeaway#

Comments

Related articles

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

API Contract Testing with Pact — Consumer-Driven Contracts for Microservices

Try these templates

Kubernetes Container Orchestration

CI/CD Pipeline Architecture

Build this architecture

Kubernetes Resource Quotas — Namespace Limits, LimitRange, and Quota Scopes Explained

Why resource quotas exist#

ResourceQuota vs LimitRange#

ResourceQuota: namespace-level limits#

Compute quotas#

Storage quotas#

Object count quotas#

LimitRange: per-pod guardrails#

What each field does#

Why LimitRange matters with ResourceQuota#

Priority class quotas#

Quota scopes#

Example: separate quotas for batch and long-running workloads#

Monitoring quota usage#

kubectl commands#

Prometheus metrics#

Useful alerts#

Grafana dashboard essentials#

Common pitfalls#

1. Forgetting LimitRange when adding ResourceQuota#

2. Quotas that are too tight#

3. Not accounting for system overhead#

4. Ignoring object count quotas#

A practical multi-tenant setup#

The practical takeaway#

Comments

Related articles

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

API Contract Testing with Pact — Consumer-Driven Contracts for Microservices

Try these templates

Kubernetes Container Orchestration

CI/CD Pipeline Architecture

Build this architecture