Kubernetes DaemonSet Guide — Logging, Monitoring, Node Agents, and Rolling Updates
What is a DaemonSet and when do you need one#
A DaemonSet ensures that a copy of a Pod runs on every node in your cluster (or a selected subset). When a new node joins, the DaemonSet controller automatically schedules a Pod on it. When a node is removed, the Pod is garbage collected.
This makes DaemonSets the right choice for workloads that must run on every node rather than being scheduled by the default scheduler to arbitrary nodes.
Core use cases#
Logging agents#
Every node generates container logs, kubelet logs, and kernel logs. A DaemonSet running Fluentd, Fluent Bit, or a Vector agent collects them all and ships them to your logging backend.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluent-bit
namespace: logging
spec:
selector:
matchLabels:
app: fluent-bit
template:
metadata:
labels:
app: fluent-bit
spec:
containers:
- name: fluent-bit
image: fluent/fluent-bit:3.1
volumeMounts:
- name: varlog
mountPath: /var/log
- name: containers
mountPath: /var/lib/docker/containers
readOnly: true
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 200m
memory: 256Mi
volumes:
- name: varlog
hostPath:
path: /var/log
- name: containers
hostPath:
path: /var/lib/docker/containers
Monitoring and metrics#
Node-level metrics exporters like Prometheus Node Exporter or Datadog Agent need access to the host's /proc and /sys filesystems. A DaemonSet guarantees one exporter per node.
Networking#
CNI plugins (Calico, Cilium, Flannel) and kube-proxy itself run as DaemonSets. They configure networking on each node and must be present before other Pods can communicate.
Storage#
CSI node plugins that mount volumes (e.g., EBS CSI driver, Longhorn) run as DaemonSets to handle volume attach/detach operations on each node.
Security#
Runtime security tools like Falco or Tetragon need kernel-level access on every node to monitor syscalls and enforce policies.
Node affinity and node selectors#
By default, a DaemonSet runs on every schedulable node. Use nodeSelector or nodeAffinity to restrict which nodes get the Pod.
Simple node selector#
spec:
template:
spec:
nodeSelector:
node-role: worker
This ensures the DaemonSet only runs on nodes labeled node-role=worker, skipping control plane nodes or specialized node pools.
Node affinity for complex rules#
spec:
template:
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/arch
operator: In
values:
- amd64
- key: node-type
operator: NotIn
values:
- spot
This runs the DaemonSet only on amd64 nodes that are not spot instances. Useful when your agent binaries are architecture-specific or when spot nodes churn too frequently.
Tolerations#
Kubernetes taints nodes to repel Pods. Control plane nodes are tainted with node-role.kubernetes.io/control-plane:NoSchedule by default. DaemonSets that must run everywhere need tolerations.
Tolerate control plane nodes#
spec:
template:
spec:
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
Tolerate all taints (for critical infrastructure)#
spec:
template:
spec:
tolerations:
- operator: Exists
An empty operator: Exists toleration matches all taints. Use this for truly essential agents (logging, monitoring) that must run on every node regardless of taints.
Tolerate specific workload taints#
spec:
template:
spec:
tolerations:
- key: dedicated
value: gpu
effect: NoSchedule
- key: dedicated
value: high-memory
effect: NoSchedule
This lets your monitoring agent run on GPU and high-memory node pools that reject regular workloads.
Rolling updates#
DaemonSets support two update strategies: RollingUpdate (default) and OnDelete.
RollingUpdate configuration#
spec:
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 0
- maxUnavailable — how many nodes can have their DaemonSet Pod down during the update. The default is 1. For large clusters, set a percentage:
maxUnavailable: 25% - maxSurge — how many extra Pods can be created during the update. Setting
maxSurge: 1withmaxUnavailable: 0enables zero-downtime updates by starting the new Pod before terminating the old one
Zero-downtime update strategy#
spec:
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0
maxSurge: 1
This creates the new Pod first, waits for it to be ready, then terminates the old Pod. Essential for networking DaemonSets where even brief gaps cause dropped connections.
OnDelete strategy#
spec:
updateStrategy:
type: OnDelete
With OnDelete, Pods are only updated when they are manually deleted. Useful for agents that cannot tolerate automatic restarts or when you need full control over the rollout timing.
Priority classes#
DaemonSet Pods should almost never be evicted. When a node runs low on resources, the kubelet evicts Pods by priority. Give your DaemonSet a high priority to prevent eviction.
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: daemonset-critical
value: 1000000
globalDefault: false
description: "Priority for DaemonSet infrastructure Pods"
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
spec:
template:
spec:
priorityClassName: daemonset-critical
For truly critical system DaemonSets (CNI, kube-proxy), use the built-in system-node-critical priority class:
spec:
template:
spec:
priorityClassName: system-node-critical
This is the highest priority available. Only use it for DaemonSets that the node cannot function without.
Resource limits and requests#
DaemonSet Pods compete for resources with application Pods on each node. Set resource requests and limits carefully.
Guidelines#
| Agent Type | CPU Request | CPU Limit | Memory Request | Memory Limit |
|---|---|---|---|---|
| Logging (Fluent Bit) | 50m | 200m | 64Mi | 256Mi |
| Metrics (Node Exporter) | 25m | 100m | 32Mi | 128Mi |
| Full agent (Datadog) | 100m | 500m | 256Mi | 512Mi |
| CNI (Cilium) | 100m | 1000m | 128Mi | 512Mi |
Always set requests#
Without resource requests, your DaemonSet Pods are in the BestEffort QoS class and will be the first evicted under memory pressure. Always set at least requests:
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 200m
memory: 256Mi
Account for DaemonSet overhead in capacity planning#
If your DaemonSet consumes 256Mi per node and you have 100 nodes, that is 25Gi of cluster memory dedicated to that single DaemonSet. Factor this into node sizing. A common mistake is adding monitoring agents without adjusting node capacity, causing application Pod evictions.
Health checks#
DaemonSet Pods should have liveness and readiness probes just like any other workload:
containers:
- name: agent
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
Without probes, a crashed agent continues to count as "running" and Kubernetes will not restart it.
Common mistakes#
- No resource limits — a logging agent with a memory leak can OOM-kill application Pods on the same node
- Missing tolerations — your monitoring agent does not run on tainted GPU nodes, leaving a blind spot
- Using Deployments instead of DaemonSets — a Deployment might schedule two Pods on one node and zero on another. DaemonSets guarantee one per node.
- Ignoring update strategy — the default
maxUnavailable: 1means only one node at a time. For a 500-node cluster, a DaemonSet rollout takes hours. IncreasemaxUnavailablefor large clusters. - No priority class — under resource pressure, the kubelet evicts your monitoring agent first, exactly when you need it most
Debugging DaemonSet issues#
# Check which nodes are missing DaemonSet Pods
kubectl get ds -n logging fluent-bit
# Find nodes without the expected Pod
kubectl get nodes -o name | while read node; do
kubectl get pods -n logging -l app=fluent-bit \
--field-selector spec.nodeName=$(echo $node | cut -d/ -f2) \
--no-headers 2>/dev/null | grep -q . || echo "Missing on $node"
done
# Check why a Pod is not scheduled on a specific node
kubectl describe pod -n logging fluent-bit-xxxxx
Look for taint/toleration mismatches, insufficient resources, or node selector mismatches in the Events section.
Article #441 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Cost Estimator
See estimated AWS monthly costs for every component in your architecture
GitHub Integration
Paste a repo URL and generate architecture from your actual codebase
Related articles
Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency
8 min read
system designCircuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j
7 min read
testingAPI Contract Testing with Pact — Consumer-Driven Contracts for Microservices
8 min read
Try these templates
Build this architecture
Generate an interactive architecture for Kubernetes DaemonSet Guide in seconds.
Try it in Codelit →
Comments