A DevOps and SRE AI Agent Workflow That Does Not Make Incidents Worse
How to design a DevOps and SRE AI agent workflow for alerts, deploys, logs, traces, runbooks, approvals, and post-incident learning.
Field guides for agent workflows, MCP tooling, evals, production handoffs, and the architecture behind software that actually ships.
How to design a DevOps and SRE AI agent workflow for alerts, deploys, logs, traces, runbooks, approvals, and post-incident learning.
How to evolve APIs safely — additive changes, field deprecation, default values, Postel's law, schema evolution, consumer-driven contracts, and breaking change detection in CI.
A complete guide to API testing — unit tests, integration tests, contract testing (Pact, Dredd), load testing (k6, Artillery), security testing (OWASP ZAP), mocking (MSW, WireMock), and automation.
Complete guide to blue-green deployment — DNS vs load balancer switching, database migration challenges, rollback strategies, smoke testing, tools (AWS CodeDeploy, Kubernetes, Argo Rollouts), and cost considerations.
Master the canary deployment pattern — traffic splitting, automated canary analysis with Kayenta, rollback triggers, and tools like Argo Rollouts, Flagger, and Istio for progressive delivery.
Practical strategies for cutting cloud spend — right-sizing, reserved instances, spot instances, auto-scaling, idle resource detection, FinOps, tagging, cost allocation, and tools like Infracost, Kubecost, and AWS Cost Explorer.
Master online schema changes with pt-online-schema-change, gh-ost, expand-contract pattern, shadow columns, dual-write strategies, data backfill pipelines, and migration testing frameworks.
A complete comparison of deployment strategies — rolling update, blue-green, canary, A/B testing, shadow launch, feature flags, and recreate. Learn when to use each and how to choose.
A comprehensive guide to immutable infrastructure — golden images, Phoenix servers, blue-green deployments, Packer, container images, and why mutable infrastructure causes drift.
A complete guide to incident management — incident lifecycle, severity levels, on-call rotation, war rooms, runbooks, blameless post-mortems, SLO-based detection, and tools like PagerDuty, incident.io, and FireHydrant.
A deep dive into Helm charts — chart structure, values.yaml, templates, hooks, dependencies, chart repositories, Helmfile, and chart testing for reliable Kubernetes deployments.
Deep dive into Kubernetes ingress controllers — NGINX Ingress, Traefik, Istio Gateway, AWS ALB Controller. TLS termination, path-based routing, rate limiting, and production configuration.
Master Kubernetes init containers — use cases for DB migrations, config loading, dependency checks, ordering guarantees, failure handling, resource sharing with app containers, and sidecar comparison.
How PodDisruptionBudgets work in Kubernetes — minAvailable vs maxUnavailable, voluntary vs involuntary disruptions, rolling updates, node drain, and cluster autoscaler interaction.
K8s Secrets are base64-encoded, not encrypted. Learn External Secrets Operator, Sealed Secrets, Vault CSI, SOPS, secret rotation, RBAC hardening, and production best practices.
Build a complete observability stack for microservices with distributed tracing, service maps, golden signals, SLO dashboards, incident response workflows, and the Grafana LGTM stack.
Monorepo vs polyrepo trade-offs, tooling (Nx, Turborepo, Bazel, Lerna), dependency management, build caching, CI/CD strategies, and code ownership with CODEOWNERS.
A comprehensive guide to platform engineering — internal developer platforms, golden paths, self-service infrastructure, developer experience, Backstage, Crossplane, platform team responsibilities, and measuring success.
Understand SLI, SLO, and SLA differences, error budgets, burn rate alerts, choosing SLIs, and tools like Nobl9 and Sloth. A practical guide to the Google SRE approach.
Master all 12 factors of cloud-native application design with modern examples — codebase, dependencies, config, backing services, build/release/run, processes, port binding, concurrency, disposability, dev/prod parity, logs, and admin processes.
Design CI/CD pipelines that actually work — build, test, deploy patterns for monorepos and microservices. GitHub Actions, GitLab CI, and ArgoCD with real examples.
Learn what container orchestration solves, how Kubernetes architecture works, key patterns like sidecar and ambassador, deployment strategies, service mesh, Helm charts, and when to use K8s vs Docker Swarm vs Nomad.
Decouple deployment from release with feature flags — percentage rollouts, A/B testing, kill switches, and tools (LaunchDarkly, Unleash, PostHog). Architecture patterns and best practices.
Complete IaC guide — Terraform, Pulumi, CloudFormation, and CDK compared. State management, modules, multi-cloud patterns, and best practices for managing cloud infrastructure.
Design a production logging architecture — structured logging, log levels, centralized aggregation with ELK/EFK, log shipping (Fluentd, Filebeat, Vector), storage options, querying, alerting, and cost management.
A comprehensive guide to zero downtime migration — expand-contract patterns, shadow writes, dual reads, blue-green deployments, feature flags, data backfill, rollback strategies, and monitoring.
Complete guide to secret management — vault architecture, tools (HashiCorp Vault, AWS Secrets Manager, Doppler, 1Password), rotation strategies, dynamic secrets, encryption patterns, and Kubernetes integration.