Platform Engineering: Building Internal Developer Platforms That Actually Work
Platform Engineering#
Platform engineering is the discipline of designing and building toolchains and workflows that enable software engineering organizations to be self-serving. Instead of every team reinventing infrastructure, a platform team builds once and lets everyone benefit.
Why Platform Engineering Exists#
DevOps promised that every developer would own their infrastructure. In practice, this created a different problem: cognitive overload. Developers now needed to understand Kubernetes, Terraform, CI/CD pipelines, observability stacks, and security policies — on top of writing application code.
Platform engineering fixes this by abstracting complexity behind self-service interfaces. Developers get the autonomy DevOps promised without the operational burden.
The Internal Developer Platform (IDP)#
An IDP is the product a platform team builds. It sits between developers and the underlying infrastructure, providing:
- Self-service infrastructure — spin up databases, queues, and services without filing tickets
- Golden paths — opinionated, pre-configured workflows that represent the recommended way to build
- Guardrails — security, compliance, and cost controls built into the platform itself
- Visibility — service catalogs, dependency maps, and ownership information
The key insight: an IDP is a product, not a project. It has users (developers), it needs user research, and it requires iteration.
Golden Paths#
A golden path is a supported, well-lit route through the platform. It is not the only way to do things — it is the recommended way.
What makes a good golden path#
- Opinionated but not restrictive — covers 80% of use cases with escape hatches for the rest
- End-to-end — from "git init" to production traffic, the path is clear
- Self-documenting — templates, scaffolding, and sensible defaults reduce the need for documentation
- Maintained — the platform team keeps golden paths up to date with security patches and best practices
Example golden path for a new microservice#
- Developer runs a scaffolding command or clicks "New Service" in the portal
- Template generates repo with CI/CD pipeline, Dockerfile, Helm chart, and observability config
- Service registers automatically in the service catalog
- Developer writes application code and pushes
- Pipeline builds, tests, and deploys to staging
- Promotion to production requires a single approval
No Kubernetes YAML was written. No Terraform was authored. The developer focused on business logic.
Self-Service Infrastructure#
Self-service does not mean "do whatever you want." It means "get what you need without waiting."
Levels of self-service#
Level 1 — Ticketing: Developer files a ticket, platform team provisions. Slow but controlled.
Level 2 — Templated: Developer selects from a menu of pre-approved configurations. Faster, still constrained.
Level 3 — Declarative: Developer describes what they need in a config file. Platform reconciles the desired state automatically.
Level 4 — Intent-based: Developer expresses intent ("I need a database for 10K reads/sec") and the platform chooses the implementation.
Most mature platform teams operate at Level 3, with Level 4 emerging for common patterns.
Developer Experience (DevEx)#
Platform engineering succeeds or fails based on developer experience. If developers avoid the platform, it does not matter how well-engineered it is.
Measuring developer experience#
- Time to first deploy — how long from joining to shipping code in production
- Lead time for changes — how long from commit to production
- Cognitive load surveys — do developers feel overwhelmed by infrastructure concerns
- Platform adoption rate — what percentage of services use the golden path
- Escape hatch frequency — how often developers bypass the platform (signals gaps)
Common DevEx failures#
- Forcing developers to use a portal when they prefer CLI or code
- Providing self-service that is slower than asking someone directly
- Building abstractions that leak — developers still need to understand the underlying system when things break
- Ignoring feedback because "we know better"
Backstage: The Service Catalog#
Backstage, originally built at Spotify, is the most popular open-source framework for building developer portals. It provides:
- Software catalog — a registry of all services, libraries, and infrastructure components with ownership information
- Software templates — scaffolding for new projects that follow golden paths
- TechDocs — documentation that lives alongside code and renders in the portal
- Plugin ecosystem — integrations with CI/CD, monitoring, incident management, and more
When Backstage makes sense#
- You have 50+ services and ownership is unclear
- Developers spend significant time figuring out "who owns this" or "how do I deploy this"
- You want a single pane of glass for the developer experience
When Backstage is overkill#
- Small teams (fewer than 5 developers) where everyone knows everything
- You only need one specific feature — consider a lighter tool instead
Crossplane: Infrastructure as Code, Kubernetes-Native#
Crossplane extends Kubernetes to manage external infrastructure. Instead of writing Terraform, developers define infrastructure as Kubernetes custom resources.
Why Crossplane for platform engineering#
- Kubernetes-native — if your team already knows Kubernetes, the learning curve is lower
- Compositions — platform teams define abstractions (an "Application" resource that creates a database, cache, and queue together)
- Drift detection — Crossplane continuously reconciles desired vs actual state
- GitOps compatible — infrastructure definitions live in Git and are applied through standard Kubernetes workflows
Example Crossplane composition#
A platform team creates a "DatabaseClaim" abstraction. Developers submit:
apiVersion: platform.company.io/v1alpha1
kind: DatabaseClaim
metadata:
name: orders-db
spec:
engine: postgresql
size: medium
backup: daily
Crossplane translates this into the appropriate cloud provider resources — RDS instance, security groups, parameter groups, backup configurations. The developer never sees the complexity.
Platform Team Responsibilities#
A platform team is not an infrastructure team with a new name. The responsibilities are different:
Build products, not tools#
- Conduct user research with developer teams
- Prioritize features based on developer impact, not technical interest
- Maintain SLAs for the platform itself
- Write documentation and provide onboarding
Define and maintain abstractions#
- Create Crossplane compositions, Backstage templates, and CI/CD pipelines
- Keep abstractions up to date as underlying infrastructure evolves
- Provide escape hatches for cases the abstraction does not cover
Enable, do not gate#
- The platform team should make the right thing easy, not the wrong thing impossible
- Avoid becoming a bottleneck — if developers are waiting on you, the platform has failed
- Provide guardrails (cost limits, security policies) that are automatic, not manual
Typical platform team structure#
- Platform product manager — prioritizes based on developer needs
- Platform engineers — build and maintain the IDP
- Developer advocates — bridge between platform team and users
- SRE/Infrastructure engineers — manage underlying infrastructure
A ratio of 1 platform engineer per 15-25 application developers is common in mature organizations.
Measuring Platform Success#
DORA metrics#
The four DORA metrics remain relevant:
- Deployment frequency — are teams deploying more often
- Lead time for changes — is the time from commit to production shrinking
- Change failure rate — are deployments causing fewer incidents
- Mean time to recovery — when things break, do they recover faster
Platform-specific metrics#
- Adoption rate — percentage of teams using the platform vs going around it
- Time to onboard — how quickly a new developer or team becomes productive
- Toil reduction — hours saved on repetitive infrastructure tasks
- Support ticket volume — fewer tickets means better self-service
- Developer satisfaction (NPS) — survey developers quarterly
Anti-metrics#
Watch out for metrics that look good but hide problems:
- High adoption with low satisfaction means developers feel forced, not helped
- Low support tickets with high escape-hatch usage means developers gave up on support
- Fast onboarding with high change failure rate means golden paths have quality gaps
Getting Started#
If you are building a platform team from scratch:
- Start with pain points — survey developers about their biggest time sinks
- Pick one golden path — do not try to platform everything at once
- Build the thinnest viable platform — a CLI tool and some templates can be a platform
- Measure and iterate — treat the platform like a product with regular releases
- Resist the urge to mandate — adoption through value beats adoption through policy
The best platforms are the ones developers choose to use.
Design your platform architecture visually — try Codelit to generate interactive infrastructure diagrams with AI.
340 articles and guides at codelit.io/blog.
Try it on Codelit
GitHub Integration
Paste a repo URL and generate architecture from your actual codebase
Related articles
Try these templates
Instagram-like Photo Sharing Platform
Full-stack social media platform with image processing, feeds, and real-time notifications.
12 componentsDiscord Voice & Communication Platform
Handles millions of concurrent voice calls with WebRTC, media servers, and guild-based routing.
10 componentsSpotify Music Streaming Platform
Music streaming with personalized recommendations, offline sync, and social features.
9 componentsBuild this architecture
Generate an interactive architecture for Platform Engineering in seconds.
Try it in Codelit →
Comments