Service-to-Service Authentication: Securing Internal Communication
When a user authenticates to your application, their identity is well understood. But what happens when Service A calls Service B, which then calls Service C? Each hop needs its own layer of trust. Service-to-service authentication ensures that every internal request is verified, authorized, and auditable.
Why Service-to-Service Auth Matters#
Monoliths communicate through in-process function calls — no network, no trust boundary. Microservices communicate over the network, and the network is hostile. Without service-to-service authentication:
- A compromised container can impersonate any service.
- Lateral movement after a breach is trivial.
- Compliance frameworks (SOC 2, PCI-DSS) require identity verification at every layer.
Mutual TLS (mTLS)#
Standard TLS authenticates the server to the client. Mutual TLS (mTLS) authenticates both sides. Each service presents a certificate, and both parties verify the other's identity before exchanging data.
┌──────────┐ ┌──────────┐
│ Service A │──── ClientHello ──────▶│ Service B │
│ │◀─── ServerHello ───────│ │
│ │◀─── Server Cert ───────│ │
│ │──── Client Cert ──────▶│ │
│ │◀─── Finished ──────────│ │
└──────────┘ └──────────┘
Key considerations:
- Certificate authority (CA): Use an internal CA — never self-signed certs in production.
- Short-lived certificates: Rotate certificates every 24 hours or less to limit exposure from a compromised key.
- Certificate revocation: Maintain a CRL or use OCSP stapling so compromised certs are rejected immediately.
Service Accounts and Static Credentials#
The simplest form of service auth uses a shared secret — an API key or service account token. While easy to implement, static credentials carry risk:
| Approach | Pros | Cons |
|---|---|---|
| API keys | Simple, language-agnostic | Must rotate manually, easy to leak |
| Service account tokens | Scoped permissions, auditable | Long-lived, stored in env vars or vaults |
| OAuth 2.0 client credentials | Standard flow, token expiry built in | Requires an authorization server |
API key rotation strategy:
- Generate a new key and add it to the allow-list.
- Deploy the new key to the calling service.
- Monitor logs to confirm zero traffic uses the old key.
- Remove the old key from the allow-list.
Overlap both keys during rotation to avoid downtime. Automate this with a secrets manager like HashiCorp Vault or AWS Secrets Manager.
JWT Propagation#
In request chains (A calls B calls C), the original caller's identity must propagate without re-authentication at every hop. JWT propagation solves this by passing a signed token through the chain.
User ──▶ API Gateway ──▶ Service A ──▶ Service B ──▶ Service C
(issues JWT) (forwards) (forwards) (validates)
Best practices:
- Do not blindly forward JWTs. Each service should validate the token signature and claims before acting.
- Use audience (
aud) claims to scope tokens to specific services. - Keep tokens short-lived (5-15 minutes) and refresh at the gateway.
- Embed minimal claims — only what downstream services need for authorization.
Token Exchange (RFC 8693)#
When Service A needs to call Service B on behalf of the user but with different scopes, use the OAuth 2.0 Token Exchange flow. Service A sends the original token to the authorization server and receives a new token scoped for Service B.
SPIFFE and SPIRE#
SPIFFE (Secure Production Identity Framework for Everyone) provides a standard for workload identity. Every workload gets a cryptographic identity document — a SVID (SPIFFE Verifiable Identity Document).
SPIRE is the reference implementation:
┌────────────────────────────────────────────┐
│ SPIRE Server │
│ ┌──────────┐ ┌───────────┐ ┌────────┐ │
│ │ Node │ │ Workload │ │ CA │ │
│ │ Attestor │ │ Registrar │ │ Plugin │ │
│ └──────────┘ └───────────┘ └────────┘ │
└────────────────────┬───────────────────────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│SPIRE Agent│ │SPIRE Agent│ │SPIRE Agent│
│ (Node 1) │ │ (Node 2) │ │ (Node 3) │
└───────────┘ └───────────┘ └───────────┘
How it works:
- SPIRE agents attest node identity using platform-specific methods (AWS instance metadata, Kubernetes service accounts).
- The SPIRE server validates attestation and issues SVIDs.
- Workloads request SVIDs from their local agent via the Workload API — no secrets stored in environment variables.
- SVIDs can be X.509 certificates (for mTLS) or JWTs (for header-based auth).
Workload Identity on Cloud Platforms#
GKE Workload Identity#
Google Kubernetes Engine maps Kubernetes service accounts to Google Cloud service accounts, eliminating the need for exported key files.
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-service
annotations:
iam.gke.io/gcp-service-account: my-service@project.iam.gserviceaccount.com
The kubelet exchanges the pod's projected service account token for a Google access token — no static credentials touch the container.
EKS Pod Identity#
Amazon EKS uses IAM Roles for Service Accounts (IRSA) or the newer EKS Pod Identity to map Kubernetes service accounts to AWS IAM roles.
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-service
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789:role/my-service-role
The AWS SDK automatically discovers the projected token and exchanges it for temporary AWS credentials via STS.
Zero Trust Between Services#
Service-to-service auth is the backbone of zero trust inside the network:
- Identity everywhere — Every service has a cryptographic identity (mTLS cert or SPIFFE SVID).
- Authenticate every request — No implicit trust based on network location.
- Authorize at the service — Each service enforces its own authorization policies, not just perimeter rules.
- Encrypt all traffic — mTLS ensures encryption in transit even inside the private network.
- Observe and audit — Log every service-to-service call with caller identity for forensic analysis.
Network Policy Is Not Enough#
Kubernetes NetworkPolicies restrict which pods can communicate but do not verify identity. A compromised pod within an allowed namespace can impersonate any service. Combine network policy with mTLS for defense in depth.
Service Mesh Integration#
Service meshes like Istio, Linkerd, and Consul Connect automate mTLS between services:
- Sidecar proxies handle certificate management and TLS termination transparently.
- Authorization policies define which services can call which endpoints.
- Certificate rotation happens automatically — typically every 24 hours.
┌──────────────────┐ ┌──────────────────┐
│ Service A │ │ Service B │
│ ┌──────────────┐ │ mTLS │ ┌──────────────┐ │
│ │ App Code │ │◀───────▶│ │ App Code │ │
│ │ │ │ │ │ │ │
│ └──────┬───────┘ │ │ └──────┬───────┘ │
│ ┌──────▼───────┐ │ │ ┌──────▼───────┐ │
│ │ Envoy Proxy │ │ │ │ Envoy Proxy │ │
│ └──────────────┘ │ │ └──────────────┘ │
└──────────────────┘ └──────────────────┘
The application code makes plain HTTP calls. The sidecar proxy upgrades them to mTLS transparently.
Choosing the Right Strategy#
| Scenario | Recommended Approach |
|---|---|
| Small team, few services | OAuth 2.0 client credentials |
| Kubernetes-native | Workload identity (GKE/EKS) + SPIFFE |
| Service mesh deployed | Mesh-managed mTLS |
| Cross-cloud or hybrid | SPIFFE/SPIRE |
| Legacy integration | API keys with automated rotation |
Key Takeaways#
- Never rely on network location as a proxy for identity. Authenticate every service-to-service call.
- mTLS provides both authentication and encryption — prefer it over token-only approaches when possible.
- SPIFFE/SPIRE standardizes workload identity across platforms and avoids vendor lock-in.
- Cloud-native workload identity (GKE, EKS) eliminates static credentials for cloud API access.
- Automate API key rotation with overlap periods to prevent downtime.
- JWT propagation carries caller identity across service hops — validate at every hop.
Service-to-service authentication transforms your internal network from a flat trust zone into a verified, encrypted, and auditable communication fabric.
Build and explore system design concepts hands-on at codelit.io.
372 articles on system design at codelit.io/blog.
Try it on Codelit
GitHub Integration
Paste any repo URL to generate an interactive architecture diagram from real code
Related articles
Try these templates
Build this architecture
Generate an interactive architecture for Service in seconds.
Try it in Codelit →
Comments