gRPC Architecture: High-Performance APIs with Protocol Buffers and HTTP/2
gRPC Architecture#
REST dominated API design for two decades. gRPC challenges that dominance with binary serialization, HTTP/2 multiplexing, and first-class streaming — enabling communication patterns that REST cannot match.
Why gRPC#
REST/JSON: Text parsing → large payloads → one request per connection
gRPC/Protobuf: Binary parsing → compact payloads → multiplexed streams
gRPC delivers:
- 10x smaller payloads than JSON (Protocol Buffers binary encoding)
- Multiplexed requests over a single TCP connection (HTTP/2)
- Bidirectional streaming — both client and server send message sequences
- Code generation — type-safe clients and servers in 12+ languages
- Deadlines and cancellation — built into the protocol
Protocol Buffers (Protobuf)#
Protobuf is gRPC's interface definition language and serialization format.
syntax = "proto3";
package ecommerce;
service OrderService {
rpc CreateOrder (CreateOrderRequest) returns (Order);
rpc GetOrder (GetOrderRequest) returns (Order);
rpc StreamOrderUpdates (GetOrderRequest) returns (stream OrderUpdate);
rpc BulkCreateOrders (stream CreateOrderRequest) returns (BulkCreateResponse);
}
message CreateOrderRequest {
string customer_id = 1;
repeated OrderItem items = 2;
Address shipping_address = 3;
}
message Order {
string id = 1;
string customer_id = 2;
repeated OrderItem items = 3;
OrderStatus status = 4;
google.protobuf.Timestamp created_at = 5;
}
enum OrderStatus {
ORDER_STATUS_UNSPECIFIED = 0;
ORDER_STATUS_PENDING = 1;
ORDER_STATUS_CONFIRMED = 2;
ORDER_STATUS_SHIPPED = 3;
}
Key Protobuf rules:
- Fields are identified by number, not name — safe to rename fields
- Use
repeatedfor arrays,mapfor dictionaries - Never reuse or change a field number after deployment
- Always include an
UNSPECIFIED = 0value for enums
HTTP/2 Multiplexing#
HTTP/1.1 sends one request per connection. HTTP/2 multiplexes many requests over a single connection using streams.
HTTP/1.1:
Connection 1: Request A → Response A
Connection 2: Request B → Response B
Connection 3: Request C → Response C
HTTP/2:
Connection 1:
Stream 1: Request A → Response A
Stream 2: Request B → Response B
Stream 3: Request C → Response C
Benefits for gRPC:
- No head-of-line blocking at the HTTP layer
- Fewer TCP connections = less overhead
- Header compression (HPACK) reduces repetitive metadata
- Server push enables proactive data delivery
Streaming Patterns#
gRPC supports four communication patterns:
1. Unary RPC#
Standard request-response. One message in, one message out.
rpc GetOrder (GetOrderRequest) returns (Order);
2. Server Streaming#
Client sends one request, server returns a stream of messages.
rpc StreamOrderUpdates (GetOrderRequest) returns (stream OrderUpdate);
Use case: live dashboard updates, log tailing, real-time notifications.
3. Client Streaming#
Client sends a stream of messages, server returns one response.
rpc BulkCreateOrders (stream CreateOrderRequest) returns (BulkCreateResponse);
Use case: file uploads, batch ingestion, telemetry reporting.
4. Bidirectional Streaming#
Both sides send message streams independently.
rpc Chat (stream ChatMessage) returns (stream ChatMessage);
Use case: real-time chat, collaborative editing, game state sync.
Client Server
│── ChatMessage ──────────────▶│
│── ChatMessage ──────────────▶│
│◀────────────── ChatMessage ──│
│── ChatMessage ──────────────▶│
│◀────────────── ChatMessage ──│
│◀────────────── ChatMessage ──│
gRPC vs REST#
| Aspect | gRPC | REST |
|---|---|---|
| Serialization | Protobuf (binary) | JSON (text) |
| Transport | HTTP/2 | HTTP/1.1 or HTTP/2 |
| Streaming | Native (4 patterns) | Limited (SSE, WebSocket) |
| Code generation | Built-in | OpenAPI/Swagger (separate) |
| Browser support | Via gRPC-Web proxy | Native |
| Human readability | Binary (needs tooling) | Plain text |
| Caching | No native HTTP caching | HTTP cache headers |
| Adoption | Internal services | Public APIs |
Use gRPC for: service-to-service communication, high-throughput pipelines, streaming, polyglot microservices.
Use REST for: public APIs, browser clients, simple CRUD, when HTTP caching matters.
Deadlines and Timeouts#
gRPC has first-class support for deadlines that propagate across service calls.
Client (deadline: 5s)
→ Service A (remaining: 4.8s)
→ Service B (remaining: 3.2s)
→ Service C (remaining: 1.1s) → DEADLINE_EXCEEDED
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
resp, err := client.GetOrder(ctx, &pb.GetOrderRequest{Id: "123"})
if err != nil {
st, _ := status.FromError(err)
if st.Code() == codes.DeadlineExceeded {
// Handle timeout
}
}
Best practice: Always set deadlines. A missing deadline means a request can hang forever.
Interceptors (Middleware)#
Interceptors are gRPC's middleware pattern — they wrap every RPC call for cross-cutting concerns.
// Unary server interceptor for logging
func loggingInterceptor(
ctx context.Context,
req interface{},
info *grpc.UnaryServerInfo,
handler grpc.UnaryHandler,
) (interface{}, error) {
start := time.Now()
resp, err := handler(ctx, req)
log.Printf("method=%s duration=%s error=%v",
info.FullMethod, time.Since(start), err)
return resp, err
}
server := grpc.NewServer(
grpc.UnaryInterceptor(loggingInterceptor),
)
Common interceptor uses:
- Authentication — validate tokens from metadata
- Logging — record method, duration, status
- Metrics — Prometheus counters and histograms
- Rate limiting — throttle by client identity
- Retry — automatic retry with backoff
Load Balancing#
gRPC uses long-lived HTTP/2 connections, which breaks traditional L4 (TCP) load balancers. You need L7 (application-layer) load balancing.
┌─────────┐ L7 Load Balancer ┌──────────┐
│ Client │────▶ (Envoy / Linkerd) ──▶│ Server 1 │
│ │ inspects HTTP/2 │ Server 2 │
│ │ frames per-RPC │ Server 3 │
└─────────┘ └──────────┘
Options:
- Client-side — client discovers servers and balances (grpc-go built-in)
- Proxy-based — Envoy, NGINX, or HAProxy with HTTP/2 support
- Service mesh — Istio or Linkerd handle it transparently
Service Mesh Integration#
gRPC works naturally with service meshes because both operate at L7:
Pod A Pod B
┌────────────────┐ ┌────────────────┐
│ App (gRPC) │ │ App (gRPC) │
│ │ │ │ ▲ │
│ ▼ │ │ │ │
│ Envoy Sidecar │───── mTLS ────│ Envoy Sidecar │
└────────────────┘ └────────────────┘
The mesh provides:
- Mutual TLS without application code changes
- Per-RPC load balancing and retries
- Distributed tracing headers propagation
- Circuit breaking on error rates
Developer Tools#
grpcurl#
Command-line tool for interacting with gRPC servers (like curl for gRPC).
# List services
grpcurl -plaintext localhost:50051 list
# Describe a service
grpcurl -plaintext localhost:50051 describe ecommerce.OrderService
# Call an RPC
grpcurl -plaintext -d '{"id": "order-123"}' \
localhost:50051 ecommerce.OrderService/GetOrder
Evans#
Interactive gRPC client with a REPL interface.
evans --host localhost --port 50051 -r repl
# Inside Evans REPL
ecommerce.OrderService@localhost:50051> call GetOrder
id (TYPE_STRING) => order-123
Other Tools#
- Buf — linting, breaking change detection, and code generation for Protobuf
- BloomRPC — GUI client for gRPC (like Postman)
- gRPC Gateway — generate REST reverse-proxy from Protobuf definitions
gRPC is the backbone of modern service-to-service communication. Its combination of Protobuf efficiency, HTTP/2 multiplexing, and native streaming makes it the default choice for internal microservice APIs.
Building high-performance distributed systems? Explore 245 engineering articles on codelit.io.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
AI Architecture Review
Get an AI audit covering security gaps, bottlenecks, and scaling risks
Related articles
Try these templates
Scalable SaaS Application
Modern SaaS with microservices, event-driven processing, and multi-tenant architecture.
10 componentsMicroservices with API Gateway
Microservices architecture with API gateway, service discovery, circuit breakers, and distributed tracing.
10 componentsEvent Sourcing with CQRS
Event-driven architecture with separate read/write models, event store, projections, and eventual consistency.
10 components
Comments