gRPCAPI designProtocol Buffersmicroservicessystem design

gRPC Architecture: High-Performance APIs with Protocol Buffers and HTTP/2

March 29, 2026 6 min readBy Codelit Team Discussion

gRPC Architecture#

REST dominated API design for two decades. gRPC challenges that dominance with binary serialization, HTTP/2 multiplexing, and first-class streaming — enabling communication patterns that REST cannot match.

Why gRPC#

REST/JSON:    Text parsing → large payloads → one request per connection
gRPC/Protobuf: Binary parsing → compact payloads → multiplexed streams

gRPC delivers:

10x smaller payloads than JSON (Protocol Buffers binary encoding)
Multiplexed requests over a single TCP connection (HTTP/2)
Bidirectional streaming — both client and server send message sequences
Code generation — type-safe clients and servers in 12+ languages
Deadlines and cancellation — built into the protocol

Protocol Buffers (Protobuf)#

Protobuf is gRPC's interface definition language and serialization format.

syntax = "proto3";

package ecommerce;

service OrderService {
  rpc CreateOrder (CreateOrderRequest) returns (Order);
  rpc GetOrder (GetOrderRequest) returns (Order);
  rpc StreamOrderUpdates (GetOrderRequest) returns (stream OrderUpdate);
  rpc BulkCreateOrders (stream CreateOrderRequest) returns (BulkCreateResponse);
}

message CreateOrderRequest {
  string customer_id = 1;
  repeated OrderItem items = 2;
  Address shipping_address = 3;
}

message Order {
  string id = 1;
  string customer_id = 2;
  repeated OrderItem items = 3;
  OrderStatus status = 4;
  google.protobuf.Timestamp created_at = 5;
}

enum OrderStatus {
  ORDER_STATUS_UNSPECIFIED = 0;
  ORDER_STATUS_PENDING = 1;
  ORDER_STATUS_CONFIRMED = 2;
  ORDER_STATUS_SHIPPED = 3;
}

Key Protobuf rules:

Fields are identified by number, not name — safe to rename fields
Use repeated for arrays, map for dictionaries
Never reuse or change a field number after deployment
Always include an UNSPECIFIED = 0 value for enums

HTTP/2 Multiplexing#

HTTP/1.1 sends one request per connection. HTTP/2 multiplexes many requests over a single connection using streams.

HTTP/1.1:
  Connection 1: Request A → Response A
  Connection 2: Request B → Response B
  Connection 3: Request C → Response C

HTTP/2:
  Connection 1:
    Stream 1: Request A → Response A
    Stream 2: Request B → Response B
    Stream 3: Request C → Response C

Benefits for gRPC:

No head-of-line blocking at the HTTP layer
Fewer TCP connections = less overhead
Header compression (HPACK) reduces repetitive metadata
Server push enables proactive data delivery

Streaming Patterns#

gRPC supports four communication patterns:

1. Unary RPC#

Standard request-response. One message in, one message out.

rpc GetOrder (GetOrderRequest) returns (Order);

2. Server Streaming#

Client sends one request, server returns a stream of messages.

rpc StreamOrderUpdates (GetOrderRequest) returns (stream OrderUpdate);

Use case: live dashboard updates, log tailing, real-time notifications.

3. Client Streaming#

Client sends a stream of messages, server returns one response.

rpc BulkCreateOrders (stream CreateOrderRequest) returns (BulkCreateResponse);

Use case: file uploads, batch ingestion, telemetry reporting.

4. Bidirectional Streaming#

Both sides send message streams independently.

rpc Chat (stream ChatMessage) returns (stream ChatMessage);

Use case: real-time chat, collaborative editing, game state sync.

Client                          Server
  │── ChatMessage ──────────────▶│
  │── ChatMessage ──────────────▶│
  │◀────────────── ChatMessage ──│
  │── ChatMessage ──────────────▶│
  │◀────────────── ChatMessage ──│
  │◀────────────── ChatMessage ──│

gRPC vs REST#

Aspect	gRPC	REST
Serialization	Protobuf (binary)	JSON (text)
Transport	HTTP/2	HTTP/1.1 or HTTP/2
Streaming	Native (4 patterns)	Limited (SSE, WebSocket)
Code generation	Built-in	OpenAPI/Swagger (separate)
Browser support	Via gRPC-Web proxy	Native
Human readability	Binary (needs tooling)	Plain text
Caching	No native HTTP caching	HTTP cache headers
Adoption	Internal services	Public APIs

Use gRPC for: service-to-service communication, high-throughput pipelines, streaming, polyglot microservices.

Use REST for: public APIs, browser clients, simple CRUD, when HTTP caching matters.

Deadlines and Timeouts#

gRPC has first-class support for deadlines that propagate across service calls.

Client (deadline: 5s)
  → Service A (remaining: 4.8s)
    → Service B (remaining: 3.2s)
      → Service C (remaining: 1.1s) → DEADLINE_EXCEEDED

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()

resp, err := client.GetOrder(ctx, &pb.GetOrderRequest{Id: "123"})
if err != nil {
    st, _ := status.FromError(err)
    if st.Code() == codes.DeadlineExceeded {
        // Handle timeout
    }
}

Best practice: Always set deadlines. A missing deadline means a request can hang forever.

Interceptors (Middleware)#

Interceptors are gRPC's middleware pattern — they wrap every RPC call for cross-cutting concerns.

// Unary server interceptor for logging
func loggingInterceptor(
    ctx context.Context,
    req interface{},
    info *grpc.UnaryServerInfo,
    handler grpc.UnaryHandler,
) (interface{}, error) {
    start := time.Now()
    resp, err := handler(ctx, req)
    log.Printf("method=%s duration=%s error=%v",
        info.FullMethod, time.Since(start), err)
    return resp, err
}

server := grpc.NewServer(
    grpc.UnaryInterceptor(loggingInterceptor),
)

Common interceptor uses:

Authentication — validate tokens from metadata
Logging — record method, duration, status
Metrics — Prometheus counters and histograms
Rate limiting — throttle by client identity
Retry — automatic retry with backoff

Load Balancing#

gRPC uses long-lived HTTP/2 connections, which breaks traditional L4 (TCP) load balancers. You need L7 (application-layer) load balancing.

┌─────────┐     L7 Load Balancer      ┌──────────┐
│ Client   │────▶  (Envoy / Linkerd)  ──▶│ Server 1 │
│          │     inspects HTTP/2       │ Server 2 │
│          │     frames per-RPC        │ Server 3 │
└─────────┘                            └──────────┘

Options:

Client-side — client discovers servers and balances (grpc-go built-in)
Proxy-based — Envoy, NGINX, or HAProxy with HTTP/2 support
Service mesh — Istio or Linkerd handle it transparently

Service Mesh Integration#

gRPC works naturally with service meshes because both operate at L7:

Pod A                              Pod B
┌────────────────┐                ┌────────────────┐
│ App (gRPC)     │                │ App (gRPC)     │
│       │        │                │       ▲        │
│       ▼        │                │       │        │
│ Envoy Sidecar  │───── mTLS ────│ Envoy Sidecar  │
└────────────────┘                └────────────────┘

The mesh provides:

Mutual TLS without application code changes
Per-RPC load balancing and retries
Distributed tracing headers propagation
Circuit breaking on error rates

Developer Tools#

grpcurl#

Command-line tool for interacting with gRPC servers (like curl for gRPC).

# List services
grpcurl -plaintext localhost:50051 list

# Describe a service
grpcurl -plaintext localhost:50051 describe ecommerce.OrderService

# Call an RPC
grpcurl -plaintext -d '{"id": "order-123"}' \
  localhost:50051 ecommerce.OrderService/GetOrder

Evans#

Interactive gRPC client with a REPL interface.

evans --host localhost --port 50051 -r repl

# Inside Evans REPL
ecommerce.OrderService@localhost:50051&gt; call GetOrder
id (TYPE_STRING) =&gt; order-123

Other Tools#

Buf — linting, breaking change detection, and code generation for Protobuf
BloomRPC — GUI client for gRPC (like Postman)
gRPC Gateway — generate REST reverse-proxy from Protobuf definitions

gRPC is the backbone of modern service-to-service communication. Its combination of Protobuf efficiency, HTTP/2 multiplexing, and native streaming makes it the default choice for internal microservice APIs.

Building high-performance distributed systems? Explore 245 engineering articles on codelit.io.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

AI Architecture Review

Get an AI audit covering security gaps, bottlenecks, and scaling risks

Build this architecture →

Comments

AI search

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

8 min read

AI safety

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

8 min read

API design

API Backward Compatibility: Ship Changes Without Breaking Consumers

6 min read

Try these templates

Scalable SaaS Application

Modern SaaS with microservices, event-driven processing, and multi-tenant architecture.

10 components

Microservices with API Gateway

Microservices architecture with API gateway, service discovery, circuit breakers, and distributed tracing.

10 components

Event Sourcing with CQRS

Event-driven architecture with separate read/write models, event store, projections, and eventual consistency.

10 components

Build this architecture

Generate an interactive gRPC Architecture in seconds.

Try it in Codelit →

gRPCAPI designProtocol Buffersmicroservicessystem design

gRPC Architecture: High-Performance APIs with Protocol Buffers and HTTP/2

March 29, 2026 6 min readBy Codelit Team Discussion

gRPC Architecture#

Why gRPC#

REST/JSON:    Text parsing → large payloads → one request per connection
gRPC/Protobuf: Binary parsing → compact payloads → multiplexed streams

gRPC delivers:

10x smaller payloads than JSON (Protocol Buffers binary encoding)
Multiplexed requests over a single TCP connection (HTTP/2)
Bidirectional streaming — both client and server send message sequences
Code generation — type-safe clients and servers in 12+ languages
Deadlines and cancellation — built into the protocol

Protocol Buffers (Protobuf)#

Protobuf is gRPC's interface definition language and serialization format.

syntax = "proto3";

package ecommerce;

service OrderService {
  rpc CreateOrder (CreateOrderRequest) returns (Order);
  rpc GetOrder (GetOrderRequest) returns (Order);
  rpc StreamOrderUpdates (GetOrderRequest) returns (stream OrderUpdate);
  rpc BulkCreateOrders (stream CreateOrderRequest) returns (BulkCreateResponse);
}

message CreateOrderRequest {
  string customer_id = 1;
  repeated OrderItem items = 2;
  Address shipping_address = 3;
}

message Order {
  string id = 1;
  string customer_id = 2;
  repeated OrderItem items = 3;
  OrderStatus status = 4;
  google.protobuf.Timestamp created_at = 5;
}

enum OrderStatus {
  ORDER_STATUS_UNSPECIFIED = 0;
  ORDER_STATUS_PENDING = 1;
  ORDER_STATUS_CONFIRMED = 2;
  ORDER_STATUS_SHIPPED = 3;
}

Key Protobuf rules:

Fields are identified by number, not name — safe to rename fields
Use repeated for arrays, map for dictionaries
Never reuse or change a field number after deployment
Always include an UNSPECIFIED = 0 value for enums

HTTP/2 Multiplexing#

HTTP/1.1 sends one request per connection. HTTP/2 multiplexes many requests over a single connection using streams.

HTTP/1.1:
  Connection 1: Request A → Response A
  Connection 2: Request B → Response B
  Connection 3: Request C → Response C

HTTP/2:
  Connection 1:
    Stream 1: Request A → Response A
    Stream 2: Request B → Response B
    Stream 3: Request C → Response C

Benefits for gRPC:

No head-of-line blocking at the HTTP layer
Fewer TCP connections = less overhead
Header compression (HPACK) reduces repetitive metadata
Server push enables proactive data delivery

Streaming Patterns#

gRPC supports four communication patterns:

1. Unary RPC#

Standard request-response. One message in, one message out.

rpc GetOrder (GetOrderRequest) returns (Order);

2. Server Streaming#

Client sends one request, server returns a stream of messages.

rpc StreamOrderUpdates (GetOrderRequest) returns (stream OrderUpdate);

Use case: live dashboard updates, log tailing, real-time notifications.

3. Client Streaming#

Client sends a stream of messages, server returns one response.

rpc BulkCreateOrders (stream CreateOrderRequest) returns (BulkCreateResponse);

Use case: file uploads, batch ingestion, telemetry reporting.

4. Bidirectional Streaming#

Both sides send message streams independently.

rpc Chat (stream ChatMessage) returns (stream ChatMessage);

Use case: real-time chat, collaborative editing, game state sync.

Client                          Server
  │── ChatMessage ──────────────▶│
  │── ChatMessage ──────────────▶│
  │◀────────────── ChatMessage ──│
  │── ChatMessage ──────────────▶│
  │◀────────────── ChatMessage ──│
  │◀────────────── ChatMessage ──│

gRPC vs REST#

Aspect	gRPC	REST
Serialization	Protobuf (binary)	JSON (text)
Transport	HTTP/2	HTTP/1.1 or HTTP/2
Streaming	Native (4 patterns)	Limited (SSE, WebSocket)
Code generation	Built-in	OpenAPI/Swagger (separate)
Browser support	Via gRPC-Web proxy	Native
Human readability	Binary (needs tooling)	Plain text
Caching	No native HTTP caching	HTTP cache headers
Adoption	Internal services	Public APIs

Use gRPC for: service-to-service communication, high-throughput pipelines, streaming, polyglot microservices.

Use REST for: public APIs, browser clients, simple CRUD, when HTTP caching matters.

Deadlines and Timeouts#

gRPC has first-class support for deadlines that propagate across service calls.

Client (deadline: 5s)
  → Service A (remaining: 4.8s)
    → Service B (remaining: 3.2s)
      → Service C (remaining: 1.1s) → DEADLINE_EXCEEDED

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()

resp, err := client.GetOrder(ctx, &pb.GetOrderRequest{Id: "123"})
if err != nil {
    st, _ := status.FromError(err)
    if st.Code() == codes.DeadlineExceeded {
        // Handle timeout
    }
}

Best practice: Always set deadlines. A missing deadline means a request can hang forever.

Interceptors (Middleware)#

Interceptors are gRPC's middleware pattern — they wrap every RPC call for cross-cutting concerns.

// Unary server interceptor for logging
func loggingInterceptor(
    ctx context.Context,
    req interface{},
    info *grpc.UnaryServerInfo,
    handler grpc.UnaryHandler,
) (interface{}, error) {
    start := time.Now()
    resp, err := handler(ctx, req)
    log.Printf("method=%s duration=%s error=%v",
        info.FullMethod, time.Since(start), err)
    return resp, err
}

server := grpc.NewServer(
    grpc.UnaryInterceptor(loggingInterceptor),
)

Common interceptor uses:

Authentication — validate tokens from metadata
Logging — record method, duration, status
Metrics — Prometheus counters and histograms
Rate limiting — throttle by client identity
Retry — automatic retry with backoff

Load Balancing#

gRPC uses long-lived HTTP/2 connections, which breaks traditional L4 (TCP) load balancers. You need L7 (application-layer) load balancing.

┌─────────┐     L7 Load Balancer      ┌──────────┐
│ Client   │────▶  (Envoy / Linkerd)  ──▶│ Server 1 │
│          │     inspects HTTP/2       │ Server 2 │
│          │     frames per-RPC        │ Server 3 │
└─────────┘                            └──────────┘

Options:

Client-side — client discovers servers and balances (grpc-go built-in)
Proxy-based — Envoy, NGINX, or HAProxy with HTTP/2 support
Service mesh — Istio or Linkerd handle it transparently

Service Mesh Integration#

gRPC works naturally with service meshes because both operate at L7:

Pod A                              Pod B
┌────────────────┐                ┌────────────────┐
│ App (gRPC)     │                │ App (gRPC)     │
│       │        │                │       ▲        │
│       ▼        │                │       │        │
│ Envoy Sidecar  │───── mTLS ────│ Envoy Sidecar  │
└────────────────┘                └────────────────┘

The mesh provides:

Mutual TLS without application code changes
Per-RPC load balancing and retries
Distributed tracing headers propagation
Circuit breaking on error rates

Developer Tools#

grpcurl#

Command-line tool for interacting with gRPC servers (like curl for gRPC).

# List services
grpcurl -plaintext localhost:50051 list

# Describe a service
grpcurl -plaintext localhost:50051 describe ecommerce.OrderService

# Call an RPC
grpcurl -plaintext -d '{"id": "order-123"}' \
  localhost:50051 ecommerce.OrderService/GetOrder

Evans#

Interactive gRPC client with a REPL interface.

evans --host localhost --port 50051 -r repl

# Inside Evans REPL
ecommerce.OrderService@localhost:50051&gt; call GetOrder
id (TYPE_STRING) =&gt; order-123

Other Tools#

Buf — linting, breaking change detection, and code generation for Protobuf
BloomRPC — GUI client for gRPC (like Postman)
gRPC Gateway — generate REST reverse-proxy from Protobuf definitions

Building high-performance distributed systems? Explore 245 engineering articles on codelit.io.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

AI Architecture Review

Get an AI audit covering security gaps, bottlenecks, and scaling risks

Build this architecture →

Comments

AI search

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

8 min read

AI safety

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

8 min read

API design

API Backward Compatibility: Ship Changes Without Breaking Consumers

6 min read

Build this architecture

Generate an interactive gRPC Architecture in seconds.

Try it in Codelit →

gRPC Architecture: High-Performance APIs with Protocol Buffers and HTTP/2

gRPC Architecture#

Why gRPC#

Protocol Buffers (Protobuf)#

HTTP/2 Multiplexing#

Streaming Patterns#

1. Unary RPC#

2. Server Streaming#

3. Client Streaming#

4. Bidirectional Streaming#

gRPC vs REST#

Deadlines and Timeouts#

Interceptors (Middleware)#

Load Balancing#

Service Mesh Integration#

Developer Tools#

grpcurl#

Evans#

Other Tools#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Scalable SaaS Application

Microservices with API Gateway

Event Sourcing with CQRS

Build this architecture

gRPC Architecture: High-Performance APIs with Protocol Buffers and HTTP/2

gRPC Architecture#

Why gRPC#

Protocol Buffers (Protobuf)#

HTTP/2 Multiplexing#

Streaming Patterns#

1. Unary RPC#

2. Server Streaming#

3. Client Streaming#

4. Bidirectional Streaming#

gRPC vs REST#

Deadlines and Timeouts#

Interceptors (Middleware)#

Load Balancing#

Service Mesh Integration#

Developer Tools#

grpcurl#

Evans#

Other Tools#

Comments

Related articles

AI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG

AI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Scalable SaaS Application

Microservices with API Gateway

Event Sourcing with CQRS

Build this architecture