microservicesapi-designsystem-designbackend

Microservices API Composition — Aggregating Data Across Services

March 29, 2026 6 min readBy Codelit Team Discussion

The composition problem#

Your frontend needs data from five services to render one page. User profile from the users service, order history from orders, recommendations from ML, notifications from messaging, loyalty points from billing.

Who is responsible for stitching this together?

Option 1 — Client-side composition#

The frontend calls each service directly.

GET /api/users/123
GET /api/orders?user=123
GET /api/recommendations?user=123
GET /api/notifications?user=123
GET /api/loyalty/123

Problems:

Multiple round trips from the client (high latency on mobile)
Client must know about every service
No place to handle cross-cutting concerns
CORS configuration for every service

This works for simple cases but falls apart at scale.

Option 2 — API Gateway composition#

The gateway aggregates responses from downstream services into a single response.

async def get_user_dashboard(user_id: str):
    user, orders, recs, notifs, loyalty = await asyncio.gather(
        fetch_user(user_id),
        fetch_orders(user_id),
        fetch_recommendations(user_id),
        fetch_notifications(user_id),
        fetch_loyalty(user_id),
    )
    return {
        "user": user,
        "orders": orders,
        "recommendations": recs,
        "notifications": notifs,
        "loyalty": loyalty,
    }

Parallel fetching is critical here. Sequential calls to five services with 100ms each means 500ms. Parallel brings it down to ~100ms.

Option 3 — Backend for Frontend (BFF)#

A dedicated backend per client type. The mobile BFF returns compact payloads. The web BFF returns richer data. The admin BFF returns everything.

Mobile App  →  Mobile BFF   →  Services
Web App     →  Web BFF      →  Services
Admin Panel →  Admin BFF    →  Services

When to use BFF:

Different clients need very different data shapes
Mobile needs minimal payloads, web needs rich responses
Teams own their BFF alongside their frontend

When NOT to use BFF:

One client type (just use a gateway)
BFFs start duplicating logic (extract shared services)

Option 4 — GraphQL federation#

Each service exposes a GraphQL subgraph. A gateway (Apollo Router, Cosmo) federates them into one schema.

# Users subgraph
type User @key(fields: "id") {
  id: ID!
  name: String!
  email: String!
}

# Orders subgraph
type User @key(fields: "id") {
  id: ID!
  orders: [Order!]!
}

type Order {
  id: ID!
  total: Float!
  status: String!
}

The client sends one query, the gateway resolves it across services:

query {
  user(id: "123") {
    name
    orders {
      total
      status
    }
    loyaltyPoints
  }
}

Advantages: clients fetch exactly what they need, no over-fetching, typed schema.

Disadvantages: operational complexity, query planning overhead, federation versioning.

Handling partial failures#

Service B is down. Do you fail the entire response or return what you have?

Strategy 1 — Fail open with defaults#

async def safe_fetch(coro, default=None):
    try:
        return await asyncio.wait_for(coro, timeout=2.0)
    except (asyncio.TimeoutError, Exception):
        return default

dashboard = {
    "user": await safe_fetch(fetch_user(uid)),
    "orders": await safe_fetch(fetch_orders(uid), default=[]),
    "recommendations": await safe_fetch(fetch_recs(uid), default=[]),
}

Return partial data with a _errors field so the client knows what is missing.

Strategy 2 — Required vs optional fields#

Mark some fields as required (user profile) and others as optional (recommendations). Fail the whole request only if required fields fail.

Strategy 3 — Cached fallbacks#

If the orders service is down, return the last cached response with a stale: true flag.

Timeout cascades#

The silent killer. Service A has a 5s timeout. It calls Service B which has a 5s timeout. B calls C which has a 5s timeout. Total possible wait: 15 seconds.

Rules for timeouts in composition:

Outer timeout must be shorter than the sum of inner timeouts — the gateway should have a 3s budget, not 15s
Deadline propagation — pass the remaining time budget downstream via headers (X-Request-Deadline)
Circuit breakers on each call — stop calling a service that is consistently slow
Shed load early — if 2 seconds have passed and you need 3 more calls, return what you have

async def compose_with_budget(user_id: str, budget_ms: int = 3000):
    deadline = time.monotonic() + budget_ms / 1000
    results = {}

    for name, coro in [("user", fetch_user(user_id)), ...]:
        remaining = deadline - time.monotonic()
        if remaining <= 0:
            break
        try:
            results[name] = await asyncio.wait_for(coro, timeout=remaining)
        except asyncio.TimeoutError:
            results[name] = None

    return results

Data consistency in composition#

Composed responses can be inconsistent. The orders service shows a new order, but the loyalty service has not processed the points yet.

Accept eventual consistency in reads. If the user just placed an order, the composed dashboard might show 0 loyalty points for a few seconds. This is fine for most use cases.

For strong consistency — don't compose. Use a single service that owns both pieces of data, or use a saga that guarantees both are updated before responding.

Performance patterns#

Response caching#

Cache composed responses at the gateway level. Even 5-second TTLs dramatically reduce load.

Request collapsing#

If 100 users request the same product page in the same second, make one call to the product service, not 100.

Selective fetching#

Include a fields parameter so clients request only what they need:

GET /api/dashboard?fields=user,orders

Skip the recommendations and loyalty calls entirely.

Choosing the right pattern#

Scenario	Pattern
Simple API, one client	API Gateway composition
Multiple client types	BFF per client
Complex data relationships	GraphQL federation
Real-time + REST mix	BFF with WebSocket layer
High traffic, simple reads	Gateway + aggressive caching

Visualize your composition architecture#

Map out your service dependencies, BFF layers, and data flow — try Codelit to generate an interactive architecture diagram.

Key takeaways#

Parallel fetching is non-negotiable — never call services sequentially
Handle partial failures gracefully — return what you have with error metadata
Propagate deadlines — outer timeouts must be shorter than inner sums
BFF pattern works when different clients need different data shapes
GraphQL federation solves composition at schema level but adds operational cost
Cache composed responses — even short TTLs make a huge difference
Accept eventual consistency in read-side composition

{ }

Explore the Discord architecture interactively

Try it →

Article #404 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Build this architecture →

Comments

API design

API Backward Compatibility: Ship Changes Without Breaking Consumers

6 min read

api design

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

8 min read

system design

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

7 min read

Try these templates

Scalable SaaS Application

Modern SaaS with microservices, event-driven processing, and multi-tenant architecture.

10 components

Multiplayer Game Backend

Real-time multiplayer game server with matchmaking, state sync, leaderboards, and anti-cheat.

8 components

Data Warehouse & Analytics

Snowflake-like data warehouse with ELT pipelines, SQL analytics, dashboards, and data governance.

8 components

Build this architecture

Generate an interactive architecture for Microservices API Composition in seconds.

Try it in Codelit →

microservicesapi-designsystem-designbackend

Microservices API Composition — Aggregating Data Across Services

March 29, 2026 6 min readBy Codelit Team Discussion

The composition problem#

Who is responsible for stitching this together?

Option 1 — Client-side composition#

The frontend calls each service directly.

GET /api/users/123
GET /api/orders?user=123
GET /api/recommendations?user=123
GET /api/notifications?user=123
GET /api/loyalty/123

Problems:

Multiple round trips from the client (high latency on mobile)
Client must know about every service
No place to handle cross-cutting concerns
CORS configuration for every service

This works for simple cases but falls apart at scale.

Option 2 — API Gateway composition#

The gateway aggregates responses from downstream services into a single response.

async def get_user_dashboard(user_id: str):
    user, orders, recs, notifs, loyalty = await asyncio.gather(
        fetch_user(user_id),
        fetch_orders(user_id),
        fetch_recommendations(user_id),
        fetch_notifications(user_id),
        fetch_loyalty(user_id),
    )
    return {
        "user": user,
        "orders": orders,
        "recommendations": recs,
        "notifications": notifs,
        "loyalty": loyalty,
    }

Parallel fetching is critical here. Sequential calls to five services with 100ms each means 500ms. Parallel brings it down to ~100ms.

Option 3 — Backend for Frontend (BFF)#

A dedicated backend per client type. The mobile BFF returns compact payloads. The web BFF returns richer data. The admin BFF returns everything.

Mobile App  →  Mobile BFF   →  Services
Web App     →  Web BFF      →  Services
Admin Panel →  Admin BFF    →  Services

When to use BFF:

Different clients need very different data shapes
Mobile needs minimal payloads, web needs rich responses
Teams own their BFF alongside their frontend

When NOT to use BFF:

One client type (just use a gateway)
BFFs start duplicating logic (extract shared services)

Option 4 — GraphQL federation#

Each service exposes a GraphQL subgraph. A gateway (Apollo Router, Cosmo) federates them into one schema.

# Users subgraph
type User @key(fields: "id") {
  id: ID!
  name: String!
  email: String!
}

# Orders subgraph
type User @key(fields: "id") {
  id: ID!
  orders: [Order!]!
}

type Order {
  id: ID!
  total: Float!
  status: String!
}

The client sends one query, the gateway resolves it across services:

query {
  user(id: "123") {
    name
    orders {
      total
      status
    }
    loyaltyPoints
  }
}

Advantages: clients fetch exactly what they need, no over-fetching, typed schema.

Disadvantages: operational complexity, query planning overhead, federation versioning.

Handling partial failures#

Service B is down. Do you fail the entire response or return what you have?

Strategy 1 — Fail open with defaults#

async def safe_fetch(coro, default=None):
    try:
        return await asyncio.wait_for(coro, timeout=2.0)
    except (asyncio.TimeoutError, Exception):
        return default

dashboard = {
    "user": await safe_fetch(fetch_user(uid)),
    "orders": await safe_fetch(fetch_orders(uid), default=[]),
    "recommendations": await safe_fetch(fetch_recs(uid), default=[]),
}

Return partial data with a _errors field so the client knows what is missing.

Strategy 2 — Required vs optional fields#

Mark some fields as required (user profile) and others as optional (recommendations). Fail the whole request only if required fields fail.

Strategy 3 — Cached fallbacks#

If the orders service is down, return the last cached response with a stale: true flag.

Timeout cascades#

The silent killer. Service A has a 5s timeout. It calls Service B which has a 5s timeout. B calls C which has a 5s timeout. Total possible wait: 15 seconds.

Rules for timeouts in composition:

Outer timeout must be shorter than the sum of inner timeouts — the gateway should have a 3s budget, not 15s
Deadline propagation — pass the remaining time budget downstream via headers (X-Request-Deadline)
Circuit breakers on each call — stop calling a service that is consistently slow
Shed load early — if 2 seconds have passed and you need 3 more calls, return what you have

async def compose_with_budget(user_id: str, budget_ms: int = 3000):
    deadline = time.monotonic() + budget_ms / 1000
    results = {}

    for name, coro in [("user", fetch_user(user_id)), ...]:
        remaining = deadline - time.monotonic()
        if remaining <= 0:
            break
        try:
            results[name] = await asyncio.wait_for(coro, timeout=remaining)
        except asyncio.TimeoutError:
            results[name] = None

    return results

Data consistency in composition#

Composed responses can be inconsistent. The orders service shows a new order, but the loyalty service has not processed the points yet.

Accept eventual consistency in reads. If the user just placed an order, the composed dashboard might show 0 loyalty points for a few seconds. This is fine for most use cases.

For strong consistency — don't compose. Use a single service that owns both pieces of data, or use a saga that guarantees both are updated before responding.

Performance patterns#

Response caching#

Cache composed responses at the gateway level. Even 5-second TTLs dramatically reduce load.

Request collapsing#

If 100 users request the same product page in the same second, make one call to the product service, not 100.

Selective fetching#

Include a fields parameter so clients request only what they need:

GET /api/dashboard?fields=user,orders

Skip the recommendations and loyalty calls entirely.

Choosing the right pattern#

Scenario	Pattern
Simple API, one client	API Gateway composition
Multiple client types	BFF per client
Complex data relationships	GraphQL federation
Real-time + REST mix	BFF with WebSocket layer
High traffic, simple reads	Gateway + aggressive caching

Visualize your composition architecture#

Map out your service dependencies, BFF layers, and data flow — try Codelit to generate an interactive architecture diagram.

Key takeaways#

Parallel fetching is non-negotiable — never call services sequentially
Handle partial failures gracefully — return what you have with error metadata
Propagate deadlines — outer timeouts must be shorter than inner sums
BFF pattern works when different clients need different data shapes
GraphQL federation solves composition at schema level but adds operational cost
Cache composed responses — even short TTLs make a huge difference
Accept eventual consistency in read-side composition

{ }

Explore the Discord architecture interactively

Try it →

Article #404 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Build this architecture →

Comments

API design

Build this architecture

Generate an interactive architecture for Microservices API Composition in seconds.

Try it in Codelit →

Microservices API Composition — Aggregating Data Across Services

The composition problem#

Option 1 — Client-side composition#

Option 2 — API Gateway composition#

Option 3 — Backend for Frontend (BFF)#

Option 4 — GraphQL federation#

Handling partial failures#

Strategy 1 — Fail open with defaults#

Strategy 2 — Required vs optional fields#

Strategy 3 — Cached fallbacks#

Timeout cascades#

Data consistency in composition#

Performance patterns#

Response caching#

Request collapsing#

Selective fetching#

Choosing the right pattern#

Visualize your composition architecture#

Key takeaways#

Comments

Related articles

API Backward Compatibility: Ship Changes Without Breaking Consumers

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

Try these templates

Scalable SaaS Application

Multiplayer Game Backend

Data Warehouse & Analytics

Build this architecture

Microservices API Composition — Aggregating Data Across Services

The composition problem#

Option 1 — Client-side composition#

Option 2 — API Gateway composition#

Option 3 — Backend for Frontend (BFF)#

Option 4 — GraphQL federation#

Handling partial failures#

Strategy 1 — Fail open with defaults#

Strategy 2 — Required vs optional fields#

Strategy 3 — Cached fallbacks#

Timeout cascades#

Data consistency in composition#

Performance patterns#

Response caching#

Request collapsing#

Selective fetching#

Choosing the right pattern#

Visualize your composition architecture#

Key takeaways#

Comments

Related articles

API Backward Compatibility: Ship Changes Without Breaking Consumers

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Circuit Breaker Implementation — State Machine, Failure Counting, Fallbacks, and Resilience4j

Try these templates

Scalable SaaS Application

Multiplayer Game Backend

Data Warehouse & Analytics

Build this architecture