api-designbackendarchitecturesystem-design

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

March 29, 2026 8 min readBy Codelit Team Discussion

Why batch endpoints exist#

Every HTTP request carries overhead: DNS resolution, TCP handshake, TLS negotiation, HTTP headers, serialization, and deserialization. When a client needs to create 500 records, making 500 individual API calls is wasteful and slow.

Batch endpoints let clients send multiple operations in a single HTTP request. The server processes them together and returns results for each operation. This reduces round trips, lowers latency, and improves throughput for both client and server.

When to use batch endpoints#

Batch endpoints make sense when:

Clients regularly perform the same operation on many resources (bulk create, bulk update, bulk delete)
Mobile or high-latency clients need to minimize round trips
You need atomic or semi-atomic operations across multiple resources
Import/export workflows process hundreds or thousands of records

They are not needed when individual operations are infrequent or when real-time, one-at-a-time processing is the natural flow.

Pattern 1: Simple bulk operation endpoint#

The simplest batch pattern accepts an array of items and processes them all with the same operation.

// POST /api/v1/users/bulk-create
{
  "users": [
    { "name": "Alice", "email": "alice@example.com" },
    { "name": "Bob", "email": "bob@example.com" },
    { "name": "Carol", "email": "carol@example.com" }
  ]
}

Response:

{
  "created": 3,
  "users": [
    { "id": "u_001", "name": "Alice", "email": "alice@example.com" },
    { "id": "u_002", "name": "Bob", "email": "bob@example.com" },
    { "id": "u_003", "name": "Carol", "email": "carol@example.com" }
  ]
}

Pros: Simple to implement, easy to understand, efficient for homogeneous operations.

Cons: Only supports one operation type per request. No way to mix creates, updates, and deletes.

Pattern 2: Google-style JSON batch#

Google APIs use a pattern where each request in the batch is a self-contained HTTP request embedded in JSON. This lets you mix different operations in a single batch.

// POST /api/v1/batch
{
  "requests": [
    {
      "id": "req-1",
      "method": "POST",
      "url": "/api/v1/users",
      "body": { "name": "Alice", "email": "alice@example.com" }
    },
    {
      "id": "req-2",
      "method": "PATCH",
      "url": "/api/v1/users/u_042",
      "body": { "name": "Alice Smith" }
    },
    {
      "id": "req-3",
      "method": "DELETE",
      "url": "/api/v1/users/u_099"
    }
  ]
}

Response:

{
  "responses": [
    {
      "id": "req-1",
      "status": 201,
      "body": { "id": "u_001", "name": "Alice" }
    },
    {
      "id": "req-2",
      "status": 200,
      "body": { "id": "u_042", "name": "Alice Smith" }
    },
    {
      "id": "req-3",
      "status": 204,
      "body": null
    }
  ]
}

Each sub-request has its own status code and response body. The outer HTTP response is always 200 OK — individual failures are reported per sub-request.

Pros: Flexible, supports mixed operations, each sub-request is independently addressable.

Cons: More complex to implement, requires internal request routing.

Pattern 3: Bulk mutation with operation field#

A middle ground between simple bulk and full JSON batch. Each item specifies its operation:

// POST /api/v1/users/batch
{
  "operations": [
    { "action": "create", "data": { "name": "Alice", "email": "alice@example.com" } },
    { "action": "update", "id": "u_042", "data": { "name": "Alice Smith" } },
    { "action": "delete", "id": "u_099" }
  ]
}

This is simpler than the Google pattern because it does not embed full HTTP requests, but still supports mixed operation types.

Partial success handling#

The hardest design decision in batch APIs: what happens when some operations succeed and others fail?

Strategy 1: All-or-nothing (transactional)#

// If any operation fails, all are rolled back
{
  "status": "failed",
  "error": "Operation 3 failed: user u_099 not found",
  "operations_applied": 0
}

Use database transactions to ensure atomicity. Simple to reason about, but one bad record blocks the entire batch.

Strategy 2: Partial success (non-transactional)#

{
  "status": "partial",
  "succeeded": 2,
  "failed": 1,
  "results": [
    { "index": 0, "status": "success", "id": "u_001" },
    { "index": 1, "status": "success", "id": "u_042" },
    { "index": 2, "status": "error", "error": { "code": "NOT_FOUND", "message": "User u_099 not found" } }
  ]
}

The HTTP status code should be 207 Multi-Status (from WebDAV, but widely adopted) or 200 OK with per-item status in the body.

Choosing a strategy#

Factor	All-or-nothing	Partial success
Data consistency	Strong	Eventual
User experience	Frustrating for large batches	Better — successes are not lost
Implementation	Simpler (single transaction)	Complex (per-item error handling)
Recovery	Retry entire batch	Retry only failed items
Best for	Financial transactions, linked records	Imports, bulk updates, notifications

Most production batch APIs use partial success because retrying an entire 1,000-item batch due to one validation error is a poor experience.

Idempotency in batch requests#

Batch requests are especially vulnerable to retry issues. If a network timeout occurs after the server processed 400 of 500 items, the client does not know which 400 succeeded.

Client-provided idempotency keys#

Each operation in the batch includes an idempotency key:

{
  "operations": [
    {
      "idempotency_key": "import-2026-03-29-item-001",
      "action": "create",
      "data": { "name": "Alice" }
    },
    {
      "idempotency_key": "import-2026-03-29-item-002",
      "action": "create",
      "data": { "name": "Bob" }
    }
  ]
}

The server stores each idempotency key with its result. On retry, operations with previously seen keys return the cached result instead of executing again.

Implementation approach#

For each operation in batch:
  1. Check idempotency key in cache/database
  2. If found: return cached result (skip execution)
  3. If not found: execute operation
  4. Store idempotency key + result (with TTL, e.g., 24 hours)

Batch-level idempotency key#

For all-or-nothing batches, a single idempotency key on the entire batch is simpler:

POST /api/v1/users/batch
Idempotency-Key: batch-2026-03-29-import-A

If the batch has already been processed, return the full cached result.

Performance considerations#

Batch size limits#

Always enforce a maximum batch size. Without limits, a single request can consume all server resources.

// Typical limits
Max items per batch: 100-1,000
Max request body size: 10MB
Max processing time: 30 seconds

Return 413 Payload Too Large or a clear error when limits are exceeded.

Sequential vs parallel processing#

Sequential: Process items one by one. Simple, predictable, easy to implement transactions. Slow for large batches.
Parallel: Process items concurrently. Faster, but harder to manage transactions and ordering. Be careful with database connection pool exhaustion.

A practical approach: process in parallel with a concurrency limit (e.g., 10 concurrent operations within a batch).

Performance comparison#

Approach	100 creates	Network round trips	Typical latency
Individual calls	100 requests	100	5-15 seconds
Batch endpoint	1 request	1	200-800ms
Batch (with DB batch insert)	1 request	1	50-200ms

The biggest gain comes from combining the batch endpoint with batch database operations (e.g., INSERT INTO ... VALUES (...), (...), (...) instead of 100 individual inserts).

Rate limiting batch requests#

Batch endpoints complicate rate limiting. A single batch request containing 500 operations should count differently than a single individual request.

Approaches#

Count by operations — each operation in the batch counts as one request against the rate limit
Count by batch — each batch counts as one request, with a separate operations-per-batch limit
Weighted — batch requests cost more against the rate limit (e.g., 1 + 0.1 per operation)

Most APIs use approach 1 because it is fairest and prevents abuse via large batches.

Error response design#

Batch error responses should make it easy for clients to identify and retry failed operations:

{
  "status": "partial",
  "succeeded": 48,
  "failed": 2,
  "results": [
    { "index": 12, "status": "error", "error": { "code": "VALIDATION_ERROR", "field": "email", "message": "Invalid email format" } },
    { "index": 37, "status": "error", "error": { "code": "DUPLICATE", "field": "email", "message": "Email already exists" } }
  ]
}

Only include failed items in the response to keep payloads small. Clients can assume unlisted items succeeded. Alternatively, include a succeeded_ids array so clients can verify.

Key design decisions checklist#

Operation scope — single operation type (bulk create) or mixed operations (JSON batch)?
Atomicity — all-or-nothing or partial success?
Idempotency — per-operation keys, batch-level key, or both?
Size limits — max items, max body size, max processing time?
Processing order — sequential, parallel, or parallel with concurrency limit?
Rate limiting — count by batch or by individual operations?
Response format — full results for every item or only errors?

Common mistakes#

No batch size limit — a client sends 100,000 items, your server runs out of memory
No idempotency support — network retries create duplicate records
Using 200 OK for partial failures — clients assume everything succeeded. Use 207 Multi-Status or include explicit status per item.
Blocking on the entire batch — for very large batches, accept the request and process asynchronously. Return a job ID the client can poll.
Ignoring database efficiency — processing 500 items as 500 individual SQL statements loses most of the batch performance benefit. Use bulk inserts and updates.

Article #442 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Build this architecture →

Comments

AI agents

Context Engineering for Agentic Systems

2 min read

AI agents

AI Agent Memory Architecture

2 min read

AI agents

Production AI Agent Deployment Checklist

2 min read

Try these templates

Netflix Video Streaming Architecture

Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.

10 components

Search Engine Architecture

Web-scale search with crawling, indexing, ranking, and sub-second query serving.

8 components

Multiplayer Game Backend

Real-time multiplayer game server with matchmaking, state sync, leaderboards, and anti-cheat.

8 components

Build this architecture

Generate an interactive architecture for Batch API Endpoints in seconds.

Try it in Codelit →

api-designbackendarchitecturesystem-design

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

March 29, 2026 8 min readBy Codelit Team Discussion

Why batch endpoints exist#

When to use batch endpoints#

Batch endpoints make sense when:

Clients regularly perform the same operation on many resources (bulk create, bulk update, bulk delete)
Mobile or high-latency clients need to minimize round trips
You need atomic or semi-atomic operations across multiple resources
Import/export workflows process hundreds or thousands of records

They are not needed when individual operations are infrequent or when real-time, one-at-a-time processing is the natural flow.

Pattern 1: Simple bulk operation endpoint#

The simplest batch pattern accepts an array of items and processes them all with the same operation.

// POST /api/v1/users/bulk-create
{
  "users": [
    { "name": "Alice", "email": "alice@example.com" },
    { "name": "Bob", "email": "bob@example.com" },
    { "name": "Carol", "email": "carol@example.com" }
  ]
}

Response:

{
  "created": 3,
  "users": [
    { "id": "u_001", "name": "Alice", "email": "alice@example.com" },
    { "id": "u_002", "name": "Bob", "email": "bob@example.com" },
    { "id": "u_003", "name": "Carol", "email": "carol@example.com" }
  ]
}

Pros: Simple to implement, easy to understand, efficient for homogeneous operations.

Cons: Only supports one operation type per request. No way to mix creates, updates, and deletes.

Pattern 2: Google-style JSON batch#

Google APIs use a pattern where each request in the batch is a self-contained HTTP request embedded in JSON. This lets you mix different operations in a single batch.

// POST /api/v1/batch
{
  "requests": [
    {
      "id": "req-1",
      "method": "POST",
      "url": "/api/v1/users",
      "body": { "name": "Alice", "email": "alice@example.com" }
    },
    {
      "id": "req-2",
      "method": "PATCH",
      "url": "/api/v1/users/u_042",
      "body": { "name": "Alice Smith" }
    },
    {
      "id": "req-3",
      "method": "DELETE",
      "url": "/api/v1/users/u_099"
    }
  ]
}

Response:

{
  "responses": [
    {
      "id": "req-1",
      "status": 201,
      "body": { "id": "u_001", "name": "Alice" }
    },
    {
      "id": "req-2",
      "status": 200,
      "body": { "id": "u_042", "name": "Alice Smith" }
    },
    {
      "id": "req-3",
      "status": 204,
      "body": null
    }
  ]
}

Each sub-request has its own status code and response body. The outer HTTP response is always 200 OK — individual failures are reported per sub-request.

Pros: Flexible, supports mixed operations, each sub-request is independently addressable.

Cons: More complex to implement, requires internal request routing.

Pattern 3: Bulk mutation with operation field#

A middle ground between simple bulk and full JSON batch. Each item specifies its operation:

// POST /api/v1/users/batch
{
  "operations": [
    { "action": "create", "data": { "name": "Alice", "email": "alice@example.com" } },
    { "action": "update", "id": "u_042", "data": { "name": "Alice Smith" } },
    { "action": "delete", "id": "u_099" }
  ]
}

This is simpler than the Google pattern because it does not embed full HTTP requests, but still supports mixed operation types.

Partial success handling#

The hardest design decision in batch APIs: what happens when some operations succeed and others fail?

Strategy 1: All-or-nothing (transactional)#

// If any operation fails, all are rolled back
{
  "status": "failed",
  "error": "Operation 3 failed: user u_099 not found",
  "operations_applied": 0
}

Use database transactions to ensure atomicity. Simple to reason about, but one bad record blocks the entire batch.

Strategy 2: Partial success (non-transactional)#

{
  "status": "partial",
  "succeeded": 2,
  "failed": 1,
  "results": [
    { "index": 0, "status": "success", "id": "u_001" },
    { "index": 1, "status": "success", "id": "u_042" },
    { "index": 2, "status": "error", "error": { "code": "NOT_FOUND", "message": "User u_099 not found" } }
  ]
}

The HTTP status code should be 207 Multi-Status (from WebDAV, but widely adopted) or 200 OK with per-item status in the body.

Choosing a strategy#

Factor	All-or-nothing	Partial success
Data consistency	Strong	Eventual
User experience	Frustrating for large batches	Better — successes are not lost
Implementation	Simpler (single transaction)	Complex (per-item error handling)
Recovery	Retry entire batch	Retry only failed items
Best for	Financial transactions, linked records	Imports, bulk updates, notifications

Most production batch APIs use partial success because retrying an entire 1,000-item batch due to one validation error is a poor experience.

Idempotency in batch requests#

Batch requests are especially vulnerable to retry issues. If a network timeout occurs after the server processed 400 of 500 items, the client does not know which 400 succeeded.

Client-provided idempotency keys#

Each operation in the batch includes an idempotency key:

{
  "operations": [
    {
      "idempotency_key": "import-2026-03-29-item-001",
      "action": "create",
      "data": { "name": "Alice" }
    },
    {
      "idempotency_key": "import-2026-03-29-item-002",
      "action": "create",
      "data": { "name": "Bob" }
    }
  ]
}

The server stores each idempotency key with its result. On retry, operations with previously seen keys return the cached result instead of executing again.

Implementation approach#

For each operation in batch:
  1. Check idempotency key in cache/database
  2. If found: return cached result (skip execution)
  3. If not found: execute operation
  4. Store idempotency key + result (with TTL, e.g., 24 hours)

Batch-level idempotency key#

For all-or-nothing batches, a single idempotency key on the entire batch is simpler:

POST /api/v1/users/batch
Idempotency-Key: batch-2026-03-29-import-A

If the batch has already been processed, return the full cached result.

Performance considerations#

Batch size limits#

Always enforce a maximum batch size. Without limits, a single request can consume all server resources.

// Typical limits
Max items per batch: 100-1,000
Max request body size: 10MB
Max processing time: 30 seconds

Return 413 Payload Too Large or a clear error when limits are exceeded.

Sequential vs parallel processing#

Sequential: Process items one by one. Simple, predictable, easy to implement transactions. Slow for large batches.
Parallel: Process items concurrently. Faster, but harder to manage transactions and ordering. Be careful with database connection pool exhaustion.

A practical approach: process in parallel with a concurrency limit (e.g., 10 concurrent operations within a batch).

Performance comparison#

Approach	100 creates	Network round trips	Typical latency
Individual calls	100 requests	100	5-15 seconds
Batch endpoint	1 request	1	200-800ms
Batch (with DB batch insert)	1 request	1	50-200ms

The biggest gain comes from combining the batch endpoint with batch database operations (e.g., INSERT INTO ... VALUES (...), (...), (...) instead of 100 individual inserts).

Rate limiting batch requests#

Batch endpoints complicate rate limiting. A single batch request containing 500 operations should count differently than a single individual request.

Approaches#

Count by operations — each operation in the batch counts as one request against the rate limit
Count by batch — each batch counts as one request, with a separate operations-per-batch limit
Weighted — batch requests cost more against the rate limit (e.g., 1 + 0.1 per operation)

Most APIs use approach 1 because it is fairest and prevents abuse via large batches.

Error response design#

Batch error responses should make it easy for clients to identify and retry failed operations:

{
  "status": "partial",
  "succeeded": 48,
  "failed": 2,
  "results": [
    { "index": 12, "status": "error", "error": { "code": "VALIDATION_ERROR", "field": "email", "message": "Invalid email format" } },
    { "index": 37, "status": "error", "error": { "code": "DUPLICATE", "field": "email", "message": "Email already exists" } }
  ]
}

Only include failed items in the response to keep payloads small. Clients can assume unlisted items succeeded. Alternatively, include a succeeded_ids array so clients can verify.

Key design decisions checklist#

Operation scope — single operation type (bulk create) or mixed operations (JSON batch)?
Atomicity — all-or-nothing or partial success?
Idempotency — per-operation keys, batch-level key, or both?
Size limits — max items, max body size, max processing time?
Processing order — sequential, parallel, or parallel with concurrency limit?
Rate limiting — count by batch or by individual operations?
Response format — full results for every item or only errors?

Common mistakes#

No batch size limit — a client sends 100,000 items, your server runs out of memory
No idempotency support — network retries create duplicate records
Using 200 OK for partial failures — clients assume everything succeeded. Use 207 Multi-Status or include explicit status per item.
Blocking on the entire batch — for very large batches, accept the request and process asynchronously. Return a job ID the client can poll.
Ignoring database efficiency — processing 500 items as 500 individual SQL statements loses most of the batch performance benefit. Use bulk inserts and updates.

Article #442 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Build this architecture →

Comments

AI agents

Build this architecture

Generate an interactive architecture for Batch API Endpoints in seconds.

Try it in Codelit →

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Why batch endpoints exist#

When to use batch endpoints#

Pattern 1: Simple bulk operation endpoint#

Pattern 2: Google-style JSON batch#

Pattern 3: Bulk mutation with operation field#

Partial success handling#

Strategy 1: All-or-nothing (transactional)#

Strategy 2: Partial success (non-transactional)#

Choosing a strategy#

Idempotency in batch requests#

Client-provided idempotency keys#

Implementation approach#

Batch-level idempotency key#

Performance considerations#

Batch size limits#

Sequential vs parallel processing#

Performance comparison#

Rate limiting batch requests#

Approaches#

Error response design#

Key design decisions checklist#

Common mistakes#

Comments

Related articles

Context Engineering for Agentic Systems

AI Agent Memory Architecture

Production AI Agent Deployment Checklist

Try these templates

Netflix Video Streaming Architecture

Search Engine Architecture

Multiplayer Game Backend

Build this architecture

Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency

Why batch endpoints exist#

When to use batch endpoints#

Pattern 1: Simple bulk operation endpoint#

Pattern 2: Google-style JSON batch#

Pattern 3: Bulk mutation with operation field#

Partial success handling#

Strategy 1: All-or-nothing (transactional)#

Strategy 2: Partial success (non-transactional)#

Choosing a strategy#

Idempotency in batch requests#

Client-provided idempotency keys#

Implementation approach#

Batch-level idempotency key#

Performance considerations#

Batch size limits#

Sequential vs parallel processing#

Performance comparison#

Rate limiting batch requests#

Approaches#

Error response design#

Key design decisions checklist#

Common mistakes#

Comments

Related articles

Context Engineering for Agentic Systems

AI Agent Memory Architecture

Production AI Agent Deployment Checklist

Try these templates

Netflix Video Streaming Architecture

Search Engine Architecture

Multiplayer Game Backend

Build this architecture