Batch API Endpoints — Patterns for Bulk Operations, Partial Success, and Idempotency
Why batch endpoints exist#
Every HTTP request carries overhead: DNS resolution, TCP handshake, TLS negotiation, HTTP headers, serialization, and deserialization. When a client needs to create 500 records, making 500 individual API calls is wasteful and slow.
Batch endpoints let clients send multiple operations in a single HTTP request. The server processes them together and returns results for each operation. This reduces round trips, lowers latency, and improves throughput for both client and server.
When to use batch endpoints#
Batch endpoints make sense when:
- Clients regularly perform the same operation on many resources (bulk create, bulk update, bulk delete)
- Mobile or high-latency clients need to minimize round trips
- You need atomic or semi-atomic operations across multiple resources
- Import/export workflows process hundreds or thousands of records
They are not needed when individual operations are infrequent or when real-time, one-at-a-time processing is the natural flow.
Pattern 1: Simple bulk operation endpoint#
The simplest batch pattern accepts an array of items and processes them all with the same operation.
// POST /api/v1/users/bulk-create
{
"users": [
{ "name": "Alice", "email": "alice@example.com" },
{ "name": "Bob", "email": "bob@example.com" },
{ "name": "Carol", "email": "carol@example.com" }
]
}
Response:
{
"created": 3,
"users": [
{ "id": "u_001", "name": "Alice", "email": "alice@example.com" },
{ "id": "u_002", "name": "Bob", "email": "bob@example.com" },
{ "id": "u_003", "name": "Carol", "email": "carol@example.com" }
]
}
Pros: Simple to implement, easy to understand, efficient for homogeneous operations.
Cons: Only supports one operation type per request. No way to mix creates, updates, and deletes.
Pattern 2: Google-style JSON batch#
Google APIs use a pattern where each request in the batch is a self-contained HTTP request embedded in JSON. This lets you mix different operations in a single batch.
// POST /api/v1/batch
{
"requests": [
{
"id": "req-1",
"method": "POST",
"url": "/api/v1/users",
"body": { "name": "Alice", "email": "alice@example.com" }
},
{
"id": "req-2",
"method": "PATCH",
"url": "/api/v1/users/u_042",
"body": { "name": "Alice Smith" }
},
{
"id": "req-3",
"method": "DELETE",
"url": "/api/v1/users/u_099"
}
]
}
Response:
{
"responses": [
{
"id": "req-1",
"status": 201,
"body": { "id": "u_001", "name": "Alice" }
},
{
"id": "req-2",
"status": 200,
"body": { "id": "u_042", "name": "Alice Smith" }
},
{
"id": "req-3",
"status": 204,
"body": null
}
]
}
Each sub-request has its own status code and response body. The outer HTTP response is always 200 OK — individual failures are reported per sub-request.
Pros: Flexible, supports mixed operations, each sub-request is independently addressable.
Cons: More complex to implement, requires internal request routing.
Pattern 3: Bulk mutation with operation field#
A middle ground between simple bulk and full JSON batch. Each item specifies its operation:
// POST /api/v1/users/batch
{
"operations": [
{ "action": "create", "data": { "name": "Alice", "email": "alice@example.com" } },
{ "action": "update", "id": "u_042", "data": { "name": "Alice Smith" } },
{ "action": "delete", "id": "u_099" }
]
}
This is simpler than the Google pattern because it does not embed full HTTP requests, but still supports mixed operation types.
Partial success handling#
The hardest design decision in batch APIs: what happens when some operations succeed and others fail?
Strategy 1: All-or-nothing (transactional)#
// If any operation fails, all are rolled back
{
"status": "failed",
"error": "Operation 3 failed: user u_099 not found",
"operations_applied": 0
}
Use database transactions to ensure atomicity. Simple to reason about, but one bad record blocks the entire batch.
Strategy 2: Partial success (non-transactional)#
{
"status": "partial",
"succeeded": 2,
"failed": 1,
"results": [
{ "index": 0, "status": "success", "id": "u_001" },
{ "index": 1, "status": "success", "id": "u_042" },
{ "index": 2, "status": "error", "error": { "code": "NOT_FOUND", "message": "User u_099 not found" } }
]
}
The HTTP status code should be 207 Multi-Status (from WebDAV, but widely adopted) or 200 OK with per-item status in the body.
Choosing a strategy#
| Factor | All-or-nothing | Partial success |
|---|---|---|
| Data consistency | Strong | Eventual |
| User experience | Frustrating for large batches | Better — successes are not lost |
| Implementation | Simpler (single transaction) | Complex (per-item error handling) |
| Recovery | Retry entire batch | Retry only failed items |
| Best for | Financial transactions, linked records | Imports, bulk updates, notifications |
Most production batch APIs use partial success because retrying an entire 1,000-item batch due to one validation error is a poor experience.
Idempotency in batch requests#
Batch requests are especially vulnerable to retry issues. If a network timeout occurs after the server processed 400 of 500 items, the client does not know which 400 succeeded.
Client-provided idempotency keys#
Each operation in the batch includes an idempotency key:
{
"operations": [
{
"idempotency_key": "import-2026-03-29-item-001",
"action": "create",
"data": { "name": "Alice" }
},
{
"idempotency_key": "import-2026-03-29-item-002",
"action": "create",
"data": { "name": "Bob" }
}
]
}
The server stores each idempotency key with its result. On retry, operations with previously seen keys return the cached result instead of executing again.
Implementation approach#
For each operation in batch:
1. Check idempotency key in cache/database
2. If found: return cached result (skip execution)
3. If not found: execute operation
4. Store idempotency key + result (with TTL, e.g., 24 hours)
Batch-level idempotency key#
For all-or-nothing batches, a single idempotency key on the entire batch is simpler:
POST /api/v1/users/batch
Idempotency-Key: batch-2026-03-29-import-A
If the batch has already been processed, return the full cached result.
Performance considerations#
Batch size limits#
Always enforce a maximum batch size. Without limits, a single request can consume all server resources.
// Typical limits
Max items per batch: 100-1,000
Max request body size: 10MB
Max processing time: 30 seconds
Return 413 Payload Too Large or a clear error when limits are exceeded.
Sequential vs parallel processing#
- Sequential: Process items one by one. Simple, predictable, easy to implement transactions. Slow for large batches.
- Parallel: Process items concurrently. Faster, but harder to manage transactions and ordering. Be careful with database connection pool exhaustion.
A practical approach: process in parallel with a concurrency limit (e.g., 10 concurrent operations within a batch).
Performance comparison#
| Approach | 100 creates | Network round trips | Typical latency |
|---|---|---|---|
| Individual calls | 100 requests | 100 | 5-15 seconds |
| Batch endpoint | 1 request | 1 | 200-800ms |
| Batch (with DB batch insert) | 1 request | 1 | 50-200ms |
The biggest gain comes from combining the batch endpoint with batch database operations (e.g., INSERT INTO ... VALUES (...), (...), (...) instead of 100 individual inserts).
Rate limiting batch requests#
Batch endpoints complicate rate limiting. A single batch request containing 500 operations should count differently than a single individual request.
Approaches#
- Count by operations — each operation in the batch counts as one request against the rate limit
- Count by batch — each batch counts as one request, with a separate operations-per-batch limit
- Weighted — batch requests cost more against the rate limit (e.g., 1 + 0.1 per operation)
Most APIs use approach 1 because it is fairest and prevents abuse via large batches.
Error response design#
Batch error responses should make it easy for clients to identify and retry failed operations:
{
"status": "partial",
"succeeded": 48,
"failed": 2,
"results": [
{ "index": 12, "status": "error", "error": { "code": "VALIDATION_ERROR", "field": "email", "message": "Invalid email format" } },
{ "index": 37, "status": "error", "error": { "code": "DUPLICATE", "field": "email", "message": "Email already exists" } }
]
}
Only include failed items in the response to keep payloads small. Clients can assume unlisted items succeeded. Alternatively, include a succeeded_ids array so clients can verify.
Key design decisions checklist#
- Operation scope — single operation type (bulk create) or mixed operations (JSON batch)?
- Atomicity — all-or-nothing or partial success?
- Idempotency — per-operation keys, batch-level key, or both?
- Size limits — max items, max body size, max processing time?
- Processing order — sequential, parallel, or parallel with concurrency limit?
- Rate limiting — count by batch or by individual operations?
- Response format — full results for every item or only errors?
Common mistakes#
- No batch size limit — a client sends 100,000 items, your server runs out of memory
- No idempotency support — network retries create duplicate records
- Using 200 OK for partial failures — clients assume everything succeeded. Use 207 Multi-Status or include explicit status per item.
- Blocking on the entire batch — for very large batches, accept the request and process asynchronously. Return a job ID the client can poll.
- Ignoring database efficiency — processing 500 items as 500 individual SQL statements loses most of the batch performance benefit. Use bulk inserts and updates.
Article #442 in the Codelit engineering series. Explore our full library of system design, infrastructure, and architecture guides at codelit.io.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
Netflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsSearch Engine Architecture
Web-scale search with crawling, indexing, ranking, and sub-second query serving.
8 componentsMultiplayer Game Backend
Real-time multiplayer game server with matchmaking, state sync, leaderboards, and anti-cheat.
8 componentsBuild this architecture
Generate an interactive architecture for Batch API Endpoints in seconds.
Try it in Codelit →
Comments