API Error Handling: Status Codes, Problem Details & Best Practices
API Error Handling Patterns#
Good APIs are defined not by how they handle success, but by how they handle failure. Clear, consistent error responses save developers hours of debugging and prevent cascading failures.
1. HTTP Status Codes — Use Them Correctly#
The status code is the first thing clients check. Get it right.
Client Errors (4xx)#
| Code | Meaning | Use When |
|---|---|---|
| 400 | Bad Request | Malformed JSON, missing required fields |
| 401 | Unauthorized | No credentials or invalid token |
| 403 | Forbidden | Valid credentials but insufficient permissions |
| 404 | Not Found | Resource does not exist |
| 409 | Conflict | Duplicate creation, version conflict |
| 422 | Unprocessable Entity | Valid syntax but semantic errors (e.g., email format) |
| 429 | Too Many Requests | Rate limit exceeded |
Server Errors (5xx)#
| Code | Meaning | Use When |
|---|---|---|
| 500 | Internal Server Error | Unexpected bug, unhandled exception |
| 502 | Bad Gateway | Upstream service returned invalid response |
| 503 | Service Unavailable | Maintenance, overload, dependency down |
| 504 | Gateway Timeout | Upstream service did not respond in time |
Common mistakes:
- Returning 200 with
{"error": "not found"}— clients cannot distinguish success from failure by status code. - Using 500 for validation errors — that is a client error (4xx).
- Using 403 when you mean 401 — 401 means "who are you?", 403 means "I know who you are but you cannot do this."
2. RFC 7807 — Problem Details for HTTP APIs#
A standard format for error responses that every client can parse consistently.
{
"type": "https://api.example.com/errors/insufficient-funds",
"title": "Insufficient Funds",
"status": 422,
"detail": "Account balance is $10.00 but transfer requires $25.00.",
"instance": "/transfers/abc123",
"balance": 10.00,
"required": 25.00
}
Required fields:
type— URI identifying the error type (use as documentation link).title— Short human-readable summary.status— HTTP status code (repeated for convenience).
Optional fields:
detail— Longer explanation specific to this occurrence.instance— URI identifying this specific error occurrence.- Custom fields (like
balance,required) for machine-readable context.
from flask import jsonify
def problem_response(type_uri, title, status, detail=None, **kwargs):
body = {"type": type_uri, "title": title, "status": status}
if detail:
body["detail"] = detail
body.update(kwargs)
response = jsonify(body)
response.status_code = status
response.content_type = "application/problem+json"
return response
3. Consistent Error Response Format#
Even if you do not adopt RFC 7807 fully, pick a format and never deviate.
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Request validation failed.",
"details": [
{ "field": "email", "issue": "Must be a valid email address." },
{ "field": "age", "issue": "Must be at least 18." }
],
"request_id": "req_abc123"
}
}
Rules:
- Always include a machine-readable error
code. - Always include a human-readable
message. - Always include a
request_idfor support and debugging. - Return all validation errors at once, not one at a time.
4. Retry-After Header#
When returning 429 or 503, tell the client when to try again:
HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1711700000
{
"type": "https://api.example.com/errors/rate-limited",
"title": "Rate Limit Exceeded",
"status": 429,
"detail": "You have exceeded 100 requests per minute.",
"retry_after_seconds": 30
}
Without Retry-After, clients guess — and usually guess wrong, hammering your server immediately.
5. Idempotency on Errors#
Network failures mean clients will retry. Without idempotency, retries cause duplicate operations.
Client: POST /api/payments { amount: 100, idempotency_key: "pay_xyz" }
Server: 201 Created { id: "payment_123" }
# Network drops the response. Client retries:
Client: POST /api/payments { amount: 100, idempotency_key: "pay_xyz" }
Server: 200 OK { id: "payment_123" } # same result, no duplicate charge
def create_payment(request):
key = request.headers.get("Idempotency-Key")
if not key:
return problem_response(..., status=400, detail="Idempotency-Key header required")
existing = cache.get(f"idempotency:{key}")
if existing:
return existing # return cached response
result = process_payment(request.json)
cache.set(f"idempotency:{key}", result, ex=86400) # cache for 24h
return result
Key rules:
- Store the full response (status + body), not just the result.
- Idempotency keys should expire (24 hours is common).
- Return the cached response with the same status code as the original.
6. Client-Side Error Handling#
APIs are only as good as how clients consume them.
async function apiCall(url: string, options?: RequestInit) {
const response = await fetch(url, options);
if (response.ok) {
return response.json();
}
const error = await response.json();
if (response.status === 429) {
const retryAfter = parseInt(response.headers.get("Retry-After") || "5");
await sleep(retryAfter * 1000);
return apiCall(url, options); // retry once
}
if (response.status >= 500) {
// Server error — safe to retry with backoff
throw new RetryableError(error);
}
// Client error (4xx) — do NOT retry, fix the request
throw new ClientError(error);
}
Client-side rules:
- 4xx errors: Do not retry. Fix the request.
- 5xx errors: Retry with exponential backoff.
- 429 errors: Respect
Retry-After. - Network errors: Retry with backoff if the request is idempotent.
7. Logging Errors Effectively#
Log differently based on severity:
@app.errorhandler(Exception)
def handle_error(error):
request_id = request.headers.get("X-Request-ID", str(uuid.uuid4()))
if isinstance(error, ValidationError):
# 4xx — INFO level, client's problem
logger.info("Validation error", extra={
"request_id": request_id,
"errors": error.messages,
"path": request.path,
})
return problem_response(..., status=422)
# 5xx — ERROR level, our problem
logger.error("Unhandled exception", extra={
"request_id": request_id,
"path": request.path,
"method": request.method,
"traceback": traceback.format_exc(),
})
return problem_response(..., status=500,
detail="An internal error occurred. Reference: " + request_id)
Never expose stack traces to clients. Return the request_id so support can correlate.
8. Error Budgets#
An error budget defines how much failure is acceptable in a given period.
SLO: 99.9% success rate
Budget: 0.1% = 1,000 errors per 1,000,000 requests
Current month:
Total requests: 800,000
Total errors: 450
Budget consumed: 56%
Budget remaining: 350 errors
How to use error budgets:
- Budget healthy (under 50%): Ship features aggressively.
- Budget tightening (50-80%): Ship with caution, increase testing.
- Budget nearly exhausted (80%+): Freeze features, focus on reliability.
- Budget blown: All engineering effort goes to stability.
def check_error_budget():
total = metrics.get_counter("http_requests_total", period="30d")
errors = metrics.get_counter("http_errors_total", period="30d")
budget = total * 0.001 # 99.9% SLO
consumed = errors / budget if budget > 0 else 1.0
return {
"budget_total": budget,
"budget_consumed_pct": round(consumed * 100, 1),
"errors_remaining": max(0, budget - errors),
}
Quick Reference#
| Pattern | Solves |
|---|---|
| Correct status codes | Client parsing, debugging |
| RFC 7807 | Standardized error format |
| Retry-After | Thundering herd on rate limits |
| Idempotency keys | Duplicate operations on retry |
| Request IDs | Correlating logs to user reports |
| Error budgets | Balancing velocity vs reliability |
Key Takeaways#
- Status codes are not optional. 200-with-error-body is an anti-pattern.
- RFC 7807 gives you a standard — adopt it or at least be consistent.
- Retry-After protects your server and helps clients behave.
- Idempotency makes retries safe for state-changing operations.
- Error budgets turn reliability into a data-driven decision.
This is article #268 in the Codelit engineering series. Explore more API design, backend architecture, and system design guides at codelit.io.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
AI Agent Tool Use Architecture: Function Calling, ReAct Loops & Structured Outputs
6 min read
AI searchAI-Powered Search Architecture: Semantic Search, Hybrid Search, and RAG
8 min read
AI safetyAI Safety Guardrails Architecture: Input Validation, Output Filtering, and Human-in-the-Loop
8 min read
Try these templates
OpenAI API Request Pipeline
7-stage pipeline from API call to token generation, handling millions of requests per minute.
8 componentsNetflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsDistributed Rate Limiter
API rate limiting with sliding window, token bucket, and per-user quotas.
7 componentsBuild this architecture
Generate an interactive architecture for API Error Handling in seconds.
Try it in Codelit →
Comments