API Design for Mobile: Bandwidth, Offline-First, and Beyond
Mobile clients operate under constraints that desktop and server-to-server APIs rarely face: unreliable networks, limited bandwidth, battery budgets, and small screens that only need a fraction of the data a web dashboard consumes. Designing APIs that ignore these realities leads to sluggish apps, drained batteries, and frustrated users.
Bandwidth Optimization#
Every unnecessary byte costs the user money on metered connections and time on slow networks.
Response Shaping#
Return only the fields the client needs. Two common approaches:
Sparse fieldsets (REST):
GET /api/posts?fields=id,title,thumbnail_url,like_count
Projections (GraphQL):
query {
posts {
id
title
thumbnailUrl
likeCount
}
}
Both eliminate over-fetching. The key is to never return a full object when the client only renders a card.
Compression#
- Enable gzip or Brotli on every JSON response. Brotli typically achieves 15–20% better compression than gzip on JSON payloads.
- For binary payloads, use Protocol Buffers or FlatBuffers — they are 2–5x smaller than JSON and faster to parse on constrained devices.
Envelope Reduction#
Avoid deep nesting and verbose envelopes. Instead of:
{ "status": "ok", "data": { "results": { "items": [...] } } }
Prefer:
{ "items": [...], "nextCursor": "abc123" }
Pagination for Mobile#
Mobile screens display limited content at a time. Efficient pagination prevents loading thousands of records the user will never scroll to.
Cursor-Based Pagination#
Cursor pagination outperforms offset pagination for mobile because it is stable under concurrent writes and naturally supports infinite scroll.
GET /api/feed?limit=20&cursor=eyJpZCI6MTAwfQ==
Response:
{
"items": [...],
"nextCursor": "eyJpZCI6ODB9",
"hasMore": true
}
Best Practices#
| Practice | Reason |
|---|---|
| Default page size of 20 | Balances payload size with scroll depth |
Include hasMore flag | Lets the client know when to stop fetching |
| Never use offset for infinite scroll | Duplicates and skips appear under concurrent writes |
Support limit parameter | Different screens need different page sizes |
Image Optimization#
Images dominate mobile bandwidth. A single unoptimized hero image can exceed the entire JSON payload.
Server-Side Strategies#
- Responsive image URLs — Accept width and quality parameters:
GET /images/abc123?w=400&q=80&fmt=webp - Content negotiation — Return WebP or AVIF when the
Acceptheader indicates support. - Blur hash placeholders — Return a tiny blur hash string (20–30 bytes) alongside the image URL. The client renders the placeholder instantly while the full image loads.
CDN Integration#
┌─────────┐ ┌──────────┐ ┌──────────────┐
│ Mobile │────▶│ CDN │────▶│ Image │
│ Client │◀────│ (edge) │◀────│ Processing │
└─────────┘ └──────────┘ │ Service │
└──────────────┘
Serve images from edge locations. The image processing service generates variants on first request and caches them at the CDN. Subsequent requests are cache hits with single-digit millisecond latency.
Offline-First Architecture#
Mobile users lose connectivity in elevators, subways, and rural areas. Offline-first APIs treat the network as an enhancement, not a requirement.
Local-First Data Flow#
┌──────────────┐ ┌──────────────┐
│ UI Layer │ │ Remote API │
└──────┬───────┘ └──────▲───────┘
│ │
┌──────▼───────┐ ┌──────┴───────┐
│ Local DB │◀────────▶│ Sync Engine │
│ (SQLite / │ │ (queue + │
│ IndexedDB) │ │ reconcile) │
└──────────────┘ └──────────────┘
- Reads always hit the local database — instant response, zero network dependency.
- Writes queue locally — the sync engine pushes them to the server when connectivity returns.
- Conflict resolution — last-write-wins, server-wins, or custom merge logic depending on the domain.
Optimistic UI#
The app updates the UI immediately on user action. If the server later rejects the write, the sync engine rolls back and notifies the user. This makes the app feel instantaneous regardless of network conditions.
Delta Sync#
Fetching the entire dataset on every sync wastes bandwidth and battery. Delta sync transfers only what changed.
Implementation Patterns#
Timestamp-based:
GET /api/contacts?updatedSince=2026-03-29T10:00:00Z
Response:
{
"updated": [...],
"deleted": ["id-77", "id-102"],
"syncToken": "ts:1711706400"
}
Sync token / changelog:
GET /api/sync?token=abc123
Response:
{
"changes": [
{ "op": "upsert", "doc": {...} },
{ "op": "delete", "id": "xyz" }
],
"nextToken": "def456"
}
Handling Full Re-Sync#
When the sync token expires or the gap is too large, the server returns a 410 Gone status. The client performs a full re-sync by fetching the complete dataset with a fresh token.
GraphQL for Mobile#
GraphQL addresses several mobile-specific pain points out of the box.
Advantages#
- No over-fetching — the client declares exactly which fields it needs.
- Single round-trip — one query can fetch a post, its author, and its comments. REST would require three requests.
- Strong typing — the schema serves as a contract, and code generation produces type-safe client code.
Mobile-Specific Considerations#
| Concern | Mitigation |
|---|---|
| Query complexity attacks | Enforce depth limits and query cost analysis |
| Caching | Use persisted queries — the client sends a hash, not the full query string |
| Bandwidth | Enable automatic persisted queries (APQ) to eliminate query text from requests |
| Offline | Cache normalized entities locally and resolve queries from the cache |
Persisted Queries#
-- First request (registers the query)
POST /graphql
{ "query": "query Feed { posts { id title } }", "extensions": { "persistedQuery": { "sha256Hash": "abc..." } } }
-- Subsequent requests (hash only)
POST /graphql
{ "extensions": { "persistedQuery": { "sha256Hash": "abc..." } } }
This saves bandwidth and prevents arbitrary query execution in production.
Backend for Frontend (BFF)#
A BFF is a thin server layer tailored to a specific client.
┌──────────┐ ┌──────────┐ ┌──────────────────┐
│ iOS App │────▶│ Mobile │────▶│ Microservices │
└──────────┘ │ BFF │ │ (users, posts, │
┌──────────┐ │ │ │ notifications) │
│ Android │────▶│ │ └──────────────────┘
└──────────┘ └──────────┘
┌──────────┐ ┌──────────┐
│ Web App │────▶│ Web BFF │────▶ (same services)
└──────────┘ └──────────┘
Why BFF for Mobile#
- Aggregation — One BFF call replaces multiple microservice calls, reducing round-trips.
- Transformation — The BFF shapes responses for the mobile screen (smaller payloads, fewer fields).
- Versioning — The BFF isolates mobile clients from backend changes. Deploy a new BFF version without touching microservices.
- Authentication — The BFF can handle token refresh and session management, simplifying the mobile client.
Push vs Pull#
Choosing between push and pull determines how fresh the data feels and how much battery the app consumes.
Pull (Polling)#
Client: GET /api/notifications?since=last_check (every 30s)
- Simple to implement.
- Wastes bandwidth when there is nothing new.
- Acceptable for low-frequency data (settings, profile updates).
Push (WebSocket / SSE / Push Notifications)#
Server ──▶ WebSocket ──▶ Client (real-time)
Server ──▶ APNs / FCM ──▶ Device (background)
- Delivers updates instantly.
- Consumes a persistent connection (battery and memory cost).
- Essential for chat, live scores, and collaborative editing.
Hybrid Approach#
The best mobile APIs combine both:
| Data Type | Strategy |
|---|---|
| Chat messages | Push via WebSocket |
| Feed updates | Push notification triggers a pull |
| Profile data | Pull on app open |
| Live scores | Push via SSE |
| Settings sync | Pull on demand |
Push notifications wake the app and trigger a targeted pull, combining freshness with efficiency.
Key Takeaways#
- Shape responses to the screen — never return more data than the client will render.
- Use cursor pagination — stable, efficient, and natural for infinite scroll.
- Optimize images aggressively — responsive URLs, modern formats, CDN caching, and blur hash placeholders.
- Design for offline — local-first reads, queued writes, and delta sync keep the app usable without connectivity.
- Consider a BFF — one aggregation layer saves the mobile client from chatty microservice calls.
- Push where it matters, pull where it doesn't — balance freshness against battery and bandwidth.
Mobile API design is one of the most practical system design topics because every interviewer has experienced a slow or broken mobile app. Mastering these patterns demonstrates real-world engineering judgment.
This is article #379 in the Codelit system design series. Want to level up your system design skills? Explore the full collection at codelit.io.
Try it on Codelit
GitHub Integration
Paste any repo URL to generate an interactive architecture diagram from real code
Related articles
Try these templates
Build this architecture
Generate an interactive architecture for API Design for Mobile in seconds.
Try it in Codelit →
Comments