Design a File Upload System — Chunked, Resumable, and Scalable
Uploads are deceptively complex#
"Let users upload files." Four words that hide enormous complexity. Large files fail over flaky connections. Virus-infected files endanger your users. Unprocessed images crash your CDN.
Here's how to build file uploads that actually work.
The naive approach (and why it fails)#
Client → POST /upload (entire file in body) → Server → Save to disk
Fails because:
- 500MB file over 3G? Request times out
- Connection drops at 80%? Start over from scratch
- Server runs out of memory buffering the file
- No virus scanning, no processing, no CDN
Chunked uploads#
Split the file into small pieces (5-10MB each) and upload them individually.
Flow:
- Client splits file into chunks
- Client uploads each chunk with chunk index + upload ID
- Server stores chunks temporarily
- When all chunks arrive, server reassembles the file
Why it works: Each chunk is a small, fast request. If one fails, retry just that chunk, not the entire file.
Resumable uploads#
The gold standard. If the upload is interrupted, resume from where it stopped.
Protocol (tus.io or similar):
- Client:
POST /uploads→ server returns upload URL + upload ID - Client:
PATCH /uploads/{id}withUpload-Offset: 0+ first chunk - Client:
PATCH /uploads/{id}withUpload-Offset: 5242880+ next chunk - Connection drops at chunk 3
- Client reconnects:
HEAD /uploads/{id}→ server returnsUpload-Offset: 10485760 - Client resumes from byte 10485760
Libraries: tus-js-client (frontend), tusd (server), or presigned URLs to S3 with multipart upload.
Direct-to-cloud uploads#
Don't route files through your server. Upload directly to cloud storage.
Presigned URL flow:
- Client requests an upload URL from your API
- Server generates a presigned S3/GCS URL (valid for 15 minutes)
- Client uploads directly to cloud storage
- Cloud triggers a webhook/event when upload completes
- Your server processes the file asynchronously
Why this is better: Your server never handles file bytes. No memory pressure, no bandwidth bottleneck. The cloud provider handles the heavy lifting.
Processing pipeline#
After upload, files need processing:
Upload Complete → Queue → Virus Scan → Process → Store → Notify
For images:
- Virus scan (ClamAV)
- EXIF data extraction
- Thumbnail generation (multiple sizes)
- Format conversion (WebP for web)
- Store originals + thumbnails in cloud storage
- Update database with file metadata
For videos:
- Virus scan
- Transcoding (multiple resolutions + HLS)
- Thumbnail extraction
- Store in cloud storage
- Serve via CDN with adaptive bitrate
For documents:
- Virus scan
- Text extraction (for search indexing)
- Preview generation (PDF thumbnail)
- Store with access controls
Storage strategy#
| Data | Storage | Why |
|---|---|---|
| Original files | S3/GCS (Standard) | Durable, cheap |
| Thumbnails | S3 + CDN | Fast delivery |
| Temporary chunks | S3 (with lifecycle) | Auto-delete after 24h |
| File metadata | PostgreSQL | Queryable, relational |
| Processing queue | SQS/Kafka | Reliable async |
Security considerations#
- Virus scanning on every upload before making files accessible
- File type validation — check magic bytes, not just extension
- Size limits — per-file and per-user quotas
- Access control — signed URLs with expiration for private files
- Content-Type enforcement — prevent XSS via uploaded HTML files
See file storage architectures#
On Codelit, search "file storage" or "Dropbox" in ⌘K to see complete file management architectures — chunked sync, deduplication, CDN delivery, and processing pipelines.
Design your upload system: search "file storage" on Codelit.io and explore the architecture.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
Uber Real-Time Location System
Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.
6 componentsScalable SaaS Application
Modern SaaS with microservices, event-driven processing, and multi-tenant architecture.
10 componentsNetflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsBuild this architecture
Generate an interactive architecture for Design a File Upload System in seconds.
Try it in Codelit →
Comments