system-designpdf-generationarchitecturebackend

PDF Generation Architecture: A Complete System Design Guide

March 28, 2026 6 min readBy Codelit Team Discussion

PDF Generation Architecture#

Generating PDFs at scale is deceptively complex. What starts as a simple "export to PDF" button quickly becomes a distributed system problem involving template rendering, async processing, storage, and compliance. This guide walks through the architecture of a production-grade PDF generation system.

Why PDF Generation Is Hard#

PDFs seem straightforward until you face:

High concurrency — hundreds of reports generated simultaneously
Large documents — catalogs, invoices, or legal filings with hundreds of pages
Pixel-perfect output — matching a designer's mockup exactly
Performance expectations — users expect documents in seconds, not minutes

HTML-to-PDF Approaches#

The most popular strategy is rendering HTML/CSS into PDF. Two dominant tools lead this space.

Puppeteer / Playwright#

Headless Chrome renders your HTML and prints it to PDF. This gives you full CSS support including flexbox, grid, and web fonts.

const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setContent(htmlString);
const pdf = await page.pdf({ format: "A4", printBackground: true });
await browser.close();

Pros: Full CSS support, JavaScript execution, accurate rendering. Cons: Heavy memory footprint (each Chrome instance uses 50-100MB), slow cold starts.

wkhtmltopdf#

A lighter alternative using the WebKit engine. It consumes less memory but has weaker CSS support — no grid, limited flexbox.

Best suited for simple layouts like invoices or receipts where CSS complexity is low.

Template Engines#

Before rendering, you need to populate templates with data. Common approaches:

Handlebars / Mustache — logic-less templates, great for invoices and reports
React-pdf — define PDF layout using React components (no browser needed)
LaTeX — ideal for scientific or mathematical documents
DOCX-to-PDF — use libraries like LibreOffice in headless mode for Word-based templates

Choose based on your team's skills and document complexity. For most web teams, HTML templates with Handlebars work well.

Async Generation with Queues#

Synchronous PDF generation blocks HTTP requests and does not scale. A queue-based architecture solves this.

Architecture Flow#

Client sends a generation request via API
API server validates the request, creates a job record, returns a job ID
Message queue (RabbitMQ, SQS, or Redis) holds the job
Worker pool picks up jobs, renders PDFs, uploads to storage
Client polls the job status or receives a webhook notification

Client → API → Queue → Worker → Storage
                ↓
           Job Status DB

Worker Scaling#

Workers are stateless and horizontally scalable. Use Kubernetes Jobs or AWS Lambda (with container images for Puppeteer) to auto-scale based on queue depth.

Keep a warm pool of browser instances to avoid cold-start latency. Reuse browser contexts rather than launching new browsers per request.

Storage and Caching#

Storage#

Store generated PDFs in object storage (S3, GCS, or MinIO). Use a consistent naming scheme:

s3://pdf-bucket/{tenant_id}/{document_type}/{year}/{uuid}.pdf

Generate pre-signed URLs for secure, time-limited downloads.

Caching#

Many PDFs are requested repeatedly (monthly statements, product catalogs). Cache aggressively:

Content hash — hash the input data; if the hash matches an existing PDF, return it
TTL-based — cache invoices for 24 hours, regenerate after that
Invalidation — purge cache when source data changes

This alone can reduce generation load by 40-60% in typical SaaS applications.

Watermarks and Branding#

Watermarks serve security and branding purposes. Implementation strategies:

At render time — overlay watermark text/images in the HTML template using CSS positioning
Post-processing — use libraries like pdf-lib or PyPDF2 to stamp watermarks onto existing PDFs
Dynamic watermarks — embed the recipient's email or a unique ID to trace leaked documents

import { PDFDocument, rgb } from "pdf-lib";

const pdfDoc = await PDFDocument.load(existingPdfBytes);
const pages = pdfDoc.getPages();
pages.forEach((page) => {
  page.drawText("CONFIDENTIAL", {
    x: 150, y: 400, size: 60,
    color: rgb(0.9, 0.9, 0.9), rotate: degrees(45),
  });
});

Digital Signatures#

For legal and financial documents, digital signatures provide authenticity and tamper detection.

PAdES (PDF Advanced Electronic Signatures) is the standard
Use libraries like node-signpdf or iText for embedding signatures
Store signing certificates securely in HSMs (Hardware Security Modules) or KMS
Timestamps from a TSA (Time Stamping Authority) prove when the document was signed

Accessibility (PDF/UA)#

Accessible PDFs are not optional — they are legally required in many jurisdictions.

Tagged PDF — structure content with headings, paragraphs, and lists
Alt text — describe images and charts
Reading order — ensure logical flow for screen readers
Language metadata — specify the document language

Puppeteer-generated PDFs often lack proper tagging. Post-process with tools like axe-pdf or Adobe Acrobat to validate compliance.

Tools and Services#

Tool	Type	Best For
Gotenberg	Self-hosted API	Docker-based, wraps LibreOffice and Chromium
DocRaptor	SaaS	High-fidelity output using Prince engine
Prince	Commercial engine	Print-quality CSS, supports CSS Paged Media
React-pdf	Library	React-native PDF generation without a browser
WeasyPrint	Python library	CSS-based PDF generation, lighter than Puppeteer
pdf-lib	JS library	Manipulating existing PDFs (merge, split, stamp)

Gotenberg#

Gotenberg deserves special mention. It wraps Chromium and LibreOffice behind a clean HTTP API, runs in Docker, and handles concurrent requests gracefully. It is often the fastest path to production.

curl --request POST http://localhost:3000/forms/chromium/convert/html \
  --form files=@index.html -o result.pdf

Monitoring and Observability#

Track these metrics in production:

Generation latency — p50, p95, p99 per document type
Queue depth — alerts when backlog grows
Failure rate — broken templates, timeout errors, OOM kills
Storage usage — cost tracking and cleanup policies

Key Takeaways#

Use HTML-to-PDF with Puppeteer or Gotenberg for most use cases
Always generate PDFs asynchronously via a job queue
Cache aggressively using content hashing
Plan for accessibility and digital signatures early
Monitor latency and failure rates — PDF generation is resource-intensive

PDF generation architecture is one of those systems that benefits enormously from thoughtful upfront design. Get the queue and caching layers right, and the rest follows.

Want to explore more system design topics? Visit codelit.io for interactive guides and tools.

This is article #216 in the Codelit engineering blog series.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Build this architecture →

Comments

AI agents

AI Agent Tool Use Architecture: Function Calling, ReAct Loops & Structured Outputs

6 min read

AI workflows

AI Workflow Orchestration: Chains, DAGs, Human-in-the-Loop & Production Patterns

6 min read

API design

API Backward Compatibility: Ship Changes Without Breaking Consumers

6 min read

Try these templates

Uber Real-Time Location System

Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.

6 components

Netflix Video Streaming Architecture

Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.

10 components

E-Commerce Checkout System

Production checkout flow with Stripe payments, inventory management, and fraud detection.

11 components

Build this architecture

Generate an interactive PDF Generation Architecture in seconds.

Try it in Codelit →

system-designpdf-generationarchitecturebackend

PDF Generation Architecture: A Complete System Design Guide

March 28, 2026 6 min readBy Codelit Team Discussion

PDF Generation Architecture#

Why PDF Generation Is Hard#

PDFs seem straightforward until you face:

High concurrency — hundreds of reports generated simultaneously
Large documents — catalogs, invoices, or legal filings with hundreds of pages
Pixel-perfect output — matching a designer's mockup exactly
Performance expectations — users expect documents in seconds, not minutes

HTML-to-PDF Approaches#

The most popular strategy is rendering HTML/CSS into PDF. Two dominant tools lead this space.

Puppeteer / Playwright#

Headless Chrome renders your HTML and prints it to PDF. This gives you full CSS support including flexbox, grid, and web fonts.

const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setContent(htmlString);
const pdf = await page.pdf({ format: "A4", printBackground: true });
await browser.close();

Pros: Full CSS support, JavaScript execution, accurate rendering. Cons: Heavy memory footprint (each Chrome instance uses 50-100MB), slow cold starts.

wkhtmltopdf#

A lighter alternative using the WebKit engine. It consumes less memory but has weaker CSS support — no grid, limited flexbox.

Best suited for simple layouts like invoices or receipts where CSS complexity is low.

Template Engines#

Before rendering, you need to populate templates with data. Common approaches:

Handlebars / Mustache — logic-less templates, great for invoices and reports
React-pdf — define PDF layout using React components (no browser needed)
LaTeX — ideal for scientific or mathematical documents
DOCX-to-PDF — use libraries like LibreOffice in headless mode for Word-based templates

Choose based on your team's skills and document complexity. For most web teams, HTML templates with Handlebars work well.

Async Generation with Queues#

Synchronous PDF generation blocks HTTP requests and does not scale. A queue-based architecture solves this.

Architecture Flow#

Client sends a generation request via API
API server validates the request, creates a job record, returns a job ID
Message queue (RabbitMQ, SQS, or Redis) holds the job
Worker pool picks up jobs, renders PDFs, uploads to storage
Client polls the job status or receives a webhook notification

Client → API → Queue → Worker → Storage
                ↓
           Job Status DB

Worker Scaling#

Workers are stateless and horizontally scalable. Use Kubernetes Jobs or AWS Lambda (with container images for Puppeteer) to auto-scale based on queue depth.

Keep a warm pool of browser instances to avoid cold-start latency. Reuse browser contexts rather than launching new browsers per request.

Storage and Caching#

Storage#

Store generated PDFs in object storage (S3, GCS, or MinIO). Use a consistent naming scheme:

s3://pdf-bucket/{tenant_id}/{document_type}/{year}/{uuid}.pdf

Generate pre-signed URLs for secure, time-limited downloads.

Caching#

Many PDFs are requested repeatedly (monthly statements, product catalogs). Cache aggressively:

Content hash — hash the input data; if the hash matches an existing PDF, return it
TTL-based — cache invoices for 24 hours, regenerate after that
Invalidation — purge cache when source data changes

This alone can reduce generation load by 40-60% in typical SaaS applications.

Watermarks and Branding#

Watermarks serve security and branding purposes. Implementation strategies:

At render time — overlay watermark text/images in the HTML template using CSS positioning
Post-processing — use libraries like pdf-lib or PyPDF2 to stamp watermarks onto existing PDFs
Dynamic watermarks — embed the recipient's email or a unique ID to trace leaked documents

import { PDFDocument, rgb } from "pdf-lib";

const pdfDoc = await PDFDocument.load(existingPdfBytes);
const pages = pdfDoc.getPages();
pages.forEach((page) => {
  page.drawText("CONFIDENTIAL", {
    x: 150, y: 400, size: 60,
    color: rgb(0.9, 0.9, 0.9), rotate: degrees(45),
  });
});

Digital Signatures#

For legal and financial documents, digital signatures provide authenticity and tamper detection.

PAdES (PDF Advanced Electronic Signatures) is the standard
Use libraries like node-signpdf or iText for embedding signatures
Store signing certificates securely in HSMs (Hardware Security Modules) or KMS
Timestamps from a TSA (Time Stamping Authority) prove when the document was signed

Accessibility (PDF/UA)#

Accessible PDFs are not optional — they are legally required in many jurisdictions.

Tagged PDF — structure content with headings, paragraphs, and lists
Alt text — describe images and charts
Reading order — ensure logical flow for screen readers
Language metadata — specify the document language

Puppeteer-generated PDFs often lack proper tagging. Post-process with tools like axe-pdf or Adobe Acrobat to validate compliance.

Tools and Services#

Tool	Type	Best For
Gotenberg	Self-hosted API	Docker-based, wraps LibreOffice and Chromium
DocRaptor	SaaS	High-fidelity output using Prince engine
Prince	Commercial engine	Print-quality CSS, supports CSS Paged Media
React-pdf	Library	React-native PDF generation without a browser
WeasyPrint	Python library	CSS-based PDF generation, lighter than Puppeteer
pdf-lib	JS library	Manipulating existing PDFs (merge, split, stamp)

Gotenberg#

Gotenberg deserves special mention. It wraps Chromium and LibreOffice behind a clean HTTP API, runs in Docker, and handles concurrent requests gracefully. It is often the fastest path to production.

curl --request POST http://localhost:3000/forms/chromium/convert/html \
  --form files=@index.html -o result.pdf

Monitoring and Observability#

Track these metrics in production:

Generation latency — p50, p95, p99 per document type
Queue depth — alerts when backlog grows
Failure rate — broken templates, timeout errors, OOM kills
Storage usage — cost tracking and cleanup policies

Key Takeaways#

Use HTML-to-PDF with Puppeteer or Gotenberg for most use cases
Always generate PDFs asynchronously via a job queue
Cache aggressively using content hashing
Plan for accessibility and digital signatures early
Monitor latency and failure rates — PDF generation is resource-intensive

PDF generation architecture is one of those systems that benefits enormously from thoughtful upfront design. Get the queue and caching layers right, and the rest follows.

Want to explore more system design topics? Visit codelit.io for interactive guides and tools.

This is article #216 in the Codelit engineering blog series.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Build this architecture →

Comments

AI agents

Build this architecture

Generate an interactive PDF Generation Architecture in seconds.

Try it in Codelit →

PDF Generation Architecture: A Complete System Design Guide

PDF Generation Architecture#

Why PDF Generation Is Hard#

HTML-to-PDF Approaches#

Puppeteer / Playwright#

wkhtmltopdf#

Template Engines#

Async Generation with Queues#

Architecture Flow#

Worker Scaling#

Storage and Caching#

Storage#

Caching#

Watermarks and Branding#

Digital Signatures#

Accessibility (PDF/UA)#

Tools and Services#

Gotenberg#

Monitoring and Observability#

Key Takeaways#

Comments

Related articles

AI Agent Tool Use Architecture: Function Calling, ReAct Loops & Structured Outputs

AI Workflow Orchestration: Chains, DAGs, Human-in-the-Loop & Production Patterns

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Uber Real-Time Location System

Netflix Video Streaming Architecture

E-Commerce Checkout System

Build this architecture

PDF Generation Architecture: A Complete System Design Guide

PDF Generation Architecture#

Why PDF Generation Is Hard#

HTML-to-PDF Approaches#

Puppeteer / Playwright#

wkhtmltopdf#

Template Engines#

Async Generation with Queues#

Architecture Flow#

Worker Scaling#

Storage and Caching#

Storage#

Caching#

Watermarks and Branding#

Digital Signatures#

Accessibility (PDF/UA)#

Tools and Services#

Gotenberg#

Monitoring and Observability#

Key Takeaways#

Comments

Related articles

AI Agent Tool Use Architecture: Function Calling, ReAct Loops & Structured Outputs

AI Workflow Orchestration: Chains, DAGs, Human-in-the-Loop & Production Patterns

API Backward Compatibility: Ship Changes Without Breaking Consumers

Try these templates

Uber Real-Time Location System

Netflix Video Streaming Architecture

E-Commerce Checkout System

Build this architecture