collaborative editingsystem designreal-timeCRDTarchitecture

Collaborative Editing System Design: Real-Time Co-Authoring at Scale

March 28, 2026 7 min readBy Codelit Team Discussion

Collaborative Editing System Design#

Real-time collaborative editing is one of the hardest distributed systems problems disguised as a simple product feature. When two people type in the same document simultaneously, the system must resolve conflicts, maintain consistency, and feel instant — all at once.

Why It's Hard#

Single-user editing is trivial: one writer, one document, no conflicts. Add a second user and everything changes:

Concurrency — two users edit the same paragraph at the same time
Latency — network round-trips mean each user sees a stale version
Ordering — operations arrive in different orders at different clients
Intent preservation — the merged result should reflect what both users meant

The core challenge: eventual consistency without losing anyone's work.

High-Level Architecture#

┌─────────┐     WebSocket      ┌──────────────┐     ┌───────────┐
│ Client A │◄──────────────────►│  Sync Server │◄───►│ Database  │
└─────────┘                    │  (Stateful)  │     │ (Versions)│
┌─────────┐     WebSocket      │              │     └───────────┘
│ Client B │◄──────────────────►│              │
└─────────┘                    └──────────────┘

Key components:

Rich text editor — renders the document and captures user operations
Sync server — receives operations, resolves ordering, broadcasts to peers
Persistence layer — stores document snapshots and operation history
Presence service — tracks cursors, selections, and online users

OT vs CRDT: The Two Approaches#

Operational Transformation (OT)#

OT was pioneered by Google Docs. Each edit is an operation (insert, delete, retain). When concurrent operations arrive, the server transforms one against the other so both can be applied in sequence.

User A: insert("X", pos=3)
User B: delete(pos=1)

Server transforms A's op against B's:
  B deleted before pos 3, so A's insert shifts to pos 2
  Result: insert("X", pos=2)

Strengths:

Battle-tested (Google Docs, Etherpad)
Smaller payloads per operation
Centralized server simplifies conflict resolution

Weaknesses:

Server is a single point of ordering — hard to decentralize
Transform functions grow complex with rich-text formatting
N-way transformation is notoriously error-prone

Conflict-Free Replicated Data Types (CRDT)#

CRDTs assign each character a unique, ordered ID so operations commute naturally — no transformation needed. Libraries like Yjs and Automerge implement this.

User A's "X" gets ID (A, seq=7) between IDs (_, seq=5) and (_, seq=6)
User B's "Y" gets ID (B, seq=4) between the same IDs
→ Deterministic ordering by ID resolves the conflict automatically

Strengths:

No central server required — works peer-to-peer
Mathematically guaranteed convergence
Naturally supports offline editing

Weaknesses:

Higher memory overhead (tombstones for deleted characters)
Document size grows over time without garbage collection
Debugging merged states is harder

Which to Choose?#

Factor	OT	CRDT
Central server available	Yes	Optional
Offline-first required	Difficult	Natural fit
Rich-text complexity	Manageable	Growing ecosystem
Proven at Google scale	Yes	Figma uses a CRDT variant

For most new projects, Yjs (CRDT) offers the best developer experience with solid performance.

Conflict Resolution in Practice#

True conflicts are rarer than you might think. Most edits happen in different parts of the document. When they do collide:

Same position insert — deterministic tie-breaking by user ID
Concurrent delete and edit — delete wins (the text no longer exists)
Formatting conflicts — last-writer-wins per attribute (bold, italic, etc.)
Structural conflicts — e.g., two users moving the same block. Requires application-level policies

The key insight: conflict resolution must be deterministic and identical on every client.

Cursor Presence and Awareness#

Users expect to see collaborators' cursors, selections, and names in real-time.

Implementation:

Each client broadcasts cursor position on every change
Presence data is ephemeral — stored in memory, not persisted
Updates are throttled (every 50-100ms) to avoid flooding
Cursor positions reference document-relative IDs, not character offsets, so they survive concurrent edits

Awareness features beyond cursors:

User avatars in the toolbar
"User X is viewing Section 3" indicators
Typing indicators per paragraph

WebSocket Sync Protocol#

HTTP polling is too slow for real-time editing. WebSockets provide the persistent, bidirectional channel needed.

A typical sync protocol:

1. Client connects → sends auth token + document ID
2. Server sends current document state (snapshot)
3. Client applies snapshot, enters "synced" state
4. On local edit:
   a. Apply operation locally (optimistic)
   b. Send operation to server
   c. Server assigns sequence number
   d. Server broadcasts to all other clients
5. On receiving remote operation:
   a. Transform against pending local operations (OT)
      — or merge via CRDT
   b. Apply to local document

Latency targets: operations should appear on remote clients in under 100ms on the same region, under 300ms cross-region.

Version History#

Users need to browse, compare, and restore previous versions.

Design considerations:

Store operation log — replay operations to reconstruct any point in time
Periodic snapshots — avoid replaying thousands of operations from the beginning
Snapshot every N operations or every M minutes
Named versions — let users manually save checkpoints ("Draft v2")
Diff view — highlight what changed between two versions using the operation log

Storage strategy:

Operations table:  doc_id | seq | user_id | op_data | timestamp
Snapshots table:   doc_id | seq | snapshot_blob | timestamp

Compact old operations by merging them into snapshots after 30 days.

Permission Model#

Collaborative documents need fine-grained access control:

Owner — full control, can delete document
Editor — can edit content
Commenter — can add comments and suggestions
Viewer — read-only access

Implementation patterns:

Store permissions per document in a separate ACL table
Check permissions on WebSocket connect and on every operation
Share links with embedded tokens for frictionless access
Organization-level defaults ("everyone at Acme Corp can edit")

For suggestion mode (like Google Docs "Suggesting"):

Track suggested edits as pending operations tied to a user
Owner or editor accepts/rejects each suggestion
Rejected suggestions are discarded; accepted ones become real operations

Offline Editing#

Offline support is where CRDTs shine. The approach:

Cache the document locally (IndexedDB or SQLite)
Queue operations while offline
On reconnect, sync queued operations with the server
CRDT properties guarantee convergence without special handling

For OT-based systems, offline is harder:

Queued operations must be transformed against all server operations that happened while offline
Long offline periods create large transformation chains
Risk of surprising merges increases with time apart

Scaling Considerations#

Horizontal scaling — shard by document ID. Each document lives on one sync server at a time
Server affinity — use consistent hashing to route WebSocket connections for the same document to the same server
Large documents — split into blocks/pages that sync independently
Hot documents — a document with 100 concurrent editors needs dedicated capacity. Monitor and auto-scale
Storage — operation logs grow fast. Compress, compact, and archive aggressively

Technology Choices#

Component	Options
CRDT library	Yjs, Automerge, Diamond Types
Editor framework	ProseMirror, TipTap, Slate, Lexical
Transport	WebSocket, WebRTC (peer-to-peer)
Persistence	PostgreSQL, Redis (presence), S3 (snapshots)
Auth	JWT tokens validated on WebSocket handshake

Key Takeaways#

OT requires a central server; CRDTs do not — choose based on your architecture
Yjs + TipTap is the most practical stack for new collaborative editors
Presence is separate from document sync — treat it as ephemeral, high-frequency data
Version history = operation log + periodic snapshots
Offline editing is natural with CRDTs, painful with OT
Shard by document ID for horizontal scaling

Ready to design systems like this in interviews and on the job? codelit.io gives you the tools to practice system design interactively — from collaborative editors to distributed databases.

This is article #211 in the Codelit engineering blog series.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Build this architecture →

Comments

AI agents

Context Engineering for Agentic Systems

2 min read

AI agents

AI Agent Memory Architecture

2 min read

AI agents

Production AI Agent Deployment Checklist

2 min read

Try these templates

Uber Real-Time Location System

Handles 5M+ GPS pings per second using H3 hexagonal geospatial indexing.

6 components

Real-Time Collaborative Editor

Notion-like document editor with real-time collaboration, conflict resolution, and rich media.

9 components

Netflix Video Streaming Architecture

Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.

10 components

Build this architecture

Generate an interactive architecture for Collaborative Editing System Design in seconds.

Try it in Codelit →

collaborative editingsystem designreal-timeCRDTarchitecture

Collaborative Editing System Design: Real-Time Co-Authoring at Scale

March 28, 2026 7 min readBy Codelit Team Discussion

Collaborative Editing System Design#

Why It's Hard#

Single-user editing is trivial: one writer, one document, no conflicts. Add a second user and everything changes:

Concurrency — two users edit the same paragraph at the same time
Latency — network round-trips mean each user sees a stale version
Ordering — operations arrive in different orders at different clients
Intent preservation — the merged result should reflect what both users meant

The core challenge: eventual consistency without losing anyone's work.

High-Level Architecture#

┌─────────┐     WebSocket      ┌──────────────┐     ┌───────────┐
│ Client A │◄──────────────────►│  Sync Server │◄───►│ Database  │
└─────────┘                    │  (Stateful)  │     │ (Versions)│
┌─────────┐     WebSocket      │              │     └───────────┘
│ Client B │◄──────────────────►│              │
└─────────┘                    └──────────────┘

Key components:

Rich text editor — renders the document and captures user operations
Sync server — receives operations, resolves ordering, broadcasts to peers
Persistence layer — stores document snapshots and operation history
Presence service — tracks cursors, selections, and online users

OT vs CRDT: The Two Approaches#

Operational Transformation (OT)#

User A: insert("X", pos=3)
User B: delete(pos=1)

Server transforms A's op against B's:
  B deleted before pos 3, so A's insert shifts to pos 2
  Result: insert("X", pos=2)

Strengths:

Battle-tested (Google Docs, Etherpad)
Smaller payloads per operation
Centralized server simplifies conflict resolution

Weaknesses:

Server is a single point of ordering — hard to decentralize
Transform functions grow complex with rich-text formatting
N-way transformation is notoriously error-prone

Conflict-Free Replicated Data Types (CRDT)#

CRDTs assign each character a unique, ordered ID so operations commute naturally — no transformation needed. Libraries like Yjs and Automerge implement this.

User A's "X" gets ID (A, seq=7) between IDs (_, seq=5) and (_, seq=6)
User B's "Y" gets ID (B, seq=4) between the same IDs
→ Deterministic ordering by ID resolves the conflict automatically

Strengths:

No central server required — works peer-to-peer
Mathematically guaranteed convergence
Naturally supports offline editing

Weaknesses:

Higher memory overhead (tombstones for deleted characters)
Document size grows over time without garbage collection
Debugging merged states is harder

Which to Choose?#

Factor	OT	CRDT
Central server available	Yes	Optional
Offline-first required	Difficult	Natural fit
Rich-text complexity	Manageable	Growing ecosystem
Proven at Google scale	Yes	Figma uses a CRDT variant

For most new projects, Yjs (CRDT) offers the best developer experience with solid performance.

Conflict Resolution in Practice#

True conflicts are rarer than you might think. Most edits happen in different parts of the document. When they do collide:

Same position insert — deterministic tie-breaking by user ID
Concurrent delete and edit — delete wins (the text no longer exists)
Formatting conflicts — last-writer-wins per attribute (bold, italic, etc.)
Structural conflicts — e.g., two users moving the same block. Requires application-level policies

The key insight: conflict resolution must be deterministic and identical on every client.

Cursor Presence and Awareness#

Users expect to see collaborators' cursors, selections, and names in real-time.

Implementation:

Each client broadcasts cursor position on every change
Presence data is ephemeral — stored in memory, not persisted
Updates are throttled (every 50-100ms) to avoid flooding
Cursor positions reference document-relative IDs, not character offsets, so they survive concurrent edits

Awareness features beyond cursors:

User avatars in the toolbar
"User X is viewing Section 3" indicators
Typing indicators per paragraph

WebSocket Sync Protocol#

HTTP polling is too slow for real-time editing. WebSockets provide the persistent, bidirectional channel needed.

A typical sync protocol:

1. Client connects → sends auth token + document ID
2. Server sends current document state (snapshot)
3. Client applies snapshot, enters "synced" state
4. On local edit:
   a. Apply operation locally (optimistic)
   b. Send operation to server
   c. Server assigns sequence number
   d. Server broadcasts to all other clients
5. On receiving remote operation:
   a. Transform against pending local operations (OT)
      — or merge via CRDT
   b. Apply to local document

Latency targets: operations should appear on remote clients in under 100ms on the same region, under 300ms cross-region.

Version History#

Users need to browse, compare, and restore previous versions.

Design considerations:

Store operation log — replay operations to reconstruct any point in time
Periodic snapshots — avoid replaying thousands of operations from the beginning
Snapshot every N operations or every M minutes
Named versions — let users manually save checkpoints ("Draft v2")
Diff view — highlight what changed between two versions using the operation log

Storage strategy:

Operations table:  doc_id | seq | user_id | op_data | timestamp
Snapshots table:   doc_id | seq | snapshot_blob | timestamp

Compact old operations by merging them into snapshots after 30 days.

Permission Model#

Collaborative documents need fine-grained access control:

Owner — full control, can delete document
Editor — can edit content
Commenter — can add comments and suggestions
Viewer — read-only access

Implementation patterns:

Store permissions per document in a separate ACL table
Check permissions on WebSocket connect and on every operation
Share links with embedded tokens for frictionless access
Organization-level defaults ("everyone at Acme Corp can edit")

For suggestion mode (like Google Docs "Suggesting"):

Track suggested edits as pending operations tied to a user
Owner or editor accepts/rejects each suggestion
Rejected suggestions are discarded; accepted ones become real operations

Offline Editing#

Offline support is where CRDTs shine. The approach:

Cache the document locally (IndexedDB or SQLite)
Queue operations while offline
On reconnect, sync queued operations with the server
CRDT properties guarantee convergence without special handling

For OT-based systems, offline is harder:

Queued operations must be transformed against all server operations that happened while offline
Long offline periods create large transformation chains
Risk of surprising merges increases with time apart

Scaling Considerations#

Horizontal scaling — shard by document ID. Each document lives on one sync server at a time
Server affinity — use consistent hashing to route WebSocket connections for the same document to the same server
Large documents — split into blocks/pages that sync independently
Hot documents — a document with 100 concurrent editors needs dedicated capacity. Monitor and auto-scale
Storage — operation logs grow fast. Compress, compact, and archive aggressively

Technology Choices#

Component	Options
CRDT library	Yjs, Automerge, Diamond Types
Editor framework	ProseMirror, TipTap, Slate, Lexical
Transport	WebSocket, WebRTC (peer-to-peer)
Persistence	PostgreSQL, Redis (presence), S3 (snapshots)
Auth	JWT tokens validated on WebSocket handshake

Key Takeaways#

OT requires a central server; CRDTs do not — choose based on your architecture
Yjs + TipTap is the most practical stack for new collaborative editors
Presence is separate from document sync — treat it as ephemeral, high-frequency data
Version history = operation log + periodic snapshots
Offline editing is natural with CRDTs, painful with OT
Shard by document ID for horizontal scaling

Ready to design systems like this in interviews and on the job? codelit.io gives you the tools to practice system design interactively — from collaborative editors to distributed databases.

This is article #211 in the Codelit engineering blog series.

Try it on Codelit

Chaos Mode

Simulate node failures and watch cascading impact across your architecture

Build this architecture →

Comments

AI agents

Build this architecture

Generate an interactive architecture for Collaborative Editing System Design in seconds.

Try it in Codelit →

Collaborative Editing System Design: Real-Time Co-Authoring at Scale

Collaborative Editing System Design#

Why It's Hard#

High-Level Architecture#

OT vs CRDT: The Two Approaches#

Operational Transformation (OT)#

Conflict-Free Replicated Data Types (CRDT)#

Which to Choose?#

Conflict Resolution in Practice#

Cursor Presence and Awareness#

WebSocket Sync Protocol#

Version History#

Permission Model#

Offline Editing#

Scaling Considerations#

Technology Choices#

Key Takeaways#

Comments

Related articles

Context Engineering for Agentic Systems

AI Agent Memory Architecture

Production AI Agent Deployment Checklist

Try these templates

Uber Real-Time Location System

Real-Time Collaborative Editor

Netflix Video Streaming Architecture

Build this architecture

Collaborative Editing System Design: Real-Time Co-Authoring at Scale

Collaborative Editing System Design#

Why It's Hard#

High-Level Architecture#

OT vs CRDT: The Two Approaches#

Operational Transformation (OT)#

Conflict-Free Replicated Data Types (CRDT)#

Which to Choose?#

Conflict Resolution in Practice#

Cursor Presence and Awareness#

WebSocket Sync Protocol#

Version History#

Permission Model#

Offline Editing#

Scaling Considerations#

Technology Choices#

Key Takeaways#

Comments

Related articles

Context Engineering for Agentic Systems

AI Agent Memory Architecture

Production AI Agent Deployment Checklist

Try these templates

Uber Real-Time Location System

Real-Time Collaborative Editor

Netflix Video Streaming Architecture

Build this architecture