Event Loop Architecture: How Modern Runtimes Handle Concurrency
A single thread handling thousands of concurrent connections sounds impossible — until you understand the event loop. The event loop is the concurrency model at the heart of Node.js, Python asyncio, Nginx, and Redis. Instead of spawning a thread per connection, it multiplexes I/O events on a single thread, achieving high concurrency without the overhead of context switching.
The Core Idea: Non-Blocking I/O#
Traditional blocking I/O looks like this: a thread calls read(), the kernel puts the thread to sleep until data arrives, and the thread wakes up to process the result. With 10,000 concurrent connections, you need 10,000 threads — each consuming memory for its stack and competing for CPU time during context switches.
Non-blocking I/O flips the model. The thread registers interest in events ("tell me when socket 42 has data") and continues processing other work. When data arrives, the operating system notifies the thread, which then handles the event. One thread can manage thousands of connections because it never blocks.
How the Event Loop Works#
┌───────────────────────────────────────────┐
│ Event Loop │
│ │
│ 1. Poll for I/O events (epoll/kqueue) │
│ 2. Execute callbacks for ready events │
│ 3. Process timers │
│ 4. Run microtasks / next-tick callbacks │
│ 5. Check for pending I/O │
│ 6. Go to step 1 │
│ │
└───────────────────────────────────────────┘
Each iteration of the loop is called a tick. The loop polls the OS for ready file descriptors, invokes their callbacks, processes any expired timers, and repeats. As long as callbacks are fast and non-blocking, the loop processes events at high throughput.
OS-Level Event Notification#
The event loop relies on the operating system's I/O multiplexing facility:
epoll (Linux)#
epoll maintains an interest list of file descriptors in the kernel. When any descriptor becomes ready, epoll_wait() returns only the ready ones — O(1) per ready event rather than O(n) for all watched descriptors (the old select/poll problem).
kqueue (macOS, BSD)#
kqueue serves the same role on BSD-derived systems. It supports file descriptors, signals, timers, and file system events in a unified interface. macOS and FreeBSD both use kqueue.
IOCP (Windows)#
Windows uses I/O Completion Ports, a proactor-style model where the OS completes I/O operations and notifies the application when results are ready — slightly different from the reactor model of epoll/kqueue but achieving the same goal.
libuv: Cross-Platform Event Loop#
Node.js abstracts over these OS-specific APIs through libuv, a C library that provides a cross-platform event loop. libuv handles:
- Network I/O — TCP, UDP, pipes, TTY via epoll/kqueue/IOCP.
- File system I/O — file operations are blocking on most OSes, so libuv runs them on a thread pool (default size: 4 threads).
- DNS resolution — also delegated to the thread pool.
- Timers — managed in a min-heap for efficient expiration checks.
- Child processes — spawning and communicating with subprocesses.
┌──────────────────────────────┐
│ Node.js │
│ JavaScript (V8 engine) │
│ ▲ │
│ │ callbacks │
│ ┌──────┴───────┐ │
│ │ libuv │ │
│ │ event loop │ │
│ │ │ │
│ │ ┌────────┐ │ │
│ │ │ thread │ │ │
│ │ │ pool │ │ │
│ │ └────────┘ │ │
│ └──────────────┘ │
│ epoll / kqueue / IOCP │
└──────────────────────────────┘
Node.js Event Loop Phases#
The Node.js event loop has distinct phases, each with its own callback queue:
- Timers — execute
setTimeoutandsetIntervalcallbacks whose threshold has elapsed. - Pending callbacks — execute I/O callbacks deferred from the previous tick.
- Idle/Prepare — internal housekeeping.
- Poll — retrieve new I/O events and execute their callbacks. If the poll queue is empty and no timers are scheduled, the loop blocks here waiting for events.
- Check — execute
setImmediatecallbacks. - Close callbacks — execute close handlers (e.g.,
socket.on('close', ...)).
Between each phase, Node.js drains the microtask queue (resolved Promises, process.nextTick). This is why process.nextTick callbacks fire before any I/O callback in the next phase.
Python asyncio#
Python's asyncio module provides a similar event loop model using coroutines (async/await). Under the hood, asyncio uses selectors — a Python abstraction over epoll, kqueue, or select depending on the platform.
import asyncio
async def fetch(url):
reader, writer = await asyncio.open_connection(host, 443, ssl=True)
writer.write(request)
data = await reader.read(4096)
writer.close()
return data
async def main():
results = await asyncio.gather(
fetch("https://api.example.com/a"),
fetch("https://api.example.com/b"),
fetch("https://api.example.com/c"),
)
asyncio.run(main())
All three fetches run concurrently on a single thread. When one awaits network I/O, the event loop switches to another coroutine that is ready.
uvloop#
uvloop is a drop-in replacement for asyncio's default event loop, built on libuv. It typically delivers 2-4x higher throughput for I/O-bound Python applications.
Thread Pool for CPU-Bound Work#
The event loop excels at I/O-bound concurrency. But CPU-bound work — image processing, JSON parsing of large payloads, cryptographic operations — blocks the loop and starves all other connections.
Solutions:
- Worker threads (Node.js) —
worker_threadsmodule runs JavaScript in separate threads with message passing. - Thread pool (libuv) — file system operations and DNS already run on the pool. You can offload custom C++ addons there too.
- ProcessPoolExecutor (Python) —
asyncio.run_in_executor()delegates CPU-bound work to a process pool, bypassing the GIL. - Dedicated services — move heavy computation to a separate service behind a queue, keeping the event loop lean.
The Actor Model#
The actor model takes a different approach to concurrency. Instead of shared state protected by locks, every actor is an independent unit with its own state and a mailbox. Actors communicate exclusively through asynchronous messages. No shared memory, no locks, no data races.
Erlang/OTP#
Erlang pioneered the actor model in production systems. Each Erlang process (not an OS process) is a lightweight actor with its own heap. The BEAM virtual machine schedules millions of actors across CPU cores with preemptive scheduling. OTP provides supervision trees — if an actor crashes, its supervisor restarts it automatically. This "let it crash" philosophy powers telecom switches, WhatsApp, and RabbitMQ.
Akka (JVM)#
Akka brings the actor model to the JVM. Akka actors are lightweight objects dispatched onto a shared thread pool. Akka supports location transparency — actors communicate the same way whether they are in the same JVM or across a network — enabling distributed systems without rewriting communication logic.
Coroutines#
Coroutines are the language-level primitive that makes event loop programming ergonomic. Unlike OS threads (preemptively scheduled by the kernel), coroutines are cooperatively scheduled — they explicitly yield control at await points.
| Property | OS Thread | Coroutine |
|---|---|---|
| Scheduling | Preemptive (kernel) | Cooperative (runtime) |
| Stack size | 1-8 MB | Kilobytes |
| Creation cost | Microseconds | Nanoseconds |
| Context switch | Expensive (kernel) | Cheap (user-space) |
| Concurrency model | Parallel | Concurrent (single thread) |
Languages with native coroutine support: Python (async/await), JavaScript (async/await), Kotlin (suspend), Go (goroutines — a hybrid with M:N scheduling), Rust (async/await with explicit executor).
Concurrency vs Parallelism#
These terms are often confused:
- Concurrency — dealing with multiple tasks that can make progress in overlapping time periods. An event loop is concurrent: it interleaves I/O operations on a single thread.
- Parallelism — executing multiple tasks simultaneously on multiple CPU cores. A multi-threaded program with four threads on four cores is parallel.
Concurrency is about structure. Parallelism is about execution. An event loop provides concurrency without parallelism. Adding worker threads or processes adds parallelism on top.
Concurrency (event loop): Parallelism (threads):
Task A ──▶ wait ──▶ done Core 1: Task A ──────▶ done
Task B ──▶ wait ──▶ done Core 2: Task B ──────▶ done
Task C ──▶ wait ──▶ done Core 3: Task C ──────▶ done
(interleaved on 1 thread) (simultaneous on 3 cores)
When to Use an Event Loop#
Event loops shine for I/O-bound, high-concurrency workloads:
- Web servers handling thousands of concurrent HTTP connections.
- API gateways and reverse proxies.
- Real-time applications (chat, notifications, live dashboards).
- Database connection proxies.
They struggle with CPU-heavy workloads unless you offload computation to threads or separate processes. For CPU-bound parallelism, traditional multi-threaded or multi-process architectures (or languages like Go with M:N scheduling) may be a better fit.
Understanding event loop architecture is essential for diagnosing performance issues, designing scalable services, and making informed choices about runtime and language for your next project.
That is article #277 on Codelit. Explore the full archive for deep dives on concurrency, distributed systems, algorithms, and software engineering fundamentals. New articles every week.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
Apache Kafka Event Streaming Platform
Distributed event streaming with producers, brokers, consumer groups, partitions, and exactly-once semantics.
10 componentsEvent Sourcing with CQRS
Event-driven architecture with separate read/write models, event store, projections, and eventual consistency.
10 componentsBuild this architecture
Generate an interactive Event Loop Architecture in seconds.
Try it in Codelit →
Comments