Message Queues Explained — Kafka, RabbitMQ, SQS, and When to Use Each
Why message queues exist#
Service A needs to tell Service B something happened. The simplest approach: A calls B directly. This works until B is slow, down, or overwhelmed.
A message queue decouples them: A publishes a message, the queue stores it, and B processes it when ready. If B crashes, the message waits. If B is slow, messages queue up instead of failing.
The three models#
Point-to-point (task queue)#
One producer, one consumer. Each message is processed exactly once by one worker.
Use for: Background jobs, email sending, image processing. Any work that needs to be done once.
Pub/sub (fan-out)#
One producer, multiple consumers. Each consumer gets a copy of every message.
Use for: Event notifications, real-time updates, analytics pipelines. When multiple services need to react to the same event.
Streaming (log)#
Messages are stored in an ordered, append-only log. Consumers read at their own pace and can replay from any point.
Use for: Event sourcing, change data capture, real-time analytics. When you need message ordering and replay capability.
The tools#
Apache Kafka#
What it is: A distributed event streaming platform. Messages are stored in partitioned topics as an immutable log.
Strengths:
- Massive throughput (millions of messages/second)
- Message replay — consumers can re-read old messages
- Ordered within partitions
- Multi-consumer — many consumers can independently read the same topic
Weaknesses:
- Complex to operate (ZooKeeper/KRaft, partition management)
- Not great for task queues (no per-message acknowledgment by default)
- Overkill for simple use cases
Best for: Event streaming, real-time analytics, change data capture, microservice event buses at scale.
RabbitMQ#
What it is: A traditional message broker supporting AMQP protocol with rich routing.
Strengths:
- Flexible routing (direct, fanout, topic, headers exchanges)
- Per-message acknowledgment and retry
- Dead letter queues built-in
- Easier to operate than Kafka
Weaknesses:
- Lower throughput than Kafka
- Messages are deleted after consumption (no replay)
- Can become a bottleneck at very high scale
Best for: Task queues, RPC patterns, complex routing rules, when you need per-message reliability.
AWS SQS#
What it is: Fully managed message queue service. Zero infrastructure to manage.
Strengths:
- Zero ops — no servers, no clusters, no tuning
- Auto-scales to any throughput
- Dead letter queues, FIFO ordering available
- Dirt cheap at low-to-medium volume
Weaknesses:
- 256KB message size limit
- At-least-once delivery (must handle duplicates)
- Limited to AWS ecosystem
- No message replay
Best for: Serverless architectures, AWS-native apps, when you don't want to manage infrastructure.
Redis Streams#
What it is: Lightweight stream data structure in Redis with consumer groups.
Strengths:
- Sub-millisecond latency
- Consumer groups with acknowledgment
- Already in your stack if you use Redis
- Simple to set up
Weaknesses:
- Persistence depends on Redis config (not as durable as Kafka)
- Limited tooling compared to Kafka/RabbitMQ
- Memory-bound
Best for: Low-latency event processing, real-time features, when you already have Redis.
The decision framework#
| Requirement | Best choice |
|---|---|
| Simple background jobs | SQS or RabbitMQ |
| Complex routing rules | RabbitMQ |
| Event streaming at scale | Kafka |
| Serverless / zero-ops | SQS |
| Message replay needed | Kafka |
| Already using Redis | Redis Streams |
| Per-message reliability | RabbitMQ |
| Multi-consumer fan-out | Kafka or SNS+SQS |
Common patterns#
Outbox pattern: Write the event to an "outbox" table in the same database transaction as the business data. A separate process reads the outbox and publishes to the queue. Guarantees at-least-once delivery without distributed transactions.
Dead letter queue: Messages that fail processing N times get moved to a separate queue for manual inspection. Essential for debugging production issues.
Idempotent consumers: Since most queues deliver at-least-once, every consumer must handle duplicates gracefully. Use a deduplication ID.
See queues in your architecture#
On Codelit, generate any system with async processing and you'll see message queues connecting services. Click the queue node to audit throughput, failure handling, and scaling characteristics.
Explore message queue architectures: describe your system on Codelit.io and see how async processing flows between services.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
Netflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsSearch Engine Architecture
Web-scale search with crawling, indexing, ranking, and sub-second query serving.
8 componentsGoogle Search Engine Architecture
Web-scale search with crawling, indexing, PageRank, query processing, ads, and knowledge graph.
10 componentsBuild this architecture
Generate an interactive architecture for Message Queues Explained in seconds.
Try it in Codelit →
Comments