Learn System: Message Queues Deep Dive — RabbitMQ vs Kafka vs SQS with Routing and Delivery Guarantees

Message Queues Deep Dive — RabbitMQ vs Kafka vs SQS with Routing and Delivery Guarantees

DodaTech Updated Jun 20, 2026 6 min read

Message queues enable asynchronous communication between distributed services by buffering messages between producers and consumers, providing decoupling, load leveling, and fault tolerance. This deep dive covers routing strategies, delivery guarantees, consumer group architecture, backpressure handling, and dead letter queues across RabbitMQ, Kafka, SQS, and Google Pub/Sub.

Why Message Queues Matter at Scale

LinkedIn processes over 7 trillion messages per day through Kafka. Every Uber trip generates dozens of events through a message bus. Without queues, a spike in orders would directly overwhelm downstream databases. Queues buffer the spike, letting consumers process at their own pace. At DodaTech, message queue patterns power background file processing in DodaZIP and real-time threat analysis in Durga Antivirus Pro.

Broker Architecture

    graph TD
    P1[Order Service] -->|Publish| Exchange{Exchange}
    P2[Payment Service] -->|Publish| Exchange
    Exchange -->|Routing key match| Q1[Queue: orders]
    Exchange -->|Topic match| Q2[Queue: payments]
    Exchange -->|Fanout broadcast| Q3[Queue: analytics]
    Q1 --> C1[Consumer: Inventory]
    Q2 --> C2[Consumer: Fraud Detection]
    Q3 --> C3[Consumer: Data Warehouse]
    subgraph DLQ[Dead Letter Handling]
        Q1 -->|Nack/expired| DLQ1[DLQ: orders]
        C1 -->|Retry exhausted| DLQ1
    end
    style Exchange fill:#e67e22,color:#fff
    style Q1 fill:#3498db,color:#fff
    style Q2 fill:#27ae60,color:#fff
    style Q3 fill:#9b59b6,color:#fff
    style DLQ1 fill:#e74c3c,color:#fff

RabbitMQ vs Kafka vs SQS vs Pub/Sub

Feature	RabbitMQ	Kafka	SQS	Pub/Sub
Model	Message broker	Event log	Managed queue	Managed pub/sub
Delivery	Push to consumer	Pull (consumer poll)	Pull (long poll)	Push + pull
Persistence	Configurable	Configurable retention	Up to 14 days	Up to 7 days
Ordering	Per queue	Per partition	FIFO queues	Ordered per key
Throughput	10K-50K/s	1M+/s	Unlimited	1M+/s
Routing	Exchanges (direct, topic, fanout, headers)	Topic-based	Single queue	Topic + filter
Operations	Self-hosted	Self-hosted	Fully managed	Fully managed

Routing Strategies

RabbitMQ exchanges provide four routing strategies:

Direct exchange — a message is routed to queues whose binding key exactly matches the routing key. Used for task distribution.

Topic exchange — routing keys support wildcards: * matches one word, # matches zero or more. Example: order.*.completed matches order.us.completed.

Fanout exchange — broadcasts every message to all bound queues regardless of routing key. Used for event notifications.

Headers exchange — routes based on message header attributes instead of routing key.

# RabbitMQ topic exchange with Python
import pika

connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

# Declare topic exchange
channel.exchange_declare(exchange='order_events', exchange_type='topic')

# Bind queues with routing patterns
channel.queue_declare(queue='us_orders', durable=True)
channel.queue_bind(queue='us_orders', exchange='order_events', routing_key='order.us.*')

channel.queue_declare(queue='eu_orders', durable=True)
channel.queue_bind(queue='eu_orders', exchange='order_events', routing_key='order.eu.*')

# Publish with routing key
channel.basic_publish(
    exchange='order_events',
    routing_key='order.us.completed',
    body='{"order_id": 1234, "status": "completed"}',
    properties=pika.BasicProperties(delivery_mode=2)
)
print("Message published with routing key: order.us.completed")
connection.close()

Delivery Guarantees

At-most-once — the broker sends the message once with no retry. If the consumer fails, the message is lost. Highest throughput, lowest reliability.

At-least-once — the broker retries delivery until the consumer acknowledges. Duplicates are possible. The consumer must be idempotent.

Exactly-once — the broker guarantees each message is processed exactly once. Requires coordination between broker, producer, and consumer (Kafka’s transactional API, idempotent producers).

# Kafka idempotent producer for exactly-once
from kafka import KafkaProducer

producer = KafkaProducer(
    bootstrap_servers=['localhost:9092'],
    enable_idempotence=True,           # Exactly-once semantics
    acks='all',                        # Wait for all replicas
    retries=3,
)
producer.send('orders', value=b'{"order_id": 1234}')
producer.flush()
print("Message sent with exactly-once guarantees")

Consumer Groups and Partition Assignment

Kafka partitions are the unit of parallelism. Each partition is consumed by exactly one consumer in a group. Adding more consumers increases throughput up to the partition count.

from kafka import KafkaConsumer

consumer = KafkaConsumer(
    'orders',
    bootstrap_servers=['localhost:9092'],
    group_id='order-processor',
    enable_auto_commit=False,          # Manual offset management
    auto_offset_reset='earliest',
)

for message in consumer:
    print(f"Partition: {message.partition}, Offset: {message.offset}, Value: {message.value}")
    process_order(message.value)
    consumer.commit()                  # Commit after successful processing

Backpressure and Prefetch

Consumer overwhelm is a common failure mode. If consumers can’t keep up, queues grow unboundedly, consuming memory and increasing latency.

Consumer-side prefetch — limit how many unacknowledged messages a consumer can hold. In RabbitMQ, basic_qos(prefetch_count=1) ensures a consumer gets one message at a time.

Kafka consumer max.poll.records — caps the number of records returned per poll, preventing batch overload.

# RabbitMQ prefetch to prevent consumer overwhelm
channel.basic_qos(prefetch_count=10)  # Max 10 unacked messages

Dead Letter Queues

Messages that can’t be processed after exhausting retries move to a dead letter queue for manual inspection. This prevents poison messages from blocking the main queue.

# RabbitMQ dead letter configuration
channel.exchange_declare(exchange='main', exchange_type='direct')
channel.exchange_declare(exchange='dlx', exchange_type='direct')

channel.queue_declare(queue='processing_queue', arguments={
    'x-dead-letter-exchange': 'dlx',
    'x-dead-letter-routing-key': 'dead',
    'x-message-ttl': 300000,              # 5 minutes TTL
})

channel.queue_declare(queue='dead_letter_queue')
channel.queue_bind(queue='dead_letter_queue', exchange='dlx', routing_key='dead')

# Consumer with retry logic
def process_with_retry(ch, method, properties, body):
    try:
        result = process_message(body)
        ch.basic_ack(delivery_tag=method.delivery_tag)
    except Exception as e:
        retry_count = int(properties.headers.get('retry-count', 0)) if properties.headers else 0
        if retry_count < 3:
            # Reject and requeue with incremented retry count
            ch.basic_nack(delivery_tag=method.delivery_tag, requeue=True)
        else:
            # Moves to DLX automatically after TTL exhaustion
            ch.basic_nack(delivery_tag=method.delivery_tag, requeue=False)

Common Errors

No message persistence: When RabbitMQ restarts, non-persistent messages are lost. Always set delivery_mode=2 for critical messages. In Kafka, configure min.insync.replicas.
Infinite retry loops: A consumer retrying a bad message forever blocks the queue. Always implement a retry budget (e.g., 3 attempts) and route failures to a DLQ.
Consumer lag blindness: Without monitoring consumer lag (Kafka) or queue depth (RabbitMQ), a growing backlog silently causes hours-long delays. Alert on threshold breaches.
Single partition bottleneck: Kafka with one partition serializes all consumption. Partition count determines parallelism. Size your partition count for peak throughput.
Not handling rebalancing: When a Kafka consumer joins or leaves a group, partitions are reassigned. Your consumer must handle on_partitions_revoked gracefully (commit offsets, stop processing).
No retry with backoff: Immediate retries on transient failures (database deadlock, network blip) often fail again. Implement exponential backoff or scheduled retry queues.
Forgetting idempotency: At-least-once delivery means duplicates. Every consumer should be idempotent — processing the same message twice produces the same result.

Practice Questions

1. What is the difference between a message queue and an event stream?

Message queues (RabbitMQ, SQS) deliver each message to one consumer and remove it after acknowledgment. Event streams (Kafka, Kinesis) persist messages and allow multiple independent consumer groups.

2. How does Kafka achieve high throughput?

Sequential disk I/O (messages are append-only logs), zero-copy data transfer, batching in producer and consumer, and partitioning for parallel processing.

3. What is a dead letter queue and when should you use it?

A DLQ stores messages that could not be processed after exhausting retries. Use it to prevent poison messages from blocking main queues while preserving them for debugging.

4. Challenge: Design a retry queue with exponential backoff.

Create a secondary “retry” queue with TTL. After a failed message’s TTL expires, it routes back to the main queue. Increase TTL exponentially (1s, 4s, 16s, 64s) per retry attempt. Move to DLQ after 5 attempts.

Mini Project

Build a multi-service order processing pipeline:

import pika, json, time

connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

# Declare exchanges and queues
channel.exchange_declare(exchange='orders', exchange_type='topic')
channel.queue_declare(queue='inventory', durable=True)
channel.queue_declare(queue='payments', durable=True)
channel.queue_declare(queue='notifications', durable=True)
channel.queue_declare(queue='orders_dlq', durable=True)

# Bind queues
channel.queue_bind(queue='inventory', exchange='orders', routing_key='order.*.created')
channel.queue_bind(queue='payments', exchange='orders', routing_key='order.*.created')
channel.queue_bind(queue='notifications', exchange='orders', routing_key='order.*.*')

def place_order(order_id, user_id, amount, region):
    message = json.dumps({
        "order_id": order_id, "user_id": user_id,
        "amount": amount, "region": region
    })
    routing_key = f"order.{region}.created"
    channel.basic_publish(exchange='orders', routing_key=routing_key, body=message,
                          properties=pika.BasicProperties(delivery_mode=2))
    print(f"Published: {routing_key} -> {order_id}")

place_order("ORD-001", "USR-42", 99.99, "us")
place_order("ORD-002", "USR-7", 149.99, "eu")
# View RabbitMQ management UI at http://localhost:15672
connection.close()

Expected output:

Published: order.us.created -> ORD-001
Published: order.eu.created -> ORD-002

Cross-References

Previous CDN Deep Dive — Edge Servers, Cache Invalidation, and Geo-Routing Explained Next Distributed Caching — Redis Cluster, Memcached, and Multi-Tier Cache Strategies

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse System Design & Architecture