Module 5: Consumer Groups in Kafka

Chapter 5 • Intermediate

40 min

Consumer Groups in Kafka

Consumer Groups are one of Kafka's most powerful features for scaling message processing and distributing load across multiple consumers. They enable parallel processing while keeping message ordering per partition and providing strong delivery guarantees (typically at-least-once, and effectively once with the right patterns).

In this module, you'll learn how consumer groups work, how partitions are assigned, how offsets are managed, and how to scale consumers safely.

🎯 What You Will Learn

By the end of this module, you will be able to:

Explain what a consumer group is and why it exists
Describe how partitions are assigned to consumers in a group
Understand rebalancing, its impact, and different assignment strategies
Manage offsets, consumer lag, and reset policies
Scale consumers horizontally and apply best practices for performance
Recognize common patterns and troubleshoot typical consumer group issues

🔍 What is a Consumer Group?

A Consumer Group is a collection of consumers that work together to consume data from one or more topics.

All consumers in the same group share the load
Each partition of a topic is processed by at most one consumer in that group
Different consumer groups can read the same topic independently

This lets you scale processing while preserving ordering per partition.

🧠 Key Concepts

1. Partition Assignment

Each partition in a topic is assigned to only one consumer within a group at a time
Multiple consumers in the same group cannot read the same partition simultaneously
If you have more consumers than partitions, some consumers will be idle

Rule of thumb:

For maximum parallelism, aim for consumers ≤ partitions per consumer group.

2. Load Balancing and Rebalancing

Kafka automatically distributes partitions among consumers in a group
When a consumer joins, leaves, or crashes, Kafka triggers a rebalance
During a rebalance:
Consumers temporarily stop processing
Partitions are reassigned
Consumers resume with their new assignments

Rebalancing is necessary for fault tolerance and elasticity, but too frequent rebalancing hurts performance.

3. Offset Management

Offsets track how far each consumer group has progressed in each partition.

Each consumer group maintains its own offsets per partition
Different groups can read the same messages at different speeds
Offsets are typically stored in a special internal topic: __consumer_offsets

Proper offset management is key to avoiding duplicates or data loss.

🏗️ Consumer Group Architecture

Consider a topic with three partitions:

code

    Topic: user-events (3 partitions)
    ├── Partition 0: [msg1, msg4, msg7, ...]
    ├── Partition 1: [msg2, msg5, msg8, ...]
    └── Partition 2: [msg3, msg6, msg9, ...]

Consumer Group "analytics"

code

    Consumer Group "analytics":
    ├── Consumer 1 → Partition 0
    ├── Consumer 2 → Partition 1
    └── Consumer 3 → Partition 2

Here, the group processes the topic in parallel, with each consumer handling a different partition.

Consumer Group "notifications"

code

    Consumer Group "notifications":
    ├── Consumer A → All partitions (independent processing)
    ├── Consumer B → All partitions (independent processing)
    └── Consumer C → All partitions (independent processing)

Another group can read the same data but perform different processing (e.g., notifications vs analytics).

🧭 Consumer Group Coordination

Group Coordinator

Kafka designates one broker as the group coordinator for each consumer group. It:

Manages consumer group membership
Handles partition assignment and rebalancing
Stores and updates consumer group metadata

Rebalancing Process (High-Level)

Consumer Joins – New consumer starts and joins the group.
Rebalance Trigger – Coordinator detects membership change (join/leave/failure).
Stop Processing – Consumers temporarily stop processing.
Partition Assignment – Coordinator calculates a new assignment.
Resume Processing – Consumers resume with new partition assignments.

Rebalancing is necessary but should be controlled to avoid constant redistributions.

Rebalancing Strategies

Kafka supports several partition assignment strategies:

Range

Assigns consecutive partitions to consumers (e.g., 0–3 to C1, 4–7 to C2).

Round Robin

Distributes partitions evenly in a round-robin fashion.

Sticky

Tries to minimize partition movement between rebalances.

Cooperative Sticky (Kafka 2.4+)

Performs incremental rebalancing, reducing pauses and improving stability.

Choosing the right strategy can reduce disruption and improve throughput.

📍 Offset Management

Offsets tell Kafka:

“For this consumer group, up to which message in this partition have we successfully processed?”

Offset Storage

Stored in Kafka’s internal topic: __consumer_offsets
Each consumer group has its own offset entries
Enables independent processing per group

Commit Strategies

Automatic Commit

properties.js

      enable.auto.commit=true
      auto.commit.interval.ms=5000

Kafka commits offsets automatically at intervals
Simple, but can cause duplicates or lost messages if the consumer crashes between commit and processing

Manual Commit (Recommended for control)

Your application decides when to commit, usually after successful processing.

Synchronous Commit – Blocking, safer, simpler error handling
Asynchronous Commit – Non-blocking, higher throughput, requires careful error handling

Offset Reset Policies

`earliest` – Start from the beginning of the partition if no offset is found
`latest` – Start from the latest message
`none` – Fail if no offset is found (forces explicit offset handling)

These are used when a consumer group appears for the first time or when offsets are invalid.

⏱️ Consumer Lag and Monitoring

What is Consumer Lag?

Lag = Latest offset in the partition – Consumer’s committed offset
High Lag: Consumer is falling behind
Zero (or low) Lag: Consumer is keeping up with real-time data

Lag is one of the most important metrics in Kafka systems.

Monitoring Consumer Groups

Common ways to monitor consumer lag and health:

Kafka Manager – Web-based management tool
Confluent Control Center – Enterprise monitoring
JMX Metrics – Built-in metrics for monitoring
Grafana / Prometheus – Popular combo for dashboards and alerts

Set alerts for high lag to detect slow consumers or insufficient capacity.

📈 Scaling Consumers Horizontally

Adding More Consumers to a Group

Scale Up – Start more consumer instances with the same group.id.
Partition Limit – Only up to one consumer per partition in a group can be active.
Rebalance – Kafka automatically reassigns partitions across consumers.
Result – Improved throughput and better fault tolerance (if one consumer fails, others take over).

Best Practices

Plan partition count with future scaling in mind.
Try to keep consumer count ≤ partition count.
Avoid frequent restarts/rescaling that cause constant rebalancing.
Always monitor consumer lag and rebalance frequency.

🌍 Real-World Examples

Example 1: E-commerce Analytics

code

    Topic: user-events (12 partitions)
    
    Analytics Group (4 consumers):
    ├── Consumer 1 → Partitions 0, 1, 2
    ├── Consumer 2 → Partitions 3, 4, 5
    ├── Consumer 3 → Partitions 6, 7, 8
    └── Consumer 4 → Partitions 9, 10, 11
    
    Throughput: 100,000 events/second
    Latency: < 100ms

This group processes user behavior in parallel, keeping up with high event volumes.

Example 2: Real-time Notifications

code

    Topic: notifications (6 partitions)
    
    Notification Group (3 consumers):
    ├── Consumer A → Partitions 0, 1
    ├── Consumer B → Partitions 2, 3
    └── Consumer C → Partitions 4, 5
    
    Processing: Email, SMS, Push notifications
    Latency: < 50ms

Here, the group ensures that notification events are processed quickly and independently.

⚙️ Consumer Group Configuration

Key Configuration Parameters

properties.js

    # Group Settings
    group.id=my-consumer-group
    group.instance.id=consumer-1
    
    # Session Management
    session.timeout.ms=30000
    heartbeat.interval.ms=3000
    
    # Offset Management
    enable.auto.commit=true
    auto.commit.interval.ms=5000
    auto.offset.reset=latest
    
    # Fetch Settings
    fetch.min.bytes=1
    fetch.max.wait.ms=500
    max.partition.fetch.bytes=1048576

Performance Tuning Tips

Session Timeout – Balance responsiveness vs stability
Heartbeat Interval – Frequent heartbeats for faster failure detection
Fetch Size – Larger fetches for better throughput
Commit Frequency – Balance between performance and durability

🧱 Common Patterns with Consumer Groups

Pattern 1: Fan-out Processing

Multiple consumer groups process the same data
Each group has different processing logic (e.g., analytics, notifications, billing)
Independent scaling and fault tolerance for each group

Pattern 2: Pipeline Processing

Sequential processing through multiple topics
Each stage has its own consumer group
Enables complex data transformations and streaming ETL

Pattern 3: Microservices Integration

Each microservice has its own consumer group
Decoupled processing and scaling
Independent deployment and monitoring per service

🛠️ Troubleshooting Consumer Groups

Common Issues

Consumer Lag – Consumers are falling behind.
Frequent Rebalancing – Constant partition movement, causing pauses.
Offset Management Problems – Lost or duplicate messages.
Uneven Partition Assignment – Some consumers do much more work than others.

Solutions

Scale Consumers – Add more consumers or partitions to reduce lag.
Optimize Processing – Improve consumer processing speed (DB, I/O, logic).
Tune Configuration – Adjust session timeouts, fetch sizes, and commit strategies.
Monitor Metrics – Track consumer lag, rebalancing, and error rates.

✅ Key Takeaways

A consumer group is how Kafka scales message processing horizontally while preserving partition ordering.
Partition assignment ensures only one consumer per partition within a group.
Kafka handles rebalancing, but too many rebalances can hurt performance.
Offsets and commit strategies determine your guarantees around duplicates and data loss.
Consumer lag is the main health signal for consumer groups.
Consumer groups power patterns like fan-out, pipelines, and microservice integration.

📚 What’s Next?

After understanding consumer groups conceptually, the next step is to work with real code:

Build Kafka consumers, run them as a group, observe rebalancing and lag, and experiment with offset commit strategies.

That’s where you’ll see these concepts turn into hands-on experience.

Hands-on Examples

Consumer Group Load Balancing Demo

# Consumer Group Load Balancing Example
    
    ## Scenario: E-commerce Event Processing
    Topic: "order-events" (6 partitions)
    Consumer Group: "order-processors"
    
    ## Initial Setup (3 consumers):
    ┌─────────────────────────────────────────────────────────┐
    │                Consumer Group: order-processors         │
    ├─────────────────────────────────────────────────────────┤
    │  Consumer 1    Consumer 2    Consumer 3                │
    │      │             │             │                     │
    │   Partition 0   Partition 2   Partition 4             │
    │   Partition 1   Partition 3   Partition 5             │
    └─────────────────────────────────────────────────────────┘
    
    ## Adding Consumer 4 (Rebalancing):
    ┌─────────────────────────────────────────────────────────┐
    │                Consumer Group: order-processors         │
    ├─────────────────────────────────────────────────────────┤
    │  Consumer 1    Consumer 2    Consumer 3    Consumer 4  │
    │      │             │             │             │        │
    │   Partition 0   Partition 1   Partition 3   Partition 5│
    │   Partition 2   Partition 4                             │
    └─────────────────────────────────────────────────────────┘
    
    ## Processing Flow:
    1. Order created → Topic "order-events"
    2. Message routed to partition (hash(order_id) % 6)
    3. Consumer processes message from assigned partition
    4. Offset committed after successful processing
    5. Next message processed from same partition
    
    ## Benefits:
    - Parallel processing across partitions
    - Automatic load balancing
    - Fault tolerance through replication
    - Independent scaling of consumers

This example shows how consumer groups automatically distribute partitions among consumers, enabling parallel processing and fault tolerance.

Module 4: Kafka Architecture (Deep Dive)

Module 6: Kafka Setup & Hands-On

Module 5: Consumer Groups in Kafka

Consumer Groups in Kafka

🎯 What You Will Learn

🔍 What is a Consumer Group?

🧠 Key Concepts

1. Partition Assignment

2. Load Balancing and Rebalancing

3. Offset Management

🏗️ Consumer Group Architecture

Consumer Group "analytics"

Consumer Group "notifications"

🧭 Consumer Group Coordination

Group Coordinator

Rebalancing Process (High-Level)

Rebalancing Strategies

📍 Offset Management

Offset Storage

Commit Strategies

Offset Reset Policies

⏱️ Consumer Lag and Monitoring

What is Consumer Lag?

Monitoring Consumer Groups

📈 Scaling Consumers Horizontally

Adding More Consumers to a Group

Best Practices

🌍 Real-World Examples

Example 1: E-commerce Analytics

Example 2: Real-time Notifications

⚙️ Consumer Group Configuration

Key Configuration Parameters

Performance Tuning Tips

🧱 Common Patterns with Consumer Groups

Pattern 1: Fan-out Processing

Pattern 2: Pipeline Processing

Pattern 3: Microservices Integration

🛠️ Troubleshooting Consumer Groups

Common Issues

Solutions

✅ Key Takeaways

📚 What’s Next?

Hands-on Examples

Consumer Group Load Balancing Demo

Related Tutorials

Previous: Module 4: Kafka Architecture (Deep Dive)

Next: Module 6: Kafka Setup & Hands-On