Kafka architecture—a deep dive

Kafka offset

Apache Kafka® is a streaming data platform and a distributed event store. Under the hood, Kafka’s architecture divides messages in a topic into partitions to allow parallel processing. Within a partition, Kafka identifies each message through the message’s offset. Offset is a continuously increasing identifier that represents the order of a message from the beginning of the partition.

This article dives deep into the concept of Kafka offset and explains how it facilitates its highly scalable and robust distributed architecture.

Summary of key Kafka offset concepts

ConceptDescription
Kafka offsetKafka offsets are identifiers of messages within a Kafka partition. They represent the order of a message from the beginning of a partition.
Log-end offsetThe last message present in a Kafka partition.
Committed offsetThe last message the consumer processed within a partition.
Consumer lagDifference between the committed offset and the last offset.
ScalabilityDividing data into partitions and using offset to track progress is crucial to enabling parallel processing and scalability in Kafka architecture.
Message orderingOffsets help consumers process messages in the same order they arrive in.
Fault toleranceKafka offsets enable fault tolerance by allowing failed consumers to restart from the last message it processed.

What is a Kafka offset?

Kafka’s operational structure consists of producers, consumers, and the broker.

  • Producers are applications that produce messages to Kafka topics.
  • Kafka broker receives messages from producers and stores them durably.
  • Consumers are applications that read the data in Kafka topics.

A Kafka topic is a logical group of data that deals with a specific business objective. The topics are divided into partitions to aid parallelism. Kafka keeps track of the messages in a partition using an identifier called offset. Every message has its own unique offset or identifier value.

Internally, Kafka writes messages to logs, and the Kafka offset represents the position of a message within that partition’s log. It indicates how far the message is from the beginning of the partition log.

Kafka offset
Kafka offset

Why are Kafka offsets necessary?

Multiple producers can write messages to a Kafka topic simultaneously, and multiple consumers can process messages from a topic parallelly. A consumer group is a set of consumers that consume from the same topic. Kafka ensures that a partition is assigned to only one consumer at a time. This helps Kafka remove the complexities of sharing messages within a single partition with multiple consumers.

The only remaining problem is ensuring the consumers get reliable data from the assigned partitions. This is where offsets come into the picture.

Kafka uses offsets to track messages from initial writing to final processing completions. Thanks to offsets, Kafka ensures consumers fetch messages in the same order producers write them. Your system can also recover gracefully from failures and continue from the exact point of failure.

While every message has its own offset, specific terms are used to track key offsets in the stream.

  • Log-end-offset is the last message present in a Kafka partition
  • High watermark offset is the point of the log up to which all messages are synced to partition replicas.
  • Committed offset indicates the last message a consumer successfully processed.

Below, we give some examples of ways that Kafka uses offsets.

Managing partition replication

Once a message arrives in a topic, it takes some time to sync with all the replicas of that partition. When the consumer fetches the messages, it first gets the high watermark offset from the broker and tries to process messages only up to that offset. This helps Kafka ensure that the consumer only considers fully replicated messages.

Managing consumer failure

If the consumer crashes between two adjacent commit intervals, it can lose messages. If the consumer fails just after processing a set of messages between the commit intervals, it can fetch duplicate messages. Offset provides an identifier for Kafka to keep track of consumers' progress. In case of consumer termination and restart, the Kafka offset ensures the consumer can resume the correct message.

Message delivery guarantees with Kafka offsets

By default, Kafka pulls in a micro-batch of messages and commits the offset at periodic intervals. This mechanism is called the auto-commit strategy. In auto-commit mode, Kafka does not guarantee message delivery. Depending on the failure scenarios, messages may get lost or be delivered more than once. The producer simply fires and forgets the messages.

Tweaking configurations related to offset committing can establish stricter message delivery guarantees. However, it depends on both producer and consumer settings. Kafka supports three kinds of delivery guarantees.

At most once

You set the producer acknowledgment configuration parameter ‘acks’ to ‘1’ or ‘all.’ On the consumer side, offsets are committed as soon as the message arrives, even before processing them. In this case, consumer failure during processing may result in lost messages, but no messages are processed more than once.

At least once

At-least-once guarantee ensures that no message will be missed, but there is a chance of messages being processed more than once. You enable producer acknowledgment and implement consumers such that they commit offset only after processing the messages. Consumer failure can result in messages being processed twice, but the risk of losing messages is very low.

Exactly once

Exactly once semantics ensure that there are no lost messages or duplicate messages. You use the idempotence configuration in the producer to prevent it from sending the same message twice to a partition.

  1. Configure the parameter enable.idempotence=true in your producer.
  2. Set the parameter processing.guarantee=exactly_once in your stream application.

At the time of writing, this is possible only when the consumer’s output is sent to a Kafka topic itself and not to other remote syncs. In this case, the offset commit and the consumer processing of messages happen as a single transaction, thereby removing the possibility of a message being processed twice on the consumer side. If the transaction fails, all its effects are reverted, making the whole action atomic.

Monitoring consumer lag

Monitoring offsets helps identify performance issues in a Kafka cluster. One of Kafka's key performance indicators is consumer lag. It represents the difference between the committed offset and the log-end offset. A minimal lag between log-end offset and committed offset is expected during normal operations. However, if the lag increases, it may break the system.

Calculating consumer lag from Kafka offsets
Calculating consumer lag from Kafka offsets

The most common reason for high lag is unpredictable surges in incoming messages.

Uneven data distribution across partitions within a topic can also increase consumer lag. By default, Kafka splits data into partitions by considering the hash of the message key. If you customize the message key and the message volume with a specific key is higher than others, the consumer catering to that partition experiences a high load, leading to high lag.

Inherently slow processing jobs and errors in pipeline components are other reasons behind consumer lag.

Working with Kafka offsets

Kafka provides several options to fine-tune the behavior of the offset handling process and achieve optimal performance. Let us look at some key offset configurations and when to use them.

Resetting the consumer offset

At times, developers need to reset the consumer’s offset and restart its processing from the very beginning or end of the offset. This mostly happens when you identify a problem with the consumer logic, and you need to recreate the state by executing the logic for all the messages.

In such cases, the below command is beneficial. Please note that this command is specific to a consumer group and not to all consumers.

./kafka-consumer-groups.sh --bootstrap-server localhost:9000 --group groupName --reset-offsets --to-earliest --topic topic_name-execute

Alternatively, this can be done if you use a new consumer group — just by renaming the current one.

Auto offset reset configuration

This configuration defines the behavior of consumers when they are started without any initially committed offsets. One example is when a consumer is started for the very first time. This can also happen in failure scenarios with no committed offsets or if the consumer cannot locate the next offset after the committed offset.

You can configure how Kafka reacts to such scenarios using the parameter ‘auto.offset.reset’. Possible values are ‘earliest’, ‘latest’, or ‘none.’ If the value is set as ‘earliest,’ the consumer starts processing from the first offset present in that partition. The ‘latest’ value instructs the consumer to begin processing from the latest available offset, and setting the parameter as ‘none’ results in an exception if there are no committed offsets.

If a committed offset is present, the consumer reads from the last committed offset, irrespective of the value of this setting.

Changing automatic offset committing interval

By default, Kafka commits offsets based on a periodic interval. This interval can be changed using the ‘auto.commit.interval.ms’ parameter. The default value is 5 seconds.

If this parameter is set to lower values, the committing frequency increases, and there is a chance that committing happens even before the consumers have completed processing. In such cases, if a consumer fails, messages can be lost. On the other hand, setting this to higher values increases the risk of consumers processing duplicate messages.

Disabling automatic offset committing

Automatic offset committing spares consumers from having to implement elaborate logic to commit offset. However, this strategy is not practical for critical use cases where message loss or duplicate messages are detrimental. One can disable automatic offset committing by setting the ‘enable.auto.commit’ property to false. This opens up the possibility of enforcing at-most-once, at-least-once, or exactly-once guarantees.

Best practices while configuring the Kafka offset

Kafka offset configurations require balancing throughput with message delivery guarantees. When message delivery guarantees increase, throughput decreases. This is because enforcing message guarantees creates overhead in the system. For example, adopting acknowledgment or idempotence instead of the fire-and-forget approach reduces some producer throughput. Likewise, consumers take on some additional load for committing offset through various strategies.

While using the auto-commit strategy, it is better to go with the default value or higher than that. Setting a lower value results in much overhead for the broker and can result in unnecessary CPU utilization. Setting to higher values lowers the risk of lost messages at the expense of duplicate ones.

Kafka’s default auto-committing strategy is best for use cases where only the final state is important, and the messages represent the system's final state. In such cases, losing a few messages or getting a few duplicate messages does not matter much in the long term. For example, a small amount of data loss is inconsequential for most monitoring use cases, and you can use either the default approach or the at-most-once guarantee.

The auto offset reset configuration has to be decided based on the use case. Suppose this value is set as ‘earliest,’ and the broker has already accumulated a lot of data before consumers start. In that case, consumers spend more time catching up until they can access the latest data. For use cases where missing the previous data is unimportant, one can set the parameter as ‘latest.’

Conclusion

Kafka offset represents the order of messages inside a partition from the beginning of that partition. This numerical value helps Kafka keep track of progress within a partition. It also allows Kafka to scale horizontally while staying fault-tolerant. The logic used for offset storage is also integral to Kafka’s message delivery guarantees.

Kafka provides several configuration parameters to alter the behavior of its offset management process. To achieve the optimal configuration, one must carefully consider the use case and analyze the impact of each parameter.

If you’re looking for a simple, interactive approach to monitoring your Kafka consumers, as well as consumer lag and editing your consumer group offsets, try Redpanda Console for free!

Chapters