Kafka Utils
Kafka Utils
Kafka Utils
Release 0.5.3
Yelp Inc.
Contents
Description
How to install
2.1 Configuration . . . .
2.2 Cluster Manager . .
2.3 Consumer Manager
2.4 Rolling Restart . . .
2.5 Kafka Check . . . .
2.6 Corruption Check .
2.7 Indices and tables .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
4
6
9
10
11
12
ii
CHAPTER 1
Description
Kafka-Utils is a library containing tools to interact with kafka clusters and manage them. The tool provides utilities
like listing of all the clusters, balancing the partition distribution across brokers and replication-groups, managing
consumer groups, rolling-restart of the cluster, cluster healthchecks.
For more information about Apache Kafka see the official Kafka documentation.
Chapter 1. Description
CHAPTER 2
How to install
2.1 Configuration
Kafka-Utils reads the cluster configuration needed to access Kafka clusters from yaml files. Each cluster is identified
by type and name. Multiple clusters of the same type should be listed in the same type.yaml file. The yaml files
are read from $KAFKA_DISCOVERY_DIR, $HOME/.kafka_discovery and /etc/kafka_discovery, the
former overrides the latter.
Sample configuration for sample_type cluster at /etc/kafka_discovery/sample_type.yaml
--clusters:
cluster-1:
broker_list:
- "cluster-elb-1:9092"
zookeeper: "11.11.11.111:2181,11.11.11.112:2181,11.11.11.113:2181/kafka-1"
cluster-2:
broker_list:
- "cluster-elb-2:9092"
zookeeper: "11.11.11.211:2181,11.11.11.212:2181,11.11.11.213:2181/kafka-2"
local_config:
cluster: cluster-1
will pick up default cluster cluster-1 from the local_config at /etc/kafka_discovery/sample_type.yaml to display statistics of default kafka-configuration.
class SampleGroupParser(ReplicationGroupParser):
def get_replication_group(self, broker):
"""Extract the replication group from a Broker instance.
Suppose each broker hostname is in the form broker-rack<n>, this
function will return "rack<n>" as replication group
"""
if broker.inactive:
# Can't extract replication group from inactive brokers because they
# don't have metadata
return None
hostname = broker.metadata['host']
return hostname.rsplit('-', 1)[1]
Replica-distribution
Uniform distribution of replicas across replication groups.
$ kafka-cluster-manager --cluster-type sample_type rebalance --replication-groups
Partition distribution
Uniform distribution of partitions across groups and brokers.
$ kafka-cluster-manager --cluster-type sample_type rebalance --brokers
Topic-partition distribution
Uniform distribution of partitions of the same topic across brokers.
The command provides the ability to balance one or more of these layers except for the topic-partition imbalance layer
which will be balanced implicitly with replica or partition rebalancing.
kafka_utils.kafka_cluster_manager.cluster_topology provides APIs to create a cluster-topology
object based on the distribution of topics, partitions, brokers and replication-groups across the cluster.
Rebalancing all layers
Rebalance all layers for given cluster. This command will generate a plan with a maximum of 10 partition movements
and 25 leader-only changes after rebalancing the cluster for all layers discussed before prior to sending it to zookeeper.
$ kafka-cluster-manager --group-parser $HOME/parser:sample_parser --apply
--cluster-type sample_type rebalance --replication-groups --brokers --leaders
--max-partition-movements 10 --max-leader-changes 25
Note: While decommissioning brokers we need to ensure that we have at least n number of active brokers where n
is the max replication-factor of a partition.
$ kafka-cluster-manager --cluster-type sample_type decommission 123456 123457 123458
2.2.5 Stats
This command provides statistics for the current imbalance state of the cluster. It also provides imbalance statistics
of the cluster if a given partition-assignment plan were to be applied to the cluster. The details include the imbalance
value of each of the above layers for the overall cluster, each broker and across each replication-group.
$ kafka-cluster-manager --group-parser $HOME/parser:sample_parser --cluster-type
sample_type stats
2.3.1 Subcommands
copy_group
delete_group
6
list_groups
list_topics
offset_advance
offset_get
offset_restore
offset_rewind
offset_save
offset_set
rename_group
unsubscribe_topics
If list_groups is called with the --storage option, then the groups will only be fetched from Zookeeper or
Kafka.
The offsets for all topics in the consumer group will be shown by default. A single topic can be specified using the
--topic option. If a topic is specified, then a list of partitions can also be specified using the --partitions
option.
By default, the offsets will be fetched from both Zookeeper and Kafkas internal offset storage. A specific offset
storage location can be speficied using the --storage option.
The save offsets file can then be used to restore the consumer group.
The offsets can also be set directly using the offset_set command. This command takes a group id, and a set of
topics, partitions, and offsets.
There is also an offset_advance command, which will advance the current offset to the same value as the high
watermark of a topic, and an offset_rewind command, which will rewind to the low watermark.
If the offset needs to be modified for a consumer group does not already exist, then the --force option can be used.
This option can be used with offset_set, offset_rewind, and offset_advance.
When the group is copied, if a topic is specified using the --topic option, then only the offsets for that topic will
be copied. If a topic is specified, then a set of partitions of that topic can also be specified using the --partitions
option.
A consumer group be unsubscribed from topics using the unsubscribe_topics subcommand. If a single topic
is specified using the --topic option, then the group will be unsubscribed from only that topic.
2.4.2 Parameters
The parameters specific for kafka-rolling-restart are:
--check-interval INTERVAL: the number of seconds between each check. Default 10.
--check-count COUNT: the number of consecutive checks that must result in cluster healthy before restarting the next server. Default 12.
--unhealthy-time-limit LIMIT: the maximum time in seconds that a cluster can be unhealthy for. If
the limit is reached, the script will terminate with an error. Default 600.
--jolokia-port PORT: The Jolokia port. Default 8778.
--jolokia-prefix PREFIX: The Jolokia prefix. Default jolokia/.
--no-confirm: If specified, the script will not ask for confirmation.
--skip N: Skip the first N servers. Useful to recover from a partial rolling restart. Default 0.
--verbose: Turn on verbose output.
2.4.3 Examples
Restart the generic dev cluster, checking the JXM metrics every 30 seconds, and restarting the next broker after 5
consecutive checks have confirmed the health of the cluster:
Check the generic prod cluster. It will report an error if the cluster is unhealthy for more than 900 seconds:
$ kafka-rolling-restart --cluster-type generic --cluster-name prod --unhealthy-time-limit 900
10
2.6.1 Parameters
The parameters specific for kafka-corruption-check are:
--minutes N: check the log files modified in the last N minutes.
--start-time START_TIME: check the log files modified after START_TIME. Example format:
--start-time "2015-11-26 11:00:00"
--end-time END_TIME: check the log files modified before END_TIME. Example format: --end-time
"2015-11-26 12:00:00"
--data-path: the path to the data files on the Kafka broker.
--java-home: the JAVA_HOME on the Kafka broker.
--batch-size BATCH_SIZE: the number of files that will be checked in parallel on each broker. Default:
5.
--check-replicas: if set it will also check the data on replicas. Default: false.
--verbose: enable verbose output.
2.6.2 Examples
Check all the files (leaders only) in the generic dev cluster and which were modified in the last 30 minutes:
11
Check all the files that were modified in the specified range:
$ kafka-corruption-check [...] --start-time "2015-11-26 11:00:00" --end-time "2015-11-26 12:00:00"
12