Getting kafkacat

Kafkacat is an open source tool available from Github at https://github.com/edenhill/kafkacat. Installation instructions are provided in the README. 

Configuration file

While you can specify all parameters straight on the commandline, we find it easiest to write the connection parameters in a common configuration file.

NOTE: This example requires kafkacat version 1.4.0-RC1 or greater (the "-F" option was added after 1.3.1). You can specify the same options on the commandline with earlier versions.

Here's an example:

bootstrap.servers=demo-kafka.htn-aiven-demo.aivencloud.com:17072
security.protocol=ssl
ssl.key.location=service.key
ssl.certificate.location=service.crt
ssl.ca.location=ca.crt

You can now refer to this config with -F flag:

kafkacat -F kafkacat.config

Alternatively, you can specify the same settings directly on the commandline:

kafkacat \
    -b demo-kafka.htn-aiven-demo.aivencloud.com:17072 \
    -X security.protocol=ssl \
    -X ssl.key.location=service.key \
    -X ss.certificate.location=service.crt \
    -X ssl.ca.location=ca.crt

If you're using SASL authentication instead of client certificates the command would look something like this:

kafkacat \
    -b demo-kafka.htn-aiven-demo.aivencloud.com:17072 \
    -X ssl.ca.location=ca.crt
    -X security.protocol=SASL_SSL
    -X sasl.mechanisms=SCRAM-SHA-256
    -X sasl.username=avnadmin
    -X sasl.password=yourpassword

Producing data

Here's a simple example on how to produce a single message into topic test-topic::

echo test-message-content | kafkacat -F kafkacat.config -P -t test-topic -k test-message-key

-P flag stands for produce mode, -t specifies the topic and -k sets the message key.

Run kafkacat without parameters to get a help output. You can use file input and specify a delimiter (-D) for splitting input into individual records for bulk loading of data.

Consuming data

Here's a simple example on how to read data out of the same topic::

kafkacat -F kafkacat.config -C -t test-topic -o -1 -e

-C flag stands for consume mode, -t selects the topic again. -o stands for offset, and negative values are considered relative to the latest message. -e stops kafkacat once it reaches the end of the log; without it, it will continue to poll for new messages.

Again, you can consult the help for full list of options. Of note, -p lets you specify a single partition for inspection.

I commonly debug message flow issues with the -f flag and the format string "%t-%p: %o %S". This gives you topic, partition, offset as well as size for each message. Absolute offset -o paired with a suitably sized count -c is great help in figuring out error cases with data pipelines.

Did this answer your question?