You've decided to migrate your Kafka data to another broker and need an efficient, straightforward solution.
In this guide, we will explore how to use kcat, a powerful and flexible Kafka data manipulation tool, to transfer data from one broker to another.
This method is particularly useful when you need a simple way to move small amounts of data from one broker to another.
In my case, I used this approach to transfer data between my development server and my local Kafka server.
When to use the kcat Approach
Simple and Fast Transfer: For moving small datasets between brokers.
Development Environments: Ideal for development environments where simplicity and time-saving solutions are valuable.
Installing kcat
Before you begin, make sure you have kcat installed on your system. You can install it using a package manager, such as Homebrew for macOS or apt-get for Debian/Ubuntu-based systems.
brew install kafkacat # For macOS using Homebrew
# OR
sudo apt-get install kafkacat # For Debian/Ubuntu-based systems
Transferring an Entire Topic
We can use kcat in producer mode, where it is used to send messages to a specific topic in Kafka. In consumer mode, we can consume messages and print them in the terminal. Our simple trick is redirecting the output of the kcat
command in consumer mode to the kcat
command in producer mode. In other words, the messages received by the first command will be sent to the second command.
See the script below with the necessary commands and details:
#!/usr/bin/env bash
# Defining the source and target brokers, as well as the corresponding topics.
SOURCE_BROKER=source_host_or_ip:9092
SOURCE_TOPIC=topic
TARGET_BROKER=target_host_or_ip:9092
TARGET_TOPIC=topic
# Using kcat to consume messages from the source broker and produce them to the target broker.
# -C: Consumer mode
# -P: Producer mode
# -b: Specifies the broker to be used
# -t: Specifies the topic to be consumed or produced
# -o beginning: Reads messages from the beginning of the topic
# -K: Sets the delimiter between the key and value of messages
# -e: Exits the command after receiving the last available message
# | : Pipe, redirects the output from the first command to the second.
# Consuming messages from the source broker and producing them to the target broker using kcat.
kcat -C -b $SOURCE_BROKER -t $SOURCE_TOPIC -o beginning -K: -e | \
kcat -b $TARGET_BROKER -P -t $TARGET_TOPIC -K:
Transferring Messages Created within a Specific Date and Time Range
We can also transfer only part of a topic with different options. In the code below, we will transfer only messages created between 10 AM and 8 PM on a specific day.
#!/usr/bin/env bash
# Defining the source and target brokers, as well as the corresponding topics.
SOURCE_BROKER=source_host_or_ip:9092
SOURCE_TOPIC=topic
TARGET_BROKER=target_host_or_ip:9092
TARGET_TOPIC=topic
START_TIMESTAMP=$(date -d '2023-10-25 10:00:00 -03' +%s000)
END_TIMESTAMP=$(date -d '2023-10-25 20:00:00 -03' +%s000)
# We replace -o beginning with -o s@$START_TIMESTAMP -o e@$END_TIMESTAMP, where each one is a timestamp like -o s@1568276612443 -o e@1568276617901
kcat -C -b $SOURCE_BROKER -t $SOURCE_TOPIC -o s@$START_TIMESTAMP -o e@$END_TIMESTAMP -K: -e | \
kcat -b $TARGET_BROKER -P -t $TARGET_TOPIC -K:
Filtering Data
If your messages are not in binary format, you can filter them by any string before sending them to the target broker.
Please note: You will still be fetching all messages, and the filtering will occur on your machine.
#!/usr/bin/env bash
# Defining the source and target brokers, as well as the corresponding topics.
SOURCE_BROKER=source_host_or_ip:9092
SOURCE_TOPIC=topic
TARGET_BROKER=target_host_or_ip:9092
TARGET_TOPIC=topic
FILTER_STRING=some_string
kcat -C -b $SOURCE_BROKER -t $SOURCE_TOPIC -o beginning -K: -e | \
grep $FILTER_STRING | \
kcat -b $TARGET_BROKER -P -t $TARGET_TOPIC -K:
Please note that we added the grep
utility, which will receive the messages from the first command, filter only those containing the desired string, and send them to the third command.
Bonus: Searching for a Message in Kafka
We can use the above command to search for a message by removing the producer command:
#!/usr/bin/env bash
# Defining the source broker and topic, as well as the filter string.
SOURCE_BROKER=source_host_or_ip:909
2
SOURCE_TOPIC=topic
FILTER_STRING=some_string
kcat -C -b $SOURCE_BROKER -t $SOURCE_TOPIC -o beginning -K: -e | \
grep $FILTER_STRING
Conclusion
In this guide, we explored an efficient and straightforward way to transfer small amounts of data from one Kafka broker to another using the powerful tool kcat. By following the provided examples, you've learned how to migrate data simply between Kafka brokers, whether by transferring an entire topic, filtering messages by time intervals, or specific keywords.
It's crucial to highlight that this approach is more suitable for transferring smaller datasets or when you need a quick solution in a development environment. If you're dealing with large volumes of data or in a production environment, it's advisable to consider other transfer strategies, such as using replication tools or more robust data pipelines, to ensure efficiency and security during migration.
We hope this guide has been helpful for your Kafka broker data transfer needs.