Setting up a Multi-Broker Kafka Cluster

Kafka is an open source distributed messaging system that is been used by many organizations for many use cases. Its use cases include stream processing, log aggregation, metrics collection and so on.

Note: This tutorial is based on Redhat 7 derivative. However, it will work on most Linux systems.

Multi-Node Kafka Cluster Setup

This tutorial will guide you to set up a latest Kafka cluster from scratch.

Prerequisites

1. You need a Zookeeper cluster before setting up a Kafka cluster. Refer this zookeeper cluster setup if you don't have one.

2. Launch three instances. Make sure you allow the traffic between Zookeeper and Kafka instances in the security groups.

3. Set hostnames for three instances for identification using the following command.

hostnamectl set-hostname (node1,2,3)

Kafka Installation

Perform the following tasks on all the servers. 1. Update the server.

sudo yum update -y

2. Install java 8.

sudo yum  -y install java-1.8.0-openjdk

3. Get the latest version of Kafka from here.

cd /opt
sudo wget http://mirror.fibergrid.in/apache/kafka/0.10.0.0/kafka_2.11-0.10.0.0.tgz

4. Untar the Kafka binary.

sudo tar -xvf kafka_2*

5. Rename the extracted Kafka folder with versions to Kafka.

sudo mv kafka_2.11-0.10.0.0 kafka

Creating a Kafka Service

6. Open the server.properties file, find zookeeper.connect

at the bottom and enter the zookeeper IPs as shown below. Replace zk1, zk2, and zk3 with the IPs or DNS names of your zookeeper instances.

zookeeper.connect=zk1:2181,zk2:2181,zk3:2181

Create a Kafka Service

1. Create a systemd file.

sudo vi /lib/systemd/system/kafka.service

Copy the following contents on to the kafka.service unit file.

[Unit]
Description=Kafka
Before=
After=network.target

[Service]
User=ec2-user
CHDIR= {{ data_dir }}
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
Restart=on-abort

[Install]
WantedBy=multi-user.target

2. Reload the daemon.

sudo systemctl daemon-reload

Managing Kafka Service

Once the Kafka service is created, you can manage the Kafka service using the Linux service module.

To start the Kafka service,

sudo service kafka start

2. To stop and restart,

sudo service kafka stop
sudo service kafka restart

Testing The Kafka Cluster

To test the kafka cluter setup, we will create a topic and few messages. Andn we will try to consume it from different node to conform that the cluster is working as intended.

To test, cd in to the kafka bin directory to get access to the kafka scripts.