How To Setup a Zookeeper Cluster – Beginners Guide

setup a zookeeper cluster

Zookeeper is a distributed coordination tool which can be used for various distributed workloads. In this article, we have explained the necessary steps to install and configure zookeeper cluster with a definite quorum.

Setup a Zookeeper Cluster

Prerequisites

1. Three VM’s. (Forms a quorum). For high availability cluster, you can go with any odd number above 3. For example, if you set up 5 servers, the cluster can handle two failed nodes and so on.

2. VM’s with inbound connections on ports 2888, 3888 and 2181. If IPtables is enabled, make sure you enable the specified ports because zookeeper communication happens through these ports.

Note: If you are using AWS or any other cloud provider, apart from server level inbound connections, make sure you enable the security groups or endpoints for the zookeeper ports.

Install and Configure Zookeeper

You need to perform the following steps in all the three VM’s.

1. Update your server.

 sudo yum -y update

2. Install Java if is not installed.

 sudo yum  -y install java-1.7.0-openjdk

3. Download zookeeper. b If you wish to choose another version, you can get the download links from here.

 wget http://mirror.fibergrid.in/apache/zookeeper/zookeeper-3.5.2-alpha/zookeeper-3.5.2-alpha.tar.gz

4. Untar the application to /opt folder

 sudo tar -xf zookeeper-3.5.2-alpha.tar.gz -C /opt/

5. Rename the zookeeper app directory

 cd /opt
sudo mv zookeeper-* zookeeper

6. Create a zoo.cfg file in /opt/zookeeper/conf directory with the configurations shown below.

tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
server.1=<ZooKeeper_IP/hostname>:2888:3888
server.2=<ZooKeeper_iP/hostname>:2888:3888
server.3=<ZooKeeper_iP/hostname>:2888:3888

In the above code, server 1, 2 and 3 represent our three zookeeper servers. You need to replace the Zookeeper_IP with relevant IP or resolvable hostnames.

7. Create a zookeeper directory in the lib folder. Thar will be zookeepers data directory as mentioned in the zoo.cfg file.

 sudo mkdir /var/lib/zookeeper

8. Create a file name myid in /var/lib/zookeeper/ directory .

 sudo touch /var/lib/zookeeper/myid

9. Each zookeeper server should have a unique number in the myid file. For example, server 1 will have value 1, server 2 will have value 2 and so on.

server 1

 sudo sh -c "echo '1' > /var/lib/zookeeper/myid"

server 2

 sudo sh -c "echo '2' > /var/lib/zookeeper/myid"

server 1

 sudo sh -c "echo '3' > /var/lib/zookeeper/myid"

Configuring Zookeeper as a Service

Zookeeper can be started and stopped using the scripts. But it is good to run it as a service to manage it in an elegant way.

1. Open zkServer.zh file for editing.

 sudo vi /opt/zookeeper/bin/zkServer.sh

2. Add the following below the shebang “#!/usr/bin/env bash” to add zookeeper to the system start up.

# description: Zookeeper Start Stop Restart
# processname: zookeeper
# chkconfig: 244 30 80

3. Find a line which says “# use POSTIX interface, symlink is followed automatically”. Replace the existing variables after that line with the following.

ZOOSH=`readlink $0`
ZOOBIN=`dirname $ZOOSH`
ZOOBINDIR=`cd $ZOOBIN; pwd`
ZOO_LOG_DIR=`echo $ZOOBIN`

4. Now, create a symlink for the zookeeper service.

sudo ln -s /opt/zookeeper/bin/zkServer.sh /etc/init.d/zookeeper

5. Add zookeeper to the boot menu.

 sudo chkconfig zookeeper on

6. Now, Restart all the servers.

 sudo  init 6

7. Once restarted, you can manage zookeeper servers using the following commands.

sudo service zookeeper status
sudo service zookeeper stop
sudo service zookeeper start
sudo service zookeeper restart

8. When you check the status, it should produce an output like the following.

/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Mode: leader

out of three servers, one will be in leader mode and other two will be in follower mode.

7 comments
  1. You always need at least 3 nodes to avoid split brain. always an odd number. So 3 or 5 nodes.

  2. Hello,
    Thanks for the great article
    I would like to understand more about the commands in step 2 & 3 in Configuring Zookeeper As A Service

  3. Hi,

    Thanks for the detailed explaination on creating cluster. I have followed to configure the settings except Configuring Zookeeper As A Service. While trying to start using zkServer.sh, console displays STARTED. But not pid createed. while trying to start using java -cp zookeeper-3.4.10.jar:lib/slf4j-api-1.6.1.jar:lib/log4j-1.2.16.jar:conf org.apache.zookeeper.server.quorum.QuorumPeerMain conf/zoo.cfg, it shows the following:
    SLF4J: Failed to load class “org.slf4j.impl.StaticLoggerBinder”.
    SLF4J: Defaulting to no-operation (NOP) logger implementation
    SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
    Invalid config, exiting abnormally

  4. Hi,
    Thanks for the above post. I am currently working on trying to use zookeeper in a two node cluster. I have my own cluster formation algorithm running on the nodes based on configuration. We only need Zookeeper’s distributed DB functionality.

    1. Is it possible to use Zookeeper in a two node cluster ? Do you know of any solutions where this has been done ?
    2. Can we still retain the zookeepers DB functionality without forming a quorum ?

    Note: Fault tolerence is not the main concern in this project. If one of the nodes go down we have enough code logic to run without the zookeeper service. We use the zookeeper to share data when the 2 nodes are alive

    Would greatly appreciate any help.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like