Zookeeper is a distributed coordination tool which can be used for various distributed workloads. In this article, we have explained the necessary steps to install and configure zookeeper cluster with a definite quorum.
Setup a Zookeeper Cluster
Prerequisites
1. Three VM’s. (Forms a quorum). For high availability cluster, you can go with any odd number above 3. For example, if you set up 5 servers, the cluster can handle two failed nodes and so on.
2. VM’s with inbound connections on ports 2888, 3888 and 2181. If IPtables is enabled, make sure you enable the specified ports because zookeeper communication happens through these ports.
Note: If you are using AWS or any other cloud provider, apart from server level inbound connections, make sure you enable the security groups or endpoints for the zookeeper ports.
Install and Configure Zookeeper
You need to perform the following steps in all the three VM’s.
1. Update your server.
sudo yum -y update
2. Install Java if is not installed.
sudo yum -y install java-1.7.0-openjdk
3. Download zookeeper. b If you wish to choose another version, you can get the download links from here.
wget http://mirror.fibergrid.in/apache/zookeeper/zookeeper-3.5.2-alpha/zookeeper-3.5.2-alpha.tar.gz
4. Untar the application to /opt folder
sudo tar -xf zookeeper-3.5.2-alpha.tar.gz -C /opt/
5. Rename the zookeeper app directory
cd /opt sudo mv zookeeper-* zookeeper
6. Create a zoo.cfg file in /opt/zookeeper/conf directory with the configurations shown below.
tickTime=2000 dataDir=/var/lib/zookeeper clientPort=2181 initLimit=5 syncLimit=2 server.1=<ZooKeeper_IP/hostname>:2888:3888 server.2=<ZooKeeper_iP/hostname>:2888:3888 server.3=<ZooKeeper_iP/hostname>:2888:3888
In the above code, server 1, 2 and 3 represent our three zookeeper servers. You need to replace the Zookeeper_IP with relevant IP or resolvable hostnames.
7. Create a zookeeper directory in the lib folder. Thar will be zookeepers data directory as mentioned in the zoo.cfg file.
sudo mkdir /var/lib/zookeeper
8. Create a file name myid in /var/lib/zookeeper/ directory .
sudo touch /var/lib/zookeeper/myid
9. Each zookeeper server should have a unique number in the myid file. For example, server 1 will have value 1, server 2 will have value 2 and so on.
server 1
sudo sh -c "echo '1' > /var/lib/zookeeper/myid"
server 2
sudo sh -c "echo '2' > /var/lib/zookeeper/myid"
server 1
sudo sh -c "echo '3' > /var/lib/zookeeper/myid"
Configuring Zookeeper as a Service
Zookeeper can be started and stopped using the scripts. But it is good to run it as a service to manage it in an elegant way.
1. Open zkServer.zh file for editing.
sudo vi /opt/zookeeper/bin/zkServer.sh
2. Add the following below the shebang “#!/usr/bin/env bash” to add zookeeper to the system start up.
# description: Zookeeper Start Stop Restart # processname: zookeeper # chkconfig: 244 30 80
3. Find a line which says “# use POSTIX interface, symlink is followed automatically”. Replace the existing variables after that line with the following.
ZOOSH=`readlink $0` ZOOBIN=`dirname $ZOOSH` ZOOBINDIR=`cd $ZOOBIN; pwd` ZOO_LOG_DIR=`echo $ZOOBIN`
4. Now, create a symlink for the zookeeper service.
sudo ln -s /opt/zookeeper/bin/zkServer.sh /etc/init.d/zookeeper
5. Add zookeeper to the boot menu.
sudo chkconfig zookeeper on
6. Now, Restart all the servers.
sudo init 6
7. Once restarted, you can manage zookeeper servers using the following commands.
sudo service zookeeper status sudo service zookeeper stop sudo service zookeeper start sudo service zookeeper restart
8. When you check the status, it should produce an output like the following.
/bin/java ZooKeeper JMX enabled by default Using config: /opt/zookeeper/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Mode: leader
out of three servers, one will be in leader mode and other two will be in follower mode.
7 comments
Better update the “>” with “>”
Thanks for the update. Its corrected now!
You always need at least 3 nodes to avoid split brain. always an odd number. So 3 or 5 nodes.
At least in my enviroment (Latest Amazon Linux)
I had to change
ZOOSH=`readlink $`
to
ZOOSH=`readlink -f $0`
Hello,
Thanks for the great article
I would like to understand more about the commands in step 2 & 3 in Configuring Zookeeper As A Service
Hi,
Thanks for the detailed explaination on creating cluster. I have followed to configure the settings except Configuring Zookeeper As A Service. While trying to start using zkServer.sh, console displays STARTED. But not pid createed. while trying to start using java -cp zookeeper-3.4.10.jar:lib/slf4j-api-1.6.1.jar:lib/log4j-1.2.16.jar:conf org.apache.zookeeper.server.quorum.QuorumPeerMain conf/zoo.cfg, it shows the following:
SLF4J: Failed to load class “org.slf4j.impl.StaticLoggerBinder”.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Invalid config, exiting abnormally
Hi,
Thanks for the above post. I am currently working on trying to use zookeeper in a two node cluster. I have my own cluster formation algorithm running on the nodes based on configuration. We only need Zookeeper’s distributed DB functionality.
1. Is it possible to use Zookeeper in a two node cluster ? Do you know of any solutions where this has been done ?
2. Can we still retain the zookeepers DB functionality without forming a quorum ?
Note: Fault tolerence is not the main concern in this project. If one of the nodes go down we have enough code logic to run without the zookeeper service. We use the zookeeper to share data when the 2 nodes are alive
Would greatly appreciate any help.