How To Setup an Elasticsearch Cluster

In part I, we learned the basic concepts of elasticsearch. In this tutorial, we will learn how to set up an elasticsearch cluster with client, master and a data node.

Setup an Elasticsearch Cluster

For this setup to work, as a prerequisite, you need three virtual machines with enough memory. This tutorial is based on ubuntu server 14.04. You can set up an ubuntu server using vagrant, or on any cloud provider.

Do the following before we start configuring the server for elasticsearch.

Create three ubuntu 14.04 VM's with 1GB RAM each.
Update all the servers using the following command.

sudo apt-get update

3. Change the hostnames to es-client-01, es-master-01 and es-data-01 to match the client, master and data node roles.

4. Edit /etc/hosts file of all the nodes and make entries for all the nodes for the hostnames as shown below. Change the IP addresses with the IP addresses of your VM’s.

192.168.4.40			es-client-01
192.168.4.41			es-master-01
192.168.4.42			es-data-01

The above configuration is very important because we will be using the hostname for the nodes to communicate with each other.

Setting up Client Node (es-client-01)

Now we have the base VM. Let's start with elastic search configuration.

Install Latest Java

Elasticsearch needs java runtime as its core is java. You can install the latest java version by executing the following commands.

Add the official oracle java repository.

sudo add-apt-repository ppa:webupd8team/java

2. Now, refresh the package list.

sudo apt-get update

3. You can now install java using the following command.

sudo apt-get install oracle-java8-installer

4. Once installed, verify the installation by checking the java version.

java -version

Install Elasticsearch

1. Download the elasticsearch installation file

https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/deb/elasticsearch/2.2.0/elasticsearch-2.2.0.deb

Note: At the time of writing, the release of elasticsearch is 2.2.0

2. Install the downloaded package.

Note: If you have downloaded any version other than 2.2.0, change the package name accordingly.

sudo dpkg -i elasticsearch-2.2.0.deb

3. Start the elasticsearch service.

sudo service elasticsearch start

4. Our node es-client-01 has elasticsearch service running and we will consider as the client node. Also, you need to set elasticsearch to start automatically on bootup. Use the following command to do that.

sudo update-rc.d elasticsearch defaults 95 10

5. Verify the elasticsearch service by sending a HTTP request to port 9200. By default elasticsearch run on port 9200.

curl http://localhost:9200

You would see a JSON response, which looks like the following.

{
  "name" : "Crusher",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "2.2.0",
    "build_hash" : "8ff36d139e16f8720f2947ef62c8167a888992fe",
    "build_timestamp" : "2016-01-27T13:32:39Z",
    "build_snapshot" : false,
    "lucene_version" : "5.4.1"
  },
  "tagline" : "You Know, for Search"
}

The above output shows the name of the node, cluster name, and a few other details.

If you do not specify a node name in the configuration, elasticsearch assigns a random name on every restart.

All the elasticsearch configurations are present in elasticsearch.yml file, which is located in /etc/elasticsearch folder.

6. Now, the elasticsearch.yml file has to be edited for the configuring the node as a client node. Open the elasticsearch.ym file located in /etc/elasticsearch directory and change the configurations as follows.

The configuration file has many sections like cluster, node, paths etc.

Note: Refer this config file for all the configurations explained below.

Under the cluster section, change the cluster name parameter.

cluster.name: devopscube-production

Under node section, change the node name parameter and add other parameters as shown below.

node.name: es-client-01
node.client: true
node.data: false

Under network section, change the "network.host" parameter with the IP address of your client node.

network.host: 192.168.4.40

Under discover section add the following.

discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["es-client-01", "es-master-01",  "es-data-01"]

The above parameters disable the multicast and send a unicast message to the specified hosts. As we have already made the hosts entry for all the hostnames, the unicast messages will go the respective nodes.

7. Save the file and restart the elasticsearch service for changes.

sudo service elasticsearch restart

Now, we need to make some system level changes. Open /etc/security/limits.conf file to change the file limits that can be used. By default, it is 1024 for Ubuntu. You can check this by running “unlimited -n” command.

Add the following lines at the end of the file.

*        soft   nofile   64000
*        hard   nofile   64000
root     soft   nofile   64000
root     hard   nofile   64000

Open /etc/pam.d/common-session file and add the following line.

session required                        pam_limits.so

It is recommended to have the heap size as half as the RAM. This tutorial is based on 1 GB RAM VM. So we will configure 512 MB swap space.

You need to set an environment variable for elasticsearch heap size. You can do this by editing the /etc/environment file. The file should look like the following.

PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"
ES_HEAP_SIZE="512M"

Once edited, you should reboot the server.

Setting Up Master and Data Node

Follow all the steps we used to setup the client node for the master and data node. Only while configuring the elasticsearch.yml file just uses the data given below. All the other steps are same for all the nodes.

For master node (elasticsearch.yml)

Under node section of the elasticsearch.yml file, add the following. Refer this file.

node.name: es-master-01
node.master: true
node.data: false

Under network section, change the “network.host” parameter. Change the IP address accordingly.

network.host: 192.168.4.42

For data node (elasticsearch.yml)

Under the node section, add the following. refer this for configurations.

node.name: es-data-01
node.client: false
node.data: true

Under the network section, replace the data nodes IP address as you did for the client and master nodes.

Once you configure all the three nodes, restart the elasticsearch service on all the three nodes.

sudo service elasticsearch restart

Now you will have a working elasticsearch cluster.

Installing elasticsearch GUI plugin

Once you setup an elasticsearch cluster, you can view the cluster status on the client node(es-client-01) using the following command.

curl http://es-client-01:9200/_cluster/stats

But the but is not that easy to comprehend. So you can make use of the elasticsearch head plugin to view the cluster details in the browser UI.

We will install this plugin on our client node. To install the plugin, navigate to "/usr/share/elasticsearch/bin" directory and execute the following command.

./plugin install mobz/elasticsearch-head

Restart the elasticsearch service for the plugin to work.

sudo service elasticsearch restart

Now, if you access http://<IP>:9200/_plugin/head/ in your browser, you will be able to see all the cluster details.

Wrapping Up

In this tutorial, I have explained all the steps to setup a three-node elasticsearch cluster. In the next article, I will cover more on indexing strategies for elasticsearch.

Also, you can take a look at the devopscube vagrant repository for setting up the three node cluster. Elasticsearch vagrant cluster setup