How to Setup EFK Stack on Kubernetes: Step by Step Guides

In this Kubernetes tutorial, you’ll learn how to setup EFK stack on Kubernetes cluster for log streaming, log analysis, and log monitoring.

In this Kubernetes tutorial, you’ll learn how to setup EFK stack on Kubernetes cluster for log streaming, log analysis, and log monitoring.

Check out part 1 in this Kubernetes logging series, where we have covered Kubernetes logging fundamentals and patterns for beginners.

When running multiple applications and services on a Kubernetes cluster, it makes more sense to stream all of your application and Kubernetes cluster logs to one centralized logging infrastructure for easy log analysis.

This beginner’s guide aims to walk you through the important technical aspects of Kubernetes logging through the EFK stack.

What is EFK Stack?

EFK stands for Elasticsearch, Fluentd, and Kibana. EFK is a popular and the best open-source choice for the Kubernetes log aggregation and analysis.

  1. Elasticsearch is a distributed and scalable search engine commonly used to sift through large volumes of log data. It is a NoSQL database based on the Lucene search engine (search library from Apache). Its primary work is to store logs and retrive logs from fluentd.
  2. Fluentd is a log shipper. It is an open source log collection agent which support multiple data sources and output formats. Also, it can forward logs to solutions like Stackdriver, Cloudwatch, elasticsearch, Splunk, Bigquery and much more. To be short, it is an unifying layer between systems that genrate log data and systems that store log data.
  3. Kibana is UI tool for querying, data visualization and dashboards. It is a query engine which allows you to explore your log data through a web interface, build visualizations for events log, query-specific to filter information for detecting issues. You can virtually build any type of dashboards using Kibana. Kibana Query Language (KQL) is used for querying elasticsearch data. Here we use Kibana to query indexed data in elasticsearch.

Also, Elasticsearch helps solve the problem of separating huge amounts of unstructured data and is in use by many organizations. Elasticsearch is commonly deployed alongside Kibana.

Note: When it comes to Kubernetes, Fluentd is the best choice because than logstash because FLuentd can parse container logs without any extra configurations. Moreover, it is a CNCF project.

Setup EFK Stack on Kubernetes

We will look at the step-by-step process for setting up EFK using Kubernetes manifests. You can find all the manifests used in this blog in the Kubernetes EFK Github repo. Each EFK component’s manifests are categorized in individual folders.

You can clone the repo and use the manifests while you follow along with the article.

git clone https://github.com/scriptcamp/kubernetes-efk

Note: All the EFK components get deployed in the default namespace.

EFK Archiecture

The following diagram shows the high level architecture of EFK stack that we are going to build.

EFK Setup Architecture

EKF components get deployed as follows,

  1. Fluentd:- Deployed as daemonset as it need to collect the container logs from all the nodes. It connects to the Elasticsearch service endpoint to forward the logs.
  2. Elasticsearch:- Deployed as statefulset as it holds the log data. We also expose the service endpoint for Fluentd and kibana to connect to it.
  3. Kibana:- Deployed as deployment and connects to elasticsearch service endpoint.

Deploy Elasticsearch Statefulset

Elasticsearch is deployed as a Statefulset and the multiple replicas connect with each other using a headless service. The headless svc helps in the DNS domain of the pods.

Save the following manifest as es-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
  labels:
    app: elasticsearch
spec:
  selector:
    app: elasticsearch
  clusterIP: None
  ports:
    - port: 9200
      name: rest
    - port: 9300
      name: inter-node

Let’s create it now.

kubectl create -f es-svc.yaml

Before we begin creating the statefulset for elastic search, let’s recall that a statefulset requires a storage class defined beforehand using which it can create volumes whenever required.

Note: Though in a production environment, we need to use 400-500Gbs of volume for elastic search, here we are deploying with 3Gb PVC’s for demonstrations.

Let’s create the Elasticsearch statefulset now. Save the following manifest as es-sts.yaml

Note: The statefulset creates the PVC with the default available storage class. If you have a custom storage class for PVC, you can add it in the volumeClaimTemplates by uncommenting the storageClassName parameter.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: es-cluster
spec:
  serviceName: elasticsearch
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
      - name: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:7.5.0
        resources:
            limits:
              cpu: 1000m
            requests:
              cpu: 100m
        ports:
        - containerPort: 9200
          name: rest
          protocol: TCP
        - containerPort: 9300
          name: inter-node
          protocol: TCP
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
        env:
          - name: cluster.name
            value: k8s-logs
          - name: node.name
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: discovery.seed_hosts
            value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
          - name: cluster.initial_master_nodes
            value: "es-cluster-0,es-cluster-1,es-cluster-2"
          - name: ES_JAVA_OPTS
            value: "-Xms512m -Xmx512m"
      initContainers:
      - name: fix-permissions
        image: busybox
        command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
        securityContext:
          privileged: true
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
      - name: increase-vm-max-map
        image: busybox
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      - name: increase-fd-ulimit
        image: busybox
        command: ["sh", "-c", "ulimit -n 65536"]
        securityContext:
          privileged: true
  volumeClaimTemplates:
  - metadata:
      name: data
      labels:
        app: elasticsearch
    spec:
      accessModes: [ "ReadWriteOnce" ]
      # storageClassName: ""
      resources:
        requests:
          storage: 3Gi

Let’s create the statefulset.

kubectl create -f es-sts.yaml

Verify Elasticsearch Deployment

After the Elastisearch pods come into the running state, let us try and verify the Elasticsearch statefulset. The easiest method to do this is to check the status of the cluster. In order to check the status, port-forward the Elasticsearch pod’s 9200 port.

kubectl port-forward es-cluster-0 9200:9200

To check the health of the Elasticsearch cluster, run the following command in the terminal.

curl http://localhost:9200/_cluster/health/?pretty

The output will display the status of the Elasticsearch cluster. If all the steps were followed correctly, the status should come up as ‘green’.

{
  "cluster_name" : "k8s-logs",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 8,
  "active_shards" : 16,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Tip on Elasticsearch Headless Service

As you know, headless svc does not work as a load balancer and is used to address a group of pods together. There is another use case for headless services.

We can use it to get the address of individual pods. Let’s take an e.g. to understand this.

We have three pods running as part of the Elastic search statefulset.

Pod namePod Address
es-cluster-0172.20.20.134
es-cluster-1172.20.10.134
es-cluster-2172.20.30.89
Elasticsearch Pods and their addresses

and a headless svc – “elasticsearch” is pointed to these pods.

If you do a nslookup from a pod running inside the same namespace of your cluster, you’ll be able to get the address of the above pods through the headless svc.

nslookup es-cluster-0.elasticsearch.default.svc.cluster.local

Server:		10.100.0.10
Address:	10.100.0.10#53

Name:	es-cluster-0.elasticsearch.default.svc.cluster.local
Address: 172.20.20.134

The above concept is used very commonly in Kubernetes, so should be understood clearly. In fact, the statefulset env vars – “discovery.seed_hosts” and “cluster.initial_master_nodes” are using this concept.

Now that we have a running Ealsticsearch cluster, let’s move on to Kibana now.

Deploy Kibana Deployment & Service

Kibana can be created as a simple Kubernetes deployment. If you check the following Kibana deployment manifest file, we have an env var ELASTICSEARCH_URL defined to configure the Elasticsearch cluster endpoint. Kibana uses the endpoint URL to connect to elasticsearch.

Create the Kibana deployment manifest as kibana-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  labels:
    app: kibana
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
      - name: kibana
        image: docker.elastic.co/kibana/kibana:7.5.0
        resources:
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        env:
          - name: ELASTICSEARCH_URL
            value: http://elasticsearch:9200
        ports:
        - containerPort: 5601

Create the manifest now.

kubectl create -f kibana-deployment.yaml

Let’s create a service of type NodePort to access the Kibana UI over node IP address. We are using nodePort for demonstration purposes. However, ideally, kubernetes ingress with a ClusterIP service is used for actual project implementation.

Save the following manifest as kibana-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: kibana-np
spec:
  selector: 
    app: kibana
  type: NodePort  
  ports:
    - port: 8080
      targetPort: 5601 
      nodePort: 30000

Create the kibana-svc now.

kubectl create -f kibana-svc.yaml

Now you will be able to access Kibana over http://<node-ip>:3000

Verify Kibana Deployment

After the pods come into the running state, let us try and verify Kibana deployment. The easiest method to do this is through the UI access of the cluster.

To check the status, port-forward the Kibana pod’s 5601 port. If you have created the nodePort service, you can also use that.

kubectl port-forward <kibana-pod-name> 5601:5601

After this, access the UI through the web browser or make a request using curl

curl http://localhost:5601/app/kibana

If the Kibana UI loads or a valid curl response comes up, then we can conclude that Kibana is running correctly.

Let’s move to fluentd component now.

Deploy Fluentd Kubernetes Manifests

Fluentd is deployed as a daemonset since it has to stream logs from all the nodes in the clusters. In addition to this, it requires special permissions to list & extract the pod’s metadata in all the namespaces.

Kubernetes Service accounts are used for providing permissions to a component in kubernetes, along with cluster roles and cluster rolebindings. Let’s go ahead and create the required service account and roles.

Create Fluentd Cluster Role

A cluster role in kubernetes contains rules that represent a set of permissions. For fluentd, we want to give permissions for pods and namespaces.

Create a manifest fluentd-role.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluentd
  labels:
    app: fluentd
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch

Apply the manifest

kubectl create -f fluentd-role.yaml

Create Fluentd Service Account

A service account in kubernetes is an entity to provide identity to a pod. Here, we want to create a service account to be used with fluentd pods.

Create a manifest fluentd-sa.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  labels:
    app: fluentd

Apply the manifest

kubectl create -f fluentd-sa.yaml

Creste Fluentd Cluster Role Binding

A cluster rolebinding in kubernetes grants permissions defined in a cluster role to a service account. We want to create a rolebinding between the role and the service account created above.

Create a manifest fluentd-rb.yaml

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: default

Apply the manifest

kubectl create -f fluentd-rb.yaml

Deploy Fluentd DaemonSet

Let us deploy the daemonset now.

Save the following as fluentd-ds.yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  labels:
    app: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1
        env:
          - name:  FLUENT_ELASTICSEARCH_HOST
            value: "elasticsearch.default.svc.cluster.local"
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "9200"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          - name: FLUENTD_SYSTEMD_CONF
            value: disable
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

Note: If you check the deployment, we whave use two env vars, "FLUENT_ELASTICSEARCH_HOST" & "FLUENT_ELASTICSEARCH_PORT". Fluentd uses these Elasticsearch values to ship the collected logs.

Lets apply the fluentd manifest

kubectl create -f fluentd-ds.yaml

Verify Fluentd Setup

In order to verify the fluentd installation, let us start a pod that creates logs continuously. We will then try to see these logs inside Kibana.

Save the following as test-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: counter
spec:
  containers:
  - name: count
    image: busybox
    args: [/bin/sh, -c,'i=0; while true; do echo "Thanks for visiting devopscube! $i"; i=$((i+1)); sleep 1; done']

Apply the manifest

kubectl create -f test-pod.yaml

Now, let’s head to Kibana to check whether the logs from this pod are being picked up by fluentd and stored at elasticsearch or not. Follow the below steps:

Step 1: Open kibana UI using proxy or the nodeport service endpoint. Head to management console inside it.

Step 2: Select the “Index Patterns” option under Kibana section.

Step 3: Create a new Index Patten using the pattern – “logstash-*” and

image 2

Step 4: Select “@timestamp” in the timestamps option.

image 3

Step 5: Now the index pattern has been created. Head to discover console.Here, you will be able to see all the logs being exported by fluentd like the logs from our test pod as shown in the image below.

Kubernetes logs in Kibana from fluentd

That’s it!

We have covered all the components required for a logging solution in Kubernetes and also verified each of our components separately. Let us go through the best practices of using EFK stack.

Kubernetes EFK Best practises

  1. Elasticsearch uses heap memory extensively for filtering and caching for better query performances, so ample memory should be available for elastic search.

    Giving more than half of total memory to elasticsearch could also leave too less memory for OS functions which could inturn hamper elasticsearch’s capabilities.

    So be mindful of this! A 40-50% of total heap space to elasticsearch is good enough.
  2. Elastic search indices can fill up quickly so it’s important to clean up old indices regularly. Kubernetes cron jobs can help you do this regularly in an automated fashion.
  3. Having data replicated across multiple nodes can help in disaster recovery and also improve query performance. By default, replication factor in elasticsearch is set to 1.

    Consider playing around with this values according to your use case. Having atleast 2 is a good practise.
  4. Data which is known to be accessed more frequently can be placed in different nodes with more resources allocated. This can be achieved by running a cronjob that moves the indices to different nodes at regular intervals.

    Though this is an advance use case – it is good for a beginner to atleast have knowledge that something like this can be done.
  5. In elastic search, you an archive indices to low cost cloud storage such as aws-s3 and restore when you need data from those indices.

    This is a best practise if you need to conserve logs for audit and compliance.
  6. Having multiple nodes like master, data and client nodes with dedicated functionalities is good for high availability and fault tolerance.

Beyond EFK – Futher Research

This guide was just a small use case of setting up the Elastic stack on Kubernetes. Elastic stack has tons of other features which help in logging and monitoring solutions. 

For example, it can ship logs from virtual machines and managed services of various cloud providers. You can even ship logs from data engineering tools like Kafka into the elastic stack. 

The elastic stack has other powerful components worth looking into, such as:

  1. Elastic Metrics: Ships metrics from multiple sources across your entire infrastructure and makes it available in elastic search and kibana.
  2. APM: Expands elastic stack capabilities and lets you analyze where exactly an application is spending time quickly fixing issues in production.
  3. Uptime: Helps in monitoring and analyzing availability issues across your apps and services before they start appearing in the production.

Explore and research them!

Conclusion

In his Kubernetes EFK setup guide, we have learned how to set up the logging infrastructure on Kubernetes.

If you want to become a DevOps engineer, it is very important to understand all the concepts involved in the Kubernetes logging.

In the next part of this series, we are going to explore Kibana dashboards and visualization options.

In Kibana, it is a good practice to visualize data through graphs wherever possible as it gives a much more clear picture of your application state. So don’t forget to check out the next part of this series.

Till then, keep learning and exploring.

1 Shares:
10 comments
  1. I have created everything as mentioned in this page.

    I am not able to create index pattern logstash-* in step 3. (Your Index pattern doesn’t match any indices.)

  2. Hi i am facing this issue. fluentd cant collect logs from nodes.
    “xxx.log unreadable. It is excluded and would be examined next time.”

    2022-02-10 10:15:55 +0000 [warn]: #0 [in_tail_container_logs] /var/log/containers/task-deps-cronjob-hert-bbu-v500r011c10-27407940-48ntw_development_task-deps-cronjob-hert-bbu-v500r011c10-80d636778693bc7e09a655a8a171596b7ca76b776db316e1ac0d122ee07e7258.log unreadable. It is excluded and would be examined next time.
    2022-02-10 10:15:55 +0000 [warn]: #0 [in_tail_container_logs] /var/log/containers/redis-replicas-0_databases_metrics-c43fc5dc025d4232d87d4b340e07cc53e44b61a4faf68a5ea42b0869bceddf7c.log unreadable. It is excluded and would be examined next time.
    2022-02-10 10:15:55 +0000 [warn]: #0 [in_tail_container_logs] /var/log/containers/agreement-hert-bbu-v500r012c00-db4d456b8-t5pck_development_agreement-hert-bbu-v500r012c00-650f6f4cc3995b9e837c48ca447696c3ff73bea1e2b50a6d6e53278df6a7f006.log unreadable. It is excluded and would be examined next time.
    2022-02-10 10:15:55 +0000 [warn]: #0 [in_tail_container_logs] /var/log/containers/csi-provisioner-6ccbfbf86f-f7mr7_longhorn-system_csi-provisioner-c5f778844ed5ce04caa2422ea33459c2c407125d6f1ed3903d4cace1a6dd6b35.log unreadable. It is excluded and would be examined next time.

  3. This is a useful guide. Thank you.
    What I can’t see though, is where the fluentd config is. Usually I have a config file where I can specify the logs to follow, parse, filter etc.
    Then there is a section which sends them on to elastic.
    In this guide it looks like all of this is happening in the fluentd darmonset yaml.
    Is the config not needed anymore? How would I modify my fluentd, to only follow logs of certain containers? Or add another source?

Leave a Reply

Your email address will not be published.

You May Also Like