How To Setup Kubernetes Cluster Using Kubeadm

Kubeadm cluster setup guide

In this blog post, I have covered the step-by-step guide to setting up a kubernetes cluster using Kubeadm with one master and two worker nodes.

Kubeadm is an excellent tool to set up a working kubernetes cluster in less time. It does all the heavy lifting in terms of setting up all kubernetes cluster components. Also, It follows all the configuration best practices for a kubernetes cluster.

What is Kubeadm?

Kubeadm is a tool to set up a minimum viable Kubernetes cluster without much complex configuration. Also, Kubeadm makes the whole process easy by running a series of prechecks to ensure that the server has all the essential components and configs to run Kubernetes.

It is developed and maintained by the official Kubernetes community. There are other options like minikube, kind, etc., that are pretty easy to set up. You can check out my minikube tutorial. Those are good options with minimum hardware requirements if you are deploying and testing applications on Kubernetes.

But if you want to play around with the cluster components or test utilities that are part of cluster administration, Kubeadm is the best option. Also, you can create a production-like cluster locally on a workstation for development and testing purposes.

Kubeadm Setup Prerequisites

Following are the prerequisites for Kubeadm Kubernetes cluster setup.

  1. Minimum two Ubuntu nodes [One master and one worker node]. You can have more worker nodes as per your requirement.
  2. The master node should have a minimum of 2 vCPU and 2GB RAM.
  3. For the worker nodes, a minimum of 1vCPU and 2 GB RAM is recommended.
  4. 10.X.X.X/X network range with static IPs for master and worker nodes. We will be using the 192.x.x.x series as the pod network range that will be used by the Calico network plugin. Make sure the Node IP range and pod IP range don’t overlap.

Note: If you are setting up the cluster in the corporate network behind a proxy, ensure set the proxy variables and have access to the container registry and docker hub. Or talk to your network administrator to whitelist registry.k8s.io to pull the required images.

Kubeadm Port Requirements

Please refer to the following image and make sure all the ports are allowed for the control plane (master) and the worker nodes. If you are setting up the kubeadm cluster cloud servers, ensure you allow the ports in the firewall configuration.

Kubeadm kubernetes cluster port requirements

If you are using vagrant-based Ubuntu VMs, the firewall will be disabled by default. So you don’t have to do any firewall configurations.

Kubeadm for Kubernetes Certification Exams

If you are preparing for Kubernetes certifications like CKA, CKAD, or CKS, you can use the local kubeadm clusters to practice for the certification exam. In fact, kubeadm itself is part of the CKA and CKS exam. For CKA you might be asked to bootstrap a cluster using Kubeadm. For CKS, you have to upgrade the cluster using kubeadm.

If you use Vagrant-based VMs on your workstation, you can start and stop the cluster whenever you need. By having the local Kubeadm clusters, you can play around with all the cluster configurations and learn to troubleshoot different components in the cluster.

Important Note: If you are planning for Kubernetes certification, make use of the CKA/CKAD/CKS coupon Codes before the price increases.

Vagrantfile, Kubeadm Scripts & Manifests

Also, all the commands used in this guide for master and worker nodes config are hosted in GitHub. You can clone the repository for reference.

git clone https://github.com/techiescamp/kubeadm-scripts

This guide intends to make you understand each config required for the Kubeadm setup. If you don’t want to run the commands one by one, you can run the script file directly.

If you are using Vagrant to set up the Kubernetes cluster, you can make use of my Vagrantfile. It launches 3 VMs. A self-explanatory basic Vagrantfile. If you are new to Vagrant, check the Vagrant tutorial.

If you are a Terraform and AWS user, you can make use of the Terraform script present under the Terraform folder to spin up ec2 instances.

Also, I have created a video demo of the whole kubeadm setup. You can refer to it during the setup.

Kubernetes Cluster Setup Using Kubeadm

Following are the high-level steps involved in setting up a kubeadm-based Kubernetes cluster.

  1. Install container runtime on all nodes- We will be using cri-o.
  2. Install Kubeadm, Kubelet, and kubectl on all the nodes.
  3. Initiate Kubeadm control plane configuration on the master node.
  4. Save the node join command with the token.
  5. Install the Calico network plugin (operator).
  6. Join the worker node to the master node (control plane) using the join command.
  7. Validate all cluster components and nodes.
  8. Install Kubernetes Metrics Server
  9. Deploy a sample app and validate the app

All the steps given in this guide are referred from the official Kubernetes documentation and related GitHub project pages.

If you want to understand every cluster component in detail, refer to the comprehensive Kubernetes Architecture.

Now let’s get started with the setup.

Step 1: Enable iptables Bridged Traffic on all the Nodes

Execute the following commands on all the nodes for IPtables to see bridged traffic. Here we are tweaking some kernel parameters and setting them using sysctl.

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

# sysctl params required by setup, params persist across reboots
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

# Apply sysctl params without reboot
sudo sysctl --system

Step 2: Disable swap on all the Nodes

For kubeadm to work properly, you need to disable swap on all the nodes using the following command.

sudo swapoff -a
(crontab -l 2>/dev/null; echo "@reboot /sbin/swapoff -a") | crontab - || true

The fstab entry will make sure the swap is off on system reboots.

You can also, control swap errors using the kubeadm parameter --ignore-preflight-errors Swap we will look at it in the latter part.

Note: From 1.28 kubeadm has beta support for using swap with kubeadm clusters. Read this to understand more.

Step 3: Install CRI-O Runtime On All The Nodes

Note: We are using cri-o instead if containerd because, in Kubernetes certification exams, cri-o is used as the container runtime in the exam clusters.

The basic requirement for a Kubernetes cluster is a container runtime. You can have any one of the following container runtimes.

  1. CRI-O
  2. containerd
  3. Docker Engine (using cri-dockerd)

We will be using CRI-O instead of Docker for this setup as Kubernetes deprecated Docker engine

Execute the following commands on all the nodes to install required dependencies and the latest version of CRIO.


sudo apt-get update -y
sudo apt-get install -y software-properties-common gpg curl apt-transport-https ca-certificates

curl -fsSL https://pkgs.k8s.io/addons:/cri-o:/prerelease:/main/deb/Release.key |
    gpg --dearmor -o /etc/apt/keyrings/cri-o-apt-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/cri-o-apt-keyring.gpg] https://pkgs.k8s.io/addons:/cri-o:/prerelease:/main/deb/ /" |
    tee /etc/apt/sources.list.d/cri-o.list

sudo apt-get update -y
sudo apt-get install -y cri-o

sudo systemctl daemon-reload
sudo systemctl enable crio --now
sudo systemctl start crio.service

Install crictl.

VERSION="v1.30.0"
wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$VERSION/crictl-$VERSION-linux-amd64.tar.gz
sudo tar zxvf crictl-$VERSION-linux-amd64.tar.gz -C /usr/local/bin
rm -f crictl-$VERSION-linux-amd64.tar.gz

crictl, a CLI utility to interact with the containers created by the container runtime.

When you use container runtimes other than Docker, you can use the crictl utility to debug containers on the nodes. Also, it is useful in CKS certification where you need to debug containers.

Step 4: Install Kubeadm & Kubelet & Kubectl on all Nodes

Download the GPG key for the Kubernetes APT repository on all the nodes.

KUBERNETES_VERSION=1.30

sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v$KUBERNETES_VERSION/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v$KUBERNETES_VERSION/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list

Update apt repo

sudo apt-get update -y

Note: If you are preparing for Kubernetes certification, install the specific version of kubernetes. For example, the current Kubernetes version for CKA, CKAD and CKS exams is Kubernetes version 1.30

You can use the following commands to find the latest versions. Install the first version in 1.30 so that you can practice cluster upgrade task.

apt-cache madison kubeadm | tac

Specify the version as shown below. Here I am using 1.30.0-1.1

sudo apt-get install -y kubelet=1.30.0-1.1 kubectl=1.30.0-1.1 kubeadm=1.30.0-1.1

Or, to install the latest version from the repo use the following command without specifying any version.

sudo apt-get install -y kubelet kubeadm kubectl

Add hold to the packages to prevent upgrades.

sudo apt-mark hold kubelet kubeadm kubectl

Now we have all the required utilities and tools for configuring Kubernetes components using kubeadm.

Add the node IP to KUBELET_EXTRA_ARGS.

sudo apt-get install -y jq
local_ip="$(ip --json addr show eth0 | jq -r '.[0].addr_info[] | select(.family == "inet") | .local')"
cat > /etc/default/kubelet << EOF
KUBELET_EXTRA_ARGS=--node-ip=$local_ip
EOF

Step 5: Initialize Kubeadm On Master Node To Setup Control Plane

Here you need to consider two options.

  1. Master Node with Private IP: If you have nodes with only private IP addresses the API server would be accessed over the private IP of the master node.
  2. Master Node With Public IP: If you are setting up a Kubeadm cluster on Cloud platforms and you need master Api server access over the Public IP of the master node server.

Only the Kubeadm initialization command differs for Public and Private IPs.

Execute the commands in this section only on the master node.

If you are using a Private IP for the master Node,

Set the following environment variables. Replace 10.0.0.10 with the IP of your master node.

IPADDR="10.0.0.10"
NODENAME=$(hostname -s)
POD_CIDR="192.168.0.0/16"

If you want to use the Public IP of the master node,

Set the following environment variables. The IPADDR variable will be automatically set to the server’s public IP using ifconfig.me curl call. You can also replace it with a public IP address

IPADDR=$(curl ifconfig.me && echo "")
NODENAME=$(hostname -s)
POD_CIDR="192.168.0.0/16"

Now, initialize the master node control plane configurations using the kubeadm command.

For a Private IP address-based setup use the following init command.

sudo kubeadm init --apiserver-advertise-address=$IPADDR  --apiserver-cert-extra-sans=$IPADDR  --pod-network-cidr=$POD_CIDR --node-name $NODENAME --ignore-preflight-errors Swap

--ignore-preflight-errors Swap is actually not required as we disabled the swap initially.

For Public IP address-based setup use the following init command.

Here instead of --apiserver-advertise-address we use --control-plane-endpoint parameter for the API server endpoint.

sudo kubeadm init --control-plane-endpoint=$IPADDR  --apiserver-cert-extra-sans=$IPADDR  --pod-network-cidr=$POD_CIDR --node-name $NODENAME --ignore-preflight-errors Swap

All the other steps are the same as configuring the master node with private IP.

On a successful kubeadm initialization, you should get an output with kubeconfig file location and the join command with the token as shown below. Copy that and save it to the file. we will need it for joining the worker node to the master.

Kubeadm init command output

Use the following commands from the output to create the kubeconfig in master so that you can use kubectl to interact with cluster API.

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Now, verify the kubeconfig by executing the following kubectl command to list all the pods in the kube-system namespace.

kubectl get po -n kube-system

You should see the following output. You will see the two Coredns pods in a pending state. It is the expected behavior. Once we install the network plugin, it will be in a running state.

You verify all the cluster component health statuses using the following command.

kubectl get --raw='/readyz?verbose'

You can get the cluster info using the following command.

kubectl cluster-info 

By default, apps won’t get scheduled on the master node. If you want to use the master node for scheduling apps, taint the master node.

kubectl taint nodes --all node-role.kubernetes.io/control-plane-

Note: You can also pass the kubeadm configs as a file when initializing the cluster. See Kubeadm Init with config file

Step 6: Join Worker Nodes To Kubernetes Master Node

We have set up cri-o, kubelet, and kubeadm utilities on the worker nodes as well.

Now, let’s join the worker node to the master node using the Kubeadm join command you have got in the output while setting up the master node.

If you missed copying the join command, execute the following command in the master node to recreate the token with the join command.

kubeadm token create --print-join-command

Here is what the command looks like. Use sudo if you running as a normal user. This command performs the TLS bootstrapping for the nodes.

sudo kubeadm join 10.128.0.37:6443 --token j4eice.33vgvgyf5cxw4u8i \
    --discovery-token-ca-cert-hash sha256:37f94469b58bcc8f26a4aa44441fb17196a585b37288f85e22475b00c36f1c61

On successful execution, you will see the output saying, “This node has joined the cluster”.

kubeadm node join output.

Now execute the kubectl command from the master node to check if the node is added to the master.

kubectl get nodes

Example output,

root@controlplane:~# kubectl get nodes

NAME           STATUS   ROLES           AGE     VERSION
controlplane   Ready    control-plane   8m42s   v1.29.0
node01         Ready    worker          2m6s    v1.29.0

In the above command, the ROLE is <none> for the worker nodes. You can add a label to the worker node using the following command. Replace worker-node01 with the hostname of the worker node you want to label.

kubectl label node node01  node-role.kubernetes.io/worker=worker

You can further add more nodes with the same join command.

Step 7: Install Calico Network Plugin for Pod Networking

Kubeadm does not configure any network plugin. You need to install a network plugin of your choice for kubernetes pod networking and enable network policy.

I am using the Calico network plugin for this setup.

Note: Make sure you execute the kubectl command from where you have configured the kubeconfig file. Either from the master of your workstation with the connectivity to the kubernetes API.

Execute the following commands to install the Calico network plugin operator on the cluster.

kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

After a couple of minutes, if you check the pods in kube-system namespace, you will see calico pods and running CoreDNS pods.

kubectl get po -n kube-system
Kubeadm calico and coreDNS pods in running state.

Step 8: Setup Kubernetes Metrics Server

Kubeadm doesn’t install metrics server component during its initialization. We have to install it separately.

To verify this, if you run the top command, you will see the Metrics API not available error.

root@controlplane:~# kubectl top nodes

error: Metrics API not available

To install the metrics server, execute the following metric server manifest file. It deploys metrics server version v0.7.1

kubectl apply -f https://raw.githubusercontent.com/techiescamp/kubeadm-scripts/main/manifests/metrics-server.yaml

This manifest is taken from the official metrics server repo. I have added the --kubelet-insecure-tls flag to the container to make it work in the local setup and hosted it separately. Or else, you will get the following error.

 because it doesn't contain any IP SANs" node=""

Once the metrics server objects are deployed, it takes a minute for you to see the node and pod metrics using the top command.

kubectl top nodes

You should be able to view the node metrics as shown below.

root@controlplane:~# kubectl top nodes

NAME           CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
controlplane   142m         7%     1317Mi          34%
node01         36m          1%     915Mi           23%

You can also view the pod CPU and memory metrics using the following command.

kubectl top pod -n kube-system

Step 9: Deploy A Sample Nginx Application

Now that we have all the components to make the cluster and applications work, let’s deploy a sample Nginx application and see if we can access it over a NodePort

Create an Nginx deployment. Execute the following directly on the command line. It deploys the pod in the default namespace.

cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 2 
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80      
EOF

Expose the Nginx deployment on a NodePort 32000

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  selector: 
    app: nginx
  type: NodePort  
  ports:
    - port: 80
      targetPort: 80
      nodePort: 32000
EOF

Check the pod status using the following command.

kubectl get pods

Once the deployment is up, you should be able to access the Nginx home page on the allocated NodePort.

For example,

kubeadm Nnginx test deployment

Step 10: Add Kubeadm Config to Workstation

If you prefer to connect the Kubeadm cluster using kubectl from your workstation, you can merge the kubeadm admin.conf with your existing kubeconfig file.

Follow the steps given below for the configuration.

Step 1: Copy the contents of admin.conf from the control plane node and save it in a file named kubeadm-config.yaml in your workstation.

Step 2: Take a backup of the existing kubeconfig.

cp ~/.kube/config ~/.kube/config.bak

Step 3: Merge the default config with kubeadm-config.yaml and export it to KUBECONFIG variable

export KUBECONFIG=~/.kube/config:/path/to/kubeadm-config.yaml

Step 4: Merger the configs to a file

kubectl config view --flatten > ~/.kube/merged_config.yaml

Step 5: Replace the old config with the new config

mv ~/.kube/merged_config.yaml ~/.kube/config

Step 6: List all the contexts

kubectl config get-contexts -o name

Step 7: Set the current context to the kubeadm cluster.

kubectl config use-context <cluster-name-here>

Now, you should be able to connect to the Kubeadm cluster from your local workstation kubectl utility.

Possible Kubeadm Issues

Following are the possible issues you might encounter in the kubeadm setup.

  1. Pod Out of memory and CPU: The master node should have a minimum of 2vCPU and 2 GB memory.
  2. Nodes cannot connect to Master: Check the firewall between nodes and make sure all the nodes can talk to each other on the required kubernetes ports.
  3. Calico Pod Restarts: Sometimes, if you use the same IP range for the node and pod network, Calico pods may not work as expected. So make sure the node and pod IP ranges don’t overlap. Overlapping IP addresses could result in issues for other applications running on the cluster as well.

For other pod errors, check out the kubernetes pod troubleshooting guide.

If your server doesn’t have a minimum of 2 vCPU, you will get the following error.

[ERROR NumCPU]: the number of available CPUs 1 is less than the required 2

If you use a public IP with --apiserver-advertise-address parameter, you will have failed master node components with the following error. To rectify this error, use --control-plane-endpoint parameter with the public IP address.

kubelet-check] Initial timeout of 40s passed.


Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

You will get the following error in worker nodes when you try to join a worker node with a new token after the master node reset. To rectify this error, reset the worker node using the command kubeadm reset.

[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
        [ERROR Port-10250]: Port 10250 is in use
        [ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists

Kubernetes Cluster Important Configurations

Following are the important Kubernetes cluster configurations you should know.

ConfigurationLocation
Static Pods Location (etcd, api-server, controller manager and scheduler)/etc/kubernetes/manifests
TLS Certificates location (kubernetes-ca, etcd-ca and kubernetes-front-proxy-ca)/etc/kubernetes/pki
Admin Kubeconfig File/etc/kubernetes/admin.conf
Kubelet configuration/var/lib/kubelet/config.yaml

There are configurations that are part of Kubernetes feature gates. If you want to use the features that are part of feature gates, you need to enable them during the Kubeadm initialization using a kubeadm configuration file.

You can refer to enabling feature gates in Kubeadm blog to understand more.

Upgrading Kubeadm Cluster

Using Kubeadm you can upgrade the kubernetes cluster for the same version patch or a new version.

Kubeadm upgrade doesn’t introduce any downtime if you upgrade one node at a time.

To do hands-on, please refer to my step-by-step guide on Kubeadm cluster upgrade

Backing Up ETCD Data

etcd backup is one the key task in real world projects and for CKA certification.

You can follow the etcd backup guide to learn how to perform etcd backup and restore.

Setup Prometheus Monitoring

As a next step, you can try setting up the Prometheus monitoring stack on the Kubeadm cluster.

I have published a detailed guide for the setup. Refer to prometheus on Kubernetes guide for step-by-step guides. The stack contains, prometheus, alert manager, kube state metrics and Grafana.

How Does Kubeadm Work?

Here is how the Kubeadm setup works.

When you initialize a Kubernetes cluster using Kubeadm, it does the following.

  1. When you initialize kubeadm, first it runs all the preflight checks to validate the system state and it downloads all the required cluster container images from the registry.k8s.io container registry.
  2. It then generates required TLS certificates and stores them in the /etc/kubernetes/pki folder.
  3. Next, it generates all the kubeconfig files for the cluster components in the /etc/kubernetes folder.
  4. Then it starts the kubelet service generates the static pod manifests for all the cluster components and saves it in the /etc/kubernetes/manifests folder.
  5. Next, it starts all the control plane components from the static pod manifests.
  6. Then it installs core DNS and Kubeproxy components
  7. Finally, it generates the node bootstrap token.
  8. Worker nodes use this token to join the control plane.
Kubeadm Workflow

As you can see all the key cluster configurations will be present under the /etc/kubernetes folder.

Kubeadm FAQs

How to use Custom CA Certificates With Kubeadm?

By default, kubeadm creates its own CA certificates. However, if you wish to use custom CA certificates, they should be placed in the /etc/kubernetes/pki folder. When kubeadm is run, it will make use of existing certificates if they are found, and will not overwrite them.

How to generate the Kubeadm Join command?

You can use kubeadm token create --print-join-command command to generate the join command.

Conclusion

In this post, we learned to install Kubernetes step by step using kubeadm.

As a DevOps engineer, it is good to have an understanding of the Kubernetes cluster components. With companies using managed Kubernetes services, we miss learning the basic building blocks of kubernetes.

This Kubeadm setup is good for learning and playing around with kubernetes.

Also, there are many other Kubeadm configs that I did not cover in this guide as it is out of the scope of this guide. Please refer to the official Kubeadm documentation. By having the whole cluster setup in VMs, you can learn all the cluster components configs and troubleshoot the cluster on component failures.

Also, with Vagrant, you can create simple automation to bring up and tear down Kubernetes clusters on-demand in your local workstation. Check out my guide on automated kubernetes vagrant setup using kubeadm.

If you are learning kubernetes, check out the comprehensive Kubernetes tutorial for beginners.

60 comments
  1. Hello,
    I’ve an openstack account and for that I’ve used the public IP based kubeadm init command but after that I’m getting this error in my journalctl log
    Aug 29 07:11:45 k8smaster kubelet[22595]: I0829 07:11:45.323434 22595 trace.go:236] Trace[1105118061]: “Reflector ListAndWatch” name:vendor/k8s.io/client-go/informers/factory.go:159 (29-Aug-2024 07:11:29.958) (total time: 15365ms):
    Aug 29 07:11:45 k8smaster kubelet[22595]: Trace[1105118061]: —“Objects listed” error:Get “https://131.228.66.23:6443/api/v1/nodes?fieldSelector=metadata.name%3Dk8smaster&limit=500&resourceVersion=0”: dial tcp 131.228.66.23:6443: connect: network is unreachable 15364ms (07:11:45.323)
    Aug 29 07:11:45 k8smaster kubelet[22595]: Trace[1105118061]: [15.365067411s] [15.365067411s] END
    Aug 29 07:11:45 k8smaster kubelet[22595]: E0829 07:11:45.323479 22595 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:159: Failed to watch *v1.Node: failed to list *v1.Node: Get “https://131.228.66.23:6443/api/v1/nodes?fieldSelector=metadata.name%3Dk8smaster&limit=500&resourceVersion=0”: dial tcp 131.228.66.23:6443: connect: network is unreachable
    Aug 29 07:11:46 k8smaster kubelet[22595]: I0829 07:11:46.336447 22595 kubelet_node_status.go:73] “Attempting to register node” node=”k8smaster”
    Aug 29 07:11:46 k8smaster kubelet[22595]: E0829 07:11:46.339932 22595 kubelet_node_status.go:96] “Unable to register node with API server” err=”Post \”https://131.228.66.23:6443/api/v1/nodes\”: dial tcp 131.228.66.23:6443: connect: network is unreachable” node=”k8smaster”
    Aug 29 07:11:46 k8smaster kubelet[22595]: E0829 07:11:46.730741 22595 eviction_manager.go:282] “Eviction manager: failed to get summary stats” err=”failed to get node info: node \”k8smaster\” not found”
    Aug 29 07:11:51 k8smaster kubelet[22595]: E0829 07:11:51.246289 22595 event.go:355] “Unable to write event (may retry after sleeping)” err=”Patch \”https://131.228.66.23:6443/api/v1/namespaces/default/events/k8smaster.17f00ed2fec9554d\”: dial tcp 131.228.66.23:6443: connect: network is unreachable” event=”&Event{ObjectMeta:{k8smaster.17f00ed2fec9554d default 0 0001-01-01 00:00:00 +0000 UTC map[] map[] [] [] []},InvolvedObject:ObjectReference{Kind:Node,Namespace:,Name:k8smaster,UID:k8smaster,APIVersion:,ResourceVersion:,FieldPath:,},Reason:NodeHasSufficientMemory,Message:Node k8smaster status is now: NodeHasSufficientMemory,Source:EventSource{Component:kubelet,Host:k8smaster,},FirstTimestamp:2024-08-29 06:59:16.663428429 +0530 IST m=+0.419664855,LastTimestamp:2024-08-29 06:59:16.747006091 +0530 IST m=+0.503242519,Count:2,Type:Normal,EventTime:0001-01-01 00:00:00 +0000 UTC,Series:nil,Action:,Related:nil,ReportingController:kubelet,ReportingInstance:k8smaster,}”
    Aug 29 07:11:52 k8smaster kubelet[22595]: E0829 07:11:52.363072 22595 controller.go:145] “Failed to ensure lease exists, will retry” err=”Get \”https://131.228.66.23:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k8smaster?timeout=10s\”: dial tcp 131.228.66.23:6443: connect: network is unreachable” interval=”7s”

    But I’ve enabled ingress on port 6443 in my security group. Please help me troubleshoot this

  2. Hi Bibin/Team,
    There is an error with gpg key and we may need to define the exact version of gpg key which we gonna use. Please check and update
    vagrant@controlplane:~$ echo “deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main” | sudo tee /etc/apt/sources.list.d/kubernetes.list
    deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main
    vagrant@controlplane:~$ sudo apt-get update -y
    Get:2 http://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/1.28/xUbuntu_22.04 InRelease [1632 B]
    Hit:3 http://ports.ubuntu.com/ubuntu-ports jammy InRelease
    Get:4 https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/xUbuntu_22.04 InRelease [1639 B]
    Ign:1 https://packages.cloud.google.com/apt kubernetes-xenial InRelease
    Hit:5 http://ports.ubuntu.com/ubuntu-ports jammy-updates InRelease
    Err:6 https://packages.cloud.google.com/apt kubernetes-xenial Release
    404 Not Found [IP: 142.250.182.142 443]
    Hit:7 http://ports.ubuntu.com/ubuntu-ports jammy-backports InRelease
    Hit:8 http://ports.ubuntu.com/ubuntu-ports jammy-security InRelease
    Reading package lists… Done
    E: The repository ‘https://apt.kubernetes.io kubernetes-xenial Release’ does not have a Release file.
    N: Updating from such a repository can’t be done securely, and is therefore disabled by default.
    N: See apt-secure(8) manpage for repository creation and user configuration details.

    For example:

    curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg –dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

    1. Hi Sathish,

      We have updated the repo details for CRIO and kubeadm. Also updated the guide to deploy 1.29 cluster.

  3. Hello. I tried this tutorial.
    Right after applying the kubeadm init… command, I can see the coredns pods are running. This the full command:
    kubeadm init –apiserver-advertise-address=192.168.56.2 –apiserver-cert-extra-sans=192.168.56.2 –pod-network-cidr=10.100.0.0/16 –node-name master01

    Also I’m using v1.28
    I would appreciate it if you can help me to understand it.

  4. Getting following error
    The following packages have unmet dependencies:
    cri-o-runc : Depends: libc6 (>= 2.34) but 2.31-0ubuntu9.14 is to be installed
    E: Unable to correct problems, you have held broken packages.

    lsb_release -a
    Distributor ID: Ubuntu
    Description: Ubuntu 20.04.6 LTS
    Release: 20.04
    Codename: focal

    1. Please use Ubuntu 22.x server. we are using CRI package for 22 version. Or you need to find the 20.x version package for Ubuntu.

  5. Additionally, a control plane component may have crashed or exited when started by the container runtime.

    To troubleshoot, list all containers using your preferred container runtimes CLI.

    Here is one example how you may list all running Kubernetes containers by using crictl:

    – ‘crictl –runtime-endpoint unix:///var/run/crio/crio.sock ps -a | grep kube | grep -v pause’

    Once you have found the failing container, you can inspect its logs with:

    – ‘crictl –runtime-endpoint unix:///var/run/crio/crio.sock logs CONTAINERID’

    error execution phase wait-control-plane: couldn’t initialize a Kubernetes cluster

    To see the stack trace of this error execute with –v=5 or higher

    Getting this error with Public Ip based method.
    Anyone can help here?

    1. @sapalding, Are you able to get the Public IP through the command. Which environment are you using for the setup?

  6. Unable to initialize Master Node With Public IP:
    Getting below error
    [init] Using Kubernetes version: v1.28.4
    [preflight] Running pre-flight checks
    [preflight] Pulling images required for setting up a Kubernetes cluster
    [preflight] This might take a minute or two, depending on the speed of your internet connection
    [preflight] You can also perform this action in beforehand using ‘kubeadm config images pull’
    W1129 08:55:24.751257 17228 checks.go:835] detected that the sandbox image “registry.k8s.io/pause:3.8” of the container runtime is inconsistent with that used by kubeadm. It is recommended that using “registry.k8s.io/pause:3.9” as the CRI sandbox image.
    [certs] Using certificateDir folder “/etc/kubernetes/pki”
    [certs] Generating “ca” certificate and key
    [certs] Generating “apiserver” certificate and key
    [certs] apiserver serving cert is signed for DNS names [ip-172-31-32-251 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.31.32.251 16.170.201.70]
    [certs] Generating “apiserver-kubelet-client” certificate and key
    [certs] Generating “front-proxy-ca” certificate and key
    [certs] Generating “front-proxy-client” certificate and key
    [certs] Generating “etcd/ca” certificate and key
    [certs] Generating “etcd/server” certificate and key
    [certs] etcd/server serving cert is signed for DNS names [ip-172-31-32-251 localhost] and IPs [172.31.32.251 127.0.0.1 ::1]
    [certs] Generating “etcd/peer” certificate and key
    [certs] etcd/peer serving cert is signed for DNS names [ip-172-31-32-251 localhost] and IPs [172.31.32.251 127.0.0.1 ::1]
    [certs] Generating “etcd/healthcheck-client” certificate and key
    [certs] Generating “apiserver-etcd-client” certificate and key
    [certs] Generating “sa” key and public key
    [kubeconfig] Using kubeconfig folder “/etc/kubernetes”
    [kubeconfig] Writing “admin.conf” kubeconfig file
    [kubeconfig] Writing “kubelet.conf” kubeconfig file
    [kubeconfig] Writing “controller-manager.conf” kubeconfig file
    [kubeconfig] Writing “scheduler.conf” kubeconfig file
    [etcd] Creating static Pod manifest for local etcd in “/etc/kubernetes/manifests”
    [control-plane] Using manifest folder “/etc/kubernetes/manifests”
    [control-plane] Creating static Pod manifest for “kube-apiserver”
    [control-plane] Creating static Pod manifest for “kube-controller-manager”
    [control-plane] Creating static Pod manifest for “kube-scheduler”
    [kubelet-start] Writing kubelet environment file with flags to file “/var/lib/kubelet/kubeadm-flags.env”
    [kubelet-start] Writing kubelet configuration to file “/var/lib/kubelet/config.yaml”
    [kubelet-start] Starting the kubelet
    [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory “/etc/kubernetes/manifests”. This can take up to 4m0s
    [kubelet-check] Initial timeout of 40s passed.

    1. @Garry are you able to get the public IP address in the IPADDR variable? Which environment are you trying to set this up?

  7. Nice Tutorial,
    But After the entire deployment, when i try to ping google.com from a busybox pod, i m getting “bad address google.com” error , i believe that the egress traffic is working. There are no firewall or Network policies on the node or Pod.

  8. I am able to setup the k8s cluster with 1master and 1worker but when I try to add another master node to the existing cluster I am unable to do that. Can anyone please help in this. Like what’s the process for joining another master to the running cluster.

    1. I’m also interested in learning how to join another master node to my Kubernetes cluster. Have you been able to do that?

  9. Hi Bibin, thank you for providing grate article to intiate kubemaster and workernodes with in few minutes by simple steps.

    Getting error while executing for following cmd – to schedule apps in masternode.
    kubectl taint nodes –all node-role.kubernetes.io/master-

    But is resolved by using following cmd:
    kubectl taint nodes –all node-role.kubernetes.io/control-plane-

  10. Hi

    One of my worker node is in different subnet, but i was able ping the physical ips from both master and worker nodes, and i can schedule job on worker node also but, from worker node pods it was not able to communicate to master node pods also i am not able to do telnet.
    telnet 10.96.0.10 53
    Trying 10.96.0.10…
    telnet: Unable to connect to remote host: Connection timed out

    this 10.96.0.10 is core DNS service ip , can you please help me i got stuck here from a long time

    1. It seems like the issue you’re facing might be related to network policies, security groups, or routing configurations. Since you can ping and schedule jobs between the master and worker nodes but are unable to establish a connection between their respective pods, it is likely that there’s a networking misconfiguration or firewall blocking the communication between the two.

      Here are some steps to troubleshoot and resolve the issue:

      Verify routing and subnet configuration:
      Ensure that the subnets and routing configurations for both the master and worker nodes are set up correctly. If there’s a misconfiguration, it might be blocking communication between the pods. If it is a corporate network, check with the network team about the routing rules.

      If host has custom DNS servers configured use that same for cluster as well.

      Check network policies:
      Network policies define how groups of pods are allowed to communicate with each other and other network endpoints. Ensure that there are no network policies blocking communication between the master and worker node pods. If necessary, create a new network policy to allow the required communication.

      Examine firewall rules and security groups:
      Check the firewall rules and security group configurations for both the master and worker nodes to ensure that they allow the necessary traffic. You might need to open up specific ports or allow certain IP ranges to enable communication between the pods.

  11. Hey Great tutorial , I am able to deploy kubernets and kubeflow , thank you for sharing it.
    But I have few issues can u help on it.

    My worker node is not able communicate with master node cluster ip, when i do telnet
    telnet 10.96.0.10 53 from worker node it is not reaching to master , and there is not cross pod communication between the pods from worker node to master node, and finally from worker node pod am not able to ping google.com also, can u please help on it.

    1. Hi triu,

      Did you install the CNI plugin to enable pod networking?

      Are the POD CIDR range and node CIDR range different? Ensure there are not IP conflicts.

      Also, check if the CoreDNS pod is running without any issues

  12. [kubelet-check] Initial timeout of 40s passed.

    Unfortunately, an error has occurred:
    timed out waiting for the condition

    This error is likely caused by:
    – The kubelet is not running
    – The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

    If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
    – ‘systemctl status kubelet’
    – ‘journalctl -xeu kubelet’

    Additionally, a control plane component may have crashed or exited when started by the container runtime.
    To troubleshoot, list all containers using your preferred container runtimes CLI.
    Here is one example how you may list all running Kubernetes containers by using crictl:
    – ‘crictl –runtime-endpoint unix:///var/run/cri-dockerd.sock ps -a | grep kube | grep -v pause’
    Once you have found the failing container, you can inspect its logs with:
    – ‘crictl –runtime-endpoint unix:///var/run/cri-dockerd.sock logs CONTAINERID’
    error execution phase wait-control-plane: couldn’t initialize a Kubernetes cluster
    To see the stack trace of this error execute with –v=5 or higher

    I’m using docker as container service and cri-dockerd is also installed.

    This error is still showing after running command
    kubeadm init –control-plane-endpoint=$IPADDR –apiserver-cert-extra-sans=$IPADDR –pod-network-cidr=$POD_CIDR –node-name $NODENAME –cri-socket=unix:///var/run/cri-dockerd.sock

    1. @priyabrata Where are you trying to run this setup?

      Is it a private IP-based setup or Public IP based setup

      If you are trying to use Public IP for the API server endpoint, you need to use the –control-plane-endpoint with public IP parameter instead of –apiserver-advertise-address as mentioned in the guide.

      1. no i’m using private ip.

        Also i’m trying to setup using public ip what you have mention in thread, that too gave same kind of error.

        1. @priyabrata

          Try once with crio instead of dockerd. Also, there is no IP conflict between Node Ip and Pod CIDR.

  13. @Bibin:
    NODEIP: NODEPORT not working (from windows machine in same subnect with same SG) for nodes other than the nodes where pods are deployed.
    i.e masternodeip:nodeport — not working
    worker1:nodeport —- working (pods are running here)
    worker2:nodeport — not working
    Please suggest.

  14. Hi @Bibin,

    Unable to join my worker nodes to the cluster and getting the below timeout error.
    [preflight] Running pre-flight checks
    error execution phase preflight: couldn’t validate the identity of the API Server: Get “https://:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s”: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    To see the stack trace of this error execute with –v=5 or higher

    Kubelet service also not getting started in the worker nodes alone.

    command failed” err=”failed to validate kubelet flags: the container runtime endpoint address was not specified or empty, use –container-runtime-endpoint to set”

    Am i missing anything? Please help me to resolve this.

    1. @Sathish could you please give some more information about the environment you are setting up the cluster? did you add required firewall rules for the nodes to communicate?

      1. Is the port requirements part recently added to the blog? havent seen that before. After adding the port it worked.

        Thanks

  15. Hello, thanks for this article. I get an error when doing vagrant up. bellow the output:
    Bringing machine ‘master’ up with ‘virtualbox’ provider…
    Bringing machine ‘node01’ up with ‘virtualbox’ provider…
    Bringing machine ‘node02’ up with ‘virtualbox’ provider…
    ==> master: Checking if box ‘bento/ubuntu-22.04’ version ‘202212.11.0’ is up to date…
    ==> master: Clearing any previously set network interfaces…
    There was an error while executing `VBoxManage`, a CLI used by Vagrant
    for controlling VirtualBox. The command and stderr is shown below.

    Command: [“hostonlyif”, “create”]

    Stderr: 0%…NS_ERROR_FAILURE
    VBoxManage: error: Failed to create the host-only adapter
    VBoxManage: error: VBoxNetAdpCtl: Error while adding new interface: failed to open /dev/vboxnetctl: No such file or directory
    VBoxManage: error: Details: code NS_ERROR_FAILURE (0x80004005), component HostNetworkInterfaceWrap, interface IHostNetworkInterface
    VBoxManage: error: Context: “RTEXITCODE handleCreate(HandlerArg *)” at line 105 of file VBoxManageHostonly.cpp

    does any one experienced the same issue ?

    1. Hi Mourad,

      Which version of Virtual Box are you using?

      Also, did you add the following to /etc/vbox/networks.conf

      * 0.0.0.0/0 ::/0

    1. Thanks for the update. Please use the following YAML
      https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/calico.yaml
      Updated the article as well.

  16. Hi @Bibin Wilson,
    This article is really very great. It helped us a lot to install kubernetes cluster.
    Thanks a lot!!

  17. This is awesome. I’m glad I found it. I have been struggling with getting my Kubernetes cluster working. I think it is selecting the version that did it for me. Thank you.

  18. First of all, thanks for this article !
    But I’m starting to following the different steps, and in some code blocks, in the article, i see “cat <" instead of the commands to execute.

    Is this a temporary issue ?

    1. I had to use this instead:

      cat > /etc/containerd/config.toml <<EOF
      [plugins."io.containerd.grpc.v1.cri"]
      systemd_cgroup = true
      EOF
      systemctl restart containerd

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like