In this blog post, I have covered the step-by-step guide to setting up a kubernetes cluster using Kubeadm with one master and two worker nodes.
Kubeadm is an excellent tool to set up a working kubernetes cluster in less time. It does all the heavy lifting in terms of setting up all kubernetes cluster components. Also, It follows all the configuration best practices for a kubernetes cluster.
What is Kubeadm?
Kubeadm is a tool to set up a minimum viable Kubernetes cluster without much complex configuration. Also, Kubeadm makes the whole process easy by running a series of prechecks to ensure that the server has all the essential components and configs to run Kubernetes.
It is developed and maintained by the official Kubernetes community. There are other options like minikube, kind, etc., that are pretty easy to set up. You can check out my minikube tutorial. Those are good options with minimum hardware requirements if you are deploying and testing applications on Kubernetes.
But if you want to play around with the cluster components or test utilities that are part of cluster administration, Kubeadm is the best option. Also, you can create a production-like cluster locally on a workstation for development and testing purposes.
Kubeadm Setup Prerequisites
Following are the prerequisites for Kubeadm Kubernetes cluster setup.
- Minimum two Ubuntu nodes [One master and one worker node]. You can have more worker nodes as per your requirement.
- The master node should have a minimum of 2 vCPU and 2GB RAM.
- For the worker nodes, a minimum of 1vCPU and 2 GB RAM is recommended.
- 10.X.X.X/X network range with static IPs for master and worker nodes. We will be using the 192.x.x.x series as the pod network range that will be used by the Calico network plugin. Make sure the Node IP range and pod IP range don’t overlap.
Note: If you are setting up the cluster in the corporate network behind a proxy, ensure set the proxy variables and have access to the container registry and docker hub. Or talk to your network administrator to whitelist registry.k8s.io to pull the required images.
Kubeadm Port Requirements
Please refer to the following image and make sure all the ports are allowed for the control plane (master) and the worker nodes. If you are setting up the kubeadm cluster cloud servers, ensure you allow the ports in the firewall configuration.
If you are using vagrant-based Ubuntu VMs, the firewall will be disabled by default. So you don’t have to do any firewall configurations.
Kubeadm for Kubernetes Certification Exams
If you are preparing for Kubernetes certifications like CKA, CKAD, or CKS, you can use the local kubeadm clusters to practice for the certification exam. In fact, kubeadm itself is part of the CKA and CKS exam. For CKA you might be asked to bootstrap a cluster using Kubeadm. For CKS, you have to upgrade the cluster using kubeadm.
If you use Vagrant-based VMs on your workstation, you can start and stop the cluster whenever you need. By having the local Kubeadm clusters, you can play around with all the cluster configurations and learn to troubleshoot different components in the cluster.
Important Note: If you are planning for Kubernetes certification, make use of the CKA/CKAD/CKS coupon Codes before the price increases.
Vagrantfile, Kubeadm Scripts & Manifests
Also, all the commands used in this guide for master and worker nodes config are hosted in GitHub. You can clone the repository for reference.
git clone https://github.com/techiescamp/kubeadm-scripts
This guide intends to make you understand each config required for the Kubeadm setup. If you don’t want to run the commands one by one, you can run the script file directly.
If you are using Vagrant to set up the Kubernetes cluster, you can make use of my Vagrantfile. It launches 3 VMs. A self-explanatory basic Vagrantfile. If you are new to Vagrant, check the Vagrant tutorial.
If you are a Terraform and AWS user, you can make use of the Terraform script present under the Terraform folder to spin up ec2 instances.
Also, I have created a video demo of the whole kubeadm setup. You can refer to it during the setup.
Kubernetes Cluster Setup Using Kubeadm
Following are the high-level steps involved in setting up a kubeadm-based Kubernetes cluster.
- Install container runtime on all nodes- We will be using cri-o.
- Install Kubeadm, Kubelet, and kubectl on all the nodes.
- Initiate Kubeadm control plane configuration on the master node.
- Save the node join command with the token.
- Install the Calico network plugin (operator).
- Join the worker node to the master node (control plane) using the join command.
- Validate all cluster components and nodes.
- Install Kubernetes Metrics Server
- Deploy a sample app and validate the app
All the steps given in this guide are referred from the official Kubernetes documentation and related GitHub project pages.
If you want to understand every cluster component in detail, refer to the comprehensive Kubernetes Architecture.
Now let’s get started with the setup.
Step 1: Enable iptables Bridged Traffic on all the Nodes
Execute the following commands on all the nodes for IPtables to see bridged traffic. Here we are tweaking some kernel parameters and setting them using sysctl
.
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# sysctl params required by setup, params persist across reboots
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
# Apply sysctl params without reboot
sudo sysctl --system
Step 2: Disable swap on all the Nodes
For kubeadm to work properly, you need to disable swap on all the nodes using the following command.
sudo swapoff -a
(crontab -l 2>/dev/null; echo "@reboot /sbin/swapoff -a") | crontab - || true
The fstab
entry will make sure the swap is off on system reboots.
You can also, control swap errors using the kubeadm parameter --ignore-preflight-errors Swap
we will look at it in the latter part.
Note: From 1.28 kubeadm has beta support for using swap with kubeadm clusters. Read this to understand more.
Step 3: Install CRI-O Runtime On All The Nodes
Note: We are using cri-o instead if containerd because, in Kubernetes certification exams, cri-o is used as the container runtime in the exam clusters.
The basic requirement for a Kubernetes cluster is a container runtime. You can have any one of the following container runtimes.
- CRI-O
- containerd
- Docker Engine (using cri-dockerd)
We will be using CRI-O instead of Docker for this setup as Kubernetes deprecated Docker engine
Execute the following commands on all the nodes to install required dependencies and the latest version of CRIO.
sudo apt-get update -y
sudo apt-get install -y software-properties-common gpg curl apt-transport-https ca-certificates
curl -fsSL https://pkgs.k8s.io/addons:/cri-o:/prerelease:/main/deb/Release.key |
gpg --dearmor -o /etc/apt/keyrings/cri-o-apt-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/cri-o-apt-keyring.gpg] https://pkgs.k8s.io/addons:/cri-o:/prerelease:/main/deb/ /" |
tee /etc/apt/sources.list.d/cri-o.list
sudo apt-get update -y
sudo apt-get install -y cri-o
sudo systemctl daemon-reload
sudo systemctl enable crio --now
sudo systemctl start crio.service
Install crictl.
VERSION="v1.30.0"
wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$VERSION/crictl-$VERSION-linux-amd64.tar.gz
sudo tar zxvf crictl-$VERSION-linux-amd64.tar.gz -C /usr/local/bin
rm -f crictl-$VERSION-linux-amd64.tar.gz
crictl
, a CLI utility to interact with the containers created by the container runtime.
When you use container runtimes other than Docker, you can use the crictl utility to debug containers on the nodes. Also, it is useful in CKS certification where you need to debug containers.
Step 4: Install Kubeadm & Kubelet & Kubectl on all Nodes
Download the GPG key for the Kubernetes APT repository on all the nodes.
KUBERNETES_VERSION=1.30
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v$KUBERNETES_VERSION/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v$KUBERNETES_VERSION/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list
Update apt repo
sudo apt-get update -y
Note: If you are preparing for Kubernetes certification, install the specific version of kubernetes. For example, the current Kubernetes version for CKA, CKAD and CKS exams is Kubernetes version 1.30
You can use the following commands to find the latest versions. Install the first version in 1.30 so that you can practice cluster upgrade task.
apt-cache madison kubeadm | tac
Specify the version as shown below. Here I am using 1.30.0-1.1
sudo apt-get install -y kubelet=1.30.0-1.1 kubectl=1.30.0-1.1 kubeadm=1.30.0-1.1
Or, to install the latest version from the repo use the following command without specifying any version.
sudo apt-get install -y kubelet kubeadm kubectl
Add hold to the packages to prevent upgrades.
sudo apt-mark hold kubelet kubeadm kubectl
Now we have all the required utilities and tools for configuring Kubernetes components using kubeadm.
Add the node IP to KUBELET_EXTRA_ARGS
.
sudo apt-get install -y jq
local_ip="$(ip --json addr show eth0 | jq -r '.[0].addr_info[] | select(.family == "inet") | .local')"
cat > /etc/default/kubelet << EOF
KUBELET_EXTRA_ARGS=--node-ip=$local_ip
EOF
Step 5: Initialize Kubeadm On Master Node To Setup Control Plane
Here you need to consider two options.
- Master Node with Private IP: If you have nodes with only private IP addresses the API server would be accessed over the private IP of the master node.
- Master Node With Public IP: If you are setting up a Kubeadm cluster on Cloud platforms and you need master Api server access over the Public IP of the master node server.
Only the Kubeadm initialization command differs for Public and Private IPs.
Execute the commands in this section only on the master node.
If you are using a Private IP for the master Node,
Set the following environment variables. Replace 10.0.0.10
with the IP of your master node.
IPADDR="10.0.0.10"
NODENAME=$(hostname -s)
POD_CIDR="192.168.0.0/16"
If you want to use the Public IP of the master node,
Set the following environment variables. The IPADDR variable will be automatically set to the server’s public IP using ifconfig.me
curl call. You can also replace it with a public IP address
IPADDR=$(curl ifconfig.me && echo "")
NODENAME=$(hostname -s)
POD_CIDR="192.168.0.0/16"
Now, initialize the master node control plane configurations using the kubeadm command.
For a Private IP address-based setup use the following init command.
sudo kubeadm init --apiserver-advertise-address=$IPADDR --apiserver-cert-extra-sans=$IPADDR --pod-network-cidr=$POD_CIDR --node-name $NODENAME --ignore-preflight-errors Swap
--ignore-preflight-errors Swap
is actually not required as we disabled the swap initially.
For Public IP address-based setup use the following init command.
Here instead of --apiserver-advertise-address
we use --control-plane-endpoint
parameter for the API server endpoint.
sudo kubeadm init --control-plane-endpoint=$IPADDR --apiserver-cert-extra-sans=$IPADDR --pod-network-cidr=$POD_CIDR --node-name $NODENAME --ignore-preflight-errors Swap
All the other steps are the same as configuring the master node with private IP.
On a successful kubeadm initialization, you should get an output with kubeconfig file location and the join command with the token as shown below. Copy that and save it to the file. we will need it for joining the worker node to the master.
Use the following commands from the output to create the kubeconfig
in master so that you can use kubectl
to interact with cluster API.
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Now, verify the kubeconfig by executing the following kubectl command to list all the pods in the kube-system
namespace.
kubectl get po -n kube-system
You should see the following output. You will see the two Coredns pods in a pending state. It is the expected behavior. Once we install the network plugin, it will be in a running state.
You verify all the cluster component health statuses using the following command.
kubectl get --raw='/readyz?verbose'
You can get the cluster info using the following command.
kubectl cluster-info
By default, apps won’t get scheduled on the master node. If you want to use the master node for scheduling apps, taint the master node.
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
Note: You can also pass the kubeadm configs as a file when initializing the cluster. See Kubeadm Init with config file
Step 6: Join Worker Nodes To Kubernetes Master Node
We have set up cri-o, kubelet, and kubeadm utilities on the worker nodes as well.
Now, let’s join the worker node to the master node using the Kubeadm join command you have got in the output while setting up the master node.
If you missed copying the join command, execute the following command in the master node to recreate the token with the join command.
kubeadm token create --print-join-command
Here is what the command looks like. Use sudo
if you running as a normal user. This command performs the TLS bootstrapping for the nodes.
sudo kubeadm join 10.128.0.37:6443 --token j4eice.33vgvgyf5cxw4u8i \
--discovery-token-ca-cert-hash sha256:37f94469b58bcc8f26a4aa44441fb17196a585b37288f85e22475b00c36f1c61
On successful execution, you will see the output saying, “This node has joined the cluster”.
Now execute the kubectl command from the master node to check if the node is added to the master.
kubectl get nodes
Example output,
root@controlplane:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
controlplane Ready control-plane 8m42s v1.29.0
node01 Ready worker 2m6s v1.29.0
In the above command, the ROLE is <none>
for the worker nodes. You can add a label to the worker node using the following command. Replace worker-node01
with the hostname of the worker node you want to label.
kubectl label node node01 node-role.kubernetes.io/worker=worker
You can further add more nodes with the same join command.
Step 7: Install Calico Network Plugin for Pod Networking
Kubeadm does not configure any network plugin. You need to install a network plugin of your choice for kubernetes pod networking and enable network policy.
I am using the Calico network plugin for this setup.
Note: Make sure you execute the kubectl command from where you have configured the
kubeconfig
file. Either from the master of your workstation with the connectivity to the kubernetes API.
Execute the following commands to install the Calico network plugin operator on the cluster.
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
After a couple of minutes, if you check the pods in kube-system
namespace, you will see calico pods and running CoreDNS pods.
kubectl get po -n kube-system
Step 8: Setup Kubernetes Metrics Server
Kubeadm doesn’t install metrics server component during its initialization. We have to install it separately.
To verify this, if you run the top command, you will see the Metrics API not available
error.
root@controlplane:~# kubectl top nodes
error: Metrics API not available
To install the metrics server, execute the following metric server manifest file. It deploys metrics server version v0.7.1
kubectl apply -f https://raw.githubusercontent.com/techiescamp/kubeadm-scripts/main/manifests/metrics-server.yaml
This manifest is taken from the official metrics server repo. I have added the --kubelet-insecure-tls
flag to the container to make it work in the local setup and hosted it separately. Or else, you will get the following error.
because it doesn't contain any IP SANs" node=""
Once the metrics server objects are deployed, it takes a minute for you to see the node and pod metrics using the top command.
kubectl top nodes
You should be able to view the node metrics as shown below.
root@controlplane:~# kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
controlplane 142m 7% 1317Mi 34%
node01 36m 1% 915Mi 23%
You can also view the pod CPU and memory metrics using the following command.
kubectl top pod -n kube-system
Step 9: Deploy A Sample Nginx Application
Now that we have all the components to make the cluster and applications work, let’s deploy a sample Nginx application and see if we can access it over a NodePort
Create an Nginx deployment. Execute the following directly on the command line. It deploys the pod in the default namespace.
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
EOF
Expose the Nginx deployment on a NodePort 32000
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
type: NodePort
ports:
- port: 80
targetPort: 80
nodePort: 32000
EOF
Check the pod status using the following command.
kubectl get pods
Once the deployment is up, you should be able to access the Nginx home page on the allocated NodePort.
For example,
Step 10: Add Kubeadm Config to Workstation
If you prefer to connect the Kubeadm cluster using kubectl from your workstation, you can merge the kubeadm admin.conf
with your existing kubeconfig file.
Follow the steps given below for the configuration.
Step 1: Copy the contents of admin.conf
from the control plane node and save it in a file named kubeadm-config.yaml
in your workstation.
Step 2: Take a backup of the existing kubeconfig.
cp ~/.kube/config ~/.kube/config.bak
Step 3: Merge the default config with kubeadm-config.yaml and export it to KUBECONFIG variable
export KUBECONFIG=~/.kube/config:/path/to/kubeadm-config.yaml
Step 4: Merger the configs to a file
kubectl config view --flatten > ~/.kube/merged_config.yaml
Step 5: Replace the old config with the new config
mv ~/.kube/merged_config.yaml ~/.kube/config
Step 6: List all the contexts
kubectl config get-contexts -o name
Step 7: Set the current context to the kubeadm cluster.
kubectl config use-context <cluster-name-here>
Now, you should be able to connect to the Kubeadm cluster from your local workstation kubectl utility.
Possible Kubeadm Issues
Following are the possible issues you might encounter in the kubeadm setup.
- Pod Out of memory and CPU: The master node should have a minimum of 2vCPU and 2 GB memory.
- Nodes cannot connect to Master: Check the firewall between nodes and make sure all the nodes can talk to each other on the required kubernetes ports.
- Calico Pod Restarts: Sometimes, if you use the same IP range for the node and pod network, Calico pods may not work as expected. So make sure the node and pod IP ranges don’t overlap. Overlapping IP addresses could result in issues for other applications running on the cluster as well.
For other pod errors, check out the kubernetes pod troubleshooting guide.
If your server doesn’t have a minimum of 2 vCPU, you will get the following error.
[ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
If you use a public IP with --apiserver-advertise-address
parameter, you will have failed master node components with the following error. To rectify this error, use --control-plane-endpoint
parameter with the public IP address.
kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
You will get the following error in worker nodes when you try to join a worker node with a new token after the master node reset. To rectify this error, reset the worker node using the command kubeadm reset
.
[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[ERROR Port-10250]: Port 10250 is in use
[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
Kubernetes Cluster Important Configurations
Following are the important Kubernetes cluster configurations you should know.
Configuration | Location |
---|---|
Static Pods Location (etcd, api-server, controller manager and scheduler) | /etc/kubernetes/manifests |
TLS Certificates location (kubernetes-ca, etcd-ca and kubernetes-front-proxy-ca) | /etc/kubernetes/pki |
Admin Kubeconfig File | /etc/kubernetes/admin.conf |
Kubelet configuration | /var/lib/kubelet/config.yaml |
There are configurations that are part of Kubernetes feature gates. If you want to use the features that are part of feature gates, you need to enable them during the Kubeadm initialization using a kubeadm configuration file.
You can refer to enabling feature gates in Kubeadm blog to understand more.
Upgrading Kubeadm Cluster
Using Kubeadm you can upgrade the kubernetes cluster for the same version patch or a new version.
Kubeadm upgrade doesn’t introduce any downtime if you upgrade one node at a time.
To do hands-on, please refer to my step-by-step guide on Kubeadm cluster upgrade
Backing Up ETCD Data
etcd backup is one the key task in real world projects and for CKA certification.
You can follow the etcd backup guide to learn how to perform etcd backup and restore.
Setup Prometheus Monitoring
As a next step, you can try setting up the Prometheus monitoring stack on the Kubeadm cluster.
I have published a detailed guide for the setup. Refer to prometheus on Kubernetes guide for step-by-step guides. The stack contains, prometheus, alert manager, kube state metrics and Grafana.
How Does Kubeadm Work?
Here is how the Kubeadm setup works.
When you initialize a Kubernetes cluster using Kubeadm, it does the following.
- When you initialize kubeadm, first it runs all the preflight checks to validate the system state and it downloads all the required cluster container images from the registry.k8s.io container registry.
- It then generates required TLS certificates and stores them in the /etc/kubernetes/pki folder.
- Next, it generates all the kubeconfig files for the cluster components in the /etc/kubernetes folder.
- Then it starts the kubelet service generates the static pod manifests for all the cluster components and saves it in the /etc/kubernetes/manifests folder.
- Next, it starts all the control plane components from the static pod manifests.
- Then it installs core DNS and Kubeproxy components
- Finally, it generates the node bootstrap token.
- Worker nodes use this token to join the control plane.
As you can see all the key cluster configurations will be present under the /etc/kubernetes folder.
Kubeadm FAQs
How to use Custom CA Certificates With Kubeadm?
By default, kubeadm creates its own CA certificates. However, if you wish to use custom CA certificates, they should be placed in the /etc/kubernetes/pki
folder. When kubeadm is run, it will make use of existing certificates if they are found, and will not overwrite them.
How to generate the Kubeadm Join command?
You can use kubeadm token create --print-join-command
command to generate the join command.
Conclusion
In this post, we learned to install Kubernetes step by step using kubeadm.
As a DevOps engineer, it is good to have an understanding of the Kubernetes cluster components. With companies using managed Kubernetes services, we miss learning the basic building blocks of kubernetes.
This Kubeadm setup is good for learning and playing around with kubernetes.
Also, there are many other Kubeadm configs that I did not cover in this guide as it is out of the scope of this guide. Please refer to the official Kubeadm documentation. By having the whole cluster setup in VMs, you can learn all the cluster components configs and troubleshoot the cluster on component failures.
Also, with Vagrant, you can create simple automation to bring up and tear down Kubernetes clusters on-demand in your local workstation. Check out my guide on automated kubernetes vagrant setup using kubeadm.
If you are learning kubernetes, check out the comprehensive Kubernetes tutorial for beginners.
60 comments
Hello,
I’ve an openstack account and for that I’ve used the public IP based kubeadm init command but after that I’m getting this error in my journalctl log
Aug 29 07:11:45 k8smaster kubelet[22595]: I0829 07:11:45.323434 22595 trace.go:236] Trace[1105118061]: “Reflector ListAndWatch” name:vendor/k8s.io/client-go/informers/factory.go:159 (29-Aug-2024 07:11:29.958) (total time: 15365ms):
Aug 29 07:11:45 k8smaster kubelet[22595]: Trace[1105118061]: —“Objects listed” error:Get “https://131.228.66.23:6443/api/v1/nodes?fieldSelector=metadata.name%3Dk8smaster&limit=500&resourceVersion=0”: dial tcp 131.228.66.23:6443: connect: network is unreachable 15364ms (07:11:45.323)
Aug 29 07:11:45 k8smaster kubelet[22595]: Trace[1105118061]: [15.365067411s] [15.365067411s] END
Aug 29 07:11:45 k8smaster kubelet[22595]: E0829 07:11:45.323479 22595 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:159: Failed to watch *v1.Node: failed to list *v1.Node: Get “https://131.228.66.23:6443/api/v1/nodes?fieldSelector=metadata.name%3Dk8smaster&limit=500&resourceVersion=0”: dial tcp 131.228.66.23:6443: connect: network is unreachable
Aug 29 07:11:46 k8smaster kubelet[22595]: I0829 07:11:46.336447 22595 kubelet_node_status.go:73] “Attempting to register node” node=”k8smaster”
Aug 29 07:11:46 k8smaster kubelet[22595]: E0829 07:11:46.339932 22595 kubelet_node_status.go:96] “Unable to register node with API server” err=”Post \”https://131.228.66.23:6443/api/v1/nodes\”: dial tcp 131.228.66.23:6443: connect: network is unreachable” node=”k8smaster”
Aug 29 07:11:46 k8smaster kubelet[22595]: E0829 07:11:46.730741 22595 eviction_manager.go:282] “Eviction manager: failed to get summary stats” err=”failed to get node info: node \”k8smaster\” not found”
Aug 29 07:11:51 k8smaster kubelet[22595]: E0829 07:11:51.246289 22595 event.go:355] “Unable to write event (may retry after sleeping)” err=”Patch \”https://131.228.66.23:6443/api/v1/namespaces/default/events/k8smaster.17f00ed2fec9554d\”: dial tcp 131.228.66.23:6443: connect: network is unreachable” event=”&Event{ObjectMeta:{k8smaster.17f00ed2fec9554d default 0 0001-01-01 00:00:00 +0000 UTC map[] map[] [] [] []},InvolvedObject:ObjectReference{Kind:Node,Namespace:,Name:k8smaster,UID:k8smaster,APIVersion:,ResourceVersion:,FieldPath:,},Reason:NodeHasSufficientMemory,Message:Node k8smaster status is now: NodeHasSufficientMemory,Source:EventSource{Component:kubelet,Host:k8smaster,},FirstTimestamp:2024-08-29 06:59:16.663428429 +0530 IST m=+0.419664855,LastTimestamp:2024-08-29 06:59:16.747006091 +0530 IST m=+0.503242519,Count:2,Type:Normal,EventTime:0001-01-01 00:00:00 +0000 UTC,Series:nil,Action:,Related:nil,ReportingController:kubelet,ReportingInstance:k8smaster,}”
Aug 29 07:11:52 k8smaster kubelet[22595]: E0829 07:11:52.363072 22595 controller.go:145] “Failed to ensure lease exists, will retry” err=”Get \”https://131.228.66.23:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k8smaster?timeout=10s\”: dial tcp 131.228.66.23:6443: connect: network is unreachable” interval=”7s”
But I’ve enabled ingress on port 6443 in my security group. Please help me troubleshoot this
Hi thank you for the detailed blog. Do we have any similar resources for offline installation?
Great article Bibin.. save my day.
Hi Bibin/Team,
There is an error with gpg key and we may need to define the exact version of gpg key which we gonna use. Please check and update
vagrant@controlplane:~$ echo “deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main” | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main
vagrant@controlplane:~$ sudo apt-get update -y
Get:2 http://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/1.28/xUbuntu_22.04 InRelease [1632 B]
Hit:3 http://ports.ubuntu.com/ubuntu-ports jammy InRelease
Get:4 https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/xUbuntu_22.04 InRelease [1639 B]
Ign:1 https://packages.cloud.google.com/apt kubernetes-xenial InRelease
Hit:5 http://ports.ubuntu.com/ubuntu-ports jammy-updates InRelease
Err:6 https://packages.cloud.google.com/apt kubernetes-xenial Release
404 Not Found [IP: 142.250.182.142 443]
Hit:7 http://ports.ubuntu.com/ubuntu-ports jammy-backports InRelease
Hit:8 http://ports.ubuntu.com/ubuntu-ports jammy-security InRelease
Reading package lists… Done
E: The repository ‘https://apt.kubernetes.io kubernetes-xenial Release’ does not have a Release file.
N: Updating from such a repository can’t be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
For example:
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg –dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
Hi Sathish,
We have updated the repo details for CRIO and kubeadm. Also updated the guide to deploy 1.29 cluster.
Hello. I tried this tutorial.
Right after applying the kubeadm init… command, I can see the coredns pods are running. This the full command:
kubeadm init –apiserver-advertise-address=192.168.56.2 –apiserver-cert-extra-sans=192.168.56.2 –pod-network-cidr=10.100.0.0/16 –node-name master01
Also I’m using v1.28
I would appreciate it if you can help me to understand it.
Getting following error
The following packages have unmet dependencies:
cri-o-runc : Depends: libc6 (>= 2.34) but 2.31-0ubuntu9.14 is to be installed
E: Unable to correct problems, you have held broken packages.
lsb_release -a
Distributor ID: Ubuntu
Description: Ubuntu 20.04.6 LTS
Release: 20.04
Codename: focal
Please use Ubuntu 22.x server. we are using CRI package for 22 version. Or you need to find the 20.x version package for Ubuntu.
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
– ‘crictl –runtime-endpoint unix:///var/run/crio/crio.sock ps -a | grep kube | grep -v pause’
Once you have found the failing container, you can inspect its logs with:
– ‘crictl –runtime-endpoint unix:///var/run/crio/crio.sock logs CONTAINERID’
error execution phase wait-control-plane: couldn’t initialize a Kubernetes cluster
To see the stack trace of this error execute with –v=5 or higher
Getting this error with Public Ip based method.
Anyone can help here?
@sapalding, Are you able to get the Public IP through the command. Which environment are you using for the setup?
Yes i am able to
curl ifconfig.me
124.253.77.177root@Master
Using ubuntu 22 with one master and worker.
Unable to initialize Master Node With Public IP:
Getting below error
[init] Using Kubernetes version: v1.28.4
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using ‘kubeadm config images pull’
W1129 08:55:24.751257 17228 checks.go:835] detected that the sandbox image “registry.k8s.io/pause:3.8” of the container runtime is inconsistent with that used by kubeadm. It is recommended that using “registry.k8s.io/pause:3.9” as the CRI sandbox image.
[certs] Using certificateDir folder “/etc/kubernetes/pki”
[certs] Generating “ca” certificate and key
[certs] Generating “apiserver” certificate and key
[certs] apiserver serving cert is signed for DNS names [ip-172-31-32-251 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.31.32.251 16.170.201.70]
[certs] Generating “apiserver-kubelet-client” certificate and key
[certs] Generating “front-proxy-ca” certificate and key
[certs] Generating “front-proxy-client” certificate and key
[certs] Generating “etcd/ca” certificate and key
[certs] Generating “etcd/server” certificate and key
[certs] etcd/server serving cert is signed for DNS names [ip-172-31-32-251 localhost] and IPs [172.31.32.251 127.0.0.1 ::1]
[certs] Generating “etcd/peer” certificate and key
[certs] etcd/peer serving cert is signed for DNS names [ip-172-31-32-251 localhost] and IPs [172.31.32.251 127.0.0.1 ::1]
[certs] Generating “etcd/healthcheck-client” certificate and key
[certs] Generating “apiserver-etcd-client” certificate and key
[certs] Generating “sa” key and public key
[kubeconfig] Using kubeconfig folder “/etc/kubernetes”
[kubeconfig] Writing “admin.conf” kubeconfig file
[kubeconfig] Writing “kubelet.conf” kubeconfig file
[kubeconfig] Writing “controller-manager.conf” kubeconfig file
[kubeconfig] Writing “scheduler.conf” kubeconfig file
[etcd] Creating static Pod manifest for local etcd in “/etc/kubernetes/manifests”
[control-plane] Using manifest folder “/etc/kubernetes/manifests”
[control-plane] Creating static Pod manifest for “kube-apiserver”
[control-plane] Creating static Pod manifest for “kube-controller-manager”
[control-plane] Creating static Pod manifest for “kube-scheduler”
[kubelet-start] Writing kubelet environment file with flags to file “/var/lib/kubelet/kubeadm-flags.env”
[kubelet-start] Writing kubelet configuration to file “/var/lib/kubelet/config.yaml”
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory “/etc/kubernetes/manifests”. This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
@Garry are you able to get the public IP address in the IPADDR variable? Which environment are you trying to set this up?
I’ve spent 2 days with this tutorial. Thank you very much I couldn’t setup a cluster without it. 👍
Nice Tutorial,
But After the entire deployment, when i try to ping google.com from a busybox pod, i m getting “bad address google.com” error , i believe that the egress traffic is working. There are no firewall or Network policies on the node or Pod.
Hi Sona,
Looks like a DNS issue. Check coredns logs and try to restart the coreDNS pod.
Hey,
Nice tutorial.
Please consider updating the Google GPG Key in https://packages.cloud.google.com/apt/doc/apt-key.gpg. to https://dl.k8s.io/apt/doc/apt-key.gpg, or an error would appear.
Cheers!
I am able to setup the k8s cluster with 1master and 1worker but when I try to add another master node to the existing cluster I am unable to do that. Can anyone please help in this. Like what’s the process for joining another master to the running cluster.
I’m also interested in learning how to join another master node to my Kubernetes cluster. Have you been able to do that?
Thank you
Hi Bibin, thank you for providing grate article to intiate kubemaster and workernodes with in few minutes by simple steps.
Getting error while executing for following cmd – to schedule apps in masternode.
kubectl taint nodes –all node-role.kubernetes.io/master-
But is resolved by using following cmd:
kubectl taint nodes –all node-role.kubernetes.io/control-plane-
Very direct tutorial. Thanks for your writing.
Hi
One of my worker node is in different subnet, but i was able ping the physical ips from both master and worker nodes, and i can schedule job on worker node also but, from worker node pods it was not able to communicate to master node pods also i am not able to do telnet.
telnet 10.96.0.10 53
Trying 10.96.0.10…
telnet: Unable to connect to remote host: Connection timed out
this 10.96.0.10 is core DNS service ip , can you please help me i got stuck here from a long time
It seems like the issue you’re facing might be related to network policies, security groups, or routing configurations. Since you can ping and schedule jobs between the master and worker nodes but are unable to establish a connection between their respective pods, it is likely that there’s a networking misconfiguration or firewall blocking the communication between the two.
Here are some steps to troubleshoot and resolve the issue:
Verify routing and subnet configuration:
Ensure that the subnets and routing configurations for both the master and worker nodes are set up correctly. If there’s a misconfiguration, it might be blocking communication between the pods. If it is a corporate network, check with the network team about the routing rules.
If host has custom DNS servers configured use that same for cluster as well.
Check network policies:
Network policies define how groups of pods are allowed to communicate with each other and other network endpoints. Ensure that there are no network policies blocking communication between the master and worker node pods. If necessary, create a new network policy to allow the required communication.
Examine firewall rules and security groups:
Check the firewall rules and security group configurations for both the master and worker nodes to ensure that they allow the necessary traffic. You might need to open up specific ports or allow certain IP ranges to enable communication between the pods.
Hey Great tutorial , I am able to deploy kubernets and kubeflow , thank you for sharing it.
But I have few issues can u help on it.
My worker node is not able communicate with master node cluster ip, when i do telnet
telnet 10.96.0.10 53 from worker node it is not reaching to master , and there is not cross pod communication between the pods from worker node to master node, and finally from worker node pod am not able to ping google.com also, can u please help on it.
Hi triu,
Did you install the CNI plugin to enable pod networking?
Are the POD CIDR range and node CIDR range different? Ensure there are not IP conflicts.
Also, check if the CoreDNS pod is running without any issues
hi
core dns pods are running without issues, am using calico cni which running fine on both nodes,
kubectl cluster-info dump | grep -m 1 cluster-cidr
“–cluster-cidr=192.168.0.0/16”,
[kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
– The kubelet is not running
– The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
– ‘systemctl status kubelet’
– ‘journalctl -xeu kubelet’
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
– ‘crictl –runtime-endpoint unix:///var/run/cri-dockerd.sock ps -a | grep kube | grep -v pause’
Once you have found the failing container, you can inspect its logs with:
– ‘crictl –runtime-endpoint unix:///var/run/cri-dockerd.sock logs CONTAINERID’
error execution phase wait-control-plane: couldn’t initialize a Kubernetes cluster
To see the stack trace of this error execute with –v=5 or higher
I’m using docker as container service and cri-dockerd is also installed.
This error is still showing after running command
kubeadm init –control-plane-endpoint=$IPADDR –apiserver-cert-extra-sans=$IPADDR –pod-network-cidr=$POD_CIDR –node-name $NODENAME –cri-socket=unix:///var/run/cri-dockerd.sock
@priyabrata Where are you trying to run this setup?
Is it a private IP-based setup or Public IP based setup
If you are trying to use Public IP for the API server endpoint, you need to use the –control-plane-endpoint with public IP parameter instead of –apiserver-advertise-address as mentioned in the guide.
no i’m using private ip.
Also i’m trying to setup using public ip what you have mention in thread, that too gave same kind of error.
@priyabrata
Try once with crio instead of dockerd. Also, there is no IP conflict between Node Ip and Pod CIDR.
With crio, it is working fine. But issue occurs in docker
@Bibin:
NODEIP: NODEPORT not working (from windows machine in same subnect with same SG) for nodes other than the nodes where pods are deployed.
i.e masternodeip:nodeport — not working
worker1:nodeport —- working (pods are running here)
worker2:nodeport — not working
Please suggest.
Hi,
thank you for the guide, it is simple and intuitive.
Welcome Aleksei 🙂
Hi @Bibin,
Unable to join my worker nodes to the cluster and getting the below timeout error.
[preflight] Running pre-flight checks
error execution phase preflight: couldn’t validate the identity of the API Server: Get “https://:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s”: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
To see the stack trace of this error execute with –v=5 or higher
Kubelet service also not getting started in the worker nodes alone.
command failed” err=”failed to validate kubelet flags: the container runtime endpoint address was not specified or empty, use –container-runtime-endpoint to set”
Am i missing anything? Please help me to resolve this.
@Sathish could you please give some more information about the environment you are setting up the cluster? did you add required firewall rules for the nodes to communicate?
Is the port requirements part recently added to the blog? havent seen that before. After adding the port it worked.
Thanks
Hi Sathish, It was there from the start 🙂
Hello, thanks for this article. I get an error when doing vagrant up. bellow the output:
Bringing machine ‘master’ up with ‘virtualbox’ provider…
Bringing machine ‘node01’ up with ‘virtualbox’ provider…
Bringing machine ‘node02’ up with ‘virtualbox’ provider…
==> master: Checking if box ‘bento/ubuntu-22.04’ version ‘202212.11.0’ is up to date…
==> master: Clearing any previously set network interfaces…
There was an error while executing `VBoxManage`, a CLI used by Vagrant
for controlling VirtualBox. The command and stderr is shown below.
Command: [“hostonlyif”, “create”]
Stderr: 0%…NS_ERROR_FAILURE
VBoxManage: error: Failed to create the host-only adapter
VBoxManage: error: VBoxNetAdpCtl: Error while adding new interface: failed to open /dev/vboxnetctl: No such file or directory
VBoxManage: error: Details: code NS_ERROR_FAILURE (0x80004005), component HostNetworkInterfaceWrap, interface IHostNetworkInterface
VBoxManage: error: Context: “RTEXITCODE handleCreate(HandlerArg *)” at line 105 of file VBoxManageHostonly.cpp
does any one experienced the same issue ?
Hi Mourad,
Which version of Virtual Box are you using?
Also, did you add the following to /etc/vbox/networks.conf
* 0.0.0.0/0 ::/0
This is awesome article Bibin, helped me a lot for the kubernetes cluster setup. Keep Rock !
Really nice article, help me a lot to understand how Kubernetes ecosystem really works, thanks! 😊
The link (https://docs.projectcalico.org/manifests/calico.yaml) is not found.
Thanks for the update. Please use the following YAML
https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/calico.yaml
Updated the article as well.
Thanks a lot!
Dear Mr. Bibin Wilson,
Thank you so much you help me a lot 🙂
Best Regards
Thanks for the post.
I had to manually upgrade ca-certificates in order to install CRI-O.
Best,
Fabian
Hi @Bibin Wilson,
This article is really very great. It helped us a lot to install kubernetes cluster.
Thanks a lot!!
This is awesome. I’m glad I found it. I have been struggling with getting my Kubernetes cluster working. I think it is selecting the version that did it for me. Thank you.
Glad it helped Iheaka 🙂
First of all, thanks for this article !
But I’m starting to following the different steps, and in some code blocks, in the article, i see “cat <" instead of the commands to execute.
Is this a temporary issue ?
Hi Pierre. It was an HTML issue. It is fixed now.
Thanks a lot for this water-clear tutorial. It is the end of one week struggling to get it working \o/
I really appreciate this article. Thank you so much!
Glat it helped Ed 🙂
Got an issue while running the kubeadm init but after following the resolution steps in this PR it worked: https://github.com/containerd/containerd/issues/4581
I had to use this instead:
cat > /etc/containerd/config.toml <<EOF
[plugins."io.containerd.grpc.v1.cri"]
systemd_cgroup = true
EOF
systemctl restart containerd