If you are a sysadmin or someone trying to get into DevOps / SRE roles related to the Kubernetes platform, you might be wondering about the activities involved in a production-ready Kubernetes cluster setup administration and support.
This blog will look into the key activities involved in setting up a production-ready Kubernetes cluster.
We have spoken to more than five kubernetes SMEs in this sector and have come up with practices they followed to deploy the kubernetes cluster to host applications on production.
Types of Kubernetes Cluster Implementation
There are primarily two types of kubernetes implementation
- Managed Kubernetes clusters (GKE, AKS, EKS, etc.): Managed cluster is the easy way to get started with a highly available Kubernetes cluster on the cloud. It takes away all the administrative overhead in managing a Kubernetes cluster. The master components or the control plane components are managed by the cloud provider and they will take care of scaling and availability aspects of the kubernetes cluster.
- Self-hosted clusters (On-Cloud & On-Prem): A high level of expertise is required in setting up a highly available self-hosted Kubernetes cluster. It involves networks, storage, and sometimes even custom scaling solutions. Self-hosted Kubernetes clusters are common in on-prem environments.
High-Level Kubernetes Cluster Setup Activities
Irrespective of the type of cluster, the following is the generic list of activities are carried out for a production-ready kubernetes cluster.
1. Kubernetes POC
When you start with kubernetes, you need to figure out many things before starting the actual development.
From networking, scaling, deployments, upgrade strategy, and a lot more. The order in which these things are carried out will vary depending on the timelines and type of application hosted on the cluster.
In most cases, POCs are carried out in sandboxed environments as images and utilities from public repositories are used as part of testing.
Kubernetes POC Checklist
Following is the generic list of activities that you can make part of the Kubernetes POC.
- Network setup for Nodes & Pods: Test all the network options available with the cloud provider or on-prem environment.
- Cluster Security: When it comes to kubernetes, PSPs, RBAC, and Network policies have to be tested out as per your project requirements.
- Ingress: Do the research and try a couple of ingress controllers that are production-ready and see which one solves your requirements.
- Cluster Scaling: Most cloud providers provide cluster scaling capabilities out of the box. However, if you plan to host mission-critical apps, make sure you have mechanisms for the graceful eviction of pods.
- Cluster Upgrade: Test the best way to upgrade the cluster. (Same cluster vs migrating to the new cluster)
- High Availability: Cluster spanning multiple zones, pod topology spread constraints, multi-regional clusters. If you are on google cloud, there are even multi-cluster services.
- Storage: Choose the right storage solution for persistent volumes to deal what stateful and stateless applications.
2. Admin Activities on Managed Kubernetes Clusters
The majority of the kubernetes implementations on the cloud are done using managed services like GKE, AKS, or EKS. You can create a highly available GKE cluster with a few clicks.
However, automation tools will be used to provision the clusters as part of IaaC (Infrastructure as code)
When using Kubernetes managed services, the administrative overhead is much lesser, and the maximum focus can be put on hosting applications and managing them in an efficient way.
Following are examples of admin activities on a cloud-managed kubernetes cluster.
Note: And most of these activities would be part of Automation (IaaC). The tools used for automation would be based on the organisation or teams decision. For example, you could use Terrafrom for provisioning a Kubernetes cluster on AWS, google cloud or Azure. Or use the cloud specific automation services like AWS cloudformation, GCP deployment manager, or Azure Resource Manager
- Cluster provisioning
- Deploying and managing platform tools for cluster management. (Ingress, Prometheus, Grafana, billing utilities, SSL Management utilities, etc.)
- Developing reusable deployment modules using tools like helm.
- Setting up logging/alerting systems
- Enabling/Deploying and managing security tools and features (RBAC, Pod security policies, etc.)
- Developing automation to provision resources on the cluster. (Resource quota, namespaces, etc.)
- Cluster/Node upgrades
- SSL Certificate management.
- 24/7 support for hosted apps.
- Access management
- Pre-scaling cluster resources for peak and off-peak loads.
3. Admin Activities on Self-Hosted Kubernetes Clusters
Custom implementations of k8s are normally done in data centers..here; there could be more administrative work in terms of initial setup, upgrades, monitoring, resource monitoring, allocation, etc.…the list goes on.
- Cluster provisioning
- Cluster upgrades
- Cluster resource allocation
- Implementing solutions for cluster scaling and availability
- Applying security patches to cluster nodes.
- Load balancer & Ingress setup
- Performance benchmarking/Optimization and tuning
- Disaster management solutions
- Detailed documentation of SOPs.
- Network management
Plus all the activities we discussed under the managed kubernetes cluster section.
Who is Responsible For Managing the Kubernetes cluster?
To answer this question, you must understand the different types of team that exists in organizations today.
These days, there are enterprise solutions in the market that offer hybrid kubernetes implementations…Organizations with a good budget will definitely opt for solutions like that. For example, EKS Anywhere.