Kubernetes Built-in AI/ML Features You Should Know

Kubernetes Built-in AI/ML Features

Kubernetes community started offering several built-in features to deploy, manage, and scale AI/ML applications efficiently.

In this blog, I will keep a track on all the native AI/ML features offered by Kubernetes (Alpha, Beta and GA features)

Gateway API Inference Extension

Kubernetes Gateway API Inference Extension is an official Kubernetes project to to support service ML model.

It addresses the traffic-routing challenges for modern GenAI and LLM inference workloads.

Mounting Container Images as Volumes (Beta)

Kubernetes version 1.31 has introduced a new alpha feature that allows you to use OCI image volumes directly within Kubernetes pods.

OCI images are images that follow Open Container Initiative specifications. You can use this feature to store binary artifacts in images and mount them to pods.

This is particularly useful for ML projects dealing with LLMs. Large Language Model deployment often involves pulling models from various sources like cloud object storage or other URIs.

OCI images containing model data make it much easier to manage and switch between different models. One project already experimenting with a similar feature is KServe, which has a feature called Modelcars.

Modelcars allows you to use OCI images that contain model data. With native OCI volume support in Kubernetes, some of the current challenges are simplified, making the process smoother.

Hands On Example: Image Volume With a Pod

Kubernetes Device Plugins (Stable)

GPUs are one of the key requirements for AI and ML applications.

To support this need, Kubernetes offers a feature called device plugins.

These plugins allow nodes to advertise their hardware resources to the kubelet, giving containers access to specialized devices like GPUs (NVIDIA GPUs, AMD GPUs). Typically runs as DaemonSets.

This setup enables Kubernetes to efficiently manage and allocate the necessary hardware for running AI and ML workloads.

For example, you can use GPU nodes with EKS using Nvidia device plugins.

Official Kubernetes Community Projects

The following are the other key official kubernetes community projects to watch out for.

  1. JobSet: For distributed training orchestration
  2. Kueue: For intelligent job queueing with topology awareness
  3. LeaderWorkerSet: API for deploying groups of pods as units, specifically designed for multi-host inference workloads.
About the author
Bibin Wilson

Bibin Wilson

Bibin Wilson (authored over 300 tech tutorials) is a cloud and DevOps consultant with over 12+ years of IT experience. He has extensive hands-on experience with public cloud platforms and Kubernetes.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to DevOpsCube – Easy DevOps, SRE Guides & Reviews.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.