How to Add Persistent Volume in Google Kubernetes Engine

Add Persistent Volume to Pods in Google Kubernetes Engine

In this blog, you will learn how to setup Persistent Volume For the GKE Kubernetes cluster.

If you want to preserve the data even after a pod deletion or pod failures, you should use persistent volumes.

About GKE Persistent Volumes

In GKE, you can provision a Google Cloud Persistent disk (Compute Engine Disks) to be used as a persistent volume in the Kubernetes cluster.

You can dynamically provision the persistent volumes on demand using the Kubernbetes persistent volume claim manifests.

Based on the type of persistent disk storage class, GKE will ensure the volume is provisioned and makes it available for use in the cluster.

Following are some important concepts you need to be aware of when using persistent volumes with GKE.

  1. You can use both google persistent disk and Google Filestore (NFS) as persistent volumes for GKE.
  2. Persistent disks can be regional (offers high durability) or zonal based on the requirements.
  3. You can also use a pre-existing persistent disk with data as a GKE persistent volume.
  4. By default, dynamic provisioning uses the pd-standard disk type as a storage class. It is one of the four standard persistent disk types offered by Google Cloud.
  5. You can back up the persistent volumes using a Volume snapshot. This feature works only if you have the Compute Engine persistent disk CSI Driver enabled in the cluster.

Setup Persistent Volume For GKE

We will do the following.

  1. Create a storage class
  2. Provision a Persistent volume using the storage class.
  3. Test a deployment with the persistent volume.

Lets gets started with the setup.

Create a storage class for GKE

Storage class is a simple way of segregating the storage options.

To put simply, a storage class defines what type of storage to be provisioned.

For example, we can classify our storage class as gold and silver. These names are arbitrary and use a name that is meaningful to you.

Gold storage class uses the pd-ssd persistent disk type for high IOPS applications (to be used with databases). While silver storage class uses the pd-standard volume type to be used for backups and normal disk operations.

These storage class segregations are completely based on the project requirements.

Note: There are default storage classes available in GKE which are backed by pd-standard disks. If you don’t specify a storage class while provisioning a PV, the default storage class is considered.

Lets create a gold storage class.

Save the following manifest as storage-class.yaml.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gold
provisioner: kubernetes.io/gce-pd
volumeBindingMode: Immediate
allowVolumeExpansion: true
reclaimPolicy: Delete
parameters:
  type: pd-standard
  fstype: ext4
  replication-type: none

Create the storage class.

kubectl apply -f storage-class.yaml

A little explanation about the parameters.

  1. Type:  supports pd-standard & pd-ssd. If you don’t specify anything, it defaults pd-standard
  2. fstype: supports ext4 and xfs. Defaults to ext4.
  3. replication-type: This decides whether the disk is zonal or regional. If you don’t specify regional-pd, it defaults to a zonal disk.
  4. allowVolumeExpansion: With this parameter, you can expand the persistent volume if required.
  5. volumeBindingMode: There are two modes. Immediate and WaitForFirstConsumer. In cases where the storage is not accessible from all the nodes, use WaitForFirstConsumer so that volume binding will happen after the pod gets created.

Create a Persistent Volume using PVC on GKE

To create a persistent volume you need to create a Persistent Volume claim.

persistentVolumeClaim is the way to request storage based on a storage class and use it with a pod. The pod to Persistent volume mapping happens through PVC.

Meaning, when you request for a PVC, a persistent volume (google persistent disk) will be dynamically provisioned based on the pd-standard or pd-ssd the parameter you specified in the storage class.

So PVC (Request) –> Storage Class (Defines Type of disk) –> Persistent Volume (Google Persistent Disk)

There are three access modes for kubernetes PVCs

  1. ReadWriteOnce: Read and write is allowed by only one pod at a time.
  2. ReadOnlyMany: Multiple pods can perform read operations from the volume.
  3. ReadWriteMany: Multiple pods can perform read & write operations from the volume.

Note: ReadWriteMany option is not supported in persistent volumes backed by Google persistent disk. If you have a use case to use ReadWriteMany volumes, you can consider google filestore backed persistent volumes for GKE.

In our example, we are creating a PVC using gold storage class with 50GB storage in the default namespace. You can also assign a custom namespace under metadata

Save the manifest as pvc.yaml

apiVersion: v1
 kind: PersistentVolumeClaim
 metadata:
   name: webapps-storage
 spec:
   storageClassName: gold
   accessModes:
     - ReadWriteOnce
   resources:
     requests:
       storage: 50Gi

Now, create the PVC.

kubeactl apply -f pvc.yaml

You can check the pv and pvc using the following commands.

kubectl get pv
kubectl get pvc

If you check the compute engine disks, you will see a 50GB disk created as shown below.

Since the gold storage class volumeBindingMode is immediate, you will see the volume provisioned and the claim available for pods to use.

If your storage class binding mode is WaitForFirstConsumer , after deploying the PVC, you will see PVC status in a pending state. Because only after a pod is created with a PVC request, kubernetes creates the persistent volume

The following image shows the difference between immediate and WaitForFirstConsumer modes.

Creating Persistent Volumes From Existing Google Compute Disks

You can create persist volumes volumes from existing google compute disks.

For demonstration purpose, I am creating a compute disk named gke-pv of size 50GB

gcloud compute disks create gke-pv  --zone=us-central1-a --size=50GB

Now we have a disk available to be used as PV in GKE.

Next step is to,

  1. Create a Persistent volume named app-storage from the gke-pv disk
  2. To use the persistent volume with the pod, we will create a persistent volume claim with the same name we use in the PV claimRef, ie app-storage-claim

Save the following manifest as disk-pv.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: app-storage
spec:
  storageClassName: "apps"
  capacity:
    storage: 50
  accessModes:
    - ReadWriteOnce
  claimRef:
    namespace: default
    name: app-storage-claim
  gcePersistentDisk:
    pdName: gke-pv
    fsType: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-storage-claim
spec:
  storageClassName: "apps"
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50

Create the PV & PVC.

kubectl apply -f disk-pv.yaml

You can check the pv and pvc using the following commands.

kubectl get pv
kubectl get pvc

Example GKE Pod With Persistent Volume

Now that we know the two ways to use google persistent disk as GKE persistent volumes, we will look at using the persistent volume in a pod.

To mount a persistent volume to the pod, we use the Persistent volume claim name in the volumes section, and we use the volume name in the volumeMounts section with the container path to mount.

In out example, we will be mounting the /usr/share/nginx/html path to the persistent volume.

apiVersion: v1
kind: Pod
metadata:
  name: nginx-app-pod
spec:
  volumes:
    - name: app-storage
      persistentVolumeClaim:
        claimName: app-storage-claim
  containers:
    - name: nginx-app-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: app-storage

Example GKE Deployment With Persistent Volume Claim

Lets try using the persistent volume on a Jenkins delployment.

This Jenkins deployment creates a pod with all its data mounted to the persistent volume. So, even if you delete the pod, a new pod will come up and mount itself to the persistent volume keeping the same old state of Jenkins.

Here is a little explanation about the deployment

  1. Create a Persistent volume claim with the gold storage class.
  2. In deployment, Create a Volumes definition named jenkins-data and add the jenkins-pv-claim to be added as a volume to the container
  3. In container spec, under volumeMounts, we define the volume name and mount path /var/jenkins_home for the container.
  4. A service will expose Jenkins on NodePort 32000

Save the following manifest as jenkins.yaml. It has the PVC, deployment and service definitions.

# Persistent Volume Claim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: jenkins-pv-claim
spec:
  storageClassName: gold
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi

# Deployment Config
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: jenkins-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jenkins
  template:
    metadata:
      labels:
        app: jenkins
    spec:
      securityContext:
            fsGroup: 1000 
            runAsUser: 1000
      containers:
        - name: jenkins
          image: jenkins/jenkins:lts
          resources:
            limits:
              memory: "2Gi"
              cpu: "1000m"
            requests:
              memory: "500Mi"
              cpu: "500m"
          ports:
            - name: httpport
              containerPort: 8080
            - name: jnlpport
              containerPort: 50000
          livenessProbe:
            httpGet:
              path: "/login"
              port: 8080
            initialDelaySeconds: 90
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 5
          readinessProbe:
            httpGet:
              path: "/login"
              port: 8080
            initialDelaySeconds: 60
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 3
          volumeMounts:
            - name: jenkins-data
              mountPath: /var/jenkins_home         
      volumes:
        - name: jenkins-data
          persistentVolumeClaim:
              claimName: jenkins-pv-claim

# Service Config
---
apiVersion: v1
kind: Service
metadata:
  name: jenkins-service
  annotations:
      prometheus.io/scrape: 'true'
      prometheus.io/path:   /
      prometheus.io/port:   '8080'
spec:
  selector: 
    app: jenkins
  type: NodePort  
  ports:
    - port: 8080
      targetPort: 8080
      nodePort: 32000

Create the deployment.

kubectl apply -f jenkins.yaml

Once the deployment is up and running, you will be able to access the Jenkins server on any of the Node port 32000

Conclusion

We have see how to setup persistent volume on a GKE cluster with few examples using pods and deployments.

Hope this article helps.

Let me know in the comment section, if you face any issues,

8 comments
  1. Hi Bibin,
    It’s a fantastic blog. I folloowed your blog and successfuly i am able to create pv,pvc and mount volume ia pod. I am also trying to create second gce-pd , pv and pvc … the second pod created successfully but not able to exec into thta pod getting error as like below.
    error: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec “b71e2357410a9aad1e4ef6259a090dffba8ed2b4e3e5181edbb38ac965a82de5”: OCI runtime exec failed: write /tmp/runc-process008951230: no space left on device: unknown
    below are status of both pv and pvc
    pv
    NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
    bfk-storage 420Gi RWO Retain Bound default/bfk-storage-claim standard 27h
    statemine-storage 200Gi RWO Retain Bound default/statemine-storage-claim standard 20m
    pvc
    NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
    bfk-storage-claim Bound bfk-storage 420Gi RWO standard 27h
    statemine-storage-claim Bound statemine-storage 200Gi RWO standard 21m
    It’s functioning for the first case but not with the second ..
    No error for pod/deployment (logs/describe showing nothing)

    is multiple pv,pvc not possible for same namespaces?
    Thank you, waiting for your comment..

  2. Thanks for the awesome article, well explained.
    What should be the recommended access mode in PVC for above Jenkins deployment, if we have more then 1 copy of pod running on different node?

    1. Hi Rohit,

      I am not sure if running multiple copies of the Jenkins pod is a good idea. It might lead to file inconsistencies as Jenkins uses flat files for storing data. For multiple pods the access mode would be ReadWriteMany

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like