Kubernetes Persistent Volumes

Kubernetes Persistent Storage provides Kubernetes apps with a simple way to request and use storage. A volume is an essential part of Kubernetes’ storage architecture. Kubernetes persistent volumes are volumes that reside within a Kubernetes cluster and may outlast other Kubernetes pods to save data and information for longer periods of time.

Persistent Volume Claims or PVCs, are the requests for storage resources by Kubernetes nodes, and Storage Classes, which specify the storage type and allow Kubernetes resources to use them without addressing the underlying implementation. There are two more key Kubernetes storage concepts. Let’s scroll down the blog to learn about them in detail, along with knowing their features, lifecycles, use cases, and setup process.

What are Kubernetes Persistent Volumes?

PVs or Persistent Volumes in Kubernetes are storage units given by the administrators as components of Kubernetes clusters. A node is a computing resource that is utilized by a cluster. The data contained in persistent volumes is not lost when the pod goes down since persistent volumes are independent of the lifespan of the pod that uses them.

They’re specified by an API object that encapsulates the operational characteristics of storage systems like NFS files or specialized cloud storage services. Admin-provided Volumes are persistent volumes in Kubernetes. File system, size, and identifiers like volume ID and name are all established characteristics. To begin using these volumes, a Pod has to first request by sending a persistent volume claim (PVC).

PVCs describe the storage capacity and characteristics a pod needs, and the cluster tries to match that request and provide the appropriate persistent volume. Kubernetes persistent volumes have the following attributes:

It may be deployed either dynamically or manually
Built using a specific filesystem
Comes with a certain size
Has attributes like volume IDs and a name for identification

To dig deeper into Kubernetes persistent volumes, one must be aware of the following two concepts:

1. Persistent Volume Claim (PVC)

It is a storage request issued by a Kubernetes node. Applications can specify specific storage attributes, such as how much storage is needed, or the kind of access, such as read/write, read-only, write-only, etc., in Persistent Volume Claims.

Kubernetes searches for a PV which fits the user’s PVC’s requirements, and if one exists, it connects the claim to that PV. This is referred to as binding. You may also set up the cluster to provide a PV for a claim on the fly.

2. StorageClass

Persistent Volumes can be defined with varied features, including performance, size, and access parameters, using the StorageClass object. It allows you to offer persistent storage to users while abstracting implementation details. Kubernetes offers a variety of pre configured StorageClasses or you can construct one yourself.

Users can create several Storage Classes to provide users a variety of performance alternatives. There may be one on a quick SSD memory with limited capacity, and the other on a slower data storage with a large capacity.

Features of Kubernetes Persistent Volumes

1. Capacity

Individuals are allowed to set the PV’s maximum storage capacity using the “capacity” feature. To guarantee consistency across all storage services and devices, amounts are defined in bytes.

2. Volume Mode

By default, Kubernetes creates a file system on the PV, but you can also use a bare block device without a layer if you wish.

3. Access Modes

The access modes for a PV are as follows:

ReadWriteOnce: Allows a specific node to read and write the data
ReadOnlyMany: Allows read-only access and can be installed on several nodes
ReadWriteMany: Specific node would be able to perform both reading and writing operations

It’s worth noting that certain storage plugins may only offer a subset of these access options.

4. Reclaim Policy

The reclaim policy determines what happens after a node no longer requires persistent storage. A PV can be retained until it is destroyed in this way: Delete, which permanently eradicates the data, or Recycle, which can be retrieved later; or Preserve, which keeps the PV in the active state until it has been deleted explicitly. It’s worth noting that certain storage plugins may not support a subset of these recovery rules.

Lifecycle of Persistent Volumes

Persistent Volumes are the cluster’s resources. In contrast, persistent volume claims are requests for certain resources, as well as claims checks. How PVs and PVCs interact during their lifecycle can be summarized as follows:

1. Provisioning

Persistent Volumes can be provisioned in one of the following two ways:

2. Static

PVs are created by a cluster administration. They include information about the actual storage that cluster users may access. They are accessible for usage through the Kubernetes API.

3. Dynamic

If all the administrator’s static PVs fail to match the user’s Persistent Volume Claim, the clusters may attempt to dynamically provide a volume, particularly for the PVC. For dynamic provisioning to take place, the PVC should demand a storage class, and the administrator must have built and specified that class.

The cluster administration must activate the DefaultStorageClass admission controller on the API server to activate dynamic storage provisioning depending on the storage class.

4. Binding

Here’s how the binding process works in a nutshell:

Administrators must submit a PVC request, indicating the amount of storage needed and the access modes necessary. Further, a master-level control loop searches for new PVCs and then:

Static – The control loop searches and binds a matched PV to a particular PVC.
Dynamic – The control loop ties them collectively if a PV was previously dynamically supplied to the PVC.

After a PVC and a PV have been connected, they are no longer mutually exclusive because one-to-one mapping exists between PVC and PV. A ClaimRef is used in this procedure to build a bi-directional binding among PV and the PVC.

5. Using

Pods use claims as volumes. Cluster inspects the claim to find the bound volume, and mounts that volume for a pod. A user specifies which access mode they desire when using their claim as a volume inside a pod for volumes that support multiple access modes.

Once a user has a claim and that claim is bound, the bound PV belongs to the user for as long as they need it. Users schedule Pods and access their claimed PVs by including a persistentvolumeclaim section in a Pod’s volumes block.

6. Storage Object in Use Protection

Claims are used as volumes in pods. The cluster examines the claim to identify the bound volume and mount it for Pod’s use. When utilising claims as a volume Pods, the user defines which access mode is wanted for volumes that allow various access modes.

Right after the user claims it as the bound, the user owns the bound PV for as long as required. By adding a persistentvolumeclaim section in a Pod’s volumes block, users may schedule Pods and access their claimed PVs.

Whenever the PVC’s status is displayed as “Terminating” and kubernetes.io/pvc-protection is included in the Finalizers list, it clearly means that PV is secured. Simply run the command mentioned below:

kubectl describe pvc hostpath

Similar to PVC, your PV is secured if the PV’s status is Terminating and the Finalizers list includes kubernetes.io/pv-protection. Just execute the command mentioned below:

kubectl describe pv task-pv-volume

7. Reclaiming

When an individual is finished utilising their volume, they may want to remove the PVC objects from the API, allowing the resources to be reclaimed. A PersistentVolume’s reclaim policy instructs Kubernetes clusters to define what to do with the volume once it has been freed from its claim. Volume can either be Retained, Recycled, or Deleted.

8. Retain

The Retain reclaim policy enables manual resource recovery. It keeps the PersistentVolume even after the PersistentVolumeClaim is removed, and the volume is deemed “released.” However, because the prior claimant’s data is still on the disc, it does not get accessible for another claim.

9. Delete

It removes both the PersistentVolume object within Kubernetes and the corresponding storage asset in the external infrastructure, such as an AWS EBS, GCE PD, Azure Disk, or Cinder volume.

10. Recycle

If the underlying volume plugin supports it, the Recycle reclaim policy scrubs the volumes (rm -rf /thevolume/*) and makes it accessible for new claims.

Why do we use persistent volumes?

There were some best practises in the early days of containerization. One of them was for containers to be neutral. There’s no reason to exclude stateless applications now because Kubernetes has developed and container-native storage solutions such as Portworx. Similarly, for running applications in containers, Kubernetes have provided features for storing, retaining, and backing up data generated or utilized by that application.

Databases are the most popular use case for Persistent volumes in Kubernetes. It is because a database must have constant access to its data, and utilizing databases like MySQL, Cassandra, CockroachDB, and even MS SQL isn’t a good idea for applications. Therefore, PVs are used for putting complicated workloads inside containers, not merely stateless that assures the consistency of our data.

Individuals can make the deployment of distributed, stateful apps easier by using persistent volumes. But before installing any PV, make sure the following things are performed:

Every pod is made from scratch (with proper config and environment variables)
The pod has a persistent volume linked to it (with the help of persistent volume claim)
The claimed storage should be installed inside the pod

These high-level stages will be repeated for each pod in the application set. In this manner, we can be sure that the original pod will deploy and have enough storage, and that each successive copy will have the same storage attachments and mounts (required for any clustered application). This stateful collection of pods can readily scale, allowing us to add many more replicas to the distributed application. If any of the pods fail, they can be replaced, and the storage can be reconnected.

How to use Kubernetes Persistent Volumes

After learning about persistent volumes, how they differ from regular volumes, as well as how they are used, we are now ready to go over how to use them. The following steps will help you create Kubernetes persistent volumes defined sizes. Before that, ensure you have a well-functioning Kubernetes environment in place:

For building a persistent volume, create a file with the name pv-demo.yaml in your preferred editor
Edit the created file by pasting the below-mentioned spec into the same

apiVersion: v1

kind: PersistentVolume

metadata:

name: <pv_name>

spec:

capacity:

storage: 10Gi

accessModes:

– ReadWriteOnce

persistentVolumeReclaimPolicy: Retain

portworxVolume:

volumeID: “<volume_id>”

Replace <pv_name> with the PersistentVolume’s name. For instance, pv0001.

Adjust the storage capacity to the quantity you want. The aforementioned configuration will generate a volume of 10 GB.

Substitute <volume_id> with the volume’s appropriate ID. For example, “pv0001” might be used.

Save the file and run the command below for creating the Persistent Volume

kubectl create –f pv-demo.yaml

The following must be displayed on your screen as a confirmation that Persistent Volume is created

persistentvolume “pv0001” created

For seeing the created PV, execute the below-mentioned command

kubectl get pv

Conclusion

In Kubernetes, Persistent Volumes are storage units given by administrators as components of clusters. The operational characteristics of storage systems, such as NFS files or specialized cloud storage services, are specified by an API object. To begin using these volumes, a Pod has to first request by sending a persistent volume claim (PVC). Persistent Volumes can be defined with varied features, including performance, size, and access parameters, using the StorageClass object.

It interacts with Persistent Volume Claims on lots of stages of its lifecycle such as provisioning, binding, reclaiming, etc. Databases are the most popular use case for Persistent volumes in Kubernetes. Therefore, PVs are used for putting complicated workloads inside containers. For setting up Kubernetes Persistent Volume in your infrastructure, go through the steps mentioned in the last section of the blog.