Kubernetes Fundamentals: Volumes

Gayatri S Ajith
FAUN — Developer Community 🐾
8 min readMar 19, 2021

--

Data storage in a distributed environment is complex in itself. Add to that virtualisation and automation and you have multiple layers of planning to do. So, how does data work in a Kubernetes cluster?

Here I assume you are familiar with the basic objects in Kubernetes and how they interact with each other.

There are 2 Parts

There are 2 components to volumes in Kubernetes —

  • The Cluster Object.
  • The Physical Storage.

The persistence and share-ability of these components are independent of each other. This means if the object dies it doesn’t imply that the physical data is lost/cleared. A new object can be created and attached to the same physical storage. For beginners, this part can be quite confusing since persistence of object is often equated to persistence of data. It is not.

Multiple objects can point to the same physical storage. Persistence of the object is independent of the physical storage persistence
Volumes in Kubernetes

I — Cluster Objects

Kubernetes supports 2 types of objects —

  • Volumes
  • Persistent Volumes

Remember, we are talking about the cluster object, not the physical storage. Irrespective of whether it's a Volumes or Persistent volume you have to plan on what physical storage will back the object— where will the files be actually saved.

Kubernetes Volumes

Containers inside a Pod can share Kubernetes Volume objects. Hence, the life of such objects is tightly associated with the life of the Pod. These objects do not have a life independent of the Pod. As soon as the Pod dies this object dies. This object persists across container restarts.

Anytime you want to access data in a container of a Pod you have to use a Kubernetes Volume object — whether the data is from a Folder, ConfigMap, Secret, or a Kubernetes Persistent Volume.

Kubernetes Volumes

Kubernetes Persistent Volumes (PV)

Pods can share Kubernetes Persistent Volume objects — irrespective of the node where the Pod is created on. Hence, the life of persistent volume objects are independent of any other object in the cluster. Remember, Kubernetes will NEVER automatically delete a Persistent Volume object. You have to always do/trigger the deletion manually.

Just because the persistent volume object can be shared across Pods it does not mean Pods across nodes will have access to the same files — that depends on the physical storage. So, if the physical storage is a simple folder then Pods across nodes need not have the same files. On the other hand if, the physical storage is an NFS volume then, Pods across nodes can have the same files. Beginners tend to stumble on this.

For a container to access a Kubernetes Persistent Volume it has to create a Kubernetes Volume backed by the Kubernetes Persistent Volume. Remember, containers can only access/mount Kubernetes Volumes.

How to define and use a Persistent Volume

Defining and using a Kubernetes Persistent Volumes isn’t as simple as a Kubernetes Volume. The easiest way to understand this is by breaking it down into 2 steps:

  • Splitting the physical storage into multiple cluster objects of defined size — Kubernetes Persistent Volume object (PV)
  • Describing the type of storage your application needs and finding the cluster object that matching this requirement — Kubernetes Persistent Volume Claim object (PVC)

When a PVC object is created, K8s searches through all available PV objects and finds the one that matches the PVC. As soon as a match is found, the PV object is attached to the claim. It is a 1–1 relationship i.e., one PV object can be attached to only one PVC and vice versa. This PVC object is then used by the Pod.

Kubernetes Persistent Volumes

II — Physical Storage

Kubernetes supports a lot of volume types (that is, physical storage solutions). You can find a complete list in the k8s official documentation. Depending on the type of volume (storage solution) you choose the parameters that need to be passed change.

An interesting point to remember is, some of these storage solutions can be dynamically provisioned — these are especially useful when you want to work with StatefulSets objects.

Kubernetes Volume Syntax

Remember,

  • K8s volumes do not have independent existence — tightly coupled with the life of the Pod. Hence, they are defined as part of your pod spec.
  • You can have multiple K8s volume objects defined inside a Pod
  • The same K8s volume object can be mounted inside multiple containers in a Pod
apiVersion: v1
kind: Pod
metadata:
name: [POD NAME]
spec:
volumes:
- [K8s VOL OBJ NAME]
[STORAGE SOLUTION TYPE]
[STORAGE SOLUTION PARAMETERS]
- ....
- ....
containers:
- name: [CONTAINER NAME]
image: [CONATINER IMAGE]
volumeMounts
- name: [ANY K8s VOL OBJ NAME DEFINE ABOVE]
path: [PATH INSIDE THE CONTAINER TO MOUNT THE VOL]
- name: [CONTAINER NAME]
image: [CONATINER IMAGE]
volumeMounts
- name: [ANY K8s VOL OBJ NAME DEFINED ABOVE]
path: [PATH INSIDE THE CONTAINER TO MOUNT THE VOL]
- ...
- ...

Example:

  • A K8s Volume object with the name shared-data is defined
  • Both the containers, first and second mount the same object inside the container. Note that the mount path need not be the same.
  • Container second is manipulating a file in the mount path which will be available in the container first since they are sharing the same K8s volume object.
  • Here the physical storage solution used is emptyDir — which is an ephemeral storage solution.
apiVersion: v1
kind: Pod
metadata:
name: two-containers
labels:
app: two
spec:
volumes:
- name: shared-data
emptyDir: {}

containers:
- name: first
image: nginx
ports:
- containerPort: 80
volumeMounts:
- name: shared-data
mountPath: /usr/share/nginx/html

- name: second
image: debian
volumeMounts:
- name: shared-data
mountPath: /pod-data

command: ["/bin/sh"]
args:
- "-c"
- >
while true; do
date >> /pod-data/index.html;
echo Hello from the second container more >> /pod-data/index.html;
sleep 20;
done

Kubernetes Persistent Volume Syntax

Remembering the break down that we mentioned before, these are the steps to use a PV.

Step 1: Define the Kubernetes Persistent Volume Object
This is where you define the cluster object and the physical storage that backs the object. Every Kubernetes PV object defines these 4 attributes:

  • Capacity: What is the size of the volume object. This size is defined in gigibytes (Gi) by default.
  • AccessModes: Describes how the object can be accessed by the nodes — ReadWriteOnce, ReadWriteMany, ReadOnlyMany. This depends on the underlying storage solution too. For example, NFS can support multiple read/write, but a specific NFS volume object can be defined as read-only.
  • StorageClassName: What type of physical storage are you using and which class should K8s use to access the physical storage
  • Additional storage parameters: any values needed for the physical storage type
apiVersion: v1
kind: PersistentVolume
metadata:
name: [OBJECT NAME]
spec:
storageClassName: [PHYSICAL STORAGE NAME]
capacity:
storage: [SIZE IN GiGiBytes (by default)]
accessModes:
- [ACCESS MODE; ReadWriteOnce/ReadOnlyMany/ReadWriteMany]
[PHYSICAL STORAGE PARAMETERS]

Step 2: Define the Kubernetes Persistent Volume Claim Object
Once the volume objects are ready, we need to describe the type of volume our application needs. For this we create a PVC object. The PVC object describes the StorageClassName, AccessMode and Capacity needed for our application. Kubernetes will then search for a PV object that matches this claim and attaches it.

A point to note here is, the StorageClassName and AccessMode have to be an exact match but, the Capacity can be equal to or greater. So, if a 1 Gi object isn’t available it will search for the next higher object (≥ 1Gi).

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: [OBJECT NAME]
labels:
type: local
spec:
storageClassName: [PHYSICAL STORAGE YOU NEED]
accessModes:
- [ACCESS MODE YOU NEED]
resources:
requests:
storage: [SIZE YOU NEED IN Gigi Bytes (by default)]

Step 3: Use the Volume Claim object to create a Kubernetes Volume object in the Pod and mount it into the containers
Like mentioned before, only Kubernetes Volume objects can be mounted inside a container. So,

  • first, you define a Kubernetes Volume in the Pod backed by the PVC
  • then, you mount it inside the container as usual

Remember, the rules for Kubernetes Volumes apply here too — you can mount this same Kubernetes Volume object inside multiple containers in the Pod.

apiVersion: v1
kind: Pod
metadata:
name: [POD NAME]
spec:
volumes:
- [K8s VOL OBJ NAME]
persistentVolumeClaim:
claimName: [PVC OBJ NAME]
- ....
- ....
containers:
- name: [CONTAINER NAME]
image: [CONATINER IMAGE]
volumeMounts
- name: [K8s VOL OBJ NAME]
path: [PATH INSIDE THE CONTAINER TO MOUNT THE VOL]
- ...
- ...

Example:

  • PV object named pv-volume is created. Physical storage solution is a folder on the host (/mnt/data) hence, the access mode is ReadWriteOnce (i.e., only 1 node can use it as ReadWrite)
  • PVC object name pv-volume-claim is created — the capacity requirement is slightly lesser than the available pv object. Hence, a successful match will happen.
  • The same PVC is connected to 2 pods — task-pv-pod and task-pv-pod2 — via Kubernetes Volume objects defined in the Pod spec (called pv-storage and pv-storage2 respectively)
  • This Kubernetes volume object is then mounted inside the container.
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pv-volume-claim
labels:
type: local
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
name: task-pv-pod
spec:
volumes:
- name: pv-storage
persistentVolumeClaim:
claimName: pv-volume-claim
containers:
- name: pv-container
image: nginx
ports:
- containerPort: 80
volumeMounts:
- name: pv-storage
mountPath: "/usr/share/nginx/html"
---
apiVersion: v1
kind: Pod
metadata:
name: task-pv-pod2
spec:
volumes:
- name: pv-storage2
persistentVolumeClaim:
claimName: pv-volume-claim
containers:
- name: pv-container2
image: nginx
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: pv-storage2

Putting it all together!

The easiest way to plan volumes in Kubernetes is by following these steps:

  • Why do you need the volume in the application?
  • How do you want the actual data to persist?
  • Which Physical Storage solution will match the above requirements?
  • Create a cluster object backed by the physical storage
  • Mount the object in the containers of the Pod

👋 Join FAUN today and receive similar stories each week in your inbox! Get your weekly dose of the must-read tech stories, news, and tutorials.

Follow us on Twitter 🐦 and Facebook 👥 and Instagram 📷 and join our Facebook and Linkedin Groups 💬

If this post was helpful, please click the clap 👏 button below a few times to show your support for the author! ⬇

--

--

Corporate Trainer for DevOps/Docker/Kubernetes/Ansible/Splunk/Jenkins/Chef/Puppet/AWS