Skip to content

PersistentVolume

CKA road trip - Kubernetes PV, PVC and StorageClass

If volumes and volumeMounts are about how containers access storage, then PersistentVolumes, PersistentVolumeClaims and StorageClasses are about where that storage actually comes from.


The Chain

pod → PVC → PV → physical storage on node

The pod only knows about the PVC. The PVC only knows about the PV. The PV knows where the actual disk is. Each layer is deliberately decoupled so you can swap out the underlying storage without touching anything above it.


PersistentVolume (PV)

The PV represents actual storage. It's a cluster-level resource that describes a piece of storage; how big it is, what access modes it supports, and where it physically lives.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: fast-pv-cka
spec:
  capacity:
    storage: 50Mi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Recycle
  storageClassName: fast-storage
  hostPath:
    path: /tmp/fast-data

hostPath means the storage lives at /tmp/fast-data on the node itself. Since this uses kubernetes.io/no-provisioner (see StorageClass below), Kubernetes won't create that directory for you; it must already exist on the node.

The PV has 50Mi available. A PVC claiming from it just needs to request equal or less.


PersistentVolumeClaim (PVC)

The PVC is a request for storage. It says "I need X amount of storage with these access modes from this storage class." Kubernetes matches it to a suitable PV.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: fast-pvc-cka
  namespace: default
spec:
  storageClassName: fast-storage
  accessModes:
    - ReadWriteOnce
  volumeName: fast-pv-cka
  resources:
    requests:
      storage: 30Mi

The PVC requests 30Mi. The PV has 50Mi. PV just needs to have equal or more than what's requested. The volumeName field explicitly binds this PVC to fast-pv-cka rather than letting Kubernetes pick any available PV.


StorageClass

The StorageClass defines how storage gets provisioned. It's the glue between PVCs and PVs.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-storage
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
provisioner: kubernetes.io/no-provisioner
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: Immediate

Two important fields here:

provisioner: kubernetes.io/no-provisioner -> Kubernetes will not automatically create the disk. The storage must already exist on the node. You're just telling Kubernetes to use it.

volumeBindingMode: Immediate -> the PVC binds to the PV the moment the PVC is created, without waiting for a pod to use it. The alternative is WaitForFirstConsumer, which delays binding until a pod actually needs it.


The Pod

This is where I often get it wrong. The pod references the PVC; not the PV, and not the hostPath directly.

Wrong:

volumes:
- name: fast-pvc-cka
  hostPath:
    path: /app/data        # bypasses the PVC entirely

Correct:

volumes:
- name: fast-pvc-cka
  persistentVolumeClaim:
    claimName: fast-pvc-cka   # references the PVC

The full pod spec:

apiVersion: v1
kind: Pod
metadata:
  name: fast-pod-cka
spec:
  containers:
  - image: nginx:latest
    name: fast-pod-cka
    volumeMounts:
    - name: fast-pvc-cka
      mountPath: /app/data     # container sees the storage here
  volumes:
  - name: fast-pvc-cka
    persistentVolumeClaim:
      claimName: fast-pvc-cka  # points to the PVC

The container sees the storage at /app/data. Under the hood, that maps through PVC → PV → /tmp/fast-data on the node. The container doesn't need to know any of that.


Why the Decoupling?

The pod uses persistentVolumeClaim instead of hostPath directly because:

  • The pod shouldn't care where storage physically lives
  • You can swap the underlying storage (different node, different disk, cloud storage) without changing the pod
  • Access control and capacity management happens at the PV/PVC layer, not the pod layer

Think of it like a power socket. Your laptop (pod) plugs into the socket (PVC). The socket connects to the grid (PV). The grid pulls power from a source (hostPath/cloud disk/etc). Your laptop doesn't need to know anything about power stations.


Quick Reference

Resource Level What it does
PV Cluster Describes actual storage and where it lives
PVC Namespace Requests storage from a PV
StorageClass Cluster Defines how storage is provisioned and bound
volume Pod Names the PVC for use inside the pod
volumeMount Container Maps the volume to a path inside the container

The One-Line Summary

PV = the actual disk. PVC = the request for that disk. StorageClass = the rules for how binding happens. Pod = just references the PVC and doesn't care about anything below it.

Side note that trips me all the time:

The SC is just a set of rules; it defines how binding works (Immediate, no-provisioner, Retain etc).

Both the PVC and PV reference the SC by name (storageClassName: fast-storage), and that's how Kubernetes knows they belong together. The SC is the matchmaker, not the storage.

The actual connection to node storage is:

PV → hostPath → /tmp/fast-data on node

SC never touches the disk directly.