Kubernetes Storage
category: DevOps
tags: kubernetes, k8s, storage, volumes, persistentvolume, pvc, storageclass
Volume
What it is: A directory accessible to containers in a pod, providing storage that can persist beyond individual container lifecycles and be shared between containers.
Why it matters: Containers are stateless and ephemeral by default. Volumes provide data persistence, enable data sharing between containers in a pod, and allow applications to maintain state across container restarts.
Volume vs Container filesystem:
- Container filesystem - Ephemeral, lost when container dies
- Volume - Can persist beyond container lifecycle
- Shared access - Multiple containers in pod can access same volume
- External storage - Can connect to external storage systems
Volume types:
- emptyDir - Temporary storage, lifecycle tied to pod
- hostPath - Mount directory from host node
- configMap/secret - Mount configuration data as files
- persistentVolumeClaim - Request for persistent storage
- nfs, cephfs, glusterfs - Network storage systems
- cloud volumes - AWS EBS, GCP PD, Azure Disk
Common commands:
# Volume inspection
kubectl describe pod <pod-name> # Show pod's volumes
kubectl get pv # List persistent volumes
kubectl get pvc # List persistent volume claims
# Troubleshooting volumes
kubectl exec <pod-name> -- df -h # Check mounted filesystems
kubectl exec <pod-name> -- ls -la /path/to/volume # List volume contents
kubectl logs <pod-name> -c <container-name> # Check for mount errors
emptyDir Volumes
What it is: Temporary storage that exists for the lifetime of a pod, shared between all containers in the pod.
Characteristics:
- Pod lifecycle - Created when pod starts, deleted when pod terminates
- Shared storage - All containers in pod can access
- Node storage - Uses node's local storage (disk or memory)
- No persistence - Data lost when pod is deleted
Use cases:
- Temporary files - Scratch space for computations
- Shared data - Communication between containers in same pod
- Cache - Temporary caching that doesn't need persistence
- Logs - Temporary log storage before shipping elsewhere
Example:
apiVersion: v1
kind: Pod
metadata:
name: web-with-cache
spec:
containers:
- name: web-server
image: nginx
volumeMounts:
- name: cache-volume
mountPath: /var/cache/nginx
- name: cache-warmer
image: busybox
command: ['sh', '-c', 'while true; do echo "warming cache" > /cache/warm; sleep 3600; done']
volumeMounts:
- name: cache-volume
mountPath: /cache
volumes:
- name: cache-volume
emptyDir: {}
emptyDir with memory storage:
volumes:
- name: memory-volume
emptyDir:
medium: Memory # Use RAM instead of disk
sizeLimit: 1Gi # Limit size to 1GB
hostPath Volumes
What it is: Mount a file or directory from the host node's filesystem into the pod.
Use cases:
- Node monitoring - Access to /proc, /sys for system monitoring
- Docker socket - Access Docker daemon from container
- Log collection - Access host log directories
- Development - Mount source code during development
Security considerations:
- Host access - Container can access host filesystem
- Privilege escalation - Potential security risk
- Node dependency - Pod tied to specific node
- Not portable - Doesn't work across different nodes
Example:
apiVersion: v1
kind: Pod
metadata:
name: host-access-pod
spec:
containers:
- name: monitor
image: busybox
command: ['sh', '-c', 'while true; do df -h /host-root; sleep 30; done']
volumeMounts:
- name: host-root
mountPath: /host-root
readOnly: true
volumes:
- name: host-root
hostPath:
path: /
type: Directory
hostPath types:
- DirectoryOrCreate - Create directory if it doesn't exist
- Directory - Directory must exist
- FileOrCreate - Create file if it doesn't exist
- File - File must exist
- Socket - Unix socket must exist
- CharDevice - Character device must exist
- BlockDevice - Block device must exist
configMap and secret Volumes
What they are: Volumes that mount ConfigMaps and Secrets as files in the container filesystem.
Use cases:
- Configuration files - App configs, nginx.conf, etc.
- Environment-specific settings - Different configs per environment
- Certificates - SSL certificates and keys
- API keys - Sensitive configuration data
ConfigMap volume example:
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
database.conf: |
host=localhost
port=5432
database=myapp
cache.conf: |
redis_host=redis-service
redis_port=6379
---
apiVersion: v1
kind: Pod
metadata:
name: app-with-config
spec:
containers:
- name: app
image: myapp
volumeMounts:
- name: config-volume
mountPath: /etc/config
volumes:
- name: config-volume
configMap:
name: app-config
Secret volume example:
apiVersion: v1
kind: Secret
metadata:
name: app-secrets
type: Opaque
data:
username: YWRtaW4= # base64 encoded
password: MWYyZDFlMmU2N2Rm # base64 encoded
---
apiVersion: v1
kind: Pod
metadata:
name: app-with-secrets
spec:
containers:
- name: app
image: myapp
volumeMounts:
- name: secret-volume
mountPath: /etc/secrets
readOnly: true
volumes:
- name: secret-volume
secret:
secretName: app-secrets
defaultMode: 0400 # Read-only for owner only
PersistentVolume (PV)
What it is: A piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes.
Why it matters: PersistentVolumes provide durable storage that exists independently of pods. They enable stateful applications, data persistence across pod restarts, and centralized storage management in Kubernetes clusters.
PV characteristics:
- Cluster resource - Independent of any individual pod
- Lifecycle independent - Exists beyond pod lifecycle
- Admin provisioned - Usually created by cluster administrators
- Capacity and access modes - Define storage size and access patterns
- Reclaim policies - What happens when PV is released
Access modes:
- ReadWriteOnce (RWO) - Volume can be mounted read-write by single node
- ReadOnlyMany (ROX) - Volume can be mounted read-only by many nodes
- ReadWriteMany (RWX) - Volume can be mounted read-write by many nodes
- ReadWriteOncePod (RWOP) - Volume can be mounted read-write by single pod
Reclaim policies:
- Retain - Manual reclamation of the resource
- Recycle - Basic scrub (rm -rf /thevolume/) - deprecated
- Delete* - Delete the volume from infrastructure
Common commands:
# PersistentVolume operations
kubectl get pv # List all persistent volumes
kubectl describe pv <pv-name> # Detailed PV information
kubectl delete pv <pv-name> # Delete persistent volume
# PV status and troubleshooting
kubectl get pv -o wide # PV with additional info
kubectl get events --field-selector involvedObject.kind=PersistentVolume
PV lifecycle states:
- Available - Free resource not yet bound to claim
- Bound - Volume is bound to a claim
- Released - Claim has been deleted but resource not reclaimed
- Failed - Volume has failed its automatic reclamation
Example PersistentVolume (NFS):
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
labels:
type: nfs
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: slow
nfs:
server: 192.168.1.100
path: /exported/path
Example PersistentVolume (AWS EBS):
apiVersion: v1
kind: PersistentVolume
metadata:
name: ebs-pv
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: gp2
awsElasticBlockStore:
volumeID: vol-12345678
fsType: ext4
Local PersistentVolume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: local-pv
spec:
capacity:
storage: 100Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /mnt/disks/ssd1
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker-node-1
PersistentVolumeClaim (PVC)
What it is: A request for storage by a user, similar to how pods consume node resources, PVCs consume PV resources.
Why it matters: PVCs provide an abstraction layer between applications and storage. Applications request storage through PVCs without needing to know the underlying storage implementation details.
PVC workflow:
1. User creates PVC - Specifies storage requirements
2. Kubernetes finds matching PV - Based on size, access mode, storage class
3. Binding occurs - PVC is bound to suitable PV
4. Pod uses PVC - Mounts the bound storage
5. Release - When PVC is deleted, PV follows reclaim policy
PVC specifications:
- Storage size - Amount of storage requested
- Access modes - How the storage will be accessed
- Storage class - Type of storage required
- Label selectors - Additional criteria for matching PVs
Common commands:
# PVC operations
kubectl get pvc # List all PVCs
kubectl describe pvc <pvc-name> # Detailed PVC information
kubectl delete pvc <pvc-name> # Delete PVC
# PVC troubleshooting
kubectl get pvc -o wide # PVC with additional info
kubectl get events --field-selector involvedObject.kind=PersistentVolumeClaim
PVC states:
- Pending - PVC is waiting to be bound to a PV
- Bound - PVC is bound to a PV
- Lost - PV backing the PVC is lost
Example PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: database-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: fast-ssd
selector:
matchLabels:
environment: production
Using PVC in Pod:
apiVersion: v1
kind: Pod
metadata:
name: database-pod
spec:
containers:
- name: database
image: postgres:13
env:
- name: POSTGRES_DB
value: myapp
- name: POSTGRES_USER
value: admin
- name: POSTGRES_PASSWORD
value: secretpassword
volumeMounts:
- name: database-storage
mountPath: /var/lib/postgresql/data
volumes:
- name: database-storage
persistentVolumeClaim:
claimName: database-pvc
PVC in StatefulSet:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: database
spec:
serviceName: database
replicas: 3
selector:
matchLabels:
app: database
template:
metadata:
labels:
app: database
spec:
containers:
- name: postgres
image: postgres:13
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
storageClassName: fast-ssd
StorageClass
What it is: A way to describe different "classes" of storage available in a cluster, enabling dynamic provisioning of PersistentVolumes.
Why it matters: StorageClasses eliminate the need for administrators to pre-provision PersistentVolumes. They enable on-demand storage provisioning with different performance characteristics, backup policies, and other parameters.
Key features:
- Dynamic provisioning - Automatically create PVs when PVCs are created
- Storage types - Define different tiers (fast SSD, slow HDD, etc.)
- Parameters - Configure storage-specific settings
- Provisioners - Backend storage systems that create volumes
- Reclaim policies - Default behavior when PVCs are deleted
Common provisioners:
- kubernetes.io/aws-ebs - Amazon Elastic Block Store
- kubernetes.io/gce-pd - Google Compute Engine Persistent Disk
- kubernetes.io/azure-disk - Azure Managed Disk
- kubernetes.io/cinder - OpenStack Cinder
- kubernetes.io/vsphere-volume - vSphere
- kubernetes.io/no-provisioner - Local volumes
Common commands:
# StorageClass operations
kubectl get storageclass # List all storage classes
kubectl get sc # Short form
kubectl describe storageclass <sc-name> # Detailed SC information
kubectl delete storageclass <sc-name> # Delete storage class
# Set default storage class
kubectl patch storageclass <sc-name> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
Example StorageClass (AWS EBS):
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
annotations:
storageclass.kubernetes.io/is-default-class: "false"
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
fsType: ext4
encrypted: "true"
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
Example StorageClass (GCP Persistent Disk):
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: regional-ssd
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
replication-type: regional-pd
zones: us-central1-a,us-central1-b
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
Example StorageClass (Local storage):
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
StorageClass parameters by provider:
AWS EBS parameters:
- type - gp2, gp3, io1, io2, sc1, st1
- fsType - ext4, xfs
- encrypted - true/false
- kmsKeyId - KMS key for encryption
GCP PD parameters:
- type - pd-standard, pd-ssd, pd-balanced
- replication-type - none, regional-pd
- zones - Comma-separated zone list
Azure Disk parameters:
- skuName - Standard_LRS, Premium_LRS, StandardSSD_LRS
- location - Azure region
- storageAccount - Storage account name
Volume binding modes:
- Immediate - PV created immediately when PVC is created
- WaitForFirstConsumer - Wait until pod using PVC is scheduled
Dynamic provisioning workflow:
1. PVC created - User creates PVC with storageClassName
2. StorageClass matched - Kubernetes finds matching StorageClass
3. Provisioner called - Storage provisioner creates actual volume
4. PV created - Kubernetes creates PV representing the volume
5. Binding - PVC is bound to the newly created PV
6. Pod mount - Pod can now use the storage
When you'll use it: Any time you need persistent storage in Kubernetes - databases, file storage, application data, logs, or any stateful workload requiring data persistence.