alexsusanu@docs:Kubernetes Storage $
alexsusanu@docs
:~$ cat Kubernetes Storage.md

HomeNOTES → Kubernetes Storage

Kubernetes Storage

category: DevOps
tags: kubernetes, k8s, storage, volumes, persistentvolume, pvc, storageclass

Volume

What it is: A directory accessible to containers in a pod, providing storage that can persist beyond individual container lifecycles and be shared between containers.

Why it matters: Containers are stateless and ephemeral by default. Volumes provide data persistence, enable data sharing between containers in a pod, and allow applications to maintain state across container restarts.

Volume vs Container filesystem:
- Container filesystem - Ephemeral, lost when container dies
- Volume - Can persist beyond container lifecycle
- Shared access - Multiple containers in pod can access same volume
- External storage - Can connect to external storage systems

Volume types:
- emptyDir - Temporary storage, lifecycle tied to pod
- hostPath - Mount directory from host node
- configMap/secret - Mount configuration data as files
- persistentVolumeClaim - Request for persistent storage
- nfs, cephfs, glusterfs - Network storage systems
- cloud volumes - AWS EBS, GCP PD, Azure Disk

Common commands:

# Volume inspection
kubectl describe pod <pod-name>                     # Show pod's volumes
kubectl get pv                                     # List persistent volumes
kubectl get pvc                                    # List persistent volume claims

# Troubleshooting volumes
kubectl exec <pod-name> -- df -h                   # Check mounted filesystems
kubectl exec <pod-name> -- ls -la /path/to/volume  # List volume contents
kubectl logs <pod-name> -c <container-name>        # Check for mount errors

emptyDir Volumes

What it is: Temporary storage that exists for the lifetime of a pod, shared between all containers in the pod.

Characteristics:
- Pod lifecycle - Created when pod starts, deleted when pod terminates
- Shared storage - All containers in pod can access
- Node storage - Uses node's local storage (disk or memory)
- No persistence - Data lost when pod is deleted

Use cases:
- Temporary files - Scratch space for computations
- Shared data - Communication between containers in same pod
- Cache - Temporary caching that doesn't need persistence
- Logs - Temporary log storage before shipping elsewhere

Example:

apiVersion: v1
kind: Pod
metadata:
  name: web-with-cache
spec:
  containers:
  - name: web-server
    image: nginx
    volumeMounts:
    - name: cache-volume
      mountPath: /var/cache/nginx
  - name: cache-warmer
    image: busybox
    command: ['sh', '-c', 'while true; do echo "warming cache" > /cache/warm; sleep 3600; done']
    volumeMounts:
    - name: cache-volume
      mountPath: /cache
  volumes:
  - name: cache-volume
    emptyDir: {}

emptyDir with memory storage:

volumes:
- name: memory-volume
  emptyDir:
    medium: Memory      # Use RAM instead of disk
    sizeLimit: 1Gi     # Limit size to 1GB

hostPath Volumes

What it is: Mount a file or directory from the host node's filesystem into the pod.

Use cases:
- Node monitoring - Access to /proc, /sys for system monitoring
- Docker socket - Access Docker daemon from container
- Log collection - Access host log directories
- Development - Mount source code during development

Security considerations:
- Host access - Container can access host filesystem
- Privilege escalation - Potential security risk
- Node dependency - Pod tied to specific node
- Not portable - Doesn't work across different nodes

Example:

apiVersion: v1
kind: Pod
metadata:
  name: host-access-pod
spec:
  containers:
  - name: monitor
    image: busybox
    command: ['sh', '-c', 'while true; do df -h /host-root; sleep 30; done']
    volumeMounts:
    - name: host-root
      mountPath: /host-root
      readOnly: true
  volumes:
  - name: host-root
    hostPath:
      path: /
      type: Directory

hostPath types:
- DirectoryOrCreate - Create directory if it doesn't exist
- Directory - Directory must exist
- FileOrCreate - Create file if it doesn't exist
- File - File must exist
- Socket - Unix socket must exist
- CharDevice - Character device must exist
- BlockDevice - Block device must exist

configMap and secret Volumes

What they are: Volumes that mount ConfigMaps and Secrets as files in the container filesystem.

Use cases:
- Configuration files - App configs, nginx.conf, etc.
- Environment-specific settings - Different configs per environment
- Certificates - SSL certificates and keys
- API keys - Sensitive configuration data

ConfigMap volume example:

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  database.conf: |
    host=localhost
    port=5432
    database=myapp
  cache.conf: |
    redis_host=redis-service
    redis_port=6379
---
apiVersion: v1
kind: Pod
metadata:
  name: app-with-config
spec:
  containers:
  - name: app
    image: myapp
    volumeMounts:
    - name: config-volume
      mountPath: /etc/config
  volumes:
  - name: config-volume
    configMap:
      name: app-config

Secret volume example:

apiVersion: v1
kind: Secret
metadata:
  name: app-secrets
type: Opaque
data:
  username: YWRtaW4=  # base64 encoded
  password: MWYyZDFlMmU2N2Rm  # base64 encoded
---
apiVersion: v1
kind: Pod
metadata:
  name: app-with-secrets
spec:
  containers:
  - name: app
    image: myapp
    volumeMounts:
    - name: secret-volume
      mountPath: /etc/secrets
      readOnly: true
  volumes:
  - name: secret-volume
    secret:
      secretName: app-secrets
      defaultMode: 0400  # Read-only for owner only

PersistentVolume (PV)

What it is: A piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes.

Why it matters: PersistentVolumes provide durable storage that exists independently of pods. They enable stateful applications, data persistence across pod restarts, and centralized storage management in Kubernetes clusters.

PV characteristics:
- Cluster resource - Independent of any individual pod
- Lifecycle independent - Exists beyond pod lifecycle
- Admin provisioned - Usually created by cluster administrators
- Capacity and access modes - Define storage size and access patterns
- Reclaim policies - What happens when PV is released

Access modes:
- ReadWriteOnce (RWO) - Volume can be mounted read-write by single node
- ReadOnlyMany (ROX) - Volume can be mounted read-only by many nodes
- ReadWriteMany (RWX) - Volume can be mounted read-write by many nodes
- ReadWriteOncePod (RWOP) - Volume can be mounted read-write by single pod

Reclaim policies:
- Retain - Manual reclamation of the resource
- Recycle - Basic scrub (rm -rf /thevolume/) - deprecated
-
Delete* - Delete the volume from infrastructure

Common commands:

# PersistentVolume operations
kubectl get pv                                     # List all persistent volumes
kubectl describe pv <pv-name>                      # Detailed PV information
kubectl delete pv <pv-name>                        # Delete persistent volume

# PV status and troubleshooting
kubectl get pv -o wide                             # PV with additional info
kubectl get events --field-selector involvedObject.kind=PersistentVolume

PV lifecycle states:
- Available - Free resource not yet bound to claim
- Bound - Volume is bound to a claim
- Released - Claim has been deleted but resource not reclaimed
- Failed - Volume has failed its automatic reclamation

Example PersistentVolume (NFS):

apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-pv
  labels:
    type: nfs
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: slow
  nfs:
    server: 192.168.1.100
    path: /exported/path

Example PersistentVolume (AWS EBS):

apiVersion: v1
kind: PersistentVolume
metadata:
  name: ebs-pv
spec:
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: gp2
  awsElasticBlockStore:
    volumeID: vol-12345678
    fsType: ext4

Local PersistentVolume:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv
spec:
  capacity:
    storage: 100Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: local-storage
  local:
    path: /mnt/disks/ssd1
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - worker-node-1

PersistentVolumeClaim (PVC)

What it is: A request for storage by a user, similar to how pods consume node resources, PVCs consume PV resources.

Why it matters: PVCs provide an abstraction layer between applications and storage. Applications request storage through PVCs without needing to know the underlying storage implementation details.

PVC workflow:
1. User creates PVC - Specifies storage requirements
2. Kubernetes finds matching PV - Based on size, access mode, storage class
3. Binding occurs - PVC is bound to suitable PV
4. Pod uses PVC - Mounts the bound storage
5. Release - When PVC is deleted, PV follows reclaim policy

PVC specifications:
- Storage size - Amount of storage requested
- Access modes - How the storage will be accessed
- Storage class - Type of storage required
- Label selectors - Additional criteria for matching PVs

Common commands:

# PVC operations
kubectl get pvc                                    # List all PVCs
kubectl describe pvc <pvc-name>                    # Detailed PVC information
kubectl delete pvc <pvc-name>                     # Delete PVC

# PVC troubleshooting
kubectl get pvc -o wide                           # PVC with additional info
kubectl get events --field-selector involvedObject.kind=PersistentVolumeClaim

PVC states:
- Pending - PVC is waiting to be bound to a PV
- Bound - PVC is bound to a PV
- Lost - PV backing the PVC is lost

Example PVC:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: database-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: fast-ssd
  selector:
    matchLabels:
      environment: production

Using PVC in Pod:

apiVersion: v1
kind: Pod
metadata:
  name: database-pod
spec:
  containers:
  - name: database
    image: postgres:13
    env:
    - name: POSTGRES_DB
      value: myapp
    - name: POSTGRES_USER
      value: admin
    - name: POSTGRES_PASSWORD
      value: secretpassword
    volumeMounts:
    - name: database-storage
      mountPath: /var/lib/postgresql/data
  volumes:
  - name: database-storage
    persistentVolumeClaim:
      claimName: database-pvc

PVC in StatefulSet:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: database
spec:
  serviceName: database
  replicas: 3
  selector:
    matchLabels:
      app: database
  template:
    metadata:
      labels:
        app: database
    spec:
      containers:
      - name: postgres
        image: postgres:13
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi
      storageClassName: fast-ssd

StorageClass

What it is: A way to describe different "classes" of storage available in a cluster, enabling dynamic provisioning of PersistentVolumes.

Why it matters: StorageClasses eliminate the need for administrators to pre-provision PersistentVolumes. They enable on-demand storage provisioning with different performance characteristics, backup policies, and other parameters.

Key features:
- Dynamic provisioning - Automatically create PVs when PVCs are created
- Storage types - Define different tiers (fast SSD, slow HDD, etc.)
- Parameters - Configure storage-specific settings
- Provisioners - Backend storage systems that create volumes
- Reclaim policies - Default behavior when PVCs are deleted

Common provisioners:
- kubernetes.io/aws-ebs - Amazon Elastic Block Store
- kubernetes.io/gce-pd - Google Compute Engine Persistent Disk
- kubernetes.io/azure-disk - Azure Managed Disk
- kubernetes.io/cinder - OpenStack Cinder
- kubernetes.io/vsphere-volume - vSphere
- kubernetes.io/no-provisioner - Local volumes

Common commands:

# StorageClass operations
kubectl get storageclass                           # List all storage classes
kubectl get sc                                    # Short form
kubectl describe storageclass <sc-name>           # Detailed SC information
kubectl delete storageclass <sc-name>             # Delete storage class

# Set default storage class
kubectl patch storageclass <sc-name> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Example StorageClass (AWS EBS):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp3
  fsType: ext4
  encrypted: "true"
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

Example StorageClass (GCP Persistent Disk):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: regional-ssd
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
  replication-type: regional-pd
  zones: us-central1-a,us-central1-b
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

Example StorageClass (Local storage):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete

StorageClass parameters by provider:

AWS EBS parameters:
- type - gp2, gp3, io1, io2, sc1, st1
- fsType - ext4, xfs
- encrypted - true/false
- kmsKeyId - KMS key for encryption

GCP PD parameters:
- type - pd-standard, pd-ssd, pd-balanced
- replication-type - none, regional-pd
- zones - Comma-separated zone list

Azure Disk parameters:
- skuName - Standard_LRS, Premium_LRS, StandardSSD_LRS
- location - Azure region
- storageAccount - Storage account name

Volume binding modes:
- Immediate - PV created immediately when PVC is created
- WaitForFirstConsumer - Wait until pod using PVC is scheduled

Dynamic provisioning workflow:
1. PVC created - User creates PVC with storageClassName
2. StorageClass matched - Kubernetes finds matching StorageClass
3. Provisioner called - Storage provisioner creates actual volume
4. PV created - Kubernetes creates PV representing the volume
5. Binding - PVC is bound to the newly created PV
6. Pod mount - Pod can now use the storage

When you'll use it: Any time you need persistent storage in Kubernetes - databases, file storage, application data, logs, or any stateful workload requiring data persistence.

Last updated: 2025-08-26 20:00 UTC