alexsusanu@docs:Pod Lifecycle - Comprehensive Study Guide $
alexsusanu@docs
:~$ cat Pod Lifecycle - Comprehensive Study Guide.md

HomeNOTES → Pod Lifecycle - Comprehensive Study Guide

Pod Lifecycle - Comprehensive Study Guide

category: Kubernetes Certification
tags: cka, kubernetes, exam, kubectl, certification

WHY Pod Lifecycle Matters (Conceptual Foundation)

Pods as the Atomic Unit of Kubernetes

Understanding pod lifecycle is understanding how Kubernetes actually runs your applications:

  • Atomic Deployment Unit - Pods are the smallest deployable units, not containers
  • Shared Execution Context - Containers in a pod share network, storage, and lifecycle
  • State Transition System - Pods move through predictable phases with clear triggers
  • Resource Management Boundary - Scheduling, networking, and storage happen at pod level
  • Failure and Recovery Model - Pod restarts, rescheduling, and health management patterns

Exam Context: Why Pod Lifecycle Mastery is Critical

  • Debugging foundation - 80% of application issues manifest in pod lifecycle
  • Networking troubleshooting - Pod-to-pod, pod-to-service communication
  • Storage management - Volume mounting, persistent storage, data persistence
  • Performance optimization - Resource requests, limits, quality of service
  • Security implementation - Pod security contexts, network policies, RBAC

Key Insight: Every Kubernetes abstraction (Deployments, Services, Jobs) ultimately manages pods. Understanding pod lifecycle means understanding how Kubernetes works at the foundational level.


Pod Lifecycle Overview

The Complete Pod Journey

┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   kubectl   │───▶│ API Server  │───▶│  Scheduler  │───▶│   kubelet   │
│   apply     │    │ Validation  │    │ Node Select │    │ Container   │
│             │    │ Admission   │    │             │    │ Runtime     │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
                           │                                       │
                           ▼                                       ▼
┌─────────────────────────────────────────────────────────────────────┐
│                         etcd                                        │
│  Pod Object: spec.nodeName = ""  → spec.nodeName = "worker-1"      │
└─────────────────────────────────────────────────────────────────────┘

Pod Phases and States

# Pod status progression
apiVersion: v1
kind: Pod
status:
  phase: Pending     # → Running → Succeeded/Failed
  conditions:        # Detailed state information
  - type: PodScheduled
    status: "True"
    lastTransitionTime: "2024-01-15T10:00:00Z"
  - type: Initialized
    status: "True"
    lastTransitionTime: "2024-01-15T10:00:15Z"
  - type: ContainersReady
    status: "True"
    lastTransitionTime: "2024-01-15T10:00:30Z"
  - type: Ready
    status: "True"
    lastTransitionTime: "2024-01-15T10:00:30Z"
  containerStatuses:
  - name: nginx
    state:
      running:
        startedAt: "2024-01-15T10:00:25Z"
    ready: true
    restartCount: 0

Phase Definitions:
- Pending: Pod accepted but containers not yet created
- Running: Pod bound to node, at least one container running
- Succeeded: All containers terminated successfully (exit 0)
- Failed: All containers terminated, at least one failed
- Unknown: Pod state cannot be determined (node communication issues)


Pod Creation Process (Deep Dive)

1. Client Request and API Server Processing

kubectl to API Server

# This simple command triggers a complex workflow
kubectl run nginx --image=nginx

# Equivalent REST API call
curl -X POST \
  https://kube-apiserver:6443/api/v1/namespaces/default/pods \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "apiVersion": "v1",
    "kind": "Pod",
    "metadata": {"name": "nginx"},
    "spec": {
      "containers": [{"name": "nginx", "image": "nginx"}]
    }
  }'

API Server Validation and Admission

# API Server processing steps:
# 1. Authentication: Is the user who they claim to be?
# 2. Authorization: Can this user create pods in this namespace?
# 3. Admission Controllers: Policy enforcement and resource mutation

# Example admission controller effects:
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  namespace: default
spec:
  serviceAccountName: default  # ← Added by ServiceAccount admission controller
  containers:
  - name: nginx
    image: nginx
    resources: {}              # ← May be modified by LimitRanger
    securityContext:           # ← Added by PodSecurityPolicy
      allowPrivilegeEscalation: false
      runAsNonRoot: true

2. Storage in etcd

Pod Object Storage

# Pod stored in etcd at this path
/registry/pods/default/nginx

# Initial pod object (spec.nodeName is empty)
{
  "apiVersion": "v1",
  "kind": "Pod",
  "metadata": {
    "name": "nginx",
    "namespace": "default",
    "uid": "12345678-1234-1234-1234-123456789abc",
    "creationTimestamp": "2024-01-15T10:00:00Z"
  },
  "spec": {
    "containers": [...],
    "nodeName": ""  # ← Empty until scheduled
  },
  "status": {
    "phase": "Pending"
  }
}

3. Scheduler Assignment

Scheduler Decision Process

# Scheduler filtering and scoring
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx
    resources:
      requests:
        memory: "128Mi"
        cpu: "100m"
  # ↓ After scheduling
  nodeName: "worker-2"  # ← Scheduler adds this field

Scheduling Constraints Example

# Complex scheduling scenario
apiVersion: v1
kind: Pod
metadata:
  name: complex-pod
spec:
  nodeSelector:               # Basic node filtering
    disktype: ssd
  affinity:
    nodeAffinity:            # Advanced node selection
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/arch
            operator: In
            values: ["amd64"]
    podAntiAffinity:         # Avoid co-location
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: app
              operator: In
              values: ["database"]
          topologyKey: kubernetes.io/hostname
  tolerations:               # Tolerate taints
  - key: "special"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"
  containers:
  - name: app
    image: myapp
    resources:
      requests:
        memory: "1Gi"
        cpu: "500m"
        nvidia.com/gpu: 1    # GPU requirement
      limits:
        memory: "2Gi"
        cpu: "1000m"
        nvidia.com/gpu: 1

4. kubelet Takes Over

kubelet Pod Management

# kubelet workflow on assigned node:
# 1. Watch API server for pods with nodeName matching this node
# 2. Pull container images
# 3. Create container runtime objects
# 4. Set up networking (CNI)
# 5. Mount volumes
# 6. Start containers
# 7. Monitor container health
# 8. Report status back to API server

Container Creation Process

# kubelet → container runtime interaction
# Example with containerd/CRI-O:

# 1. Image pulling
# kubelet calls: ImageService.PullImage()
# Status: spec.containers[0].image: "nginx" → "nginx:latest"

# 2. Container creation
# kubelet calls: RuntimeService.CreateContainer()
# With pod sandbox (shared network/IPC namespace)

# 3. Container start
# kubelet calls: RuntimeService.StartContainer()
# Updates pod status: containerStatuses[0].state.running.startedAt

Pod Networking (Complete Model)

1. Pod Network Model Fundamentals

The Kubernetes Network Model

┌─────────────────────────────────────────────────────────────┐
│                    Cluster Network                         │
│                                                             │
│  ┌─────────────┐           ┌─────────────┐                 │
│  │   Node-1    │           │   Node-2    │                 │
│  │ 10.244.1.0  │           │ 10.244.2.0  │                 │
│  │             │           │             │                 │
│  │ ┌─────────┐ │           │ ┌─────────┐ │                 │
│  │ │ Pod A   │ │           │ │ Pod C   │ │                 │
│  │ │10.244.1.│ │           │ │10.244.2.│ │                 │
│  │ │    10   │ │           │ │    15   │ │                 │
│  │ └─────────┘ │           │ └─────────┘ │                 │
│  │ ┌─────────┐ │           │ ┌─────────┐ │                 │
│  │ │ Pod B   │ │           │ │ Pod D   │ │                 │
│  │ │10.244.1.│ │◀─────────▶│ │10.244.2.│ │                 │
│  │ │    20   │ │           │ │    25   │ │                 │
│  │ └─────────┘ │           │ └─────────┘ │                 │
│  └─────────────┘           └─────────────┘                 │
└─────────────────────────────────────────────────────────────┘

# Network principles:
# 1. Every pod gets a unique IP address
# 2. Pods can communicate with all other pods without NAT
# 3. Nodes can communicate with all pods without NAT
# 4. The IP that a pod sees itself as is the same IP others see it as

Pod Networking Setup Process

# When kubelet creates a pod:

# 1. Create network namespace for pod
ip netns add pod-namespace-12345

# 2. Call CNI (Container Network Interface) plugin
# /opt/cni/bin/bridge < CNI_CONFIG
# CNI plugin configures:
# - Virtual ethernet pair (veth)
# - Bridge connection
# - IP address assignment
# - Route configuration

# 3. Result: Pod has network connectivity
# Pod can reach: other pods, services, external networks

2. Container-to-Container Communication (Same Pod)

Shared Network Namespace

# Multi-container pod networking
apiVersion: v1
kind: Pod
metadata:
  name: shared-network-pod
spec:
  containers:
  - name: web
    image: nginx
    ports:
    - containerPort: 80
  - name: sidecar
    image: busybox
    command: ["sleep", "3600"]
---
# Inside the pod:
# Both containers share the same network interface
# web container binds to 0.0.0.0:80
# sidecar can reach web via localhost:80
# External clients see both containers at same pod IP

Localhost Communication Example

# From sidecar container:
kubectl exec shared-network-pod -c sidecar -- wget -qO- localhost:80
# ↑ This works because containers share network namespace

# Port conflicts within pod:
# If both containers try to bind to port 80 → conflict!
# Solution: Use different ports per container

3. Pod-to-Pod Communication

Direct IP Communication

# Pod A wants to communicate with Pod B
apiVersion: v1
kind: Pod
metadata:
  name: client-pod
spec:
  containers:
  - name: client
    image: busybox
    command: ["sleep", "3600"]
---
apiVersion: v1
kind: Pod
metadata:
  name: server-pod
spec:
  containers:
  - name: server
    image: nginx
    ports:
    - containerPort: 80
# Get pod IPs
kubectl get pods -o wide
# NAME         READY   STATUS    RESTARTS   AGE   IP            NODE
# client-pod   1/1     Running   0          1m    10.244.1.10   worker-1
# server-pod   1/1     Running   0          1m    10.244.2.15   worker-2

# Direct pod-to-pod communication
kubectl exec client-pod -- wget -qO- 10.244.2.15:80
# ↑ This works across nodes due to CNI networking

Service Discovery and DNS

# Service for stable endpoint
apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  selector:
    app: web
  ports:
  - port: 80
    targetPort: 80
---
apiVersion: v1
kind: Pod
metadata:
  name: web-pod
  labels:
    app: web
spec:
  containers:
  - name: nginx
    image: nginx
# DNS-based service discovery
kubectl exec client-pod -- nslookup web-service
# Returns: web-service.default.svc.cluster.local → 10.96.55.123

# Service communication
kubectl exec client-pod -- wget -qO- web-service:80
kubectl exec client-pod -- wget -qO- web-service.default.svc.cluster.local:80

4. Network Troubleshooting

Common Network Issues and Diagnostics

# 1. Pod cannot reach external internet
kubectl exec test-pod -- nslookup google.com
kubectl exec test-pod -- wget -qO- https://google.com

# 2. Pod-to-pod communication failure
kubectl exec pod-a -- ping <pod-b-ip>
kubectl exec pod-a -- telnet <pod-b-ip> 80

# 3. Service discovery not working
kubectl exec test-pod -- nslookup kubernetes
kubectl exec test-pod -- nslookup web-service

# 4. CNI plugin issues
kubectl describe pod test-pod  # Look for network setup errors
kubectl logs -n kube-system <cni-pod>

# 5. DNS resolution problems
kubectl exec test-pod -- cat /etc/resolv.conf
kubectl get svc -n kube-system kube-dns

Network Policy Impact

# Network policies can block communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all
spec:
  podSelector: {}  # Applies to all pods in namespace
  policyTypes:
  - Ingress
  - Egress
  # No ingress/egress rules = deny all traffic
---
# Allow specific communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-web-traffic
spec:
  podSelector:
    matchLabels:
      app: web
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 80

Pod Storage (Volumes and Persistence)

1. Volume Types and Use Cases

Temporary Storage (emptyDir)

apiVersion: v1
kind: Pod
metadata:
  name: shared-storage-pod
spec:
  containers:
  - name: writer
    image: busybox
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo $(date) >> /shared/output.log; sleep 5; done"]
    volumeMounts:
    - name: shared-volume
      mountPath: /shared
  - name: reader
    image: busybox
    command: ["/bin/sh"]
    args: ["-c", "tail -f /shared/output.log"]
    volumeMounts:
    - name: shared-volume
      mountPath: /shared
  volumes:
  - name: shared-volume
    emptyDir:
      sizeLimit: 1Gi  # Optional size limit

Host Path Volumes

apiVersion: v1
kind: Pod
metadata:
  name: hostpath-pod
spec:
  containers:
  - name: app
    image: nginx
    volumeMounts:
    - name: host-logs
      mountPath: /var/log/nginx
    - name: host-config
      mountPath: /etc/nginx/conf.d
      readOnly: true
  volumes:
  - name: host-logs
    hostPath:
      path: /var/log/nginx  # Node filesystem path
      type: DirectoryOrCreate
  - name: host-config
    hostPath:
      path: /etc/nginx/conf.d
      type: Directory
  nodeSelector:
    kubernetes.io/hostname: specific-node  # Pin to specific node

Gotcha: hostPath volumes tie pods to specific nodes and pose security risks.

ConfigMap and Secret Volumes

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  config.yaml: |
    server:
      port: 8080
      debug: true
    database:
      host: db.example.com
      port: 5432
---
apiVersion: v1
kind: Secret
metadata:
  name: app-secrets
type: Opaque
data:
  username: YWRtaW4=  # admin (base64)
  password: cGFzc3dvcmQ=  # password (base64)
---
apiVersion: v1
kind: Pod
metadata:
  name: config-pod
spec:
  containers:
  - name: app
    image: myapp
    volumeMounts:
    - name: config-volume
      mountPath: /etc/config
      readOnly: true
    - name: secret-volume
      mountPath: /etc/secrets
      readOnly: true
    env:
    - name: CONFIG_FILE
      value: /etc/config/config.yaml
  volumes:
  - name: config-volume
    configMap:
      name: app-config
      defaultMode: 0644
  - name: secret-volume
    secret:
      secretName: app-secrets
      defaultMode: 0400  # Read-only for owner only

2. Persistent Storage

PersistentVolume and PersistentVolumeClaim

# Cluster administrator creates PV
apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv-1
spec:
  capacity:
    storage: 10Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /mnt/disks/vol1
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - worker-node-1
---
# User requests storage via PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-storage
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: local-storage
---
# Pod uses PVC
apiVersion: v1
kind: Pod
metadata:
  name: persistent-pod
spec:
  containers:
  - name: app
    image: postgres:13
    env:
    - name: POSTGRES_DB
      value: myapp
    - name: PGDATA
      value: /var/lib/postgresql/data/pgdata
    volumeMounts:
    - name: postgres-storage
      mountPath: /var/lib/postgresql/data
  volumes:
  - name: postgres-storage
    persistentVolumeClaim:
      claimName: app-storage

Storage Classes and Dynamic Provisioning

# StorageClass for dynamic provisioning
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp3
  iops: "3000"
  encrypted: "true"
allowVolumeExpansion: true
reclaimPolicy: Delete
---
# PVC using StorageClass
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: dynamic-storage
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: fast-ssd  # Triggers dynamic provisioning

3. Volume Mounting and Lifecycle

Volume Mount Options and Behavior

apiVersion: v1
kind: Pod
metadata:
  name: mount-options-pod
spec:
  containers:
  - name: app
    image: busybox
    command: ["sleep", "3600"]
    volumeMounts:
    - name: data-volume
      mountPath: /data
      readOnly: false
      subPath: app-data      # Mount subdirectory only
    - name: config-volume
      mountPath: /etc/config
      readOnly: true
    - name: shared-volume
      mountPath: /shared
      mountPropagation: HostToContainer  # Propagate host mounts
  volumes:
  - name: data-volume
    persistentVolumeClaim:
      claimName: app-data
  - name: config-volume
    configMap:
      name: app-config
      items:  # Mount specific keys only
      - key: config.yaml
        path: app.yaml
        mode: 0644
  - name: shared-volume
    hostPath:
      path: /mnt/shared

Init Containers and Volume Preparation

apiVersion: v1
kind: Pod
metadata:
  name: init-volume-pod
spec:
  initContainers:
  - name: volume-setup
    image: busybox
    command: ["/bin/sh"]
    args:
    - -c
    - |
      mkdir -p /data/app /data/logs
      chown 1000:1000 /data/app /data/logs
      chmod 755 /data/app /data/logs
    volumeMounts:
    - name: app-data
      mountPath: /data
  containers:
  - name: app
    image: myapp
    securityContext:
      runAsUser: 1000
      runAsGroup: 1000
    volumeMounts:
    - name: app-data
      mountPath: /app/data
      subPath: app
    - name: app-data
      mountPath: /app/logs
      subPath: logs
  volumes:
  - name: app-data
    persistentVolumeClaim:
      claimName: app-storage

Pod Logging (Complete Logging Strategy)

1. Container Log Management

Standard Output/Error Logging

# Application logging to stdout/stderr
apiVersion: v1
kind: Pod
metadata:
  name: logging-pod
spec:
  containers:
  - name: app
    image: busybox
    command: ["/bin/sh"]
    args:
    - -c
    - |
      while true; do
        echo "$(date): INFO - Application is running" >&1
        echo "$(date): ERROR - Something went wrong" >&2
        sleep 10
      done
# Accessing container logs
kubectl logs logging-pod
kubectl logs logging-pod --previous  # Previous container instance
kubectl logs logging-pod --since=1h  # Time-based filtering
kubectl logs logging-pod --tail=50   # Last N lines
kubectl logs logging-pod -f          # Follow real-time

# Multi-container pod logs
kubectl logs logging-pod -c app      # Specific container
kubectl logs logging-pod --all-containers=true  # All containers

Log Rotation and Retention

# kubelet log rotation configuration
# /var/lib/kubelet/config.yaml
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
logging:
  format: json
containerLogMaxSize: "10Mi"     # Rotate when log file reaches 10MB
containerLogMaxFiles: 5         # Keep 5 rotated files
# Container log files on node
ls -la /var/log/pods/default_logging-pod_*/app/
# 0.log      (current log file)
# 0.log.1    (rotated log file)
# 0.log.2    (older rotated log file)

2. Sidecar Logging Patterns

Log Aggregation Sidecar

apiVersion: v1
kind: Pod
metadata:
  name: sidecar-logging-pod
spec:
  containers:
  - name: app
    image: busybox
    command: ["/bin/sh"]
    args:
    - -c
    - |
      while true; do
        echo "$(date): Application log entry" >> /var/log/app.log
        sleep 5
      done
    volumeMounts:
    - name: log-volume
      mountPath: /var/log
  - name: log-shipper
    image: fluent/fluent-bit:latest
    volumeMounts:
    - name: log-volume
      mountPath: /var/log
      readOnly: true
    - name: fluent-config
      mountPath: /fluent-bit/etc
  volumes:
  - name: log-volume
    emptyDir: {}
  - name: fluent-config
    configMap:
      name: fluent-bit-config
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
data:
  fluent-bit.conf: |
    [INPUT]
        Name tail
        Path /var/log/app.log
        Tag app.logs
    [OUTPUT]
        Name forward
        Match *
        Host logging-server.monitoring.svc.cluster.local
        Port 24224

Multi-Stream Logging

apiVersion: v1
kind: Pod
metadata:
  name: multi-stream-logging
spec:
  containers:
  - name: app
    image: myapp
    volumeMounts:
    - name: log-volume
      mountPath: /var/log
  # Separate sidecar for each log stream
  - name: access-log-streamer
    image: busybox
    command: ["/bin/sh"]
    args: ["-c", "tail -f /var/log/access.log"]
    volumeMounts:
    - name: log-volume
      mountPath: /var/log
      readOnly: true
  - name: error-log-streamer
    image: busybox
    command: ["/bin/sh"]
    args: ["-c", "tail -f /var/log/error.log"]
    volumeMounts:
    - name: log-volume
      mountPath: /var/log
      readOnly: true
  volumes:
  - name: log-volume
    emptyDir: {}

3. Centralized Logging Architecture

ELK Stack Integration

# Filebeat DaemonSet for log collection
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: logging
spec:
  selector:
    matchLabels:
      app: filebeat
  template:
    metadata:
      labels:
        app: filebeat
    spec:
      containers:
      - name: filebeat
        image: elastic/filebeat:8.5.0
        volumeMounts:
        - name: config
          mountPath: /usr/share/filebeat/filebeat.yml
          subPath: filebeat.yml
        - name: varlog
          mountPath: /var/log
          readOnly: true
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: varlogpods
          mountPath: /var/log/pods
          readOnly: true
      volumes:
      - name: config
        configMap:
          name: filebeat-config
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: varlogpods
        hostPath:
          path: /var/log/pods

4. Log Troubleshooting

Common Logging Issues

# 1. Pod logs not appearing
kubectl describe pod problematic-pod
# Check: Events section for container creation issues

# 2. Container keeps restarting
kubectl logs problematic-pod --previous
# Previous instance logs often show crash cause

# 3. Log volume full
kubectl exec -it pod-name -- df -h /var/log
# Check volume usage in pod

# 4. Log rotation not working
# Check kubelet configuration and node disk space
df -h /var/log/pods/

# 5. Sidecar not collecting logs
kubectl logs pod-name -c sidecar-container
# Check sidecar container for collection errors

Log Analysis Patterns

# Search for specific patterns
kubectl logs app-pod | grep "ERROR"
kubectl logs app-pod | grep -E "(ERROR|WARN|FATAL)"

# Time-based analysis
kubectl logs app-pod --since=2h | grep "database"
kubectl logs app-pod --since-time="2024-01-15T10:00:00Z"

# Multi-pod log aggregation
for pod in $(kubectl get pods -l app=web -o name); do
  echo "=== $pod ==="
  kubectl logs $pod --tail=10
done

Pod Health and Monitoring

1. Health Probes

Liveness, Readiness, and Startup Probes

apiVersion: v1
kind: Pod
metadata:
  name: health-check-pod
spec:
  containers:
  - name: web
    image: nginx
    ports:
    - containerPort: 80
    # Startup probe - initial health check
    startupProbe:
      httpGet:
        path: /health
        port: 80
      failureThreshold: 30    # Allow 30 * 10 = 300 seconds for startup
      periodSeconds: 10
    # Liveness probe - restart container if unhealthy
    livenessProbe:
      httpGet:
        path: /health
        port: 80
        httpHeaders:
        - name: Custom-Header
          value: health-check
      initialDelaySeconds: 30
      periodSeconds: 10
      timeoutSeconds: 5
      failureThreshold: 3
      successThreshold: 1
    # Readiness probe - remove from service if not ready
    readinessProbe:
      httpGet:
        path: /ready
        port: 80
      initialDelaySeconds: 5
      periodSeconds: 5
      timeoutSeconds: 3
      failureThreshold: 3
      successThreshold: 1
    # Resource limits affect scheduling and QoS
    resources:
      requests:
        memory: "128Mi"
        cpu: "100m"
      limits:
        memory: "256Mi"
        cpu: "200m"

Probe Types and Examples

# HTTP GET probe
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
    scheme: HTTPS

# TCP socket probe
livenessProbe:
  tcpSocket:
    port: 3306

# Command execution probe
livenessProbe:
  exec:
    command:
    - cat
    - /tmp/healthy

2. Resource Management and QoS

Quality of Service Classes

# Guaranteed QoS (highest priority)
apiVersion: v1
kind: Pod
metadata:
  name: guaranteed-pod
spec:
  containers:
  - name: app
    image: nginx
    resources:
      requests:
        memory: "1Gi"
        cpu: "500m"
      limits:
        memory: "1Gi"    # requests == limits
        cpu: "500m"      # requests == limits
---
# Burstable QoS (medium priority)
apiVersion: v1
kind: Pod
metadata:
  name: burstable-pod
spec:
  containers:
  - name: app
    image: nginx
    resources:
      requests:
        memory: "512Mi"
        cpu: "250m"
      limits:
        memory: "1Gi"    # limits > requests
        cpu: "1000m"     # limits > requests
---
# BestEffort QoS (lowest priority)
apiVersion: v1
kind: Pod
metadata:
  name: besteffort-pod
spec:
  containers:
  - name: app
    image: nginx
    # No resource requests or limits specified

3. Pod Disruption and Termination

Graceful Termination Process

apiVersion: v1
kind: Pod
metadata:
  name: graceful-termination-pod
spec:
  containers:
  - name: app
    image: nginx
    lifecycle:
      preStop:  # Hook executed before SIGTERM
        exec:
          command:
          - /bin/sh
          - -c
          - |
            echo "Graceful shutdown initiated"
            nginx -s quit  # Graceful nginx shutdown
            sleep 5
  terminationGracePeriodSeconds: 30  # Time allowed for graceful shutdown
# Pod termination sequence:
# 1. Pod marked for deletion (grace period starts)
# 2. Pod removed from service endpoints
# 3. preStop hook executed (if defined)
# 4. SIGTERM sent to main container process
# 5. Grace period countdown (terminationGracePeriodSeconds)
# 6. SIGKILL sent if process still running after grace period

Pod Disruption Budgets

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-pdb
spec:
  minAvailable: 2  # or maxUnavailable: 1
  selector:
    matchLabels:
      app: web

Exam-Specific Pod Scenarios

1. Common Exam Tasks

Multi-Container Pod Creation

# Create pod with specific requirements
apiVersion: v1
kind: Pod
metadata:
  name: exam-pod
  labels:
    app: exam-app
spec:
  containers:
  - name: main
    image: nginx:1.20
    ports:
    - containerPort: 80
    env:
    - name: ENV_VAR
      value: "production"
    resources:
      requests:
        memory: "128Mi"
        cpu: "100m"
      limits:
        memory: "256Mi"
        cpu: "200m"
    volumeMounts:
    - name: shared-data
      mountPath: /data
  - name: sidecar
    image: busybox
    command: ["sleep", "3600"]
    volumeMounts:
    - name: shared-data
      mountPath: /data
  volumes:
  - name: shared-data
    emptyDir: {}
  restartPolicy: Always
  nodeSelector:
    disktype: ssd

Troubleshooting Failed Pods

# Systematic troubleshooting approach
# 1. Check pod status and events
kubectl get pods
kubectl describe pod <pod-name>

# 2. Check container logs
kubectl logs <pod-name>
kubectl logs <pod-name> --previous

# 3. Check resource availability
kubectl describe nodes
kubectl top nodes
kubectl top pods

# 4. Check image and registry issues
kubectl get events --sort-by='.lastTimestamp'

# 5. Interactive debugging
kubectl exec -it <pod-name> -- /bin/sh

2. Pod Security and Best Practices

Security Context Implementation

apiVersion: v1
kind: Pod
metadata:
  name: secure-pod
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: secure-app
    image: nginx
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop:
        - ALL
        add:
        - NET_BIND_SERVICE
    volumeMounts:
    - name: tmp-volume
      mountPath: /tmp
    - name: var-cache-nginx
      mountPath: /var/cache/nginx
    - name: var-run
      mountPath: /var/run
  volumes:
  - name: tmp-volume
    emptyDir: {}
  - name: var-cache-nginx
    emptyDir: {}
  - name: var-run
    emptyDir: {}

Conceptual Mastery Framework

Pod Lifecycle State Machine

Understanding pods as state machines that transition through phases:
1. Pending → API accepted, awaiting scheduling
2. Running → At least one container running
3. Succeeded/Failed → Terminal states
4. Unknown → Communication failure with node

Resource Allocation and Scheduling

Pods are resource consumption units where:
- Requests = guaranteed resources for scheduling
- Limits = maximum allowed resource usage
- QoS classes = eviction priority during resource pressure

Networking and Communication

Pods are network endpoints with:
- Unique IP addresses within cluster network
- Shared network namespace between containers
- Service discovery through DNS and environment variables

Storage and Data Management

Pods are data processing units with:
- Ephemeral storage for temporary data
- Volume mounts for shared data between containers
- Persistent volumes for data that survives pod restarts


Troubleshooting Decision Tree

Pod Issue?
├── Not Starting?
│   ├── Check: kubectl describe pod
│   ├── Look for: Image pull errors, resource constraints
│   └── Fix: Verify image, node resources, scheduling constraints
├── Running but Not Working?
│   ├── Check: kubectl logs pod
│   ├── Look for: Application errors, configuration issues
│   └── Fix: Review config, environment variables, volumes
├── Networking Issues?
│   ├── Check: kubectl exec pod -- ping/telnet
│   ├── Look for: CNI issues, NetworkPolicy blocks
│   └── Fix: Verify CNI, check policies, test connectivity
└── Performance Issues?
    ├── Check: kubectl top pods
    ├── Look for: Resource limits, node capacity
    └── Fix: Adjust requests/limits, scale resources

Conceptual Mastery Checklist

Understand pods as atomic scheduling and execution units
Master the pod lifecycle from creation to termination
Comprehend networking model and communication patterns
Know volume types and persistent storage strategies
Practice logging collection and troubleshooting workflows
Internalize health checking and monitoring approaches
Understand resource management and QoS implications


Mastering pod lifecycle means understanding how Kubernetes transforms your application definitions into running, networked, monitored workloads. Every other Kubernetes concept builds upon this foundational understanding of how pods actually work.

Last updated: 2025-08-26 20:00 UTC