alexsusanu@docs:YAML Manifests - Comprehensive Study Guide $
alexsusanu@docs
:~$ cat YAML Manifests - Comprehensive Study Guide.md

HomeNOTES → YAML Manifests - Comprehensive Study Guide

YAML Manifests - Comprehensive Study Guide

category: Kubernetes Certification
tags: cka, kubernetes, exam, kubectl, certification

WHY YAML Manifests Matter (Conceptual Foundation)

The Infrastructure as Code Paradigm

YAML manifests are declarative infrastructure definitions - the DNA of your Kubernetes applications:

  • Desired State Declaration - You describe WHAT you want, not HOW to achieve it
  • API Object Blueprints - Every manifest creates or modifies API objects in etcd
  • Reproducible Infrastructure - Same manifest = same result across environments
  • Version Control Integration - Infrastructure changes tracked like code
  • GitOps Foundation - Manifests enable automated deployment pipelines

Exam Context: Why YAML Mastery is Critical

  • 70% of exam tasks involve writing or modifying YAML
  • No syntax highlighting - you must know structure by heart
  • Complex nested relationships - understanding object hierarchy is crucial
  • Troubleshooting broken manifests - rapid YAML debugging skills essential
  • Template generation - converting kubectl commands to production-ready YAML

Core Architectural Understanding

Kubernetes API Object Model

Every YAML manifest represents a desired state declaration to the API server:

# This is NOT just configuration - it's a state transition request
apiVersion: apps/v1  # Which API group handles this?
kind: Deployment     # What type of object?
metadata:           # How to identify this object?
  name: web-app
  namespace: production
spec:               # What is the desired state?
  replicas: 3
  # ... more spec details
status:             # What is the current state? (managed by controllers)
  # Never write this section - it's controller-managed

Key Concept: manifests define spec (desired), controllers reconcile to achieve it, and populate status (actual).

YAML Syntax Foundation for Kubernetes

# Indentation is CRITICAL (2 spaces standard)
apiVersion: v1
kind: Pod
metadata:
  name: example-pod
  labels:           # Key-value pairs
    app: web
    tier: frontend
  annotations:      # Extended metadata
    description: "Main web application pod"
spec:
  containers:       # Array of objects
  - name: web       # Array item starts with dash
    image: nginx:1.20
    ports:
    - containerPort: 80
      protocol: TCP
  - name: sidecar   # Second container
    image: busybox
    command: ["sleep", "3600"]

Critical YAML Rules:
- Indentation = hierarchy (spaces only, never tabs)
- Hyphens (-) = array items
- Colons (:) = key-value pairs
- Quotes preserve strings with special characters


Essential Manifest Patterns

1. Pod Manifests (Building Block)

Basic Pod Structure

apiVersion: v1
kind: Pod
metadata:
  name: basic-pod
  namespace: default
  labels:
    app: myapp
    version: v1
spec:
  containers:
  - name: main-container
    image: nginx:1.20
    ports:
    - containerPort: 80
    env:
    - name: ENV_VAR
      value: "production"
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"
  restartPolicy: Always
  nodeSelector:
    disktype: ssd

Multi-Container Pod (Sidecar Pattern)

apiVersion: v1
kind: Pod
metadata:
  name: sidecar-pod
spec:
  containers:
  - name: main-app
    image: nginx
    volumeMounts:
    - name: shared-logs
      mountPath: /var/log/nginx
  - name: log-shipper
    image: fluent/fluent-bit
    volumeMounts:
    - name: shared-logs
      mountPath: /var/log/nginx
      readOnly: true
  volumes:
  - name: shared-logs
    emptyDir: {}

Conceptual Insight: Multi-container pods share network, storage, and lifecycle - they're atomic deployment units.

2. Deployment Manifests (Production Workloads)

Complete Deployment Pattern

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-deployment
  labels:
    app: web
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  selector:
    matchLabels:        # MUST match template labels
      app: web
  template:            # Pod template
    metadata:
      labels:
        app: web        # Referenced by selector above
        version: v1
    spec:
      containers:
      - name: web
        image: nginx:1.20
        ports:
        - containerPort: 80
        livenessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 5
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"

Critical Gotcha: spec.selector.matchLabels MUST match spec.template.metadata.labels exactly.

Advanced Deployment Strategies

# Blue-Green Deployment Pattern
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-blue
spec:
  replicas: 3
  strategy:
    type: Recreate  # All pods replaced simultaneously
  selector:
    matchLabels:
      app: myapp
      version: blue
  template:
    metadata:
      labels:
        app: myapp
        version: blue
    spec:
      containers:
      - name: app
        image: myapp:v1.0
---
# Switch traffic by updating service selector
apiVersion: v1
kind: Service
metadata:
  name: app-service
spec:
  selector:
    app: myapp
    version: blue  # Change to 'green' for traffic switch
  ports:
  - port: 80
    targetPort: 8080

3. Service Manifests (Network Abstraction)

ClusterIP Service (Internal)

apiVersion: v1
kind: Service
metadata:
  name: internal-service
spec:
  type: ClusterIP  # Default type
  selector:
    app: web       # Matches pod labels
  ports:
  - name: http
    port: 80       # Service port
    targetPort: 8080  # Container port
    protocol: TCP
  sessionAffinity: ClientIP  # Optional: sticky sessions

NodePort Service (External Access)

apiVersion: v1
kind: Service
metadata:
  name: nodeport-service
spec:
  type: NodePort
  selector:
    app: web
  ports:
  - port: 80
    targetPort: 8080
    nodePort: 30080  # Optional: specify port (30000-32767)

LoadBalancer Service (Cloud Provider)

apiVersion: v1
kind: Service
metadata:
  name: loadbalancer-service
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
  type: LoadBalancer
  selector:
    app: web
  ports:
  - port: 80
    targetPort: 8080
  loadBalancerSourceRanges:  # Restrict access
  - "10.0.0.0/8"

Headless Service (StatefulSet Pattern)

apiVersion: v1
kind: Service
metadata:
  name: headless-service
spec:
  clusterIP: None  # Headless - no cluster IP assigned
  selector:
    app: database
  ports:
  - port: 5432
    targetPort: 5432

Conceptual Insight: Services create stable network endpoints for ephemeral pods via label selection.

4. ConfigMap and Secret Patterns

ConfigMap Varieties

# Literal values
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  database_url: "postgresql://db:5432/myapp"
  debug_mode: "false"
  config.yaml: |
    server:
      port: 8080
      host: "0.0.0.0"
    database:
      driver: postgres
      maxConnections: 10
---
# Using ConfigMap in Pod
apiVersion: v1
kind: Pod
metadata:
  name: config-pod
spec:
  containers:
  - name: app
    image: myapp
    env:
    - name: DATABASE_URL
      valueFrom:
        configMapKeyRef:
          name: app-config
          key: database_url
    volumeMounts:
    - name: config-volume
      mountPath: /etc/config
  volumes:
  - name: config-volume
    configMap:
      name: app-config

Secret Patterns

# Opaque Secret (base64 encoded)
apiVersion: v1
kind: Secret
metadata:
  name: db-secret
type: Opaque
data:
  username: YWRtaW4=  # admin (base64)
  password: cGFzc3dvcmQ=  # password (base64)
stringData:  # Plain text (automatically encoded)
  connection-string: "postgresql://admin:password@db:5432/myapp"
---
# TLS Secret
apiVersion: v1
kind: Secret
metadata:
  name: tls-secret
type: kubernetes.io/tls
data:
  tls.crt: LS0tLS1CRUdJTi... # Base64 certificate
  tls.key: LS0tLS1CRUdJTi... # Base64 private key
---
# Docker Registry Secret
apiVersion: v1
kind: Secret
metadata:
  name: registry-secret
type: kubernetes.io/dockerconfigjson
data:
  .dockerconfigjson: eyJhdXRocyI6... # Base64 docker config

5. Volume and Storage Patterns

PersistentVolume and PersistentVolumeClaim

# PersistentVolume (cluster-wide resource)
apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /mnt/disks/vol1
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - worker-node-1
---
# PersistentVolumeClaim (namespace resource)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: local-storage
---
# Using PVC in Pod
apiVersion: v1
kind: Pod
metadata:
  name: storage-pod
spec:
  containers:
  - name: app
    image: nginx
    volumeMounts:
    - name: storage
      mountPath: /data
  volumes:
  - name: storage
    persistentVolumeClaim:
      claimName: app-pvc

Volume Types Reference

apiVersion: v1
kind: Pod
metadata:
  name: volume-examples
spec:
  containers:
  - name: app
    image: busybox
    command: ["sleep", "3600"]
    volumeMounts:
    - name: empty-dir
      mountPath: /tmp/empty
    - name: host-path
      mountPath: /tmp/host
    - name: config-volume
      mountPath: /etc/config
    - name: secret-volume
      mountPath: /etc/secrets
  volumes:
  - name: empty-dir
    emptyDir:
      sizeLimit: 1Gi
  - name: host-path
    hostPath:
      path: /var/log
      type: Directory
  - name: config-volume
    configMap:
      name: app-config
  - name: secret-volume
    secret:
      secretName: app-secret
      defaultMode: 0400  # Read-only for owner

Advanced Manifest Patterns

1. StatefulSet for Stateful Applications

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: database
spec:
  serviceName: "database-headless"
  replicas: 3
  selector:
    matchLabels:
      app: database
  template:
    metadata:
      labels:
        app: database
    spec:
      containers:
      - name: postgres
        image: postgres:13
        env:
        - name: POSTGRES_DB
          value: myapp
        - name: POSTGRES_USER
          valueFrom:
            secretKeyRef:
              name: db-secret
              key: username
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:  # Unique PVC per pod
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

Key Concept: StatefulSets provide stable network identity and persistent storage per replica.

2. DaemonSet for Node-level Services

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      hostNetwork: true  # Use host networking
      hostPID: true      # Access host process namespace
      containers:
      - name: node-exporter
        image: prom/node-exporter:latest
        args:
        - '--path.rootfs=/host'
        ports:
        - containerPort: 9100
          hostPort: 9100  # Expose on host
        volumeMounts:
        - name: rootfs
          mountPath: /host
          readOnly: true
      volumes:
      - name: rootfs
        hostPath:
          path: /
      tolerations:  # Run on all nodes, including masters
      - operator: Exists

3. Job and CronJob Patterns

# One-time Job
apiVersion: batch/v1
kind: Job
metadata:
  name: data-migration
spec:
  completions: 1
  parallelism: 1
  backoffLimit: 3
  template:
    spec:
      containers:
      - name: migrator
        image: migrate/migrate
        command: ["migrate", "-path", "/migrations", "-database", "$(DATABASE_URL)", "up"]
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-secret
              key: connection-string
      restartPolicy: Never
---
# Scheduled CronJob
apiVersion: batch/v1
kind: CronJob
metadata:
  name: backup-job
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  concurrencyPolicy: Forbid  # Don't run concurrent jobs
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: postgres:13
            command: ["/bin/bash"]
            args:
            - -c
            - "pg_dump $DATABASE_URL > /backup/backup-$(date +%Y%m%d-%H%M%S).sql"
            volumeMounts:
            - name: backup-storage
              mountPath: /backup
          volumes:
          - name: backup-storage
            persistentVolumeClaim:
              claimName: backup-pvc
          restartPolicy: OnFailure

4. Ingress for HTTP Routing

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
  tls:
  - hosts:
    - app.example.com
    secretName: app-tls
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80
      - path: /
        pathType: Prefix
        backend:
          service:
            name: frontend-service
            port:
              number: 80

YAML Generation and Templating

1. kubectl Generators (Exam Essential)

# Generate deployment YAML
kubectl create deployment web --image=nginx --dry-run=client -o yaml > deployment.yaml

# Generate service YAML
kubectl expose deployment web --port=80 --target-port=8080 --dry-run=client -o yaml > service.yaml

# Generate configmap YAML
kubectl create configmap app-config --from-literal=key1=value1 --dry-run=client -o yaml > configmap.yaml

# Generate secret YAML
kubectl create secret generic app-secret --from-literal=password=secret123 --dry-run=client -o yaml > secret.yaml

# Generate job YAML
kubectl create job data-job --image=busybox --dry-run=client -o yaml -- echo "hello world" > job.yaml

2. Common Modifications Patterns

# Start with generated base
apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null  # Remove this line
  labels:
    app: web
  name: web
spec:
  replicas: 1  # Change to desired count
  selector:
    matchLabels:
      app: web
  strategy: {}  # Replace with specific strategy
  template:
    metadata:
      creationTimestamp: null  # Remove this line
      labels:
        app: web
    spec:
      containers:
      - image: nginx
        name: nginx
        resources: {}  # Add actual resource limits
status: {}  # Remove entire status section

Exam Tip: Always remove creationTimestamp: null, resources: {}, strategy: {}, and entire status: {} sections.


Validation and Troubleshooting

1. YAML Syntax Validation

# Validate without applying
kubectl apply --dry-run=client -f manifest.yaml

# Validate against cluster
kubectl apply --dry-run=server -f manifest.yaml

# Explain API structure
kubectl explain deployment.spec.template.spec.containers
kubectl explain pod.spec --recursive

2. Common YAML Errors and Fixes

Indentation Errors

# WRONG - inconsistent indentation
apiVersion: v1
kind: Pod
metadata:
 name: bad-pod
spec:
   containers:
 - name: nginx
    image: nginx

# CORRECT - consistent 2-space indentation
apiVersion: v1
kind: Pod
metadata:
  name: good-pod
spec:
  containers:
  - name: nginx
    image: nginx

Label Selector Mismatches

# WRONG - selector doesn't match template labels
spec:
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: frontend  # Mismatch!

# CORRECT - labels match exactly
spec:
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web       # Match!

Resource Reference Errors

# WRONG - referencing non-existent resources
env:
- name: CONFIG_VALUE
  valueFrom:
    configMapKeyRef:
      name: missing-config  # ConfigMap doesn't exist
      key: missing-key      # Key doesn't exist

# CORRECT - verify resources exist first
kubectl get configmap app-config
kubectl describe configmap app-config

3. Debugging Workflow

# 1. Validate YAML syntax
kubectl apply --dry-run=client -f manifest.yaml

# 2. Check resource creation
kubectl apply -f manifest.yaml
kubectl get all -l app=myapp

# 3. Investigate issues
kubectl describe deployment myapp
kubectl describe pod <pod-name>
kubectl logs <pod-name>

# 4. Check events
kubectl get events --sort-by='.lastTimestamp'

Best Practices and Patterns

1. Production-Ready Manifest Structure

apiVersion: apps/v1
kind: Deployment
metadata:
  name: production-app
  labels:
    app: production-app
    version: v1.2.3
    tier: frontend
  annotations:
    deployment.kubernetes.io/revision: "1"
    description: "Production web application"
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  selector:
    matchLabels:
      app: production-app
  template:
    metadata:
      labels:
        app: production-app
        version: v1.2.3
        tier: frontend
    spec:
      containers:
      - name: web
        image: myapp:v1.2.3
        ports:
        - containerPort: 8080
          name: http
        env:
        - name: ENVIRONMENT
          value: "production"
        resources:
          requests:
            memory: "256Mi"
            cpu: "200m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3
        volumeMounts:
        - name: config
          mountPath: /etc/config
          readOnly: true
        - name: secrets
          mountPath: /etc/secrets
          readOnly: true
      volumes:
      - name: config
        configMap:
          name: app-config
      - name: secrets
        secret:
          secretName: app-secrets
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 2000
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - production-app
              topologyKey: kubernetes.io/hostname

2. Multi-Resource Manifest Organization

# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: myapp
  labels:
    name: myapp
---
# configmap.yaml in same file (using ---)
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
  namespace: myapp
data:
  app.properties: |
    server.port=8080
    database.url=jdbc:postgresql://db:5432/myapp
---
# secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: app-secrets
  namespace: myapp
type: Opaque
stringData:
  database-password: "supersecret"
---
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  namespace: myapp
# ... rest of deployment spec

3. Resource Naming Conventions

# Follow consistent naming patterns
metadata:
  name: myapp-frontend-deployment    # app-component-type
  labels:
    app.kubernetes.io/name: myapp
    app.kubernetes.io/component: frontend
    app.kubernetes.io/version: "1.2.3"
    app.kubernetes.io/managed-by: kubectl

Exam-Specific Strategies

1. Speed Optimization Techniques

# Use generators + modifications instead of writing from scratch
kubectl create deployment web --image=nginx -o yaml --dry-run=client > base.yaml

# Edit efficiently:
# 1. Remove unnecessary fields (creationTimestamp, status, etc.)
# 2. Add required specifications (resources, probes, etc.)
# 3. Validate quickly with --dry-run=client

2. Common Exam Patterns

# Multi-container pod with shared volume
apiVersion: v1
kind: Pod
metadata:
  name: multi-container
spec:
  containers:
  - name: producer
    image: busybox
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo $(date) >> /shared/output.log; sleep 5; done"]
    volumeMounts:
    - name: shared
      mountPath: /shared
  - name: consumer
    image: busybox
    command: ["/bin/sh"]
    args: ["-c", "tail -f /shared/output.log"]
    volumeMounts:
    - name: shared
      mountPath: /shared
  volumes:
  - name: shared
    emptyDir: {}

3. Critical Validation Checklist

YAML syntax is valid (proper indentation, no tabs)
apiVersion matches the resource type
Required fields are present (name, image, etc.)
Label selectors match template labels
Resource references exist (ConfigMaps, Secrets, PVCs)
Namespace consistency across related resources
Security contexts are appropriate
Resource limits are specified


Conceptual Mastery Framework

Understanding the API Object Lifecycle

  1. Declaration: YAML defines desired state
  2. Submission: kubectl sends to API server
  3. Validation: API server validates syntax and permissions
  4. Storage: etcd stores the object definition
  5. Reconciliation: Controllers work to achieve desired state
  6. Status Updates: Controllers update status fields
  7. Monitoring: You observe actual vs desired state

YAML as Infrastructure Language

  • Declarative: Describe the end state, not the steps
  • Idempotent: Same input = same result
  • Composable: Complex systems built from simple primitives
  • Version-controlled: Infrastructure changes tracked over time
  • Portable: Same manifests work across clusters

Mastering YAML manifests means understanding Kubernetes as a declarative system where you define desired state and controllers work to achieve it. Every manifest is a conversation with the API server about how you want your infrastructure to look.

Last updated: 2025-08-26 20:00 UTC