CKA Study Guide: kubeadm Cluster Installation
category: Kubernetes Certification
tags: cka, kubernetes, exam, kubectl, certification
The Fundamental Problem kubeadm Solves
Before kubeadm, installing Kubernetes was a nightmare of manual configuration, cryptographic complexity, and system administration. You had to:
- Generate and distribute TLS certificates manually
- Configure etcd cluster with proper certificates
- Set up the API server with correct flags and certificates
- Bootstrap the kubelet on each node with proper authentication
- Configure networking, DNS, and service discovery
- Handle certificate rotation and cluster lifecycle
kubeadm abstracts this complexity into a simple, opinionated workflow while maintaining production-grade security and best practices.
Why NOT Use Managed Services for Learning?
While cloud providers offer managed Kubernetes (EKS, GKE, AKS), understanding kubeadm is crucial because:
- Debugging skills: When things go wrong, you need to understand the underlying components
- Cost management: Self-managed clusters can be significantly cheaper
- Compliance requirements: Some organizations require on-premises or specific configurations
- Educational value: Understanding how the sausage is made makes you a better Kubernetes operator
- Career advancement: Many companies run self-managed clusters
Understanding Kubernetes Cluster Architecture
The Control Plane Components
Before diving into kubeadm, you must understand what you're actually building:
etcd: The distributed key-value store that holds all cluster state. Everything else is stateless and can be rebuilt from etcd data.
kube-apiserver: The REST API frontend that validates and stores resources in etcd. Everything talks to the cluster through this component.
kube-controller-manager: Runs the control loops that watch cluster state and make changes to achieve desired state (deployment scaling, node health management, etc.).
kube-scheduler: Decides which nodes should run which pods based on resource requirements, constraints, and policies.
kube-proxy: Runs on each node and implements service networking (load balancing to pod endpoints).
kubelet: The node agent that manages pods, containers, and communicates with the API server.
Why This Architecture Matters
This design implements several critical patterns:
- Separation of concerns: Each component has a single responsibility
- Declarative state: Everything is stored as desired state, controllers reconcile to that state
- API-driven: All interactions go through the API server, enabling consistent access control and auditing
- Horizontal scaling: Multiple instances of stateless components for high availability
- Pluggable: Components can be replaced or enhanced (different CNI, CRI, storage)
How kubeadm Approaches the Bootstrapping Problem
The Bootstrap Paradox
How do you start a Kubernetes cluster when Kubernetes manages itself? This is a chicken-and-egg problem:
- kubelet needs to know about the API server to get pod specs
- API server needs to be running for kubelet to communicate with it
- But API server itself runs as pods managed by kubelet
kubeadm's Solution: Static Pods
kubeadm solves this with static pods - pods that kubelet manages directly from local files without requiring an API server:
- Phase 1: kubelet starts and finds static pod manifests in
/etc/kubernetes/manifests/
- Phase 2: kubelet starts control plane components (API server, controller-manager, scheduler) as static pods
- Phase 3: Once API server is running, kubeadm uses it to configure the rest of the cluster
- Phase 4: Install networking, DNS, and other cluster components through the API
This bootstrapping approach is elegant because it uses Kubernetes to manage Kubernetes, but doesn't require Kubernetes to already be running.
The kubeadm Workflow Philosophy
kubeadm follows these principles:
- Minimal and opinionated: Provides a working cluster with sensible defaults
- Composable: Can be used as part of larger automation
- Production-ready: Implements security best practices by default
- Extensible: Allows customization through configuration files and phases
Prerequisites and System Requirements
Hardware Requirements
# Minimum requirements for each node:
# - 2 CPUs (for control plane nodes)
# - 2GB RAM
# - Network connectivity between nodes
# - Unique hostname, MAC address, and product_uuid for each node
# Check system requirements
lscpu | grep "CPU(s)" # Check CPU count
free -h # Check memory
ip link # Check network interfaces
cat /sys/class/dmi/id/product_uuid # Check product UUID
Why These Requirements Exist
2 CPUs minimum: The control plane components (especially etcd) are CPU-intensive during cluster operations. Single CPU nodes will be slow and potentially unstable.
2GB RAM minimum: Control plane components have memory overhead, plus space for system pods, networking, and monitoring.
Unique identifiers: Kubernetes uses these to distinguish nodes. Cloned VMs often have identical values, causing cluster issues.
Network Requirements
# Required ports for control plane:
# 6443: kube-apiserver
# 2379-2380: etcd server client API
# 10250: kubelet API
# 10259: kube-scheduler
# 10257: kube-controller-manager
# Check if ports are available
netstat -tlnp | grep :6443
ss -tlnp | grep :2379
# Required ports for worker nodes:
# 10250: kubelet API
# 30000-32767: NodePort services
Container Runtime Setup
Kubernetes needs a container runtime (CRI-compatible). Docker is no longer supported directly; containerd is the recommended choice:
# Install containerd
apt-get update
apt-get install -y containerd
# Configure containerd for Kubernetes
mkdir -p /etc/containerd
containerd config default | tee /etc/containerd/config.toml
# Enable systemd cgroup driver (required for kubelet)
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
# Restart containerd
systemctl restart containerd
systemctl enable containerd
Why systemd cgroup driver? kubelet and the container runtime must use the same cgroup driver to manage resource limits. systemd is the standard on most Linux distributions.
System Configuration
# Disable swap (Kubernetes requires this)
swapoff -a
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# Load required kernel modules
cat <<EOF | tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
modprobe overlay
modprobe br_netfilter
# Configure sysctl for networking
cat <<EOF | tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sysctl --system
Why disable swap? kubelet's memory management assumes predictable memory allocation. Swap can cause unpredictable performance and OOM behavior.
Why these kernel modules?
- overlay
: Required for container filesystems
- br_netfilter
: Required for bridge networking and iptables rules
Why these sysctl settings?
- Bridge netfilter: Allows iptables to process bridged traffic (pod-to-pod networking)
- IP forwarding: Required for routing between pods and services
Installing kubeadm, kubelet, and kubectl
Package Installation
# Add Kubernetes apt repository
apt-get update
apt-get install -y apt-transport-https ca-certificates curl
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /' | tee /etc/apt/sources.list.d/kubernetes.list
# Install specific versions (important for cluster consistency)
apt-get update
apt-get install -y kubelet=1.28.0-00 kubeadm=1.28.0-00 kubectl=1.28.0-00
# Prevent automatic updates
apt-mark hold kubelet kubeadm kubectl
# Start kubelet (it will fail until cluster is initialized)
systemctl enable --now kubelet
Why Version Pinning Matters
Cluster consistency: All nodes should run the same kubelet version to avoid compatibility issues.
Controlled upgrades: Kubernetes has a strict upgrade path (N to N+1 minor versions only). Automatic updates can break this.
API compatibility: kubectl should be within one minor version of the API server to ensure full compatibility.
Understanding the Components
kubeadm: The cluster bootstrapping tool. Only used during installation and upgrades.
kubelet: The node agent that runs on every node. Manages pods and communicates with the API server.
kubectl: The CLI client for interacting with the cluster. Can be installed on any machine with network access to the API server.
Initializing the Control Plane
Basic Cluster Initialization
# Initialize control plane with specific pod subnet
kubeadm init --pod-network-cidr=10.244.0.0/16
# Alternative with custom API server address
kubeadm init \
--apiserver-advertise-address=192.168.1.100 \
--pod-network-cidr=10.244.0.0/16 \
--service-cidr=10.96.0.0/12
Understanding the Parameters
--pod-network-cidr: Defines the IP range for pod networking. Must not overlap with node or service networks. Different CNI plugins have different requirements:
- Flannel: typically uses 10.244.0.0/16
- Calico: flexible, often 192.168.0.0/16
- Weave: typically 10.32.0.0/12
--apiserver-advertise-address: The IP address the API server advertises to other cluster members. Critical for multi-node clusters and load balancers.
--service-cidr: IP range for cluster services (ClusterIP). Default is 10.96.0.0/12, which provides ~1M service IPs.
What kubeadm init Actually Does
- Preflight checks: Validates system requirements, ports, container runtime
- Certificate generation: Creates CA and component certificates with proper SANs
- Control plane static pods: Generates manifests for API server, controller-manager, scheduler
- etcd setup: Initializes etcd cluster (local or external)
- kubectl configuration: Sets up admin kubeconfig
- Bootstrap tokens: Creates tokens for node joining
- Add-ons: Installs CoreDNS and kube-proxy
The Generated Certificates
kubeadm creates a complete PKI infrastructure:
# Certificate files location
ls -la /etc/kubernetes/pki/
# Key certificates:
# ca.crt/ca.key - Cluster CA (signs all other certs)
# apiserver.crt/apiserver.key - API server TLS cert
# apiserver-kubelet-client.crt/key - API server to kubelet authentication
# front-proxy-ca.crt/key - Front proxy CA
# etcd/ca.crt/key - etcd CA (if using local etcd)
# sa.pub/sa.key - Service account token signing keys
Why so many certificates? Each component needs its own identity and encryption. This implements zero-trust networking where every connection is authenticated and encrypted.
Setting Up kubectl Access
# Copy admin config (as root)
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
# Verify cluster status
kubectl cluster-info
kubectl get nodes
kubectl get pods -n kube-system
Understanding and Installing Container Network Interface (CNI)
Why CNI is Required
After kubeadm init, nodes show "NotReady" status because there's no pod networking:
kubectl get nodes
# NAME STATUS ROLES AGE VERSION
# control-plane NotReady control-plane 5m v1.28.0
Kubernetes defines the networking model but doesn't implement it. CNI plugins provide:
- Pod-to-pod networking across nodes
- Network policies for security
- Service load balancing (in cooperation with kube-proxy)
Installing Flannel (Simple Example)
# Install Flannel CNI
kubectl apply -f https://github.com/coreos/flannel/raw/master/Documentation/kube-flannel.yml
# Watch nodes become ready
kubectl get nodes -w
Installing Calico (Production Example)
# Install Calico operator
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/tigera-operator.yaml
# Configure Calico with custom pod CIDR
cat <<EOF | kubectl apply -f -
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
calicoNetwork:
ipPools:
- blockSize: 26
cidr: 10.244.0.0/16
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
EOF
CNI Plugin Comparison
Flannel:
- Pros: Simple, minimal configuration, stable
- Cons: Limited network policy support, basic routing
- Use case: Development, simple clusters
Calico:
- Pros: Rich network policies, BGP routing, performance
- Cons: More complex, requires understanding of networking
- Use case: Production, security-focused environments
Weave:
- Pros: Built-in encryption, automatic mesh networking
- Cons: Performance overhead, less actively maintained
- Use case: Secure environments, multi-cloud
Verifying CNI Installation
# Check that nodes are Ready
kubectl get nodes
# Verify CNI pods are running
kubectl get pods -n kube-system | grep -E "(flannel|calico|weave)"
# Test pod networking
kubectl run test-pod --image=nginx --restart=Never
kubectl get pod test-pod -o wide # Note the pod IP
kubectl exec test-pod -- ip addr show eth0
Adding Worker Nodes
The Join Process
During kubeadm init
, you get a join command:
kubeadm join 192.168.1.100:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:1234567890abcdef...
Understanding the Join Parameters
--token: A bootstrap token that allows the node to authenticate with the API server during the join process. Tokens are time-limited (24 hours by default).
--discovery-token-ca-cert-hash: A hash of the cluster CA certificate. This prevents man-in-the-middle attacks during bootstrap.
--discovery-token-unsafe-skip-ca-verification: Alternative to CA hash for testing (never use in production).
What Happens During Node Join
- Token validation: New node presents token to API server
- CA verification: Node verifies it's talking to the correct cluster
- TLS bootstrap: Node requests client certificate from API server
- kubelet registration: Node registers itself with the API server
- Pod scheduling: Node becomes available for pod scheduling
Managing Join Tokens
# List existing tokens
kubeadm token list
# Create new token (if original expired)
kubeadm token create
# Create token with custom TTL
kubeadm token create --ttl 2h
# Get CA cert hash for join command
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
# Generate complete join command
kubeadm token create --print-join-command
Worker Node Setup
# On worker node: Install container runtime and kubelet (same as control plane)
# Then join the cluster
kubeadm join 192.168.1.100:6443 --token abc123.xyz789 \
--discovery-token-ca-cert-hash sha256:hash...
# Verify from control plane
kubectl get nodes
kubectl describe node worker-node-1
Advanced Configuration with kubeadm
Using Configuration Files
For complex setups, use configuration files instead of command-line flags:
# kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.1.100
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
kubeletExtraArgs:
node-labels: "environment=production,zone=us-west-1a"
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: v1.28.0
clusterName: production-cluster
controlPlaneEndpoint: "k8s-api.company.com:6443"
networking:
serviceSubnet: "10.96.0.0/12"
podSubnet: "10.244.0.0/16"
dnsDomain: "cluster.local"
apiServer:
extraArgs:
audit-log-maxage: "30"
audit-log-maxbackup: "10"
audit-log-maxsize: "100"
audit-log-path: "/var/log/audit.log"
extraVolumes:
- name: audit-log
hostPath: "/var/log"
mountPath: "/var/log"
readOnly: false
pathType: DirectoryOrCreate
controllerManager:
extraArgs:
bind-address: "0.0.0.0"
scheduler:
extraArgs:
bind-address: "0.0.0.0"
etcd:
local:
dataDir: "/var/lib/etcd"
extraArgs:
listen-metrics-urls: "http://0.0.0.0:2381"
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
serverTLSBootstrap: true
rotateCertificates: true
# Initialize with config file
kubeadm init --config=kubeadm-config.yaml
Why Configuration Files Matter
Version control: Cluster configuration can be stored in git and reviewed
Repeatability: Identical clusters can be created consistently
Complexity management: Advanced configurations become manageable
Documentation: Serves as documentation of cluster setup
External etcd Setup
For production high availability, use external etcd:
# In ClusterConfiguration
etcd:
external:
endpoints:
- https://etcd1.company.com:2379
- https://etcd2.company.com:2379
- https://etcd3.company.com:2379
caFile: /etc/etcd/ca.crt
certFile: /etc/etcd/etcd-client.crt
keyFile: /etc/etcd/etcd-client.key
Why external etcd?
- Isolation: etcd failures don't affect control plane components
- Scaling: etcd can be scaled independently
- Backup/restore: Easier to manage etcd lifecycle
- Performance: Dedicated resources for the most critical component
Troubleshooting kubeadm Installations
Common Initialization Failures
Port conflicts:
# Check if required ports are in use
netstat -tlnp | grep -E ":(6443|2379|2380|10250|10259|10257)"
# Kill processes using required ports
lsof -ti:6443 | xargs kill -9
Container runtime issues:
# Check containerd status
systemctl status containerd
# Check container runtime detection
crictl info
# Verify kubelet can communicate with runtime
journalctl -u kubelet -f
Certificate issues:
# Check certificate validity
openssl x509 -in /etc/kubernetes/pki/ca.crt -text -noout
# Regenerate certificates if needed
kubeadm certs renew all
Node Join Failures
Token expiration:
# Check token status
kubeadm token list
# Create new token
kubeadm token create --print-join-command
Network connectivity:
# Test connectivity to API server
telnet 192.168.1.100 6443
# Check firewall rules
iptables -L INPUT -n | grep 6443
CA hash mismatch:
# Regenerate correct hash
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
kubelet Troubleshooting
The kubelet logs are crucial for diagnosing issues:
# Check kubelet status
systemctl status kubelet
# Follow kubelet logs
journalctl -u kubelet -f
# Check kubelet configuration
cat /var/lib/kubelet/config.yaml
# Verify kubelet can reach API server
curl -k https://localhost:10250/healthz
Common Log Messages and Solutions
"failed to run Kubelet: unable to load bootstrap kubeconfig"
- Solution: Regenerate bootstrap tokens and kubeconfig
"node not found"
- Solution: Check node registration and API server connectivity
"pod sandbox changed, it will be killed and re-created"
- Solution: Usually normal during networking setup, but check CNI
"failed to create pod sandbox"
- Solution: Check container runtime and CNI configuration
Cluster Validation and Health Checks
Comprehensive Cluster Testing
# Check all nodes are ready
kubectl get nodes -o wide
# Verify system pods
kubectl get pods -n kube-system
# Test DNS resolution
kubectl run dnsutils --image=gcr.io/kubernetes-e2e-test-images/dnsutils:1.3 --restart=Never
kubectl exec dnsutils -- nslookup kubernetes.default
kubectl delete pod dnsutils
# Test service connectivity
kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --port=80 --type=ClusterIP
kubectl run test --image=busybox --restart=Never -- wget -qO- nginx
kubectl delete deployment nginx
kubectl delete service nginx
kubectl delete pod test
# Check cluster info
kubectl cluster-info
kubectl cluster-info dump > cluster-dump.txt
Performance and Resource Validation
# Check resource allocation
kubectl top nodes
kubectl top pods -A
# Verify scheduler is placing pods
kubectl get events --sort-by=.metadata.creationTimestamp
# Test horizontal scaling
kubectl create deployment test-scale --image=nginx
kubectl scale deployment test-scale --replicas=3
kubectl get pods -l app=test-scale -o wide
kubectl delete deployment test-scale
Security Validation
# Check RBAC is working
kubectl auth can-i create pods
kubectl auth can-i create pods --as=system:anonymous
# Verify network policies (if using Calico/other NP-capable CNI)
kubectl get networkpolicies -A
# Check Pod Security Standards
kubectl get pods -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.securityContext}{"\n"}{end}'
Cluster Lifecycle Management
Backing Up the Cluster
etcd backup (most critical):
# Create etcd snapshot
ETCDCTL_API=3 etcdctl snapshot save backup.db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key
# Verify backup
ETCDCTL_API=3 etcdctl snapshot status backup.db
Certificate backup:
# Backup certificates and configs
tar -czf k8s-backup-$(date +%Y%m%d).tar.gz \
/etc/kubernetes/pki/ \
/etc/kubernetes/admin.conf \
/etc/kubernetes/kubelet.conf \
/etc/kubernetes/controller-manager.conf \
/etc/kubernetes/scheduler.conf
Node Maintenance
Draining nodes safely:
# Cordon node (prevent new pods)
kubectl cordon worker-node-1
# Drain node (move existing pods)
kubectl drain worker-node-1 --ignore-daemonsets --delete-emptydir-data
# Perform maintenance...
# Uncordon node (allow scheduling)
kubectl uncordon worker-node-1
Removing nodes:
# Drain and remove from cluster
kubectl drain worker-node-1 --ignore-daemonsets --force
kubectl delete node worker-node-1
# On the node itself: reset and clean up
kubeadm reset
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
ipvsadm -C # if using ipvs
Security Best Practices for kubeadm Clusters
Certificate Management
- Regular rotation: Use
kubeadm certs renew all
before expiration - Secure storage: Protect private keys and backup certificates securely
- Monitoring: Set up alerts for certificate expiration
Network Security
# Disable insecure API server port (if accidentally enabled)
# Remove --insecure-port=8080 from API server manifest
# Use network policies to restrict pod-to-pod communication
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
namespace: default
spec:
podSelector: {}
policyTypes:
- Ingress
EOF
Access Control
- Disable default ServiceAccount auto-mount: Add
automountServiceAccountToken: false
- Use dedicated ServiceAccounts: Don't use default SA for applications
- Implement proper RBAC: Follow principle of least privilege
- Regular audits: Review who has cluster-admin access
System Hardening
# Set up audit logging in API server
# Add to /etc/kubernetes/manifests/kube-apiserver.yaml:
# --audit-log-path=/var/log/audit.log
# --audit-policy-file=/etc/kubernetes/audit-policy.yaml
# Restrict kubelet permissions
# In kubelet config: authorization-mode=Webhook,RBAC
# Enable Pod Security Standards
# Add to API server: --enable-admission-plugins=PodSecurity
Exam Tips
Time Management
- Practice the full workflow: From fresh VMs to working cluster in under 30 minutes
- Use configuration files: Faster than remembering all command-line flags
- Know the common troubleshooting steps: Port conflicts, token expiration, CNI issues
Key Commands to Master
# Fast cluster setup
kubeadm init --pod-network-cidr=10.244.0.0/16
kubectl apply -f flannel.yaml
# Quick troubleshooting
journalctl -u kubelet -f
kubectl get pods -n kube-system
kubeadm token create --print-join-command
# Validation
kubectl get nodes
kubectl run test --image=nginx --restart=Never
kubectl delete pod test
Common Scenarios
- Initialize control plane with specific networking
- Add worker nodes to existing cluster
- Troubleshoot node join failures
- Verify cluster networking and DNS
- Backup and restore cluster state
Things to Remember
- Always check prerequisites (swap, ports, container runtime)
- CNI must match the pod-network-cidr specified during init
- Tokens expire - know how to generate new ones
- Node names must be unique and resolvable
- kubelet logs are your best friend for troubleshooting