CKA Road Trip: Node NotReady + etcd Backup¶
Two tasks, one exercise.
Part 1 — Node NotReady¶
k describe node controlplane
# Conditions:
# Ready Unknown NodeStatusUnknown Kubelet stopped posting node status.
The condition message is the signal. Kubelet stopped posting node status means one thing — the kubelet process is dead.
ssh controlplane
systemctl status kubelet
# Active: inactive (dead)
systemctl start kubelet
systemctl status kubelet
# Active: active (running)
exit
k get nodes
# controlplane Ready
The kubelet was stopped. Start it, node recovers.
Part 2 — etcd Backup¶
Verify etcd is running first:
Take the snapshot:
ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/apiserver-etcd-client.crt \
--key=/etc/kubernetes/pki/apiserver-etcd-client.key \
snapshot save /opt/cluster_backup.db > backup.txt 2>&1
The three certs are always required — etcd won't talk without mTLS. Find them at:
/etc/kubernetes/pki/etcd/ca.crt
/etc/kubernetes/pki/apiserver-etcd-client.crt
/etc/kubernetes/pki/apiserver-etcd-client.key
> backup.txt 2>&1 redirects both stdout and stderr to the file. Without the > before backup.txt etcdctl sees it as a second argument and throws snapshot save expects one argument.
The Diagnostic Chain¶
node NotReady
↓
k describe node → "Kubelet stopped posting node status"
↓
ssh into node
↓
systemctl status kubelet → inactive
↓
systemctl start kubelet
↓
node Ready
Kubelet stopped posting node status is unambiguous. Go straight to the kubelet, don't waste time elsewhere.