Skip to content

CKA Road Trip: Kubernetes Networking — From the Ground Up

Kubernetes networking is confusing because there are multiple layers of "network" stacked on top of each other. Once you understand each layer and what it owns, it stops being magic.


Layer 0 — The Linux Host Network

Before Kubernetes exists, you have a Linux machine with a network interface:

controlplane node
  eth0: 172.30.1.2     ← the real IP of this machine
  lo:   127.0.0.1      ← loopback, local to this machine only

This is the node network. Machines talk to each other here. 172.30.1.2 is reachable from node01 at 172.30.2.2. Normal networking.


Layer 1 — Linux Network Namespaces

This is where containers come in. When a container starts, the kernel creates a network namespace for it. Think of a network namespace as a completely separate, isolated copy of the networking stack.

Inside a network namespace: - its own network interfaces - its own IP address - its own routing table - its own loopback (127.0.0.1)

The container has no idea the host network exists. Its localhost is its own loopback, not the node's.

host network namespace          container network namespace
  eth0: 172.30.1.2               eth0: 192.168.1.5  ← pod IP
  lo:   127.0.0.1                lo:   127.0.0.1     ← container's OWN loopback

These are two completely separate localhostes. This is the source of most networking confusion.


Layer 2 — The veth Pair (The Wire)

A container in its own namespace can't talk to anything. It needs a wire connecting it to the outside world.

That wire is a veth pair — two virtual network interfaces connected like a cable. What goes in one end comes out the other.

host side                    container side
  veth_abc123  ←──────────→  eth0 (inside container)
  (on the bridge)              (pod IP: 192.168.1.5)

The host end plugs into a bridge (think: a virtual network switch). The container end is the pod's eth0. Every pod gets one veth pair.


Layer 3 — The Bridge (The Switch)

The bridge connects all the veth pairs on a node. Pods on the same node talk through the bridge.

         bridge (cni0: 192.168.1.1)
         /            \
   veth_pod_A      veth_pod_B
       |                |
   pod A             pod B
192.168.1.2      192.168.1.3

Pod A pings Pod B: 1. Pod A sends packet to 192.168.1.3 2. Goes through veth_pod_A to the bridge 3. Bridge forwards to veth_pod_B 4. Pod B receives it

No iptables, no routing — pure L2 switching on the same node.


Layer 4 — Pod IPs (Cross-Node)

Pods on different nodes need to reach each other too. The CNI plugin (Flannel, Cilium, Calico) handles this.

Each node gets a block of pod IPs:

controlplane: pods get 192.168.0.0/24
node01:       pods get 192.168.1.0/24

Cross-node traffic goes through the CNI plugin — either encapsulated in a tunnel (Flannel VXLAN) or routed directly (Calico BGP). The pod doesn't know or care. It just sends to the destination pod IP and the CNI handles getting it there.

Key point: every pod in the cluster gets a unique IP. Any pod can reach any other pod directly by IP — no NAT, no port mapping needed. This is the Kubernetes networking model.


Layer 5 — Services and ClusterIP

Pod IPs are ephemeral. A pod dies, its IP is gone. A new pod gets a new IP. You can't hardcode pod IPs.

A Service gives you a stable IP that never changes. It's called a ClusterIP.

nginx-service   ClusterIP: 10.96.45.123:80
              load balances to:
                pod-A: 192.168.1.2:80
                pod-B: 192.168.1.3:80

But here's the thing — the ClusterIP is not assigned to any interface. You can't ping it. It exists only as an iptables rule written by kube-proxy on every node.

When a pod sends traffic to 10.96.45.123:80: 1. Packet hits the iptables KUBE-SERVICES chain 2. iptables randomly picks a pod IP (load balancing) 3. DNAT rewrites the destination to the pod IP 4. Packet routes normally to the pod

The ClusterIP is just a hook in iptables. That's it.


Layer 6 — DNS

Nobody remembers 10.96.45.123. DNS maps service names to ClusterIPs.

Every pod has /etc/resolv.conf pointing at CoreDNS:

nameserver 10.96.0.10      ← CoreDNS ClusterIP
search default.svc.cluster.local svc.cluster.local cluster.local

When a pod does curl nginx-service: 1. DNS lookup: nginx-service → resolves to nginx-service.default.svc.cluster.local 2. CoreDNS returns 10.96.45.123 3. Pod sends to 10.96.45.123 4. iptables DNAT → pod IP 5. Packet reaches the pod


The localhost confusion — fully explained

Now that you understand network namespaces:

controlplane node (host namespace)
  127.0.0.1:6443  ← apiserver listening here

pod on that node (its own namespace)
  127.0.0.1       ← the pod's OWN loopback
                    completely separate from the node's loopback

When you kubectl exec into a pod and run curl localhost:6443 — you are inside the pod's network namespace. Its localhost is its own loopback. The apiserver is on the node's loopback, which is a completely different network namespace. The packet never leaves the pod's namespace, never reaches the node, never reaches the apiserver.

To reach the apiserver from inside a pod:

# use the kubernetes service — this IS reachable from any pod
curl -k https://kubernetes.default.svc.cluster.local/healthz
# or
curl -k https://10.96.0.1/healthz   # kubernetes service ClusterIP

This works because 10.96.0.1 is a ClusterIP — iptables on the node rewrites it to the apiserver's actual IP and port.


The Full Picture

                    ┌─────────────────────────────────┐
                    │         controlplane node         │
                    │  eth0: 172.30.1.2  (node IP)     │
                    │  lo:   127.0.0.1   (node loopback)│
                    │    apiserver: 127.0.0.1:6443      │
                    │    etcd:      127.0.0.1:2379      │
                    │                                   │
                    │  ┌─────────────┐                  │
                    │  │   pod A     │ ← own namespace  │
                    │  │ 192.168.0.2 │                  │
                    │  │ lo:127.0.0.1│ ← pod's loopback │
                    │  └──────┬──────┘                  │
                    │      veth pair                    │
                    │      bridge cni0                  │
                    └─────────────────────────────────┘
                              │ cross-node via CNI
                    ┌─────────────────────────────────┐
                    │           node01                  │
                    │  eth0: 172.30.2.2                 │
                    │  ┌─────────────┐                  │
                    │  │   pod B     │                  │
                    │  │ 192.168.1.3 │                  │
                    │  └─────────────┘                  │
                    └─────────────────────────────────┘

services (ClusterIP) — exist only as iptables rules, on every node
DNS (CoreDNS)        — a pod, reachable via its own ClusterIP 10.96.0.10

The Rules Worth Memorising

Pod to pod (same node): direct via bridge — no NAT, no routing.

Pod to pod (different node): CNI handles it — pod just sends to the pod IP.

Pod to service: iptables DNAT on the node — ClusterIP gets rewritten to a pod IP.

Pod to apiserver: use kubernetes.default.svc.cluster.local — never localhost.

Node to component: localhost works — you're on the same host.

localhost inside a container: the container's own loopback only — nothing else.

697