kubectl + Linux Tools — Output Manipulation Reference¶

The Key Lesson From This Exercise¶

For any task that involves filtering, sorting, or formatting kubectl output — use standard Linux tools, not complex kubectl-specific patterns like --sort-by with jsonpath chains.

Why: awk, sort, tail, grep are: - On every Linux system - Covered in HTB/OverTheWire — you already know them - Findable in any Linux cheatsheet - Composable — combine them for any output format

The pattern is: kubectl get <resource> -o wide gives you the raw data → Linux tools shape it into whatever format you need.

The Exercise¶

Part I: Create a ClusterIP service named nginx-service, exposing nginx-deployment on port 8080 → targetPort 80.

Part II: Get pod IPs sorted ascending, save to pod_ips.txt with header:

IP_ADDRESS
192.168.1.4
192.168.1.5
192.168.1.6

Part I — ClusterIP Service (YAML approach)¶

Always use YAML for services — it's explicit, reviewable, and matches the exam pattern.

apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  type: ClusterIP
  selector:
    app: nginx          # must match pod labels exactly
  ports:
  - port: 8080          # service listens on 8080
    targetPort: 80      # routes to container port 80

kubectl apply -f nginx-service.yml
kubectl describe service nginx-service    # verify Endpoints are populated

Selector Mismatch — The Most Common Mistake¶

In the raw notes, Endpoints: <none> appeared because the selector app.kubernetes.io/name: nginx didn't match the actual pod labels.

Before writing the service YAML, always check pod labels:

kubectl get pods --show-labels

Output:

NAME                              LABELS
nginx-deployment-5bdb4f8cbb-hl8w9  app=nginx,pod-template-hash=5bdb4f8cbb

The label is app: nginx, not app.kubernetes.io/name: nginx. Use app: nginx in the selector.

After applying, verify endpoints:

kubectl get endpoints nginx-service
# Should show: 192.168.1.4:80,192.168.1.5:80,192.168.1.6:80
# If <none>: selector still doesn't match

Part II — Pod IPs Sorted, Saved to File¶

The Working Command¶

echo "IP_ADDRESS" > pod_ips.txt
kubectl get pods -o wide | tail -n +2 | awk '{print $6}' | sort >> pod_ips.txt
cat pod_ips.txt

Full Breakdown — Every Part¶

Step 1:

echo "IP_ADDRESS" > pod_ips.txt

Creates the file with just the header line. > overwrites (creates if missing).

Step 2:

kubectl get pods -o wide

Output:

NAME        READY  STATUS   RESTARTS  AGE  IP            NODE
pod-abc     1/1    Running  0         5m   192.168.1.4   node01
pod-def     1/1    Running  0         5m   192.168.1.5   node01
pod-ghi     1/1    Running  0         5m   192.168.1.6   node01

-o wide adds the IP column. IP is column 6.

Step 3:

| tail -n +2

Skips the first line (the column headers). tail -n +2 = "start from line 2 onwards." Without this, IP (the header text) would appear in your output.

Step 4:

| awk '{print $6}'

Extracts column 6 (the IP) from each line. $6 = 6th whitespace-separated field.

Step 5:

| sort

Sorts the IPs in ascending order. Without sort, order depends on whatever kubectl returns.

Step 6:

>> pod_ips.txt

Appends to the file (not overwrites). The header is already there from step 1 — you append the IPs after it.

Result in pod_ips.txt:

IP_ADDRESS
192.168.1.4
192.168.1.5
192.168.1.6

Why Not --sort-by or jsonpath for This¶

--sort-by='.status.podIP' and the jsonpath range pattern technically work, but: - .status.podIP is not a field path shown on the kubectl cheatsheet - You'd have to derive it by reading raw JSON (kubectl get pods -o json) - Under exam pressure, this adds unnecessary complexity - You can't verify you have the right field path without testing it

The Linux pipeline approach (-o wide | tail -n +2 | awk '{print $6}' | sort) uses tools you already know, each step is visible and debuggable, and the logic is obvious.

The Linux Pipeline Pattern for kubectl Output¶

Any time you need to manipulate kubectl output:

kubectl get <resource> -o wide          ← get the data with all columns
  | tail -n +2                          ← skip the header row
  | awk '{print $N}'                    ← extract column N
  | sort                                ← sort
  | sort -u                             ← sort + deduplicate
  | grep "pattern"                      ← filter rows
  | grep -v "pattern"                   ← exclude rows
  | head -1                             ← first result only
  | wc -l                               ← count results

Common column positions in kubectl get pods -o wide:

Column	$N	Content
NAME	$1	Pod name
READY	$2	Ready count
STATUS	$3	Running/Pending/etc
RESTARTS	$4	Restart count
AGE	$5	How long running
IP	$6	Pod IP
NODE	$7	Node name

kubectl get nodes -o wide columns:

Column	$N	Content
NAME	$1	Node name
STATUS	$2	Ready/NotReady
ROLES	$3	control-plane/worker
AGE	$4
VERSION	$5	Kubernetes version
INTERNAL-IP	$6	Node IP

Writing Output to Files — Pattern Reference¶

# Write header then append data
echo "HEADER" > file.txt
kubectl get pods -o wide | tail -n +2 | awk '{print $6}' | sort >> file.txt

# Write all at once with heredoc
cat > file.txt << EOF
line1
line2
EOF

# Overwrite with command output
kubectl get pods -o wide > output.txt

# Append to existing file
kubectl get pods -o wide >> output.txt

Quick Reference¶

# Get pod IPs sorted, with header
echo "IP_ADDRESS" > pod_ips.txt
kubectl get pods -o wide | tail -n +2 | awk '{print $6}' | sort >> pod_ips.txt

# Get pod names only
kubectl get pods -o wide | tail -n +2 | awk '{print $1}'

# Get pods on a specific node
kubectl get pods -o wide | grep node01

# Get running pods only
kubectl get pods | grep Running | awk '{print $1}'

# Count pods
kubectl get pods | tail -n +2 | wc -l

# Check selector before creating service
kubectl get pods --show-labels

# Verify service has endpoints
kubectl get endpoints <svc-name>

# Full service debug sequence
kubectl get pods --show-labels          # check labels
kubectl describe svc <name>             # check Selector + Endpoints
kubectl get endpoints <name>            # see actual pod IPs