kubectl + Linux Tools — Output Manipulation Reference¶
The Key Lesson From This Exercise¶
For any task that involves filtering, sorting, or formatting kubectl output — use standard Linux tools, not complex kubectl-specific patterns like --sort-by with jsonpath chains.
Why: awk, sort, tail, grep are:
- On every Linux system
- Covered in HTB/OverTheWire — you already know them
- Findable in any Linux cheatsheet
- Composable — combine them for any output format
The pattern is: kubectl get <resource> -o wide gives you the raw data → Linux tools shape it into whatever format you need.
The Exercise¶
Part I: Create a ClusterIP service named nginx-service, exposing nginx-deployment on port 8080 → targetPort 80.
Part II: Get pod IPs sorted ascending, save to pod_ips.txt with header:
Part I — ClusterIP Service (YAML approach)¶
Always use YAML for services — it's explicit, reviewable, and matches the exam pattern.
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
type: ClusterIP
selector:
app: nginx # must match pod labels exactly
ports:
- port: 8080 # service listens on 8080
targetPort: 80 # routes to container port 80
kubectl apply -f nginx-service.yml
kubectl describe service nginx-service # verify Endpoints are populated
Selector Mismatch — The Most Common Mistake¶
In the raw notes, Endpoints: <none> appeared because the selector app.kubernetes.io/name: nginx didn't match the actual pod labels.
Before writing the service YAML, always check pod labels:
Output:
The label is app: nginx, not app.kubernetes.io/name: nginx. Use app: nginx in the selector.
After applying, verify endpoints:
kubectl get endpoints nginx-service
# Should show: 192.168.1.4:80,192.168.1.5:80,192.168.1.6:80
# If <none>: selector still doesn't match
Part II — Pod IPs Sorted, Saved to File¶
The Working Command¶
echo "IP_ADDRESS" > pod_ips.txt
kubectl get pods -o wide | tail -n +2 | awk '{print $6}' | sort >> pod_ips.txt
cat pod_ips.txt
Full Breakdown — Every Part¶
Step 1:
Creates the file with just the header line.> overwrites (creates if missing).
Step 2:
Output:NAME READY STATUS RESTARTS AGE IP NODE
pod-abc 1/1 Running 0 5m 192.168.1.4 node01
pod-def 1/1 Running 0 5m 192.168.1.5 node01
pod-ghi 1/1 Running 0 5m 192.168.1.6 node01
-o wide adds the IP column. IP is column 6.
Step 3:
Skips the first line (the column headers).tail -n +2 = "start from line 2 onwards." Without this, IP (the header text) would appear in your output.
Step 4:
Extracts column 6 (the IP) from each line.$6 = 6th whitespace-separated field.
Step 5:
Sorts the IPs in ascending order. Withoutsort, order depends on whatever kubectl returns.
Step 6:
Appends to the file (not overwrites). The header is already there from step 1 — you append the IPs after it.Result in pod_ips.txt:
Why Not --sort-by or jsonpath for This¶
--sort-by='.status.podIP' and the jsonpath range pattern technically work, but:
- .status.podIP is not a field path shown on the kubectl cheatsheet
- You'd have to derive it by reading raw JSON (kubectl get pods -o json)
- Under exam pressure, this adds unnecessary complexity
- You can't verify you have the right field path without testing it
The Linux pipeline approach (-o wide | tail -n +2 | awk '{print $6}' | sort) uses tools you already know, each step is visible and debuggable, and the logic is obvious.
The Linux Pipeline Pattern for kubectl Output¶
Any time you need to manipulate kubectl output:
kubectl get <resource> -o wide ← get the data with all columns
| tail -n +2 ← skip the header row
| awk '{print $N}' ← extract column N
| sort ← sort
| sort -u ← sort + deduplicate
| grep "pattern" ← filter rows
| grep -v "pattern" ← exclude rows
| head -1 ← first result only
| wc -l ← count results
Common column positions in kubectl get pods -o wide:
| Column | $N | Content |
|---|---|---|
| NAME | $1 | Pod name |
| READY | $2 | Ready count |
| STATUS | $3 | Running/Pending/etc |
| RESTARTS | $4 | Restart count |
| AGE | $5 | How long running |
| IP | $6 | Pod IP |
| NODE | $7 | Node name |
kubectl get nodes -o wide columns:
| Column | $N | Content |
|---|---|---|
| NAME | $1 | Node name |
| STATUS | $2 | Ready/NotReady |
| ROLES | $3 | control-plane/worker |
| AGE | $4 | |
| VERSION | $5 | Kubernetes version |
| INTERNAL-IP | $6 | Node IP |
Writing Output to Files — Pattern Reference¶
# Write header then append data
echo "HEADER" > file.txt
kubectl get pods -o wide | tail -n +2 | awk '{print $6}' | sort >> file.txt
# Write all at once with heredoc
cat > file.txt << EOF
line1
line2
EOF
# Overwrite with command output
kubectl get pods -o wide > output.txt
# Append to existing file
kubectl get pods -o wide >> output.txt
Quick Reference¶
# Get pod IPs sorted, with header
echo "IP_ADDRESS" > pod_ips.txt
kubectl get pods -o wide | tail -n +2 | awk '{print $6}' | sort >> pod_ips.txt
# Get pod names only
kubectl get pods -o wide | tail -n +2 | awk '{print $1}'
# Get pods on a specific node
kubectl get pods -o wide | grep node01
# Get running pods only
kubectl get pods | grep Running | awk '{print $1}'
# Count pods
kubectl get pods | tail -n +2 | wc -l
# Check selector before creating service
kubectl get pods --show-labels
# Verify service has endpoints
kubectl get endpoints <svc-name>
# Full service debug sequence
kubectl get pods --show-labels # check labels
kubectl describe svc <name> # check Selector + Endpoints
kubectl get endpoints <name> # see actual pod IPs