- .woodpecker.yaml: image paths -> library/autojanet-{agent,dispatcher}
- .woodpecker.yaml: secret names RS_HARBOR_USER / RS_HARBOR_PASS (global)
- container/Dockerfile: restore COPY skills/, skills/ populated from opencode config
- skills/: 84 opencode skills bundled into image
- k8s/manifests: update image refs to library/
5 KiB
| name | description |
|---|---|
| network-debugging | Use when diagnosing network connectivity issues in Zoe's homelab or work environments — DNS not resolving, TLS cert stuck, service unreachable, ingress not routing, Cilium dropping packets, or Pangolin tunnel not working. |
Network Debugging
Overview
Systematic outside-in debugging for Zoe's homelab stack: DigitalOcean DNS + BIND9 split-horizon, cert-manager DNS-01, Traefik IngressRoute, Cilium CNI, and Pangolin tunnels.
Rule: Always work from outside in. DNS → TLS → Ingress → Pod → Cilium → Pangolin.
Quick Symptom → First Command
| Symptom | First command |
|---|---|
| Can't reach service from browser | dig <hostname> @8.8.8.8 |
| Certificate expired / not trusted | kubectl get certificate -n <ns> |
| cert-manager stuck in Pending | kubectl get challenge -A |
| Service resolves but connection refused | kubectl get endpoints <svc> -n <ns> |
| Works internally, not externally | Check Pangolin annotations + external-dns target |
| Works externally, not from cluster | kubectl run nettest --image=nicolaka/netshoot |
| Pod can't reach external internet | Check Cilium NetworkPolicy egress rules |
| DNS resolves wrong IP | Compare dig @8.8.8.8 vs dig @10.0.6.6 (split-horizon issue) |
Level 1: DNS
# Public DNS
dig <hostname> @8.8.8.8
dig <hostname> @ns1.digitalocean.com
# Internal DNS (from within cluster)
kubectl run -it --rm dnsutils --image=busybox --restart=Never -- nslookup <hostname>
# ACME challenge record (cert-manager DNS-01)
dig TXT _acme-challenge.<hostname> @ns1.digitalocean.com
# ExternalDNS registration
kubectl logs -n external-dns -l app.kubernetes.io/name=external-dns | tail -20
Stack: DigitalOcean (ctz.fyi public) + BIND9 (10.0.6.6, split-horizon internal)
Public NS: ns1/ns2/ns3.digitalocean.com
Domains: *.ctz.fyi (public), *.i.ctz.fyi (internal only)
Level 2: TLS / cert-manager
# Certificate status
kubectl get certificate -n <namespace>
kubectl describe certificate <name> -n <namespace>
# Active ACME challenge
kubectl get challenge -A
kubectl describe challenge <name> -n <namespace>
# cert-manager errors
kubectl logs -n cert-manager deploy/cert-manager | grep -i error | tail -20
# Verify cert in secret
kubectl get secret <name>-tls -n <namespace> \
-o jsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -noout -dates
Common issue: cert-manager can't create DNS TXT record
- Check DigitalOcean token:
kubectl get secret digitalocean-dns -n cert-manager - Check outbound UDP 53 — Cilium NetworkPolicy may block cert-manager egress
Level 3: Ingress / Traefik
# Check IngressRoute
kubectl get ingressroute -n <namespace> -o yaml
# Traefik logs for hostname
kubectl logs -n traefik deploy/traefik | grep <hostname>
Critical gotcha: cert-manager reads Ingress objects, not IngressRoute CRDs.
You must have both:
IngressRoute— actual routingIngress— cert-manager TLS issuance + external-dns registration
Missing the companion Ingress = cert never issued, hostname never registered.
Level 4: Pod Connectivity
# Test from inside cluster
kubectl run -it --rm nettest --image=nicolaka/netshoot --restart=Never -- bash
# curl http://<service>.<namespace>.svc.cluster.local
# nslookup <service>.<namespace>.svc.cluster.local
# curl -v https://<external-hostname>
# Check service has endpoints (pod actually behind service?)
kubectl get endpoints <service> -n <namespace>
Level 5: Cilium
# Cilium status
kubectl exec -n kube-system ds/cilium -- cilium status
# Dropped flows
kubectl exec -n kube-system ds/cilium -- \
hubble observe --namespace <ns> --verdict DROPPED
# Active policies
kubectl get networkpolicy -n <namespace>
kubectl get ciliumnetworkpolicy -n <namespace>
# Pod identity
kubectl exec -n kube-system ds/cilium -- cilium endpoint list | grep <pod-ip>
Level 6: Pangolin Tunnel
# Check annotations on IngressRoute
kubectl get ingressroute <name> -n <namespace> -o yaml | grep pangolin
# Pangolin/Newt pod health
kubectl get pods -n pangolin
kubectl logs -n pangolin <newt-pod>
Required annotations for Pangolin-routed services:
annotations:
pangolin.fossorial.io/enabled: "true"
external-dns.alpha.kubernetes.io/target: "external"
EKS / Cloud Extras
# CoreDNS logs
kubectl logs -n kube-system -l k8s-app=kube-dns
# Security group check
aws ec2 describe-security-groups --group-ids sg-xxxx
Also check: VPC flow logs, ALB access logs, inbound/outbound security group rules.
Common Mistakes
| Mistake | Fix |
|---|---|
Only created IngressRoute, no Ingress |
Add companion Ingress for cert-manager + external-dns |
| cert-manager can't do DNS-01 | Check DigitalOcean API token secret exists in cert-manager ns |
| Split-horizon confusion | Always compare @8.8.8.8 vs @10.0.6.6 explicitly |
| Pangolin service not externally reachable | Verify both annotations are present |
| Cilium blocking cert-manager | Check egress NetworkPolicy for UDP 53 and TCP 443 |