Debug a CrashLoopBackOff

Why this matters

CrashLoopBackOff is one of the most common Kubernetes incidents. What you really want is not “what does CrashLoopBackOff mean?” but “why is this pod crashing and what should I try next?”

KubeGraf helps you go from a red pod to a plausible root cause and fix plan by combining logs, events, and an incident timeline.

Scenario: payments API keeps crashing in prod-cluster

kubectl config use-context prod-cluster
kubectl get pods -n payments
NAME                                   READY   STATUS             RESTARTS   AGE
payments-api-66cbd9d4dc-7xg9n          0/1     CrashLoopBackOff   5          2m31s
payments-api-66cbd9d4dc-87zc2          1/1     Running            0          5m12s
redis-payments-0                       1/1     Running            0          10m

Step‑by‑step flow

1. Confirm the problem with kubectl

kubectl get pods -n payments
kubectl describe pod payments-api-66cbd9d4dc-7xg9n -n payments | sed -n '1,40p'

2. Open KubeGraf on the right cluster and namespace

kubegraf
  • Press c and select prod-cluster if needed.
  • Press n and select the payments namespace.
  • Switch to the Pods view and filter with /payments-api.

Tip: Use status filters (if available) to quickly highlight only unhealthy workloads (CrashLoopBackOff, Error, ImagePullBackOff).

3. Inspect logs and events through KubeGraf

2025-03-22T12:01:03Z ERROR payments-api Failed to start HTTP server: DB_CONNECTION_STRING not set
2025-03-22T12:01:03Z ERROR payments-api Exiting with code 1

4. Use the Incident Timeline and Brain Panel

  • Incident Timeline shows a new deployment of payments-api, a config map update, and failing probes.
  • Brain Panel summarizes: “payments-api started crashing after a new rollout. Logs show DB_CONNECTION_STRING not set. Check the associated config or secret.”

5. Fix the underlying issue

kubectl rollout undo deployment/payments-api -n payments
kubectl edit configmap payments-api-config -n payments
kubectl rollout status deployment/payments-api -n payments
kubectl get pods -n payments

Expected outcome

After following this workflow you should be able to:

  • Take a CrashLoopBackOff from a red pod to a concrete, likely root cause.
  • Use KubeGraf’s logs, events, and Incident Timeline instead of guessing from raw kubectl output alone.
  • Apply and validate a fix confidently, knowing what changed and why.