KubeGraf Documentation
Local-first Kubernetes tool for detecting incidents, understanding root causes with evidence,
and safely responding to failures—without SaaS lock-in.
What KubeGraf Does
Incident Detection
Automatically monitors for common Kubernetes failures:
- CrashLoopBackOff - Containers repeatedly failing to start
- OOMKilled - Pods killed due to memory limits
- ImagePullBackOff - Failed image pulls
- Probe failures - Liveness and readiness check failures
- Restart storms - Excessive pod restarts
Evidence-Based Diagnosis
Correlates multiple data sources to explain failures:
- Event timeline showing when failures occurred
- Recent changes (deployments, configs, secrets)
- Container logs with error extraction
- Resource usage patterns
- Related failures across the cluster
Safe Fix Previews
Preview changes before applying them:
- Dry-run validation before any changes
- Diff view showing exactly what will change
- Rollback suggestions for failed deployments
- Read-only mode by default
- All actions require explicit confirmation
Knowledge Bank
Local incident storage for learning and reporting:
- SQLite database stores all incident history
- Search by pod, namespace, error type, or fix
- Export reports for postmortems
- Track recurring patterns across time
- No data leaves your machine
Three Ways to Use KubeGraf
⌨️
Terminal UI
Keyboard-driven interface for SSH sessions and power users. Works over slow connections.
🌐
Web Dashboard
Browser-based UI with resource map visualization and real-time updates. Run locally with kubegraf web.
🚀
Modern SPA
Single-page app interface for teams. All three interfaces work with the same local backend.
Requirements
- A working
kubectl configuration
- Access to a Kubernetes cluster (local or remote)
- macOS, Linux, or Windows
Quick Install
curl -sSL https://kubegraf.io/install.sh | bash
See the Installation Guide for more options including
Homebrew, Scoop, and manual installation.