Troubleshooting toolkits by stack
Pick your stack and get a guided path — the top production errors, the best troubleshooting prompts, a free validator or assistant, a repeatable runbook, and a senior review when you need one.
Kubernetes
Fix Kubernetes deployment, pod, ingress, YAML, and cluster issues faster with prompts, validators, runbooks, and incident workflows built for real DevOps engineers.
OpenStack
Diagnose Horizon 504s, Cinder and RabbitMQ timeouts, dead Neutron agents, Keystone auth, and Kolla-Ansible issues with control-plane runbooks and AI workflows.
Terraform
Review plans, validate HCL, and untangle state, provider, and drift issues before apply — with prompts and a browser-based Terraform validator.
GitLab CI
Debug stuck pipelines, offline runners, .gitlab-ci.yml syntax, missing variables, and deployment jobs with prompts and a CI validator.
Prometheus
Fix down targets, missing metrics, noisy or silent alerts, and slow PromQL — with alert-rule tooling and monitoring prompts.
Linux
Fix failed services, full disks, permission errors, high load, OOM kills, DNS, and SSH problems with command-driven guides and prompts.
RabbitMQ
Diagnose RPC timeouts, queue backlogs, missed heartbeats, memory/disk alarms, and cluster partitions — with queue-level runbooks and prompts.
Docker
Fix restarting containers, image-pull failures, Dockerfile build errors, port conflicts, volume permissions, and container networking with prompts and a Dockerfile validator.
Ansible
Debug playbook failures, become/connection errors, Jinja2 templating, inventory issues, and idempotency problems with prompts and a playbook validator.
Helm
Debug chart templating, failed releases and upgrades, values precedence, and rendering errors with prompts and a Helm chart validator.
NGINX
Debug reverse-proxy and upstream errors, TLS, location-block precedence, rate limiting, and 502/504s with prompts and an nginx.conf validator.
Kafka
Diagnose broker and controller issues, under-replicated partitions, consumer lag, rebalances, and retention with prompts and Kafka runbooks.
PostgreSQL
Diagnose slow queries and EXPLAIN plans, locks and deadlocks, bloat and vacuum, replication lag, and connection exhaustion with prompts and runbooks.
Grafana
Debug datasource errors, no-data panels, templating, alerting, auth/SSO, and provisioning issues with prompts and Grafana runbooks.
Redis
Diagnose OOM and eviction, MISCONF persistence, replication and Sentinel failover, Cluster slot errors, and latency spikes with prompts and Redis runbooks.
MySQL
Diagnose slow queries and EXPLAIN plans, locks and deadlocks, replication lag, connection exhaustion, and InnoDB issues with prompts and runbooks.