Postgres Slow Query EXPLAIN Triage Prompt

Turn a confusing EXPLAIN (ANALYZE, BUFFERS) plan into a ranked, plain-English diagnosis with concrete index/rewrite fixes and a verification step — so you fix the real bottleneck, not a guess.

Target user

Backend and platform engineers chasing a single slow Postgres query

Difficulty

Intermediate

Tools

Claude, ChatGPT, Cursor

You are a senior PostgreSQL DBA who reads execution plans for a living. You diagnose from evidence in the plan, never from folklore, and you always confirm a fix with a fresh plan. I will paste: - The SQL query: [SQL QUERY] - The full plan from `EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) <query>`: [EXPLAIN OUTPUT] - Postgres version and relevant settings if known (work_mem, random_page_cost): [VERSION / SETTINGS] - Table sizes and existing indexes if known: [SCHEMA / INDEXES] Work through this in order: 1. **Find the dominant node** — identify where actual time and buffer reads concentrate. Flag any node where `actual rows` diverges sharply from `estimated rows` (a planner mis-estimate), large `Rows Removed by Filter`, external Sort/Hash spills to disk, or high `shared read` (cold cache) vs `shared hit`. 2. **Name the root cause** — e.g. missing index forcing a Seq Scan, a non-sargable predicate (function on an indexed column), stale statistics, bad join order from a bad estimate, or under-sized work_mem causing a spill. Tie each conclusion to a specific line in the plan. 3. **Propose fixes, ranked by impact and risk** — for each: the exact DDL or rewrite, why it helps, and the downside (write overhead, lock cost, plan regression risk). Prefer the smallest change that addresses the dominant node. 4. **Give a verification step** — the exact `EXPLAIN (ANALYZE, BUFFERS)` to re-run, what numbers should change (e.g. Seq Scan → Index Scan, buffers down, no spill), and an `ANALYZE <table>` to run first if statistics look stale. Output format: (a) one-line verdict, (b) a ranked table of fixes [change | impact | risk | how to verify], (c) the exact commands to run, in order. Guardrails: validate every index/rewrite on a replica or a snapshot of prod first. Build new indexes with `CREATE INDEX CONCURRENTLY`. Never DROP an index on prod until you have confirmed nothing relies on it (check pg_stat_user_indexes idx_scan) and you can recreate it. Wrap any DDL you do apply in a transaction where the storage engine allows it.

Why this prompt works

EXPLAIN plans are dense, and the natural failure mode is to skim for the word “Seq Scan” and bolt on an index. This prompt forces a disciplined read: locate the node that actually dominates wall-clock time and buffer traffic, then explain the cost from the evidence in front of you. By anchoring the estimate-vs-actual gap and Rows Removed by Filter, it surfaces the planner mis-estimates and non-sargable predicates that an index alone won’t fix.

The ranked-fix table keeps you honest about trade-offs. A new index speeds reads but taxes every write and can itself be the wrong shape; a query rewrite may be safer than new storage. Asking for impact and risk side by side means the model proposes the smallest effective change rather than a pile of speculative indexes.

Finally, the verification step closes the loop. A tuning change is only real once a fresh EXPLAIN (ANALYZE, BUFFERS) shows the dominant node shrink and the spill or cold reads disappear — and the guardrails (replica testing, CONCURRENTLY, baselines) keep the human in control of anything that touches production.

Postgres Slow Query EXPLAIN Triage Prompt

Why this prompt works

Related prompts

Postgres postgresql.conf Workload Tuning Prompt

Postgres Index Advisor from pg_stat_statements Prompt

Why this prompt works

Related prompts

Postgres postgresql.conf Workload Tuning Prompt

Postgres Index Advisor from pg_stat_statements Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet