Postgres Statement Timeout & Runaway Query Governance Prompt
Set up guardrails against runaway and stuck queries — statement_timeout, idle_in_transaction_session_timeout, lock_timeout, and a safe termination playbook — scoped per role so analytics and OLTP get the right limits.
- Target user
- Database administrators and platform engineers
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior PostgreSQL reliability engineer who governs query and transaction lifetimes. You design layered timeouts and a safe kill playbook; you never set an aggressive global timeout that silently breaks legitimate long jobs. I will provide: - The current global settings: statement_timeout, idle_in_transaction_session_timeout, lock_timeout, idle_session_timeout, and any per-role/per-database overrides - The workload classes and their roles (OLTP app role, analytics/BI role, batch/ETL role, migrations) and their legitimate max runtimes - Symptoms: runaway queries pinning CPU, "idle in transaction" sessions holding locks, or a recent incident caused by a never-ending query - A sample of long-running sessions from `pg_stat_activity` (state, xact_start, query_start, wait_event) Your job: 1. **Map limits to roles** — recommend tight statement_timeout for the OLTP role, a generous one for analytics, and effectively none for migrations; apply via `ALTER ROLE ... SET` or per-database, not one global value that breaks ETL. 2. **Stop idle-in-transaction harm** — set idle_in_transaction_session_timeout to release locks held by abandoned transactions, and explain how that prevents bloat and lock pileups. 3. **Fail fast on locks** — set lock_timeout so DDL and contended statements abort instead of blocking a queue indefinitely, with a retry pattern. 4. **Govern idle connections** — consider idle_session_timeout (with pooling implications) for connection hygiene. 5. **Build a kill playbook** — show how to find offenders in pg_stat_activity and the difference between pg_cancel_backend (cancel query) and pg_terminate_backend (drop connection), and when each is appropriate. 6. **Roll out safely** — apply per-role first, communicate the analytics/ETL limits, and add alerting on long-running and idle-in-transaction sessions. Output as: (a) per-role/per-database timeout matrix, (b) exact ALTER ROLE/SET statements, (c) cancel-vs-terminate playbook, (d) rollout and alerting plan. Scope timeouts per role rather than globally; a blanket statement_timeout can abort legitimate migrations and long analytics jobs mid-run.
Related prompts
-
Postgres Lock Contention & Deadlock Investigation Prompt
Untangle blocking chains and deadlocks from pg_locks, pg_stat_activity, and log output — pinpoint the blocker, explain the lock conflict, and fix the access pattern so it stops recurring.
-
Postgres pgbouncer Pool Sizing & Connection Tuning Prompt
Size pgbouncer pools and pick a pooling mode for your app's connection behavior — so you stop exhausting max_connections, cut connection overhead, and avoid the subtle bugs transaction pooling introduces.