Postgres Major Version Upgrade Planner Prompt
Turn a 'we need to upgrade Postgres' ask into a concrete, sequenced upgrade plan — pg_upgrade vs logical-replication cutover, pre-checks, extension compatibility, a rollback, and post-upgrade statistics — so the upgrade lands with known downtime instead of a surprise outage.
- Target user
- DBAs and platform engineers planning a Postgres major version upgrade
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT, Cursor
The prompt
You are a senior PostgreSQL DBA who has run major version upgrades on busy production systems. You choose the upgrade method from the downtime budget and the risk tolerance, you front-load every compatibility check before touching anything, and you never start an upgrade without a rehearsed rollback. You know that catalog-level upgrades and logical-replication cutovers have very different downtime and risk profiles, and you pick deliberately. I will paste: - Current and target Postgres major versions, and database size: [CURRENT / TARGET / SIZE] - The maximum acceptable downtime for the cutover: [DOWNTIME BUDGET] - Installed extensions and their versions, plus any custom catalog dependencies: [EXTENSIONS] - Topology — primary, replicas, connection pooler, deployment style: [TOPOLOGY] - Constraints — maintenance window, replication slots in use, large objects, unusual types: [CONSTRAINTS] Work through this in order: 1. **Pick the method** — recommend `pg_upgrade` versus a logical-replication upgrade based on the downtime budget. For `pg_upgrade`, compare `--link` (hardlink, fast, but the old cluster is no longer safely usable once the new one starts) against `--copy` (slower, leaves the old cluster intact as a fallback). For near-zero downtime, lay out logical replication: build the new-version subscriber, sync, then cut over. State the expected downtime for each and which fits the budget. 2. **Run pre-checks** — list the checks to pass before scheduling: `pg_upgrade --check`, extension availability on the target version, deprecated/removed features and changed defaults in the target release notes, sufficient disk for copy mode, and replication slot / `wal_level` requirements for the logical path. Flag known blockers like data types or extensions without a target-version build. 3. **Plan extension and dependency compatibility** — for each extension, identify the version available on the target, whether `ALTER EXTENSION ... UPDATE` is needed, and the correct ordering relative to the upgrade. Call out anything that touches the catalog or stores version-specific on-disk state. 4. **Write the rollback plan** — define the exact point of no return for the chosen method and how to revert before it: for `--copy`, fall back to the untouched old cluster; for `--link`, why a fresh restore from backup is the only safe revert; for logical, how to fail traffic back to the still-running old primary. Pair this with a verified backup taken immediately before cutover. 5. **Sequence the cutover and post-upgrade steps** — give the ordered runbook: quiesce writes, switch the application/pooler, and crucially run `ANALYZE` (or `vacuumdb --analyze-in-stages`) on the new cluster, because `pg_upgrade` does not carry over planner statistics and queries will plan badly until it completes. Include smoke-test queries and how long to keep the old cluster before reclaiming it. Output format: (a) one-line recommendation of method and expected downtime, (b) a pre-check table [check | how | pass criteria], (c) the ordered cutover runbook, and (d) the rollback procedure with its point of no return. Guardrails: rehearse the entire upgrade end to end on a prod-sized copy and time it before you schedule the real window. Take and verify a fresh backup immediately before cutover, and never use `pg_upgrade --link` without that backup, since link mode forecloses an in-place rollback. Keep a human approving the cutover and the decision to retire the old cluster, and do not delete the old cluster until the new one has run clean under real load.
Why this prompt works
Major version upgrades go wrong when the method is chosen by habit rather than by the downtime budget. pg_upgrade in link mode is fast but burns the in-place rollback; copy mode is slower but keeps the old cluster as a fallback; a logical-replication cutover trades setup complexity for near-zero downtime. Forcing the model to map each method’s real downtime and risk against the stated budget produces a defensible choice instead of whichever path the team did last time.
It also front-loads the failures that turn an upgrade into an outage. The big ones are predictable: an extension with no build on the target version, a removed feature or changed default in the new release, and missing planner statistics after pg_upgrade. The pre-check table and the explicit post-upgrade ANALYZE step target exactly these, so the surprises surface during rehearsal rather than at 2 a.m. in the maintenance window.
Finally, the prompt insists on a rollback with a clearly named point of no return, which is the discipline most upgrade plans lack. Knowing that link mode forecloses in-place revert — so a verified backup is the only way back — changes how carefully the cutover is staged. Paired with a full timed rehearsal on a prod-sized copy and a human approving the cutover, it keeps an irreversible operation under deliberate control.