PostgreSQL HA Automatic Failover Design Prompt

Produces a reviewed high-availability architecture for automatic PostgreSQL failover using Patroni or repmgr, covering quorum/DCS topology, replication mode, fencing, traffic routing, and a concrete failover test plan.

Target user

DBAs, SREs, and platform engineers standing up or hardening self-managed PostgreSQL clusters that must survive node loss without manual intervention

Difficulty

Advanced

Tools

Claude, ChatGPT, Cursor

You are a senior PostgreSQL DBA who designs high-availability clusters for production systems where an unplanned primary loss must promote a healthy standby automatically, correctly, and without data corruption. You reason from first principles: a safe failover is fundamentally a distributed-consensus problem layered on top of streaming replication, so before recommending any topology you establish where the source of truth for "who is primary" lives, what the acceptable data-loss window (RPO) and recovery time (RTO) are, and how a partitioned-but-alive old primary will be prevented from accepting writes. You never optimize for a clean demo at the expense of split-brain safety. I will paste: - [CURRENT_TOPOLOGY] — number of nodes, regions/AZs, and how PostgreSQL is deployed (bare metal, VMs, containers, Kubernetes) - [STACK_CHOICE] — Patroni, repmgr, or "recommend one" plus any constraints - [DCS_OR_CONSENSUS] — existing etcd/Consul/ZooKeeper/Kubernetes API availability, or none yet - [RPO_RTO_TARGETS] — acceptable data loss and acceptable downtime per failover - [REPLICATION_REQUIREMENTS] — whether synchronous commit is mandated, latency between nodes, write volume - [CLIENT_ROUTING] — how applications currently reach the primary (VIP, HAProxy, PgBouncer, service discovery, connection string) - [CONSTRAINTS] — version, OS, network/firewall rules, change-control limits, and any existing backups/WAL archiving Then work through the following steps: 1. **Clarify RPO/RTO and derive replication mode** — Translate the stated targets into a concrete choice: asynchronous replication (lower latency, possible data loss on failover) versus synchronous (`synchronous_commit` with `synchronous_standby_names`, zero/bounded data loss but write availability depends on a healthy synchronous standby). Call out the quorum-commit option (`ANY k (...)`) and the failure modes of each. 2. **Design the consensus/DCS layer** — Specify the quorum store (etcd/Consul/ZooKeeper, or the Kubernetes API for Patroni). Require an odd number of voting members spread across failure domains so the cluster keeps a majority when one domain dies. Explain why the DCS, not PostgreSQL, owns the leader lock. 3. **Map the node topology to failure domains** — Place primary and standbys across AZs/regions so no single failure (host, rack, AZ) loses both a quorum and all writable candidates. Note cascading-replication or witness-node options for small clusters. 4. **Define fencing and split-brain prevention** — Detail how the old primary is demoted or isolated: DCS leader-lock expiry plus Patroni `pg_rewind`/demote, watchdog/`softdog` for self-fencing, STONITH or cloud API stop where available. Spell out what happens to a primary that loses DCS connectivity but is still running. 5. **Specify client routing and connection draining** — Define how clients are steered to the new primary: callback-driven HAProxy health checks against the Patroni REST API (`/primary`, `/replica`), VIP move, or DNS/service-discovery update, fronted by PgBouncer. Include connection-reset behavior and `target_session_attrs=read-write` where applicable. 6. **Author the config artifacts** — Produce the key fields of the Patroni YAML (or repmgr.conf): `ttl`, `loop_wait`, `retry_timeout`, `maximum_lag_on_failover`, `synchronous_mode`, and the bootstrap/`pg_hba` essentials, with each value justified against the RTO/RPO. 7. **Build the failover test plan** — Enumerate drills: graceful switchover, hard primary kill, DCS-node loss, network partition of the primary, and full-AZ loss. For each, state expected behavior, what to observe, and the data-integrity check afterward. Output format: Return a markdown architecture document containing (a) a topology table with columns Node | Role | Failure Domain | Sync? | Notes, (b) a fenced config block of the recommended Patroni/repmgr settings with inline justification comments, and (c) a failover test matrix with columns Scenario | Trigger | Expected Outcome | Observation | Pass Criteria. Guardrails: Treat every recommendation as a proposal, not an instruction to execute. Validate the entire design on a replica or staging cluster that mirrors production topology before any production change, and confirm a tested, restorable backup plus WAL archiving exist before enabling automatic failover. Never run a destructive failover drill in production without an explicit maintenance window and a named human approving the change; document a manual rollback path for every scenario. Flag any place where synchronous mode could block writes so the operator can make an informed availability-versus-durability decision.

Why this prompt works

High-availability failover is one of the most dangerous areas of PostgreSQL operations because the failure that triggers it is, by definition, the moment your assumptions are tested. This prompt forces the model to reason about consensus before replication and replication before routing, which mirrors how experienced DBAs actually design clusters: the question is never “how do I promote a standby” but “how do I guarantee exactly one node believes it is primary.” By demanding RPO/RTO targets up front and translating them into a concrete synchronous-versus-asynchronous decision, the prompt prevents the common failure of copying a Patroni config from a blog post without understanding its durability trade-offs.

The structured placeholders ensure the model has the context that genuinely changes the answer, especially failure-domain layout and the existing DCS. A three-node cluster in a single AZ and a five-node cluster spread across three AZs demand fundamentally different quorum and witness decisions, and the prompt’s insistence on mapping nodes to failure domains surfaces single points of failure that are easy to miss. Requiring an odd number of voting members and explicit fencing language steers the model away from the most catastrophic outcome, split-brain, which silently corrupts data rather than causing an obvious outage.

Finally, the output format makes the design reviewable and testable rather than aspirational. A topology table, a justified config block, and a failover test matrix give an operator concrete artifacts to inspect, challenge, and rehearse on staging. The guardrails close the loop by insisting on a tested backup, a staging validation, and a human approver before any production change, which is exactly the discipline that separates a resilient cluster from one that only appears resilient until the first real primary loss.

PostgreSQL HA Automatic Failover Design Prompt

Why this prompt works

Related prompts

Postgres Replication Lag Debugging Prompt

Why this prompt works

Related prompts

Postgres Replication Lag Debugging Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet