Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AI for Postgres Difficulty: Intermediate ClaudeChatGPTCursor

Postgres Index Type Selector Prompt

Pick the right index access method — B-tree, BRIN, GIN, GiST, Hash, or SP-GiST — for a specific column and query pattern, reasoning from cardinality, data distribution, and the operators your queries actually use.

Target user
DBAs and backend engineers who reach for B-tree by reflex and want the access method that actually fits the column
Difficulty
Intermediate
Tools
Claude, ChatGPT, Cursor

The prompt

You are a senior PostgreSQL DBA who knows that B-tree is the right answer most of the time
and the wrong answer often enough to matter. You reason from first principles: an index
type is a deal between the operators it can answer, the physical layout of the data, and
the storage and maintenance cost it imposes. BRIN is nearly free but only pays off when the
column correlates with physical row order; GIN is built for "is this contained in that"
over composite values like JSONB, arrays, and tsvector; GiST and SP-GiST answer geometric
and range overlap; Hash answers equality and nothing else. You match the method to the
query, not to habit.

I will paste:
- The table and column(s) in question, with data type and rough row count: [TABLE/COLUMN]
- The cardinality and distribution of the column (distinct values, skew, whether it
  correlates with insertion/physical order like a timestamp or serial): [DISTRIBUTION]
- The exact predicates and operators these queries use (=, <, BETWEEN, @>, ?, &&,
  ILIKE, ORDER BY, etc.): [QUERY PATTERNS]
- Write rate and how much index size/maintenance you can afford: [WRITE/STORAGE BUDGET]

Do the following:

1. **Read the operators** — map each query's operators to the access methods that can
   actually serve them. Equality-and-range and ORDER BY point at B-tree; containment (@>),
   existence (?), and full-text point at GIN; overlap (&&) and nearest-neighbor point at
   GiST; pure equality on a huge non-ordered column can point at Hash.

2. **Weigh the data shape** — for naturally-ordered, append-mostly columns (event time,
   bigserial) on large tables, evaluate BRIN: it is tiny and cheap but only effective when
   physical/logical correlation is high. State the correlation assumption explicitly and
   how to check it (pg_stats correlation).

3. **Cost the trade-off** — for each viable method, give the win (which queries get faster)
   against the cost (index size, build time, write amplification, and any caveats such as
   GIN's slower single-row updates or the fastupdate pending-list behavior).

4. **Recommend and justify** — pick one primary recommendation per query pattern, note any
   opclass that matters (for example a path-ops opclass for JSONB containment, or a
   trigram opclass for ILIKE), and call out when the honest answer is "B-tree is fine."

Output format: a decision table [column/query | operators | recommended index type +
opclass | why this beats B-tree here | cost caveat], followed by the exact CREATE INDEX
statements.

Guardrails: build candidate indexes on a replica or snapshot first and confirm the planner
actually chooses them with EXPLAIN (ANALYZE, BUFFERS) on representative data — an index the
planner ignores is pure write cost. For BRIN, verify the correlation assumption holds before
trusting it, since a reordered or randomly-inserted column makes BRIN nearly useless. Build
with CREATE INDEX CONCURRENTLY in production and keep a human reviewing each choice; do not
swap a working B-tree for an exotic type without measured proof it wins.

Why this prompt works

Most index problems are not missing indexes — they are the wrong kind of index. Engineers default to B-tree because it is the implicit choice behind PRIMARY KEY and UNIQUE, and then wonder why a JSONB containment query or a timestamp-range scan over a billion rows stays slow. This prompt reframes index selection as a deal between three forces: the operators the query uses, the physical shape of the data, and the storage-and-maintenance budget. Starting from the operators is the key move, because the access method is fundamentally defined by which operator classes it can answer, and that immediately rules most candidates in or out.

The distribution and cost steps are where the genuinely useful decisions live. BRIN is the headline example: it is almost free and can replace a huge B-tree on an append-mostly time-series table, but it collapses to a near-full scan the moment physical correlation breaks down. By forcing the correlation assumption to be stated and checked against pg_stats, the prompt prevents the common failure of recommending BRIN on a column that has been shuffled by updates or out-of-order inserts. The same discipline applies to GIN, where the win on containment and full-text comes with a real write-side cost that has to be measured, not assumed.

Crucially, the prompt is honest about when nothing beats B-tree. Exotic index types are over-prescribed by people who just learned they exist, and a GiST or Hash index that the planner never chooses is pure write amplification. By ending every recommendation with an EXPLAIN-on-a-replica guardrail and an explicit “B-tree is fine” escape hatch, the output stays grounded in measured behavior rather than novelty, and a human still approves each swap before it touches production.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week