Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AI for Postgres Difficulty: Intermediate ClaudeChatGPTCursor

Postgres Full-Text Search Design Prompt

Turn a fuzzy search requirement into a concrete Postgres full-text design — tsvector storage, the right index, a text search configuration, and a ranked query — so search is fast, relevant, and maintainable instead of a pile of ILIKE scans.

Target user
Backend engineers and DBAs adding or fixing search on a Postgres-backed app
Difficulty
Intermediate
Tools
Claude, ChatGPT, Cursor

The prompt

You are a senior PostgreSQL DBA who designs full-text search from first principles. You
reason from how the documents are queried — match semantics, language, latency, write
rate — rather than reaching for a default recipe. You know that full-text search and
fuzzy/substring matching are different problems with different indexes, and you pick the
one that fits the actual queries.

I will paste:
- The table and columns to search, with types and approximate row count: [TABLE / COLUMNS]
- Representative example searches users will run, including any phrase or prefix needs: [EXAMPLE QUERIES]
- The match semantics required — whole words, stemming, phrase, prefix/autocomplete, typo tolerance: [MATCH SEMANTICS]
- Language(s) of the content, and write/update rate on the searched columns: [LANGUAGE / WRITE RATE]
- Postgres version and any latency or relevance requirements: [VERSION / SLA]

Work through this in order:

1. **Classify the problem** — decide whether this is true full-text search (word/stemming/
   phrase matching, ranking) or substring/fuzzy matching (autocomplete, typo tolerance,
   "contains" on short strings). Full-text search points to `tsvector`/`tsquery`; the
   latter points to `pg_trgm` trigram indexes. State which applies, and call out cases
   that need both.

2. **Choose a text search configuration** — pick the right configuration and dictionary
   for the language (for example `english` vs `simple`), and explain the effect of
   stemming and stop words on the example queries. Note where `unaccent` or a custom
   configuration is warranted so accented and unaccented forms match.

3. **Decide tsvector storage** — recommend a generated `tsvector` column (a stored
   `GENERATED ALWAYS AS (...) STORED` column, indexed) versus a functional expression
   index built directly on `to_tsvector(...)`. Weigh write amplification, the ability to
   apply per-field weights, and query ergonomics. Be explicit that the indexed expression
   and the query expression must use the same configuration or the index will not be used.

4. **Apply weighting and ranking** — show how to combine multiple fields with `setweight`
   (for example title at weight A, body at weight D) and how to order results with
   `ts_rank` or `ts_rank_cd`. Explain when ranking is worth its cost and when a simple
   boolean match plus an ORDER BY on another column is enough.

5. **Specify the index and the query** — give the exact `CREATE INDEX` (GIN for typical
   full-text workloads; mention GiST and the RUM extension only where they genuinely fit)
   and the matching `WHERE` clause using `@@` with `websearch_to_tsquery` or
   `to_tsquery`. For autocomplete or typo tolerance, give the `pg_trgm` GIN/GiST index and
   a `%`/`similarity()` or `ILIKE` query instead, and say why.

Output format: (a) one-line verdict on full-text vs trigram (or both), (b) the exact DDL
for the column and index, (c) the exact search query with ranking, and (d) a short table
[decision | choice | reason | trade-off].

Guardrails: build the design on a replica or a snapshot of prod first and confirm with
`EXPLAIN (ANALYZE, BUFFERS)` that the GIN/trigram index is actually used before rollout.
Create indexes with `CREATE INDEX CONCURRENTLY` so you do not hold an ACCESS EXCLUSIVE
lock, and remember that adding a STORED generated column rewrites the table — schedule it
in a maintenance window and back up first. Keep a human reviewing the relevance results on
real data before this replaces an existing search path.

Why this prompt works

The most common full-text mistake in Postgres is solving the wrong problem. Teams reach for tsvector when they actually need substring or typo-tolerant matching (a trigram job), or they sprinkle ILIKE '%term%' everywhere when they really want stemmed, ranked word search. This prompt forces the classification step up front, so the design follows the match semantics the users actually need rather than whichever feature was remembered first.

It also makes the two decisions that quietly determine whether search is fast: the text search configuration and the storage/index shape. Stemming and stop words from the wrong configuration change which queries match, and a mismatch between the indexed expression and the query expression is the classic reason a GIN index sits unused while every search does a sequential scan. By demanding the same configuration on both sides and asking for an EXPLAIN check, the prompt closes the gap where most “why is search slow” tickets originate.

Finally, weighting and ranking are treated as deliberate choices with costs, not defaults. setweight plus ts_rank gives genuinely better ordering, but ranking is not free, and sometimes a boolean match with a cheaper sort is the right call. Pairing that with backup-first, CONCURRENTLY, and replica-testing guardrails keeps a schema-rewriting change — which a generated column is — under human control rather than something that surprises production.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week