Query Plan Hints and Steering the Postgres Planner With AI

Every engineer who comes to Postgres from another database eventually asks the same question: where are the query hints? You want to write something like /*+ INDEX(orders idx_customer) */ and force the planner to do the obvious thing. The answer is that core Postgres deliberately does not have them, and that decision shapes how you fix bad plans here. Instead of bolting a directive onto a query, you diagnose why the planner made the choice it did, and then you fix the input that misled it. AI is a genuinely useful partner for this work because reading a large EXPLAIN plan and reasoning about row estimates is exactly the kind of pattern-matching it does well. It still gets the final say wrong sometimes, so you verify on real plans before you trust it.

This is how I think about steering the Postgres planner without reaching for hints first.

Why Postgres Has No Hints

The Postgres community position is that hints encode a snapshot of what was optimal on one day, against one data distribution, on one version. Data grows, distributions skew, and a hint that pinned a nested loop in 2024 quietly becomes a performance landmine in 2026 when the joined table is ten times larger. A bad cost estimate is a bug you can fix once and benefit from everywhere; a hint is a workaround you have to remember, audit, and eventually remove.

That philosophy means the planner is doing its honest best with the statistics it has. When it picks a sequential scan over an index you were sure it should use, the productive question is not “how do I override this” but “what does the planner believe about this table that I don’t.” Most of the time the answer is a row estimate that is off by an order of magnitude.

Diagnose First With EXPLAIN

Start with EXPLAIN (ANALYZE, BUFFERS) so you can compare estimated rows against actual rows. The ratio between them is the single most valuable number on the page.

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.total
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE c.region = 'EMEA'
  AND o.created_at >= now() - interval '30 days';

When a node reads rows=42 estimated but actual rows=38000, the planner chose its join strategy on a fantasy. A nested loop is cheap for 42 rows and catastrophic for 38,000. The fix is not to forbid the nested loop — it is to teach the planner that the row count is large.

For a structured triage of a confusing plan, I lean on the workflow in the Postgres slow query EXPLAIN triage prompt, which walks through reading each node and separating estimation errors from genuine cost problems.

Use enable_* Flags as a Diagnostic, Not a Fix

Postgres exposes session-level toggles like enable_seqscan, enable_nestloop, and enable_hashjoin. These are not hints — they are blunt instruments that make a plan node nearly infinitely expensive so the planner avoids it. Their real value is diagnostic. If you suspect the index scan would be faster, prove it in a session:

SET enable_seqscan = off;
EXPLAIN (ANALYZE, BUFFERS)
SELECT ... ;  -- the same query
RESET enable_seqscan;

If the forced index plan is dramatically faster, you have confirmed the planner’s cost model is wrong, not your intuition. That confirmation is the goal. You do not leave enable_seqscan = off in production, because the next query that genuinely needs a sequential scan will suffer for it. You use the result to go fix the statistics or cost settings that produced the bad estimate.

Prompt: “Here is an EXPLAIN ANALYZE plan. The Bitmap Heap Scan on orders estimates 50 rows but actually returns 41,000. The filter is status = 'shipped' AND created_at >= now() - interval '7 days'. What is the most likely cause and what would you check?”

Output (abridged): “The two-order-of-magnitude underestimate suggests the planner is treating status and created_at as independent and multiplying their selectivities, when they are correlated — recent rows are far more likely to be ‘shipped’. Check default_statistics_target for the table, then create extended statistics with CREATE STATISTICS ... (dependencies) on the pair so the planner models the correlation. Run ANALYZE afterward and recheck the estimate before changing any cost GUCs.”

That answer is correct in shape, and it points at the real lever: statistics.

Fix the Root Cause: Statistics

Most “the planner is dumb” complaints are really “the planner has stale or thin statistics.” Three tools handle the vast majority of cases.

First, make sure ANALYZE has actually run after a big data change. Autovacuum usually handles it, but after a bulk load you may be querying against statistics from before the load.

ANALYZE orders;

Second, increase the statistics resolution on columns with skewed distributions. The default histogram has 100 buckets; for a column with a long tail of values, that can be too coarse.

ALTER TABLE orders ALTER COLUMN status SET STATISTICS 1000;
ANALYZE orders;

You can raise the global default_statistics_target in postgresql.conf, but doing it per-column is more surgical and keeps ANALYZE cheap on tables that do not need it.

Third — and this is the one people forget — use extended statistics for correlated columns. The planner assumes columns are independent. When they are not, estimates collapse. CREATE STATISTICS teaches it about the relationship.

CREATE STATISTICS orders_status_created (dependencies, mcv)
  ON status, created_at FROM orders;
ANALYZE orders;

After this, the planner understands that status = 'shipped' and a recent created_at are not independent, and the 50-vs-41000 gap from the prompt above usually shrinks to something sane. This is the highest-leverage fix in the whole article, because it corrects the model rather than patching one query.

When You Genuinely Need pg_hint_plan

Sometimes the data is genuinely pathological, the statistics are already accurate, and the planner still picks a worse plan — often around complex joins where the search space is large. For those cases there is the pg_hint_plan extension, which adds Oracle-style hints in a comment block.

CREATE EXTENSION pg_hint_plan;

/*+ IndexScan(orders idx_orders_customer) NestLoop(orders customers) */
SELECT o.id FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE c.region = 'EMEA';

Treat this as the exception, not the tool. Every hint you add is a maintenance liability that must be revisited as data grows. If you do use it, document why the planner was wrong and what you tried first, so a future engineer knows whether the hint is still earning its keep. AI is useful here for drafting the hint syntax, which is fiddly, but the decision to hint at all should be yours and well-justified.

A Practical Order of Operations

When a query is slow because of a bad plan, work the list in order and stop as soon as the plan is good:

# 1. Capture the real plan
psql -c "EXPLAIN (ANALYZE, BUFFERS) <query>"

# 2. Confirm the better plan is actually better
psql -c "SET enable_seqscan=off; EXPLAIN ANALYZE <query>; RESET enable_seqscan;"

# 3. Fix statistics
psql -c "ANALYZE <table>;"
psql -c "ALTER TABLE <table> ALTER COLUMN <col> SET STATISTICS 1000; ANALYZE <table>;"
psql -c "CREATE STATISTICS ... ON <col_a>, <col_b> FROM <table>; ANALYZE <table>;"

# 4. Only then consider pg_hint_plan

AI accelerates steps 1 and 3 enormously — it reads the plan, spots the estimation error, and proposes the statistics object. You still run the commands and re-check the plan, because a confident wrong suggestion costs you nothing only if you verify it. If you want a deeper read of recurring planner pain across a workload, the Postgres index advisor over pg_stat_statements prompt is a good companion for finding the queries worth this attention.

The mindset shift is the whole point. Hints ask “how do I force the answer I want.” Postgres asks “why does the planner believe something false,” and that question, answered well, fixes a class of queries at once. For more on tuning the planner and the rest of the engine, the Postgres category collects the related guides.