Partitioning Large Postgres Tables With AI

A 900GB events table is the kind of object that makes every operation scary. Vacuum takes forever, an index build is an all-day affair, and dropping old data means a DELETE that bloats the table worse than the data it removed. Partitioning is the right answer for tables like this — splitting one giant table into many smaller child tables by a key — but it’s also easy to get wrong in ways that are expensive to undo, starting with picking the wrong partition key. AI is a strong design partner here: it reasons clearly about partition strategy, key choice, and the migration sequence. What it can’t do is know your query patterns or your data distribution. You bring those; it designs; you verify on a replica.

This is how I approach partitioning a table that’s already too big.

Decide whether to partition at all

Partitioning isn’t free. It adds operational complexity, the partition key has to be in most queries to get the benefit (partition pruning), and a bad key gives you all the cost and none of the gain. The clear wins: time-series data you age out (drop old partitions instantly instead of DELETE), tables huge enough that per-table maintenance is painful, and queries that always filter on a natural key like tenant or date. I describe the table and its top queries to AI and have it argue whether partitioning actually helps before I commit to a multi-week migration.

Choose the partition key from real query patterns

This is the decision that matters most, and it must come from how the table is queried, not how it feels. If most queries filter by created_at, range-partition by time. If they filter by tenant_id, consider partitioning by tenant. The key has to appear in queries so the planner can prune to a single partition.

-- what do queries actually filter on? pull the evidence
SELECT substr(query, 1, 100) AS query, calls
FROM pg_stat_statements
WHERE query ILIKE '%events%'
ORDER BY calls DESC
LIMIT 20;

Here are the top 20 queries hitting the events table by call count, plus its schema. Most filter on created_at and some on tenant_id. Recommend a partition strategy: range by month on created_at, list/hash by tenant_id, or composite. Explain which queries will get partition pruning and which won’t, and flag any query that would have to scan every partition.

The model is good at mapping queries to pruning behavior. The query that can’t prune is the one to watch — if a hot query doesn’t filter on your partition key, partitioning could make it slower by forcing a scan across all children.

Design the partitioned table

For time-series, range partitioning by month is the common, boring, correct choice. Create the parent as partitioned and add child partitions.

CREATE TABLE events_p (
  id          bigint GENERATED ALWAYS AS IDENTITY,
  tenant_id   bigint NOT NULL,
  created_at  timestamptz NOT NULL,
  payload     jsonb,
  PRIMARY KEY (id, created_at)        -- key must include partition column
) PARTITION BY RANGE (created_at);

CREATE TABLE events_p_2026_06 PARTITION OF events_p
  FOR VALUES FROM ('2026-06-01') TO ('2026-07-01');

CREATE TABLE events_p_2026_07 PARTITION OF events_p
  FOR VALUES FROM ('2026-07-01') TO ('2026-08-01');

Note the primary key has to include the partition column — a real constraint people hit immediately. I ask AI to confirm the PK and unique-constraint implications, because they ripple into your foreign keys.

Automate partition creation

You don’t want to create partitions by hand forever. Either use pg_partman to manage them, or script ahead-of-time creation so you never write into a missing partition.

CREATE EXTENSION IF NOT EXISTS pg_partman;
SELECT partman.create_parent(
  p_parent_table => 'public.events_p',
  p_control      => 'created_at',
  p_type         => 'range',
  p_interval     => '1 month',
  p_premake      => 3
);

AI is handy for drafting the pg_partman config and the retention policy, but verify the premake count covers your write horizon — a write into a date with no partition fails outright unless you have a default partition.

Migrate the existing data without a long lock

This is the dangerous part. You can’t just ALTER TABLE a giant table into a partitioned one — you build the new partitioned table alongside, backfill in batches, then cut over. The lock-aware sequence matters as much as it does for any schema migration.

-- backfill in time-ranged batches so each statement is bounded
INSERT INTO events_p
SELECT * FROM events
WHERE created_at >= '2026-06-01' AND created_at < '2026-06-08';
-- repeat per range, committing each; app dual-writes during cutover

I need to migrate a 900GB non-partitioned events table to a monthly range-partitioned table with minimal lock time. Lay out the cutover: build the partitioned table, backfill in batches, dual-write from the app, verify counts, then rename/swap. Include the lock level of the final swap and how to make it fast. Give me a rollback at each step.

The final rename swap takes a brief exclusive lock, so you do it in a low-traffic window with lock_timeout set. AI lays out the dual-write-and-swap sequence well; you prove the whole thing on a replica with real data first.

The payoff and the verification

Once partitioned, dropping old data is instant — DROP TABLE events_p_2025_01 instead of a bloating DELETE — and maintenance runs per partition instead of over the whole monster. Confirm pruning actually works on your real queries:

EXPLAIN (ANALYZE, BUFFERS)
SELECT count(*) FROM events_p
WHERE created_at >= '2026-06-01' AND created_at < '2026-06-15';
-- the plan should touch one partition, not all of them

If the plan scans every partition, your key or your query is wrong and you fix it before cutover. AI designs the strategy, the key, and the migration sequence quickly and accurately — but it can’t see your data distribution or run your queries, so the replica dry-run with production-scale data is the step that actually proves it. The rest of the scaling material is in the Postgres guides, and my partitioning-design prompts live in the prompt library.