Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AI for MySQL Difficulty: Intermediate ClaudeChatGPTCursor

MySQL JSON Column Schema & Query Design Prompt

Decide where JSON fits in a MySQL 8.0 schema, then design storage, validation, indexing, and queries for it.

Target user
Backend engineers and DBAs modeling semi-structured data in MySQL 8.0
Difficulty
Intermediate
Tools
Claude, ChatGPT, Cursor

The prompt

You are a senior MySQL 8.0 DBA who has cleaned up many "everything is a JSON blob" schemas. You understand the native JSON type, the `->` and `->>` (JSON_UNQUOTE(JSON_EXTRACT())) operators, JSON_TABLE for relational extraction, indexing JSON via generated columns, and validating shape with CHECK constraints using JSON_SCHEMA_VALID(). You know JSON columns cannot be indexed directly and that values are stored as a binary document, not text.

I will provide:
- The data I want to store and how it is queried (filters, sorts, joins, aggregates): [DESCRIBE]
- The proposed or existing table DDL, including any JSON columns: [PASTE]
- Representative sample JSON documents (2-3, redacted): [PASTE]
- Read/write volume, document size, and which keys are queried most: [DESCRIBE]
- MySQL version and any ORM or app-layer constraints: [DESCRIBE]

Work through this:

1. **JSON vs normalized.** Decide which fields belong in real typed columns versus a JSON column. Push hot, indexed, frequently-filtered attributes into columns; keep sparse, optional, or schema-flexible attributes in JSON. State the trade-off explicitly.
2. **Validate the shape.** Where JSON is justified, add a CHECK constraint using JSON_SCHEMA_VALID('<schema>', col) so malformed or wrong-typed documents are rejected at write time. Provide the JSON Schema document.
3. **Index what you query.** For each predicate on a JSON path, add a STORED or VIRTUAL generated column extracting that value (e.g. `val VARCHAR(64) AS (data->>'$.status')`) and an index on it. Recommend VIRTUAL for read-mostly extraction, STORED only when needed. Show the exact ALTER TABLE.
4. **Rewrite the queries.** Convert blind `data->>'$.x'` filters to hit the generated-column index, and use JSON_TABLE to flatten nested arrays into rows for joins and aggregation rather than app-side parsing.
5. **Check correctness traps.** Call out collation/case sensitivity on `->>` results, NULL vs JSON null, type coercion in comparisons, and CAST where ordering matters.

Output: revised DDL (columns, generated columns, indexes, CHECK), a JSON Schema for validation, and before/after query rewrites with the EXPLAIN expectation for each.

Guardrails: prototype the new DDL and run EXPLAIN on a replica or staging copy with production-like data before touching prod; back up the table (and confirm a restore path) before any ALTER, since adding STORED generated columns rewrites the table.

Why this prompt works

The most common JSON mistake in MySQL 8.0 is treating the JSON type as a license to stop designing a schema. JSON is excellent for sparse, optional, or genuinely schema-flexible data, and it is a liability when used to hide attributes you filter, sort, and join on every request. This prompt forces the first and most important decision up front — what belongs in a typed column versus a document — instead of letting the model bless a blob and move on.

From there it leans on the features that make JSON in MySQL actually performant. You cannot index a JSON column directly, so the only real path to indexed access is a generated column that extracts the path you query, indexed like any other column; the prompt makes the model produce that ALTER TABLE explicitly and recommend VIRTUAL over STORED unless there is a concrete reason, because STORED rewrites the table. Pairing extraction with JSON_TABLE means nested arrays become rows you can join and aggregate in SQL rather than parsing in application code.

Finally it closes the correctness gaps that bite teams later: validating document shape with a JSON_SCHEMA_VALID CHECK constraint so bad data never lands, and flagging the subtle traps around ->> collation, JSON null versus SQL NULL, and comparison coercion. The guardrails keep the risky parts — table-rewriting STORED columns and constraints that can reject the existing back-catalog — behind a replica test and a verified backup.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week