Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AI for MySQL Difficulty: Advanced ClaudeChatGPTCursor

MySQL performance_schema & sys Investigation Prompt

Run a structured performance_schema and sys-schema investigation to find what is actually slowing a MySQL 8.0 server.

Target user
Senior DBAs and SREs diagnosing latency, contention, or IO problems in MySQL 8.0
Difficulty
Advanced
Tools
Claude, ChatGPT, Cursor

The prompt

You are a senior MySQL 8.0 performance engineer who diagnoses problems with performance_schema and the sys schema rather than guesswork. You know statement digests (events_statements_summary_by_digest / sys.statement_analysis), the sys views for full table scans and unindexed queries (sys.statements_with_full_table_scans), wait-event analysis (events_waits_summary_global_by_event_name, sys.wait_classes_global_by_latency), memory instrumentation (memory_summary_global_by_event_name, sys.memory_global_by_current_bytes), and file IO by table (sys.io_global_by_file_by_bytes, sys.schema_table_statistics). You know latency is reported in picoseconds and that instruments/consumers must be enabled to see anything.

I will provide:
- The symptom: latency spikes, lock waits, high CPU, IO saturation, OOM — when it happens and the blast radius: [DESCRIBE]
- Output of the relevant sys views (statement_analysis, statements_with_full_table_scans, wait_classes_global_by_latency, io_global_by_file_by_bytes, schema_table_lock_waits): [PASTE]
- Which performance_schema instruments/consumers are enabled (setup_instruments, setup_consumers): [PASTE]
- Server context: version, RAM, storage type, workload, and recent changes: [DESCRIBE]

Work through this:

1. **Confirm instrumentation.** Verify the relevant instruments and consumers are enabled (statement, wait, stage, memory); if the data needed is missing, say which setup_instruments/setup_consumers rows to enable before continuing.
2. **Rank by total latency, not count.** From sys.statement_analysis / events_statements_summary_by_digest, find the digests with the highest total latency and rows examined per row sent. Convert picoseconds to readable units. These, not the most frequent queries, are usually the cause.
3. **Find scans and missing indexes.** Use sys.statements_with_full_table_scans and schema_table_statistics to spot statements doing full scans and the tables they hit.
4. **Classify the wait.** From sys.wait_classes_global_by_latency and the wait-event summaries, decide whether the bottleneck is IO, lock/synchronization, or CPU-bound execution, and map IO waits to specific files/tables via sys.io_global_by_file_by_bytes.
5. **Check memory and locks.** Use sys.memory_global_by_current_bytes for an OOM/growth story and schema_table_lock_waits / data_lock_waits for contention.

Output: a ranked root-cause shortlist tied to specific digests/tables/wait classes, the evidence query for each, and a remediation per item (index, query rewrite, config) with the metric to recheck afterward.

Guardrails: enabling extra performance_schema instruments adds overhead — turn them on briefly on a replica or staging host first and measure the cost; apply any index or config remediation on a replica/staging copy and back up before changing production.

Why this prompt works

When a MySQL server misbehaves, the temptation is to stare at the slow query log and start tuning the query that appears most often. performance_schema and the sys schema let you do something far more reliable: rank work by total latency and rows examined, classify the actual wait, and tie IO to specific files and tables. This prompt enforces that discipline, sending the model to the right sys views in the right order instead of pattern-matching a symptom to a generic fix.

Two facts trip up almost everyone, and the prompt bakes both in. First, instrumentation has to be enabled to exist — if the relevant setup_instruments or setup_consumers rows are off, the views are simply empty, and the correct first move is to enable them (briefly, on a non-prod host, because they add overhead) rather than conclude “nothing is wrong.” Second, latency is reported in picoseconds and counters are cumulative since the last reset, so raw totals must be converted and bounded to a window or you end up chasing a problem that was fixed weeks ago.

The payoff is a root-cause shortlist where every item is backed by an evidence query — a specific digest, table, or wait class — and paired with a concrete remediation and the metric to recheck after applying it. That makes the investigation reviewable and reversible: a human can see why each conclusion was reached, and the guardrails keep instrument-enabling and any index or config change behind a replica test and a backup before they reach production.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week