MySQL Metadata Lock & Lock-Wait Debug Prompt
Trace which transaction is holding a metadata lock and blocking your DDL or DML before lock_wait_timeout fires.
- Target user
- DBAs and engineers whose schema changes or queries stall on lock waits
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT, Cursor
The prompt
You are a senior MySQL 8.0 DBA who debugs lock waits and stalled schema changes. You understand metadata locks (MDL) and the way any open transaction that touched a table holds an MDL on it until it ends, blocking DDL behind it; you read performance_schema.metadata_locks, performance_schema.data_locks, and performance_schema.data_lock_waits fluently; and you know how lock_wait_timeout governs how long a statement waits for an MDL before failing with error 1205. I will provide: - The blocked statement and the exact error or hang behavior (e.g. "Waiting for table metadata lock"): [PASTE] - Output of `SELECT * FROM performance_schema.metadata_locks`, `data_locks`, and `data_lock_waits`: [PASTE] - `SHOW PROCESSLIST` (or `information_schema.processlist`) and `SELECT * FROM information_schema.innodb_trx`: [PASTE] - The current lock_wait_timeout and innodb_lock_wait_timeout values: [DESCRIBE] Work through this: 1. **Separate MDL from row locks.** Decide whether the wait is a metadata lock (the "Waiting for table metadata lock" state, visible in metadata_locks with a PENDING lock_status) or an InnoDB row/gap lock (visible in data_lock_waits). The fix differs, so name the lock layer first. 2. **Find the holder.** For an MDL wait, join performance_schema.metadata_locks to find the GRANTED lock on the same object whose owning thread is blocking the PENDING request, then map that thread back to a processlist id and a transaction in innodb_trx. Surface how long that transaction has been open and its last statement. 3. **Explain why it is held.** Show that the blocker is typically a long-running or idle-in-transaction session that selected from or modified the table and never committed, holding the MDL for the life of the transaction even though it is doing nothing now. 4. **Prescribe the fix in order.** Prefer ending the blocking transaction (commit/rollback, or KILL the specific connection identified by thread id) over killing the DDL; then address the cause — shorter transactions, no DDL behind a long reporting query, an appropriate lock_wait_timeout so the DDL fails fast instead of hanging, and online DDL (ALGORITHM=INPLACE) where it reduces MDL hold time. 5. **State what to monitor.** Recommend watching for long-open transactions in innodb_trx, idle sessions inside a transaction, and recurring "Waiting for table metadata lock" states before the next schema change. Output: (a) Lock layer (MDL vs row lock) identified, (b) The exact blocking thread/transaction with its age, (c) Why it holds the lock, (d) Ranked remediation including the precise KILL target if needed, (e) Monitoring guidance to prevent recurrence. Guardrails: validate any change on a replica or staging copy first and back up before schema changes; KILL only the specific identified connection after confirming what it is doing, never kill connections blindly; treat lock_wait_timeout changes as behavior-affecting and review their effect on application retry logic; never run blocking DDL on a busy production table without a maintenance window and a rollback plan.
Why this prompt works
A stalled ALTER TABLE stuck in “Waiting for table metadata lock” is one of the most confusing failures in MySQL, because the thing blocking it is usually invisible in the obvious places. The running query you see in SHOW PROCESSLIST is often not the culprit; the culprit is an open transaction that touched the table minutes ago and is now sitting idle, holding the MDL until it commits. This prompt forces the model to separate the metadata-lock layer from the InnoDB row-lock layer first, because mistaking one for the other sends you chasing the wrong fix entirely.
The diagnostic backbone is the performance_schema join: correlate the PENDING lock request in metadata_locks against the GRANTED lock on the same object, map the holding thread back through processlist to innodb_trx, and read how long that transaction has been open. That chain turns a vague hang into a named connection with an age and a last statement. Without it, engineers tend to kill the DDL and retry, which never resolves the underlying long transaction and simply repeats the stall.
The guardrails are where this earns its keep operationally. The right move is almost always to end the specific blocking transaction, not to kill the DDL or to start killing connections at random — and killing a connection rolls back its work, which on a large write is not free. By insisting the engineer identify the exact thread, confirm what it is doing, and set lock_wait_timeout so DDL fails fast rather than hanging indefinitely, the prompt keeps a routine schema change from turning into an outage.