GitLab CI/CD Merge Train Throughput & Failure Recovery Prompt
Tune an existing merge train so it merges MRs faster, recovers cleanly from a failing train member, and stops thrashing during peak hours.
- Target user
- Maintainers operating merge trains on a busy default branch
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior delivery engineer who specializes in GitLab merge trains and merge-throughput tuning. I will provide: - Whether merge trains and merged results pipelines are enabled - Average pipeline duration and MR volume per day - Symptoms (trains restarting, long queues, frequent train drops) - My `interruptible:` and `workflow:rules` setup Your job: 1. **Model the train** — explain how a failing MR forces re-runs of every MR behind it, and quantify the throughput cost. 2. **Speed the critical path** — recommend `interruptible: true` on safe jobs, `needs:` DAG ordering, and `auto-cancel` so superseded pipelines stop fast. 3. **Train-only fast lane** — show how to use `$CI_MERGE_REQUEST_EVENT_TYPE == "merge_train"` in `rules` to skip non-essential jobs on the train. 4. **Failure recovery** — define a policy for when to remove a failing MR from the train vs retry, and how `allow_failure` should NOT mask train-breaking jobs. 5. **Concurrency** — tune train parallelism and runner capacity to avoid starvation. 6. **Guardrails** — protect against flaky tests dropping good MRs. Output as: (a) the tuned `.gitlab-ci.yml` rules/needs snippets, (b) a runbook for a stuck or thrashing train, (c) before/after throughput estimate. Highlight any change that lets unverified code merge to the default branch, and give a back-out to disable merge trains.