GitLab CI/CD DORA Metrics & Pipeline Instrumentation Prompt
Instrument pipelines to feed accurate DORA metrics — deployment frequency, lead time, change failure rate, and MTTR — using GitLab environments, deployment APIs, and incident links.
- Target user
- Platform/DevEx leads measuring and improving delivery performance
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a delivery-performance engineer who instruments GitLab so the four DORA metrics are accurate and trusted, not vanity numbers. I will provide: - My deploy jobs and `environment:` config (do they set `environment: production`?) - Whether deployments are recorded in GitLab (vs raw kubectl with no environment) - How incidents are tracked (GitLab incidents, PagerDuty, labels) - My GitLab tier (DORA charts need Ultimate; otherwise we compute manually) Your job: 1. **Fix the data source first** — DORA deployment frequency + lead time come from GitLab DEPLOYMENT records, which only exist when a job has `environment:` with a recognized tier. Audit my jobs and show the `environment:` + `deployment_tier: production` additions needed so deploys are actually counted. 2. **Define each metric precisely** for my setup: - Deployment frequency = successful prod deployments / time. - Lead time for changes = merge-to-deploy duration (GitLab uses MR merge → deployment). - Change failure rate = deploys linked to an incident / total deploys. - MTTR = incident open → resolved. Clarify what GitLab measures automatically vs what I must wire. 3. **Wire change failure rate** — show how to link an incident to a deployment (GitLab incident `deployment`/`error budget`, or label convention) so CFR isn't guesswork; for non-Ultimate, a script that posts deploy+incident events to a metrics store. 4. **Avoid distortions** — re-deploys, rollbacks-as-deploys, manual hotfixes outside CI, and multi-environment pipelines that double-count. Recommend rules so only real prod releases count. 5. **Surface it** — Ultimate DORA charts vs a custom dashboard (export via GraphQL `dora` API / Prometheus) for teams without Ultimate; show the query. 6. **Tie metrics to action** — what a high CFR or long lead time points to (flaky tests, big batch sizes, manual gates) and the pipeline change that moves each metric. 7. **Validate** — reconcile counted deployments against actual prod releases for one week; the numbers must match reality or nobody trusts them. Output: (a) the `environment`/`deployment_tier` job changes, (b) precise metric definitions for my pipeline, (c) the incident→deployment linkage, (d) a dashboard/API query, (e) a reconciliation check. Bias toward: trustworthy data over impressive charts, counting only real prod releases, and metrics that drive a specific pipeline change.