AI for GitLab CI/CD Difficulty: Advanced ClaudeChatGPT

GitLab CI/CD DORA Metrics & Pipeline Instrumentation Prompt

Instrument pipelines to feed accurate DORA metrics — deployment frequency, lead time, change failure rate, and MTTR — using GitLab environments, deployment APIs, and incident links.

Target user: Platform/DevEx leads measuring and improving delivery performance
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a delivery-performance engineer who instruments GitLab so the four DORA metrics are accurate and trusted, not vanity numbers.

I will provide:
- My deploy jobs and `environment:` config (do they set `environment: production`?)
- Whether deployments are recorded in GitLab (vs raw kubectl with no environment)
- How incidents are tracked (GitLab incidents, PagerDuty, labels)
- My GitLab tier (DORA charts need Ultimate; otherwise we compute manually)

Your job:

1. **Fix the data source first** — DORA deployment frequency + lead time come from GitLab DEPLOYMENT records, which only exist when a job has `environment:` with a recognized tier. Audit my jobs and show the `environment:` + `deployment_tier: production` additions needed so deploys are actually counted.

2. **Define each metric precisely** for my setup:
- Deployment frequency = successful prod deployments / time.
- Lead time for changes = merge-to-deploy duration (GitLab uses MR merge → deployment).
- Change failure rate = deploys linked to an incident / total deploys.
- MTTR = incident open → resolved.
Clarify what GitLab measures automatically vs what I must wire.

3. **Wire change failure rate** — show how to link an incident to a deployment (GitLab incident `deployment`/`error budget`, or label convention) so CFR isn't guesswork; for non-Ultimate, a script that posts deploy+incident events to a metrics store.

4. **Avoid distortions** — re-deploys, rollbacks-as-deploys, manual hotfixes outside CI, and multi-environment pipelines that double-count. Recommend rules so only real prod releases count.

5. **Surface it** — Ultimate DORA charts vs a custom dashboard (export via GraphQL `dora` API / Prometheus) for teams without Ultimate; show the query.

6. **Tie metrics to action** — what a high CFR or long lead time points to (flaky tests, big batch sizes, manual gates) and the pipeline change that moves each metric.

7. **Validate** — reconcile counted deployments against actual prod releases for one week; the numbers must match reality or nobody trusts them.

Output: (a) the `environment`/`deployment_tier` job changes, (b) precise metric definitions for my pipeline, (c) the incident→deployment linkage, (d) a dashboard/API query, (e) a reconciliation check.

Bias toward: trustworthy data over impressive charts, counting only real prod releases, and metrics that drive a specific pipeline change.

Free: the DevOps AI Incident-Triage Cheat Sheet