Instrumenting GitLab Pipelines With AI-Generated

Our pipeline takes eleven minutes. That’s the number I quote in standups, and it’s the number nobody can do anything about, because “eleven minutes” is not a diagnosis. Is it the test stage? The Docker build? Some flaky npm ci that occasionally redownloads the world? I had no idea. The GitLab UI shows me per-job durations in a list, but the moment a pipeline fans out across parallel jobs and stages, that list stops telling a story. There’s no waterfall. There’s no trace. There’s just a wall of green checkmarks and a slow clock.

So I did what I should have done a year ago: I wrapped the pipeline in OpenTelemetry traces. And because I’m not interested in hand-writing OTLP JSON payloads at 4pm on a Friday, I let AI generate most of the boilerplate. The result is a pipeline that emits a span per job, propagates a single trace ID across every stage, and ships everything to a collector I can actually query. Here’s how it fits together — and where I made the AI keep its hands off the keyboard.

Why per-job spans beat the duration list

A trace is a tree of spans. For CI, the natural shape is: one root span per pipeline, one child span per job, with the job spans timestamped so a viewer can reconstruct the waterfall. Once that’s in place, the bottleneck stops being a mystery — you open the trace, find the widest bar, and that’s your stage. Parallel jobs overlap visually. Retries show up as siblings. Queue time (the gap between “pipeline created” and “job started”) becomes a visible, measurable thing instead of a vibe.

The trick in GitLab is that jobs are isolated. Each job is a fresh runner with no shared memory, so you can’t keep a tracer object alive across them. Instead you treat each job as a self-contained span emitter that’s handed the same trace ID and the pipeline’s root span ID as its parent. GitLab’s predefined variables make that propagation almost free.

Generating the span emitter with AI

I gave the model a tight prompt: “Write a Bash function that emits a single OpenTelemetry span over OTLP/HTTP using curl. Inputs: trace ID, parent span ID, span name, start time (unix nanos), end time (unix nanos), and a set of key/value attributes. Output valid OTLP JSON to $OTLP_ENDPOINT/v1/traces.”

Treat the model like a fast junior engineer here. It’s genuinely good at remembering the shape of the OTLP JSON envelope — the resourceSpans → scopeSpans → spans nesting that I always have to look up. It produced this in one shot, and I only had to fix the timestamp units:

# .gitlab-ci.yml — reusable span emitter, pulled in via !reference or YAML anchor
.otel_emit: &otel_emit |
  emit_span() {
    local name="$1" parent="$2" start_ns="$3" end_ns="$4"
    local span_id; span_id=$(openssl rand -hex 8)
    curl -sS --max-time 10 \
      -X POST "${OTLP_ENDPOINT}/v1/traces" \
      -H "Content-Type: application/json" \
      -H "Authorization: ${OTLP_AUTH_HEADER}" \
      -d @- <<JSON
    {
      "resourceSpans": [{
        "resource": { "attributes": [
          { "key": "service.name", "value": { "stringValue": "gitlab-ci" } }
        ]},
        "scopeSpans": [{
          "spans": [{
            "traceId": "${TRACE_ID}",
            "spanId": "${span_id}",
            "parentSpanId": "${parent}",
            "name": "${name}",
            "kind": 1,
            "startTimeUnixNano": "${start_ns}",
            "endTimeUnixNano": "${end_ns}",
            "attributes": [
              { "key": "ci.pipeline.id", "value": { "stringValue": "${CI_PIPELINE_ID}" } },
              { "key": "ci.job.id",      "value": { "stringValue": "${CI_JOB_ID}" } },
              { "key": "ci.job.stage",   "value": { "stringValue": "${CI_JOB_STAGE}" } },
              { "key": "ci.job.name",    "value": { "stringValue": "${CI_JOB_NAME}" } }
            ]
          }]
        }]
      }]
    }
JSON
  }

That’s the part AI is great at: structurally fiddly, well-documented, low-judgement boilerplate. Note one thing it got wrong on the first pass and I caught in review — it hardcoded a Bearer token. Never let that survive. OTLP_AUTH_HEADER is a masked, protected CI/CD variable, set in Settings → CI/CD → Variables, never a literal in YAML and never something you paste into a chat window.

Pro Tip: Ask the model for the OTLP JSON envelope and the attribute map, but inject every secret and endpoint as a CI/CD variable yourself. The model should never see your real OTLP_AUTH_HEADER, collector URL, or registry creds — if it generated them, it can leak them into a commit.

Minting the trace ID once, propagating it everywhere

The root span has to be created before any job runs, and its trace ID needs to reach every downstream job. The cleanest place is a prepare job in the first stage that mints the IDs and writes them to a dotenv artifact, which GitLab automatically loads into later jobs as environment variables.

stages: [prepare, build, test, deploy]

prepare:trace:
  stage: prepare
  script:
    - export TRACE_ID=$(openssl rand -hex 16)
    - export ROOT_SPAN_ID=$(openssl rand -hex 8)
    - echo "TRACE_ID=${TRACE_ID}"       >> trace.env
    - echo "ROOT_SPAN_ID=${ROOT_SPAN_ID}" >> trace.env
    - echo "PIPELINE_START_NS=$(date +%s%N)" >> trace.env
  artifacts:
    reports:
      dotenv: trace.env

Now TRACE_ID, ROOT_SPAN_ID, and PIPELINE_START_NS are present in every subsequent job. That single trace ID is the spine of the whole thing — every job span points its parentSpanId at ROOT_SPAN_ID, so they all hang off the same root and render as one waterfall. This is also how you’d propagate a W3C traceparent if you wanted spans to nest under an external trigger (say, a deploy orchestrated from another system): you’d accept the incoming traceparent as a pipeline variable instead of minting your own.

Timing each job with before_script / after_script

The per-job span needs honest start and end timestamps. before_script and after_script run around the job body, so they’re the right hooks. after_script is especially valuable because it runs even when the job fails — which means you still capture a span (and its duration) for the failing job, which is exactly when you most want the data.

.traced:
  before_script:
    - !reference [.otel_emit]            # pull in emit_span()
    - export JOB_START_NS=$(date +%s%N)
  after_script:
    - source <(grep -E '^(TRACE_ID|ROOT_SPAN_ID|JOB_START_NS)=' trace.env 2>/dev/null) || true
    - emit_span "${CI_JOB_NAME}" "${ROOT_SPAN_ID}" "${JOB_START_NS}" "$(date +%s%N)"

build:app:
  extends: .traced
  stage: build
  script:
    - docker build -t app:${CI_COMMIT_SHORT_SHA} .

test:unit:
  extends: .traced
  stage: test
  script:
    - npm ci
    - npm test

Every job that extends: .traced now emits a span bounded by real wall-clock time, attributed with its stage and job IDs. One caveat the AI flagged correctly: JOB_START_NS is exported in before_script and read in after_script, but if your runner spawns those in separate shells you’ll need to persist it to a file instead of an env var. Worth testing on your actual runner.

otel-cli when curl gets old

Hand-rolled curl is fine, but once you want span events, status codes, and proper error spans, reach for otel-cli. It wraps all of this in a single binary and speaks OTLP natively:

.traced_cli:
  after_script:
    - |
      otel-cli span \
        --service "gitlab-ci" \
        --name "${CI_JOB_NAME}" \
        --tp-print \
        --attrs "ci.pipeline.id=${CI_PIPELINE_ID},ci.job.stage=${CI_JOB_STAGE},ci.job.id=${CI_JOB_ID}" \
        --start "${JOB_START_NS}" --end "$(date +%s%N)" \
        --endpoint "${OTLP_ENDPOINT}" \
        --otlp-headers "authorization=${OTLP_AUTH_HEADER}"

It’s less code to maintain and it handles the gRPC-vs-HTTP and timestamp-format quirks that always bit me with raw curl. The AI is useful for translating your curl version into the equivalent otel-cli flags — a mechanical refactor it rarely gets wrong.

The webhook / exporter alternative

If you’d rather not touch .gitlab-ci.yml at all, GitLab can push the data to you. The gitlab-exporter webhook approach subscribes to GitLab’s Pipeline events and Job events webhooks; a small receiver converts each event’s timestamps into spans server-side. The OpenTelemetry community also maintains a GitLab receiver for the Collector that does the same thing by polling the API.

The tradeoff: webhooks give you traces with zero pipeline changes and no secrets in your YAML, but you’re limited to the timing data GitLab reports (created/started/finished), so you lose the granular inside-the-job spans you’d get from instrumenting script steps directly. I run both — coarse webhook spans for coverage, in-job spans for the jobs I’m actively optimizing.

Where AI helped and where I drew the line

The model wrote roughly 80% of the YAML above, and it saved me a genuinely tedious afternoon. But I reviewed every line before it went near main. It tried to inline a token once. It used millisecond timestamps where OTLP wants nanoseconds. It assumed after_script shared the before_script shell. Each of those is the kind of plausible-but-wrong output you get from a fast junior who’s never run your specific runner. If you want a second set of eyes on a pipeline diff like this, running it through an automated code review pass catches the secret-handling mistakes before a human even looks.

For the prompting itself, I keep a few reusable scaffolds in my prompt library — “convert this curl OTLP call to otel-cli,” “generate an attribute map from these CI variables” — and the CI-specific ones live in a prompt pack so I’m not rewriting them per project. Any capable assistant works for this; I tend to use Claude for the longer YAML refactors.

Conclusion

OpenTelemetry turned “the pipeline is slow” into “the test:integration job spends four of its six minutes pulling a base image we could cache.” That’s an actionable sentence, and I couldn’t say it before. AI got me there faster by generating the OTLP envelopes, the attribute maps, and the otel-cli translation — the boilerplate that’s boring to write and easy to get subtly wrong. Just remember what it is: a fast junior engineer. Let it draft the spans, review every line, and keep your auth headers and CI secrets in masked variables where neither the model nor a leaked commit can ever reach them.

Instrumenting GitLab Pipelines With AI-Generated OpenTelemetry Traces