Using AI to Explain and Document an Inherited GitLab

You join a team, open the .gitlab-ci.yml, and it’s 800 lines spread across six include files, with hidden template jobs, extends chains, cryptic rules, and zero comments. Nobody who wrote it is still around. You’re now responsible for a pipeline you don’t understand, and the only documentation is the YAML itself. This is one of the most genuinely useful things to point AI at: reverse-engineering an undocumented pipeline into something a human can reason about. The model reads YAML faster than you and never gets bored tracing an extends chain — a fast junior engineer doing the tedious archaeology. You still verify, because a confidently-wrong explanation of a deploy job is dangerous. Here’s how I do it.

Start with a plain-English walkthrough

My first prompt to Claude is broad: “Here’s a complete .gitlab-ci.yml with all its includes. Walk me through what this pipeline does, stage by stage, in plain English. For each job, tell me what triggers it and what it produces.” Reading the model’s narrative is dramatically faster than parsing the YAML myself, and it immediately orients me — “ah, there are three deploy paths gated on different refs.”

The model is reliable on the mechanical reading: which jobs are in which stage, what extends resolves to, which artifacts flow where. Where I stay skeptical is anything involving intent — “this job appears to roll back the deploy” might be the model pattern-matching a script that does something subtly different. I treat the walkthrough as a map drawn by someone who’s read the territory but never walked it.

Build a visual DAG of the pipeline

Text only goes so far for a tangled pipeline. I ask the model to generate a Mermaid diagram of the job dependency graph:

graph LR
  build --> test
  build --> lint
  test --> deploy_staging
  deploy_staging --> deploy_prod

A rendered DAG instantly shows the critical path, parallel branches, and any job that’s weirdly isolated. The model builds this from the needs/stage relationships in the YAML, and it’s usually accurate because it’s pure mechanical translation. I cross-check it against the pipeline editor’s own DAG view — if they match, I trust the diagram. This diagram becomes the centerpiece of the docs I write, because a picture saves the next person hours.

Pro Tip: Ask the AI to specifically call out any job whose trigger conditions it’s unsure about. Phrase it as “list every job where you’re not fully confident about when it runs, and explain your uncertainty.” This flips the model’s tendency to sound confident into a feature — it surfaces exactly the rules clauses you need to test yourself rather than burying them in a confident-sounding paragraph. The jobs it flags are your verification to-do list.

Decode the gnarly rules together

The rules blocks are where inherited pipelines hide their real behavior, and they’re the highest-value thing to document accurately. I take each non-obvious rules block and ask the model to “explain in plain English when this job runs and when it’s skipped, covering branch pushes, merge requests, tags, and scheduled pipelines.” Then — critically — I verify the model’s interpretation against a real run history. GitLab’s pipeline list shows what actually ran on past commits; if the model says “this only runs on tags” but I see it ran on a branch push last Tuesday, the model is wrong and I dig in. The rules and workflow prompts guide covers the evaluation-order subtleties that trip up these explanations.

Generate the documentation — then edit it

Once I trust my understanding, I have the AI draft the actual docs: a README section with the DAG diagram, a per-stage description, a table of CI/CD variables the pipeline expects (extracted from the YAML’s $VARIABLE references), and a “gotchas” list. The variable table is especially valuable — the model scans the whole config and lists every variable referenced, which is tedious to do by hand and easy to miss one.

But I edit the draft before committing it. The model writes documentation that’s accurate to the YAML but misses the why — the historical reasons, the “we do it this weird way because of a vendor quirk” context that no amount of reading YAML reveals. I add that human context. AI-drafted docs that nobody reviewed are how teams end up with confidently-wrong documentation, which is worse than none.

Don’t let it expose what variables actually contain

When the model builds the variable table, it lists variable names and where they’re used — $DEPLOY_TOKEN is referenced in the prod deploy job. That’s exactly what documentation should say. What it must never do, and what you must never feed it, is the actual values of those secrets. Documenting “this pipeline needs a $DEPLOY_TOKEN with registry push scope” is great onboarding info; pasting the real token into the chat so the model can “understand the deploy” is a breach. The names and purposes go in the docs; the values stay in masked CI/CD variables where they belong.

Verify before you trust the map

The whole point of this exercise is to be able to safely change the pipeline later, so the documentation has to be trustworthy. My verification pass before I commit the docs:

Cross-check the DAG diagram against the pipeline editor’s rendered graph.
Validate the model’s rules explanations against actual run history in the pipeline list.
Confirm the variable table against a grep of the YAML for $ references — no hallucinated or missed variables.
Add the human “why” context the model couldn’t know.
Have a teammate who’s touched the pipeline read it, if one exists.

This is the human-in-the-loop part. The AI did the archaeology — reading 800 lines, tracing includes, building the graph — in a fraction of the time I would have. I did the verification and added the institutional knowledge. That split is what makes the docs trustworthy enough to act on. When the inherited pipeline does eventually break, good docs plus an incident-response workflow is the difference between a calm fix and a panicked one.

Conclusion

An undocumented inherited pipeline is a liability, and reverse-engineering it by hand is exactly the kind of tedious reading that AI accelerates dramatically. Let the model walk you through it, draw the DAG, and decode the rules — then verify its DAG against the editor, its rules against run history, and its variable table against a grep, and add the human context it can’t know. Keep secret values out of the chat; document only their names and purpose. The result is documentation you can actually trust enough to change the pipeline safely. Fast junior engineer, human-in-the-loop, review before merge. More onboarding and pipeline guides are in the GitLab CI/CD category, with reusable explain-this-pipeline prompts in the prompts library.

Using AI to Explain and Document an Inherited GitLab Pipeline