AI-Assisted .gitlab-ci.yml Refactors That Don't Break Prod

Every long-lived GitLab project ends up with the same .gitlab-ci.yml: 600 lines, copy-pasted before_script blocks, six jobs that differ only in an environment name, and a tag-rules clause nobody dares touch. Refactoring it by hand is tedious and risky, because the only way to truly test a CI config is to push it and watch what happens. That feedback loop makes people leave the mess alone.

This is exactly the kind of grunt work AI is good at — pattern-matching duplication and proposing a cleaner structure in seconds. But CI config has a nasty property: a refactor that looks equivalent can silently change which jobs run, when they run, or what they inherit. So I treat the model like a fast junior engineer who’s never seen production: useful for the first draft, but every diff gets reviewed before it merges. Here’s the workflow I’ve settled on.

Start by having AI map the duplication

Before changing anything, I paste the whole file into Claude and ask it to describe the structure, not rewrite it: “List every block of duplicated script or before_script content, and every place where jobs differ only by a variable.” This is a read-only ask, so there’s no risk, and it forces the model to build an accurate mental model before it starts editing.

The output is usually a clean inventory: “Jobs deploy_staging, deploy_qa, and deploy_prod are identical except for CI_ENVIRONMENT_NAME and the rules.” That inventory is the actual refactor plan. If the model misreads something here, I catch it now, on a description, instead of in a broken pipeline later.

Collapse repeated jobs with extends

The single highest-value refactor is replacing copy-pasted jobs with a hidden template job and extends. Take this before:

deploy_staging:
  stage: deploy
  image: "alpine/helm:3.14"
  script:
    - helm upgrade --install app ./chart --namespace staging
  environment:
    name: staging
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'

deploy_prod:
  stage: deploy
  image: "alpine/helm:3.14"
  script:
    - helm upgrade --install app ./chart --namespace prod
  environment:
    name: prod
  rules:
    - if: '$CI_COMMIT_TAG'

I ask the model to factor out the shared parts. The result:

.deploy:
  stage: deploy
  image: "alpine/helm:3.14"
  script:
    - helm upgrade --install app ./chart --namespace "$DEPLOY_NS"

deploy_staging:
  extends: .deploy
  variables:
    DEPLOY_NS: "staging"
  environment:
    name: staging
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'

deploy_prod:
  extends: .deploy
  variables:
    DEPLOY_NS: "prod"
  environment:
    name: prod
  rules:
    - if: '$CI_COMMIT_TAG'

The hidden job (leading dot) never runs on its own — it’s a template. The thing to verify here is merge semantics: extends deep-merges maps but replaces arrays. If the template has a script and the child also has a script, the child’s wins entirely. AI gets this wrong about a third of the time, producing a child job that silently drops the template’s commands.

Pro Tip: After any extends refactor, open the pipeline editor’s “Full configuration” tab (or run glab ci config style validation in your editor). It shows the fully-merged YAML for each job. Diff that against the old jobs — if the merged output matches, the refactor is behavior-preserving regardless of how clever the templating got.

Anchors vs. extends — make AI pick the right tool

Models love to reach for YAML anchors (&name / *name) because they’ve seen them everywhere. But anchors are pure text substitution with no GitLab awareness — they can’t reference jobs across include files, and they make the config harder to read. extends is GitLab-native, supports multiple parents, and works across includes.

My standing instruction is: “Prefer extends over YAML anchors unless the duplication is within a single small block.” When the model proposes anchors for cross-job reuse, I push back and ask it to convert to extends. This is a place where the human-in-the-loop matters: the model will happily generate technically-valid anchors that paint you into a corner six months later.

Split a monolith with includes

Once jobs are deduplicated, the next refactor is splitting one giant file into included fragments — say ci/build.yml, ci/test.yml, ci/deploy.yml. I have the AI propose the split and generate the top-level stitcher:

include:
  - local: "ci/build.yml"
  - local: "ci/test.yml"
  - local: "ci/deploy.yml"

stages:
  - build
  - test
  - deploy

The trap: stages must be declared in the file that does the including, or stage ordering becomes undefined. The model sometimes scatters stages across fragments, which GitLab merges in an order you didn’t intend. I always pull stage declarations up to the root and tell the model to do the same.

Verify rules behavior didn’t change

The scariest silent breakage is in rules. A refactor that consolidates three jobs into one template plus three children can accidentally change when a job runs — turning a merge-request-only job into one that runs on every commit, or vice versa. AI reasons about rules poorly because the evaluation order and when defaults are subtle.

I never trust a rules refactor on inspection alone. I build a small matrix of scenarios — “MR to main,” “tag push,” “scheduled pipeline,” “push to feature branch” — and ask the model to predict which jobs run in each, both before and after. Then I sanity-check its prediction against my own reading. When they disagree, I’m usually right and the model is confidently wrong. If you want a second opinion on a risky diff, our code review dashboard is built for exactly this kind of “did this change behavior?” check.

Keep secrets out of the conversation

One hard rule: I never paste real CI/CD variables, tokens, or deploy keys into an AI chat, even when the surrounding YAML references them. If a job uses $VAULT_TOKEN or $AWS_SECRET_ACCESS_KEY, the variable name is fine to share — the value never is. When I need the model to understand a job that depends on a secret, I describe what the secret does (“this is a registry password”) rather than handing it over. CI secrets in a chat log are a breach waiting to happen, and no refactor is worth that.

My review checklist before merge

Every AI-assisted CI refactor goes through the same gate before it merges:

Run the pipeline editor’s lint and check the merged “Full configuration.”
Diff merged job output (old vs. new) for at least the deploy and release jobs.
Trace rules across a handful of trigger scenarios.
Confirm no script arrays got dropped by extends replacement.
Push to a throwaway branch first and watch a real run, never straight to a protected branch.

If you do this kind of refactor often, it’s worth capturing the prompts as reusable templates. I keep a small library of “refactor this CI file” prompts in our prompts collection, and the prompt packs have a CI-focused set that bundles the merge-semantics warnings right into the instruction.

Conclusion

AI turns a dreaded all-day .gitlab-ci.yml cleanup into a 30-minute reviewed diff. The leverage is real — but so is the failure mode, because CI config breaks silently and only in production. Let the model do the pattern-matching and the first draft, then verify merge semantics and rules behavior yourself before anything touches a protected branch. Fast junior engineer, human-in-the-loop, review before merge. For more in this vein, browse the rest of the GitLab CI/CD guides.