Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for GitLab CI/CD By James Joyner IV · · 10 min read

AI for GitLab CI parallel: and matrix: Jobs Without the Sprawl

GitLab parallel and matrix jobs multiply fast and get expensive. Here's how I use AI to generate matrices that test what matters without runner sprawl.

  • #gitlab
  • #ci-cd
  • #ai
  • #matrix
  • #performance

GitLab’s parallel:matrix is one of those features that’s genuinely powerful and genuinely easy to abuse. You want to test against three Python versions and two databases, so you write a matrix — and now every pipeline spins up six jobs. Add another axis and it’s twelve. Add another and you’re at twenty-four jobs per pipeline, each consuming runner minutes and a queue slot. The combinatorial explosion is real, and AI will happily generate a matrix that quadruples your CI bill if you don’t constrain it. Used carefully, though, it’s a great tool for generating the right matrix fast. Here’s how I keep the sprawl in check.

Generate the matrix from your actual support requirements

The first prompt isn’t “write a test matrix.” It’s: “I support Python 3.10, 3.11, and 3.12, and I test against PostgreSQL 14 and 16. Generate a parallel:matrix that covers these combinations.” Anchoring on your real support matrix keeps the model honest:

test:
  stage: test
  image: "python:$PYTHON_VERSION"
  parallel:
    matrix:
      - PYTHON_VERSION: ["3.10", "3.11", "3.12"]
        PG_VERSION: ["14", "16"]
  services:
    - "postgres:$PG_VERSION"
  script:
    - pip install -r requirements.txt
    - pytest

That’s six jobs. The model produces this cleanly. The thing to check is that the variables actually flow where you think — image: "python:$PYTHON_VERSION" works because matrix variables are available at job-definition time, but the model sometimes references a matrix variable in a place GitLab can’t expand it. Validate in the pipeline editor; it’ll show you the six expanded jobs by name.

Prune the matrix — you don’t need every cell

This is the judgment call the AI won’t make for you. A full cross-product tests every combination, but most combinations aren’t interesting. You probably don’t need Python 3.10 and 3.12 against both databases — testing the oldest and newest Python against your primary DB, plus your newest Python against the secondary DB, often catches the same bugs at half the cost.

GitLab supports this with multiple matrix entries instead of one big cross-product:

  parallel:
    matrix:
      - PYTHON_VERSION: ["3.10", "3.12"]
        PG_VERSION: ["16"]
      - PYTHON_VERSION: ["3.12"]
        PG_VERSION: ["14"]

That’s three jobs instead of six, covering the meaningful corners. I ask the model: “Propose a reduced matrix that covers the version boundaries and the primary/secondary database without the full cross-product, and explain what coverage I’m giving up.” Making it state the tradeoff explicitly is what keeps you from blindly accepting a matrix that’s either too thin or too fat. The cost angle is real — the pipeline cost optimization guide treats matrix sprawl as a top line-item.

Pro Tip: Run the full matrix on a schedule (nightly) and a pruned matrix on merge requests. Developers get fast, cheap signal on every MR; you still get full-cross-product coverage once a day. Ask the AI to generate two job definitions — one gated on $CI_PIPELINE_SOURCE == "schedule" with the full matrix, one on MRs with the reduced one. This is the single best way to have your cake and eat it on matrix cost.

parallel: N vs. parallel: matrix — pick the right one

People conflate two different features. parallel: 5 splits one logical job into 5 shards (for test splitting), giving each shard the same environment but a slice of the work. parallel: matrix creates distinct jobs with different variable combinations. They solve different problems: sharding a slow suite vs. testing across configurations.

The model occasionally reaches for matrix when you wanted simple sharding, or vice versa. I state the goal plainly: “I want to split my slow test suite across 5 identical runners” gets parallel: 5; “I want to test against multiple versions” gets parallel: matrix. You can even combine them, but I’d advise against it until the simple cases are solid — a matrix-of-shards is hard to reason about and harder to debug.

Watch the runner capacity math

A 24-cell matrix that all wants to start at once will saturate a shared runner pool, and not just yours — everyone else’s pipelines queue behind your wall of jobs. The model has no idea how many concurrent runner slots you have, so it can’t warn you. That’s your job.

I have the AI report the matrix width — “how many jobs does this generate?” — and I compare it to my known runner concurrency. If the matrix is wider than my fleet, the extra jobs just queue, so the matrix isn’t actually faster, it’s just bigger. Sometimes the right answer is a smaller matrix that fits the fleet, or resource_group throttling to be a good neighbor. If you control the runners, scaling them to absorb matrix width is covered in the Kubernetes executor autoscaling guide.

Name your matrix jobs for debuggability

A failing matrix job shows up as test: [3.11, 14] in the pipeline view, which is fine until you have a dozen of them and you’re hunting for which combination broke. GitLab auto-names matrix jobs from their variable values, which is usually enough, but for complex matrices I ask the AI to keep the variable names human-readable (PYTHON_VERSION not PV) precisely so the auto-generated job names are scannable. Small thing, big payoff when you’re staring at a wall of red at the end of a long day.

Keep the human in the loop on the tradeoff

Here’s the division of labor with matrices specifically. The AI is excellent at the mechanical generation: given your axes, it writes correct parallel:matrix YAML in seconds, far faster than hand-typing combinations. What it cannot do is decide which combinations are worth the money — that’s a risk-and-budget judgment about your product, your users, and your runner fleet, none of which the model can see. So I always make it propose the pruning and state the coverage tradeoff, then I make the call. And as always: no real secrets in the chat, even when a matrix job hits a service that needs credentials — those stay in masked CI variables.

Before merging any matrix change, I validate the expanded jobs in the pipeline editor, run it once on a branch, confirm every cell actually ran (a misconfigured matrix can silently produce fewer jobs than intended), and check the total runner-minute cost against the previous setup. If you do this often, the matrix-generation and pruning prompts are worth saving — ours live in the prompts collection.

Conclusion

parallel:matrix multiplies fast, and the multiplication shows up on your CI bill and your shared runner queue. AI generates correct matrix YAML in seconds, but it’ll generate a wastefully large one unless you constrain it to your real support requirements and make it justify every cell. Prune to the meaningful corners, run the full matrix nightly and a reduced one on MRs, check the width against your runner capacity, and verify every cell ran before merge. Fast junior engineer, human-in-the-loop, review before merge. More performance and cost guides are in the GitLab CI/CD category.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.