GitLab CI/CD Pipeline Optimization Prompt

You are a senior DevOps engineer who has shaved hours off real GitLab pipelines in production. You know the difference between cache and artifacts, when DAG with `needs:` actually helps, and when an "optimization" is just complexity. I will provide: - The current pipeline timing: total duration, longest jobs, where the critical path lives (`Pipelines → <id>` view, optionally the "PipelineGraph") - The full `.gitlab-ci.yml` (or relevant excerpts) - The runner executor type (Docker, K8s, shell) - Constraints: must run on every push? Specific compliance requirements? Single-runner cluster? - Recent pipeline durations (e.g., median over last 30 runs from analytics) Your job: 1. **Identify the critical path** — the longest chain of dependent jobs. Total pipeline time is dominated by this. Optimizations that don't shorten the critical path don't help wall-clock duration. 2. **Apply optimizations in priority order** (highest impact first): - **Convert stages → DAG with `needs:`**: stages enforce sequential gates between groups; DAG lets each job start as soon as its inputs are ready. Often shaves 30-50% off long pipelines. - **Parallelize naturally splittable jobs** with `parallel: <N>` or `parallel: matrix:` — tests, lints, builds across N versions. - **Cache dependencies properly**: `cache:key:files:` instead of static key; cache `node_modules/`, `~/.cache/pip`, `~/.gradle/caches/`, target/build directories per language. Set `cache:policy: pull` for read-only consumers. - **Use artifacts ONLY for cross-job handoff**, not as a "cache." Artifacts upload AND download for every consumer. - **Pre-build CI base images** so jobs don't `apt-get install` on every run. Build a "ci-base" image with toolchain baked, push to your registry, use as job `image:`. - **Shallow clone**: `GIT_DEPTH: 50` (or smaller) for jobs that don't need full history. Default is `20` on GitLab.com; verify yours. - **Skip unchanged paths** with `rules:changes:` — don't run frontend tests if only backend code changed. - **Dependency proxy / registry mirror**: avoid Docker Hub rate limits, faster pulls. `dependency_proxy:` on GitLab; or set runner pull-policy to `if-not-present`. - **`interruptible: true`** on jobs that should cancel when a new pipeline starts on the same MR. Saves CPU on outdated pipelines. 3. **Identify ANTI-optimizations** the user might be doing: - Excessive parallelization without enough runners → jobs queue instead of run - Caching too aggressively → cache restore time > rebuild time - Pre-building images for every commit → image build itself becomes the bottleneck - Long-lived branch-specific caches that grow unbounded 4. **Estimate the win** for each recommendation, qualitatively (small / medium / large) so the user can prioritize. 5. **Watch for the trade-offs**: pipeline speed vs determinism, cost (more runners), or maintenance complexity (DAG is harder to reason about than stages). 6. **Recommend monitoring**: GitLab's pipeline analytics, job duration trends, runner utilization. Optimization is iterative. Mark any change that requires runner / cluster reconfiguration (e.g., upgrade dependency-proxy, install more runners) separately from `.gitlab-ci.yml`-only changes. --- Current pipeline duration (median): [N minutes] Critical path (longest jobs in order): [DESCRIBE] Runner executor + count: [e.g., 4 shared K8s runners, 2 GPU specific] Constraints: [must run X / cannot Y / regulatory] Full `.gitlab-ci.yml`: ```yaml [PASTE — or relevant 70%] ``` Recent timing data: [PASTE — job names + durations]

Why this prompt works

Pipeline optimization is a domain where there are many techniques but only a few apply to any given pipeline. The first question is always “where’s the critical path?” — without that, every recommendation is a guess. This prompt forces the model to optimize the critical path specifically rather than scattering generic advice.

How to use it

Get real timing data first. “Pipeline is slow” tells the model nothing. Median duration over 30 runs + the slowest jobs tells everything.
Identify the critical path explicitly. If your slowest job is test-integration at 20 min and total pipeline is 25 min, optimizing the 2-min lint doesn’t matter.
One change at a time. Apply DAG OR caching changes, not both at once — you won’t know which won/lost.
Measure after each change. Compare median durations before/after; one slow run can mislead.

Useful diagnostics

# View pipeline timing
# In GitLab UI: Pipelines → click pipeline → "Pipeline graph" tab
# Or: Analytics → CI/CD analytics → pipeline durations chart

# Per-job durations via API
curl -s --header "PRIVATE-TOKEN: <token>" \
  "https://gitlab.example.com/api/v4/projects/<id>/pipelines/<pipeline-id>/jobs" | \
  jq -r '.[] | "\(.duration)s \(.name) [\(.stage)]"' | sort -nr | head

# Find the bottleneck (longest job)
curl -s --header "PRIVATE-TOKEN: <token>" \
  "https://gitlab.example.com/api/v4/projects/<id>/pipelines/<pipeline-id>/jobs" | \
  jq '.[] | {name, stage, duration, queued_duration}' | jq -s 'sort_by(-.duration) | .[0:5]'

# Runner utilization (admin)
# Admin → Monitoring → CI/CD → Runner utilization chart

High-impact patterns

Convert stages to DAG (`needs:`)

Before:

stages: [build, test, deploy]
build-app:        { stage: build, script: ./build.sh }
build-frontend:   { stage: build, script: ./build-fe.sh }
test-unit:        { stage: test, script: pytest }
test-integration: { stage: test, script: ./integ.sh }
deploy:           { stage: deploy, script: ./deploy.sh }

After (DAG):

stages: [build, test, deploy]   # still useful for UI grouping
build-app:        { stage: build, script: ./build.sh }
build-frontend:   { stage: build, script: ./build-fe.sh }
test-unit:        { stage: test, script: pytest, needs: ["build-app"] }
test-integration: { stage: test, script: ./integ.sh,
                    needs: ["build-app", "build-frontend"] }
deploy:           { stage: deploy, script: ./deploy.sh,
                    needs: ["test-unit", "test-integration"] }

Now test-unit starts as soon as build-app finishes — no wait for build-frontend.

Smart caching (Node example)

.node-cache: &node-cache
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
    policy: pull-push       # producer

test-unit:
  <<: *node-cache
  cache:
    policy: pull            # consumer (read-only, fast)
  script: npm test

cache:key:files: invalidates only when package-lock.json changes. Most pipelines never invalidate.

`rules:changes:` to skip unchanged paths

test-frontend:
  rules:
    - changes:
        - frontend/**/*
        - package.json
  script: npm test --prefix frontend

Frontend tests skipped when only backend code changed.

`parallel: matrix:` for matrix testing

test:
  parallel:
    matrix:
      - PYTHON_VERSION: ["3.10", "3.11", "3.12"]
        OS: ["ubuntu", "alpine"]
  image: python:$PYTHON_VERSION-$OS
  script: pytest

6 jobs run in parallel — across all 6 combinations.

Pre-built CI base image

Instead of:

test:
  image: python:3.12
  before_script:
    - apt-get update && apt-get install -y postgresql-client libpq-dev
    - pip install -r requirements-dev.txt    # 90 seconds every job
  script: pytest

Build once:

# In a separate ".images/Dockerfile.ci-python"
FROM python:3.12-slim
RUN apt-get update && apt-get install -y postgresql-client libpq-dev
COPY requirements-dev.txt /tmp/
RUN pip install -r /tmp/requirements-dev.txt

Then in pipelines:

test:
  image: registry.example.com/team/ci-python:1.2
  script: pytest                              # no before_script!

`interruptible: true` for MR pipelines

default:
  interruptible: true       # cancel outdated MR pipelines automatically

deploy:
  interruptible: false      # never cancel deploys
  script: ./deploy.sh

Dependency proxy (avoid Docker Hub rate limits)

variables:
  CI_DEPENDENCY_PROXY_SERVER: $CI_SERVER_HOST:$CI_SERVER_PORT

image: $CI_DEPENDENCY_PROXY_DIRECT_GROUP_IMAGE_PREFIX/python:3.12

GitLab caches the image; subsequent jobs hit your registry instead of Docker Hub.

Common pitfalls this catches

Caching everything: cache restore at 30s + 200MB pull > recomputing in 10s. Profile.
DAG with hidden ordering deps: deploy job runs before tests because the user forgot to needs: ["test-*"]. Validate visually in the pipeline graph.
Artifacts used as cache: every job uploads + downloads. Use cache: for build outputs that don’t need to flow between jobs.
Excessive parallel: against few runners: jobs queue; no real speedup.
GIT_STRATEGY: clone instead of fetch: clones from scratch every job; fetch reuses.
when: always on a cleanup job after a flaky deploy: cleanup runs when deploy fails, may delete state needed for diagnosis.

Estimating wins (qualitative)

Change	Typical win	Cost
Stages → DAG (`needs:`)	30-50% on long pipelines	Pipeline-graph complexity
Effective dependency cache	1-3 min per job	Cache invalidation risk
Pre-built CI base image	1-2 min per job	Image maintenance
`interruptible: true`	Frees runners for active MRs	None
`parallel: matrix:`	2-10× wall-clock on testable	More runners needed
`rules:changes:` to skip	100% of skipped jobs	Risk: skipping when shouldn’t
Dependency proxy	5-30s per pull	Setup once
`GIT_DEPTH: 50`	5-30s on big repos	Tools needing history break

When to escalate

Slow runner provisioning at scale — engage runner team to provision more / faster nodes.
Cache backend (S3, MinIO) saturated → larger cache backend, or smaller caches.
Pipeline doesn’t fit in 1 hour even after optimization — consider parent/child pipelines or scheduled jobs.

Reading prompts? Get all 500 in one free PDF

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response

Instant PDF download — yours free, forever

Plus one practical AI-workflow email a week (no spam)

GitLab CI/CD Pipeline Optimization Prompt

Why this prompt works

How to use it

Useful diagnostics

High-impact patterns

Convert stages to DAG (`needs:`)

Smart caching (Node example)

`rules:changes:` to skip unchanged paths

`parallel: matrix:` for matrix testing

Pre-built CI base image

`interruptible: true` for MR pipelines

Dependency proxy (avoid Docker Hub rate limits)

Common pitfalls this catches

Estimating wins (qualitative)

When to escalate

Related prompts

GitLab CI/CD Debugging Prompt

GitLab CI/CD Cache vs Artifacts Design Prompt

GitLab CI/CD `needs:` DAG Optimization Prompt

GitLab Runner Troubleshooting Prompt

Reading prompts? Get all 500 in one free PDF

Why this prompt works

How to use it

Useful diagnostics

High-impact patterns

Convert stages to DAG (needs:)

Smart caching (Node example)

rules:changes: to skip unchanged paths

parallel: matrix: for matrix testing

Pre-built CI base image

interruptible: true for MR pipelines

Dependency proxy (avoid Docker Hub rate limits)

Common pitfalls this catches

Estimating wins (qualitative)

When to escalate

Related prompts

GitLab CI/CD Debugging Prompt

GitLab CI/CD Cache vs Artifacts Design Prompt

GitLab CI/CD `needs:` DAG Optimization Prompt

GitLab Runner Troubleshooting Prompt

Reading prompts? Get all 500 in one free PDF

Convert stages to DAG (`needs:`)

`rules:changes:` to skip unchanged paths

`parallel: matrix:` for matrix testing

`interruptible: true` for MR pipelines