Skip to content
CloudOps
All prompts
AI for GitLab CI/CD Difficulty: Intermediate ClaudeChatGPT

GitLab CI/CD Pipeline Optimization Prompt

Speed up slow GitLab pipelines — DAG with `needs:`, cache vs artifacts, parallel jobs, image pre-builds, dependency proxy, and shallow clones.

Target user
DevOps engineers wanting faster GitLab pipelines
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a senior DevOps engineer who has shaved hours off real GitLab pipelines in production. You know the difference between cache and artifacts, when DAG with `needs:` actually helps, and when an "optimization" is just complexity.

I will provide:
- The current pipeline timing: total duration, longest jobs, where the critical path lives (`Pipelines → <id>` view, optionally the "PipelineGraph")
- The full `.gitlab-ci.yml` (or relevant excerpts)
- The runner executor type (Docker, K8s, shell)
- Constraints: must run on every push? Specific compliance requirements? Single-runner cluster?
- Recent pipeline durations (e.g., median over last 30 runs from analytics)

Your job:

1. **Identify the critical path** — the longest chain of dependent jobs. Total pipeline time is dominated by this. Optimizations that don't shorten the critical path don't help wall-clock duration.
2. **Apply optimizations in priority order** (highest impact first):
   - **Convert stages → DAG with `needs:`**: stages enforce sequential gates between groups; DAG lets each job start as soon as its inputs are ready. Often shaves 30-50% off long pipelines.
   - **Parallelize naturally splittable jobs** with `parallel: <N>` or `parallel: matrix:` — tests, lints, builds across N versions.
   - **Cache dependencies properly**: `cache:key:files:` instead of static key; cache `node_modules/`, `~/.cache/pip`, `~/.gradle/caches/`, target/build directories per language. Set `cache:policy: pull` for read-only consumers.
   - **Use artifacts ONLY for cross-job handoff**, not as a "cache." Artifacts upload AND download for every consumer.
   - **Pre-build CI base images** so jobs don't `apt-get install` on every run. Build a "ci-base" image with toolchain baked, push to your registry, use as job `image:`.
   - **Shallow clone**: `GIT_DEPTH: 50` (or smaller) for jobs that don't need full history. Default is `20` on GitLab.com; verify yours.
   - **Skip unchanged paths** with `rules:changes:` — don't run frontend tests if only backend code changed.
   - **Dependency proxy / registry mirror**: avoid Docker Hub rate limits, faster pulls. `dependency_proxy:` on GitLab; or set runner pull-policy to `if-not-present`.
   - **`interruptible: true`** on jobs that should cancel when a new pipeline starts on the same MR. Saves CPU on outdated pipelines.
3. **Identify ANTI-optimizations** the user might be doing:
   - Excessive parallelization without enough runners → jobs queue instead of run
   - Caching too aggressively → cache restore time > rebuild time
   - Pre-building images for every commit → image build itself becomes the bottleneck
   - Long-lived branch-specific caches that grow unbounded
4. **Estimate the win** for each recommendation, qualitatively (small / medium / large) so the user can prioritize.
5. **Watch for the trade-offs**: pipeline speed vs determinism, cost (more runners), or maintenance complexity (DAG is harder to reason about than stages).
6. **Recommend monitoring**: GitLab's pipeline analytics, job duration trends, runner utilization. Optimization is iterative.

Mark any change that requires runner / cluster reconfiguration (e.g., upgrade dependency-proxy, install more runners) separately from `.gitlab-ci.yml`-only changes.

---

Current pipeline duration (median): [N minutes]
Critical path (longest jobs in order): [DESCRIBE]
Runner executor + count: [e.g., 4 shared K8s runners, 2 GPU specific]
Constraints: [must run X / cannot Y / regulatory]
Full `.gitlab-ci.yml`:
```yaml
[PASTE — or relevant 70%]
```
Recent timing data:
[PASTE — job names + durations]

Why this prompt works

Pipeline optimization is a domain where there are many techniques but only a few apply to any given pipeline. The first question is always “where’s the critical path?” — without that, every recommendation is a guess. This prompt forces the model to optimize the critical path specifically rather than scattering generic advice.

How to use it

  1. Get real timing data first. “Pipeline is slow” tells the model nothing. Median duration over 30 runs + the slowest jobs tells everything.
  2. Identify the critical path explicitly. If your slowest job is test-integration at 20 min and total pipeline is 25 min, optimizing the 2-min lint doesn’t matter.
  3. One change at a time. Apply DAG OR caching changes, not both at once — you won’t know which won/lost.
  4. Measure after each change. Compare median durations before/after; one slow run can mislead.

Useful diagnostics

# View pipeline timing
# In GitLab UI: Pipelines → click pipeline → "Pipeline graph" tab
# Or: Analytics → CI/CD analytics → pipeline durations chart

# Per-job durations via API
curl -s --header "PRIVATE-TOKEN: <token>" \
  "https://gitlab.example.com/api/v4/projects/<id>/pipelines/<pipeline-id>/jobs" | \
  jq -r '.[] | "\(.duration)s \(.name) [\(.stage)]"' | sort -nr | head

# Find the bottleneck (longest job)
curl -s --header "PRIVATE-TOKEN: <token>" \
  "https://gitlab.example.com/api/v4/projects/<id>/pipelines/<pipeline-id>/jobs" | \
  jq '.[] | {name, stage, duration, queued_duration}' | jq -s 'sort_by(-.duration) | .[0:5]'

# Runner utilization (admin)
# Admin → Monitoring → CI/CD → Runner utilization chart

High-impact patterns

Convert stages to DAG (needs:)

Before:

stages: [build, test, deploy]
build-app:        { stage: build, script: ./build.sh }
build-frontend:   { stage: build, script: ./build-fe.sh }
test-unit:        { stage: test, script: pytest }
test-integration: { stage: test, script: ./integ.sh }
deploy:           { stage: deploy, script: ./deploy.sh }

After (DAG):

stages: [build, test, deploy]   # still useful for UI grouping
build-app:        { stage: build, script: ./build.sh }
build-frontend:   { stage: build, script: ./build-fe.sh }
test-unit:        { stage: test, script: pytest, needs: ["build-app"] }
test-integration: { stage: test, script: ./integ.sh,
                    needs: ["build-app", "build-frontend"] }
deploy:           { stage: deploy, script: ./deploy.sh,
                    needs: ["test-unit", "test-integration"] }

Now test-unit starts as soon as build-app finishes — no wait for build-frontend.

Smart caching (Node example)

.node-cache: &node-cache
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
    policy: pull-push       # producer

test-unit:
  <<: *node-cache
  cache:
    policy: pull            # consumer (read-only, fast)
  script: npm test

cache:key:files: invalidates only when package-lock.json changes. Most pipelines never invalidate.

rules:changes: to skip unchanged paths

test-frontend:
  rules:
    - changes:
        - frontend/**/*
        - package.json
  script: npm test --prefix frontend

Frontend tests skipped when only backend code changed.

parallel: matrix: for matrix testing

test:
  parallel:
    matrix:
      - PYTHON_VERSION: ["3.10", "3.11", "3.12"]
        OS: ["ubuntu", "alpine"]
  image: python:$PYTHON_VERSION-$OS
  script: pytest

6 jobs run in parallel — across all 6 combinations.

Pre-built CI base image

Instead of:

test:
  image: python:3.12
  before_script:
    - apt-get update && apt-get install -y postgresql-client libpq-dev
    - pip install -r requirements-dev.txt    # 90 seconds every job
  script: pytest

Build once:

# In a separate ".images/Dockerfile.ci-python"
FROM python:3.12-slim
RUN apt-get update && apt-get install -y postgresql-client libpq-dev
COPY requirements-dev.txt /tmp/
RUN pip install -r /tmp/requirements-dev.txt

Then in pipelines:

test:
  image: registry.example.com/team/ci-python:1.2
  script: pytest                              # no before_script!

interruptible: true for MR pipelines

default:
  interruptible: true       # cancel outdated MR pipelines automatically

deploy:
  interruptible: false      # never cancel deploys
  script: ./deploy.sh

Dependency proxy (avoid Docker Hub rate limits)

variables:
  CI_DEPENDENCY_PROXY_SERVER: $CI_SERVER_HOST:$CI_SERVER_PORT

image: $CI_DEPENDENCY_PROXY_DIRECT_GROUP_IMAGE_PREFIX/python:3.12

GitLab caches the image; subsequent jobs hit your registry instead of Docker Hub.

Common pitfalls this catches

  • Caching everything: cache restore at 30s + 200MB pull > recomputing in 10s. Profile.
  • DAG with hidden ordering deps: deploy job runs before tests because the user forgot to needs: ["test-*"]. Validate visually in the pipeline graph.
  • Artifacts used as cache: every job uploads + downloads. Use cache: for build outputs that don’t need to flow between jobs.
  • Excessive parallel: against few runners: jobs queue; no real speedup.
  • GIT_STRATEGY: clone instead of fetch: clones from scratch every job; fetch reuses.
  • when: always on a cleanup job after a flaky deploy: cleanup runs when deploy fails, may delete state needed for diagnosis.

Estimating wins (qualitative)

ChangeTypical winCost
Stages → DAG (needs:)30-50% on long pipelinesPipeline-graph complexity
Effective dependency cache1-3 min per jobCache invalidation risk
Pre-built CI base image1-2 min per jobImage maintenance
interruptible: trueFrees runners for active MRsNone
parallel: matrix:2-10× wall-clock on testableMore runners needed
rules:changes: to skip100% of skipped jobsRisk: skipping when shouldn’t
Dependency proxy5-30s per pullSetup once
GIT_DEPTH: 505-30s on big reposTools needing history break

When to escalate

  • Slow runner provisioning at scale — engage runner team to provision more / faster nodes.
  • Cache backend (S3, MinIO) saturated → larger cache backend, or smaller caches.
  • Pipeline doesn’t fit in 1 hour even after optimization — consider parent/child pipelines or scheduled jobs.

Related prompts

Newsletter

Get weekly AI workflows for DevOps engineers

Practical prompts, automation ideas, and tool reviews for infrastructure engineers. One email per week. No spam.