GitLab CI/CD Cache vs Artifacts Design Prompt
Choose between cache and artifacts in GitLab CI/CD — design cache keys that invalidate correctly, set artifact expiry, and avoid the common 'cache as artifact' mistake.
- Target user
- DevOps engineers designing pipeline data flow
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior DevOps engineer with deep experience designing GitLab CI/CD caching and artifact strategy. You know that cache and artifacts solve different problems, and that confusing them produces slow, expensive pipelines. I will provide: - The current `.gitlab-ci.yml` cache/artifacts setup - The runner type (Docker, K8s, shell) and cache backend (local-disk, S3, MinIO) - The data being cached/artifacted: dependency dirs, build outputs, test reports, generated docs - The pipeline's overall shape: how many jobs, which jobs need which data - Symptom: slow restore, cache always missing, artifact upload timeouts, ballooning storage Your job: 1. **Clarify the mental model**: - **Cache** = persisted between pipelines, for speeding up recomputation. Stored on the runner (or in S3) keyed by `cache:key:`. **Not** a job-to-job handoff. Best for `node_modules`, `~/.cache/pip`, etc. — things you'd recompute from a lockfile. - **Artifacts** = produced by one job, consumed by another (or saved for download). Uploaded to GitLab server, downloaded by downstream jobs. Best for build outputs that feed other jobs (compiled binary, test reports), or for keeping deploy artifacts. - **Rule of thumb**: cache = "I could regenerate this." Artifact = "I need to pass this to another job or keep it for the user." 2. **Cache key strategy** — assess the current `cache:key:`: - **Static key** (e.g., `key: deps`) → never invalidates; great until dependencies change. Risky. - **`cache:key:files:`** → invalidates when listed files change. Use for `package-lock.json`, `requirements.txt`, `Gemfile.lock`, etc. Preferred for most. - **`cache:key:prefix:` + `files:`** → adds a manual versioning prefix for forced invalidation. - **`$CI_COMMIT_REF_SLUG`** in key → per-branch cache. Useful for feature-branch isolation but multiplies storage. - **`$CI_JOB_NAME`** in key → per-job cache. Avoids cross-contamination but redundant if jobs share deps. 3. **Cache scope policies**: - `policy: pull-push` (default) → reads and writes - `policy: pull` → read-only; downstream jobs that don't modify deps. Faster, less write contention. - `policy: push` → write-only; rare; for "build cache" jobs that initialize. 4. **Cache paths**: - List EXACTLY the directories that hold cached state. Don't cache the project working dir — that's the git checkout. - Common per-language: `node_modules/`, `~/.cache/pip`, `~/.gradle/caches`, `~/.m2/repository`, `target/`, `.venv/` 5. **Artifacts strategy**: - **`artifacts:paths:`** — files/dirs to upload. Avoid huge dirs (test fixtures, build caches). - **`artifacts:reports:`** — typed artifacts (`junit`, `coverage`, `dotenv`, `codequality`, `dast`, `sast`). Get UI integration. - **`artifacts:expire_in:`** — default depends on project setting; set explicitly for clarity. Use short for ephemeral (1 hour), long for releases (never). - **`artifacts:when:`** — `on_success` (default), `on_failure`, `always`. For failed-job logs/screenshots, use `on_failure`. - **`artifacts:exclude:`** — strip noise (e.g., exclude `**/node_modules`). 6. **Common anti-patterns** to flag: - **Using artifacts as cache**: every job uploads + downloads N MB. Massive overhead vs `cache:`. - **Caching build outputs** instead of artifacting them: works on the same runner sometimes, fails when job goes to a different runner. - **Unscoped cache** (no key files) growing forever: defaults are pretty good, but `key: dependencies` with no invalidation hits stale cache for months. - **Cache the project directory**: GitLab clones the project; caching `./` defeats clone+cache. - **Artifact size > 100 MB on every job**: server storage + upload time. Trim or use a real artifact registry (Package Registry). - **No `artifacts:expire_in`**: relies on project's default; admin may have set it generously. 7. **For multi-runner / autoscaler setups**: - Local-disk cache is per-runner — different runners miss each other's caches. Use distributed cache (S3/MinIO). - Configure `[runners.cache]` in `config.toml` for shared cache. Provide concrete YAML diffs for each finding. --- Runner type: [Docker / K8s / shell] Cache backend: [local / S3 / MinIO] Symptom: [DESCRIBE] Current cache + artifacts config (from `.gitlab-ci.yml`): ```yaml [PASTE] ``` Data sizes (rough): [node_modules: 500 MB, build/: 200 MB, etc.] Pipeline shape: [DESCRIBE jobs and their data dependencies]
Why this prompt works
The single biggest GitLab CI/CD design mistake is using artifacts where cache belongs — a job uploads 500 MB of node_modules as artifacts and 10 downstream jobs each download it. The right answer is “cache the deps, artifact the build output.” This prompt forces explicit reasoning per piece of data.
How to use it
- Inventory the data flowing through your pipeline. For each directory: who produces it, who consumes it, can it be regenerated?
- Apply the rule of thumb: can regenerate → cache. Job-to-job handoff → artifact. Both criteria fail → maybe you don’t need to persist it at all.
- Set explicit
expire_inon every artifact block. Don’t rely on defaults. - For shared runner pools, ensure cache backend is distributed (S3/MinIO).
Decision matrix
| Data | Solution |
|---|---|
node_modules/, ~/.cache/pip (dependency dirs) | Cache with files: [lockfile] key |
| Compiled binary used by deploy job | Artifact with short expire_in |
| JUnit test report | Artifact reports:junit (UI integration) |
| Build cache (incremental compile state) | Cache with policy: pull-push |
| Generated documentation site for deploy | Artifact (passed to pages job) |
| Code coverage report | Artifact reports:coverage |
| Logs / screenshots from failed tests | Artifact when: on_failure |
| Release binary for download | Package Registry or long-lived artifact |
Intermediate target/ between two compile jobs | Cache if same runner, artifact if cross-runner |
Cache key patterns
Single lockfile (Node.js)
.node-cache: &node-cache
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
Multiple lockfiles (monorepo)
.cache: &cache
cache:
key:
files:
- package-lock.json
- subproject/package-lock.json
paths:
- node_modules/
- subproject/node_modules/
Manually invalidatable (with prefix)
.cache: &cache
cache:
key:
prefix: v3- # bump to invalidate
files: [package-lock.json]
paths:
- node_modules/
Per-branch (use sparingly)
.cache: &cache
cache:
key: "$CI_COMMIT_REF_SLUG"
paths: [node_modules/]
Producer / consumer with policy: pull
install-deps:
stage: setup
cache:
key:
files: [package-lock.json]
paths: [node_modules/]
policy: pull-push # producer; reads existing, updates
script:
- npm ci
test:
stage: test
cache:
key:
files: [package-lock.json]
paths: [node_modules/]
policy: pull # consumer; read-only, fast
script:
- npm test
needs: [install-deps]
Artifact patterns
Build → deploy handoff
build:
stage: build
script:
- go build -o bin/app ./cmd/app
artifacts:
paths:
- bin/app
expire_in: 1 day
deploy:
stage: deploy
needs: [build]
script:
- ./deploy.sh bin/app
Test reports (UI integration)
test:
script:
- pytest --junitxml=report.xml --cov-report=xml:coverage.xml
artifacts:
when: always
paths:
- report.xml
- coverage.xml
reports:
junit: report.xml
coverage_report:
coverage_format: cobertura
path: coverage.xml
expire_in: 1 week
Failed-job diagnostics
e2e:
script:
- npx playwright test
artifacts:
when: on_failure
paths:
- test-results/
- playwright-report/
expire_in: 3 days
Don’t artifact this
# DON'T — using artifacts as a poor man's cache
build:
script:
- npm ci # always pulls 500 MB
- npm run build
artifacts:
paths:
- node_modules/ # WRONG — 500 MB upload per pipeline
- dist/
# DO
build:
cache:
key: { files: [package-lock.json] }
paths: [node_modules/]
script:
- npm ci # cached on most runs
- npm run build
artifacts:
paths:
- dist/ # only the build output (10s of MB)
expire_in: 1 day
Common findings this catches
node_modulesinartifacts:paths:→ switch tocache:. Big win.cache:key: static-keywithout invalidation → switch tokey:files:[lockfile].- No
cache:policy:on consumer jobs → they’re pushing the cache too. Setpolicy: pull. - Artifacts > 100 MB on every job → look for
**/*.logor accidentally-included caches. Useartifacts:exclude:. artifacts:expire_in: never→ audit; usually unnecessary.- Local-disk runner cache + multi-runner cluster → caches miss every cross-runner job. Configure S3 distributed cache.
When to escalate
- GitLab server storage usage spiking → audit artifact expiry settings org-wide; clean old pipelines.
- S3 cache backend showing high latency → consider regional placement; bandwidth between runner and S3.
- Pipeline reliability dropping due to cache flakiness → audit
policy:settings; consider per-job key fragmentation.
Related prompts
-
GitLab CI/CD `needs:` DAG Optimization Prompt
Convert stage-based GitLab pipelines to DAG (`needs:`), find hidden ordering bugs, design clean fan-out/fan-in patterns, and avoid `needs:` traps.
-
GitLab CI/CD Pipeline Optimization Prompt
Speed up slow GitLab pipelines — DAG with `needs:`, cache vs artifacts, parallel jobs, image pre-builds, dependency proxy, and shallow clones.
-
GitLab Runner Troubleshooting Prompt
Diagnose GitLab Runner failures — runner offline, executor errors, Docker-in-Docker issues, autoscaler problems, slow job pickup, and resource exhaustion.