Top 25 GitLab CI/CD Pipeline Mistakes (and How to Avoid Them)
The top 25 GitLab CI/CD pipeline mistakes that hurt security, cost, and reliability — with real .gitlab-ci.yml fixes you can copy into your repo today.
- #gitlab
- #ci-cd
- #pipelines
- #devops
- #mistakes
The 25 most common GitLab CI/CD pipeline mistakes fall into five buckets: security (leaked secrets, over-privileged job tokens, no scanning), performance and cost (no caching, floating image tags, no interruptible), reliability (no timeouts, blind retries, no rollback path), maintainability (only/except sprawl, monolithic jobs, unreviewed pipeline code), and workflow (no rules:, deploying from feature branches, ignoring merge-request pipelines). Almost every painful .gitlab-ci.yml I have inherited makes a handful of these at once, and each one is cheap to fix once you can name it. Below are all 25, grouped by theme, with the exact YAML I use to fix them.
I have spent years cleaning up other people’s pipelines, and the failure modes rhyme. None of these require a platform migration or a new tool — they are config changes you can ship in an afternoon. Work through them in order of blast radius: secrets and privilege first, then cost and reliability, then the slow-burn maintainability problems.
Security mistakes
Security mistakes in CI are the worst kind because the pipeline runs with credentials a developer would never get directly. Fix these first.
1. Hardcoding secrets in .gitlab-ci.yml or CI variables
Putting a token, password, or kubeconfig directly in your YAML — or in a plain (non-masked, non-protected) CI/CD variable — means it lives in git history forever and shows up in any job log. Use masked, protected variables for static secrets, and prefer short-lived credentials via OIDC so nothing long-lived is stored at all.
# Bad: secret baked into the file
deploy:
script:
- curl -H "Authorization: Bearer glpat-xxxxxxxxxxxx" https://api.internal/deploy
# Good: injected from a masked + protected CI/CD variable
deploy:
script:
- curl -H "Authorization: Bearer $DEPLOY_TOKEN" https://api.internal/deploy
Mark the variable Masked and Protected in Settings → CI/CD → Variables so it never prints and never reaches unprotected branches. For cloud credentials, see GitLab CI secrets management with OIDC — it removes the static key entirely.
2. Over-privileged CI_JOB_TOKEN
The job token is convenient, but by default it can be configured to access more projects and APIs than a single job needs. Lock down the token’s allowlist so a compromised job in one repo cannot pull source or trigger pipelines across your whole group.
In Settings → CI/CD → Token Access, set inbound access to only the projects that legitimately call this one, and disable the broad “All groups and projects” option. Treat CI_JOB_TOKEN like any other credential with a least-privilege scope.
3. Running every job in a privileged: true runner
A privileged runner shares the host kernel namespaces and can trivially escape the container. People enable it once for Docker-in-Docker and then leave it on for the whole fleet. Run only your image-build jobs on a tagged privileged runner and keep everything else on an unprivileged one.
build-image:
tags: [dind-privileged] # only this job lands on the privileged runner
services: [docker:dind]
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
unit-tests:
tags: [shared-unprivileged]
script:
- make test
Better yet, replace DinD entirely with a rootless builder like Kaniko or BuildKit so you never need privilege at all.
4. No SAST or dependency scanning
If your pipeline does not run static analysis and dependency scanning, you are shipping known-vulnerable code and CVEs into production blind. GitLab ships templates that wire this in with one include:.
include:
- template: Jobs/SAST.gitlab-ci.yml
- template: Jobs/Dependency-Scanning.gitlab-ci.yml
- template: Jobs/Secret-Detection.gitlab-ci.yml
The findings surface directly in the merge request. For a tuning walkthrough see using AI to harden GitLab CI security scanning.
5. Echoing secrets into job logs
Even a masked variable leaks if you transform it — base64, concatenation, or piping it through a command that prints its arguments. A masked variable only matches the exact stored string. Never echo a secret, never run set -x in a block that touches one, and avoid passing secrets as CLI flags that show up in process listings.
# Bad: derived value defeats masking and prints in the log
script:
- echo "Bearer $(echo -n $TOKEN | base64)"
# Good: keep it in the environment, never print it
script:
- apply --token-env DEPLOY_TOKEN
6. Shell injection via unquoted variables
CI variables can contain attacker-controlled values (branch names, MR titles, webhook payloads). Unquoted interpolation in a shell script lets that content run as code. Quote every expansion and never eval user-influenced input.
# Bad: a branch named "; rm -rf /" is now your problem
script:
- deploy --env=$CI_COMMIT_REF_NAME
# Good
script:
- deploy --env="$CI_COMMIT_REF_NAME"
Performance and cost mistakes
Slow pipelines burn runner minutes and developer patience. These five are where most of the waste hides.
7. Floating image tags like image: latest
image: node:latest makes builds non-reproducible — the same commit produces different results next week, and a breaking upstream change lands in your pipeline without a single line of your code changing. Pin to a specific tag, and ideally a digest, so builds are deterministic.
# Bad
image: node:latest
# Good
image: node:20.11.1-bookworm
# Best: immutable digest
image: node:20.11.1-bookworm@sha256:abc123...
8. No caching (or a cache key that never hits)
Reinstalling dependencies from scratch on every job adds minutes and external bandwidth. The classic failure is a cache key so generic it never matches, or so specific it never reuses. Key the cache on your lockfile and scope the paths tightly.
test:
cache:
key:
files:
- package-lock.json
paths:
- .npm/
script:
- npm ci --cache .npm --prefer-offline
- npm test
There is a lot of nuance here — fallback keys, policy: pull for read-only jobs, cache-vs-artifacts. The full treatment is in GitLab CI caching strategies: a deep dive.
9. No interruptible, so superseded pipelines keep running
When you push three commits in a row, the first two pipelines are already obsolete but keep consuming runners. Marking jobs interruptible: true (with auto-cancel-redundant enabled) cancels stale pipelines automatically.
default:
interruptible: true # applies to all jobs
deploy:
interruptible: false # opt deploys back out so they finish
10. Pulling images without a registry mirror
Every job that pulls from Docker Hub competes for a shared rate limit and pays a latency tax. Point your runners at a pull-through cache or your GitLab Dependency Proxy so images come from a warm, local mirror.
build:
image: ${CI_DEPENDENCY_PROXY_GROUP_IMAGE_PREFIX}/node:20.11.1
script:
- npm ci
This alone can cut cold-start time dramatically and immunizes you from Docker Hub rate-limit outages.
11. No artifact expiry
Artifacts default to staying around far longer than they are useful, and they count against your storage quota. Set expire_in on every artifact and keep only what a human or downstream job actually needs.
build:
artifacts:
paths: [dist/]
expire_in: 1 week
when: on_success
Reserve long retention for release builds; everything else can expire in hours or days. Pair this with a container registry cleanup policy so old images do not pile up either.
Reliability mistakes
A pipeline that lies about its own health is worse than no pipeline. These mistakes erode trust in green checkmarks.
12. No timeout on jobs
A hung job — a flaky network call, a deadlocked test — will sit consuming a runner until the project-wide default kills it, often an hour later. Set an aggressive per-job timeout so failures fail fast.
integration-tests:
timeout: 15 minutes
script:
- make integration
13. allow_failure: true hiding real failures
allow_failure is meant for genuinely optional jobs, but it gets sprinkled onto flaky tests to make pipelines “go green.” Now real regressions pass silently. Reserve it for advisory jobs and fix the flake instead.
# Acceptable: a non-blocking lint advisory
spell-check:
allow_failure: true
script: [make spellcheck]
# Not acceptable: hiding a broken test suite
unit-tests:
allow_failure: true # delete this and fix the tests
14. Blind retry: masking flaky tests
Setting retry: 2 on a job without scoping the failure type re-runs everything — including legitimate failures — and hides flakiness that should be fixed. Scope retries to infrastructure errors only.
e2e:
retry:
max: 2
when:
- runner_system_failure
- stuck_or_timeout_failure
That retries genuine infra hiccups but lets a real assertion failure fail immediately.
15. No environment or rollback path
Deploying with a bare kubectl apply and no environment: block means GitLab has no record of what is running where, and no one-click rollback. Declare environments so deploys are tracked and reversible.
deploy-prod:
stage: deploy
environment:
name: production
url: https://app.example.com
script:
- ./deploy.sh
With this, GitLab’s environment page shows the deploy history and offers a re-deploy of any prior commit as your rollback.
16. No resource_group on deploys
Without a resource group, two deploy jobs to the same environment can run concurrently and clobber each other’s state — especially painful with Terraform. A resource group serializes them.
deploy-prod:
resource_group: production # only one prod deploy runs at a time
environment: production
script: [./deploy.sh]
I cover this alongside interruptible in taming GitLab pipeline concurrency.
17. Giant monolithic jobs
A single 400-line build-test-deploy job is impossible to debug, cache, or parallelize — one failure throws away all the work before it. Split work into focused jobs per stage so failures are isolated and re-runnable.
stages: [build, test, deploy]
build: { stage: build, script: [make build] }
lint: { stage: test, script: [make lint] }
test: { stage: test, script: [make test] }
deploy: { stage: deploy, script: [make deploy] }
Maintainability mistakes
These do not break anything today. They make your pipeline impossible to change six months from now.
18. Using only/except instead of rules
only/except is legacy, cannot be combined cleanly, and does not support changes plus if in one place. rules: is the modern, composable replacement and is what every new GitLab feature targets.
# Bad: legacy, hard to compose
deploy:
only: [main]
# Good: explicit, extensible
deploy:
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
when: on_success
- when: never
See mastering rules:changes for path-scoped pipelines for the powerful changes: patterns.
19. No needs: / DAG, so everything runs stage-by-stage
Stage-based execution forces every job in a stage to finish before the next stage starts, even when there is no real dependency. A needs: DAG lets independent chains run as soon as their inputs are ready.
unit-tests:
stage: test
needs: [build] # starts the instant build finishes
deploy:
stage: deploy
needs: [unit-tests, integration-tests]
This often shaves whole minutes off wall-clock time. More in optimizing GitLab pipeline DAGs with needs.
20. No .gitlab-ci.yml validation or lint in the workflow
Pushing a YAML typo means waiting for the pipeline to fail before you find out. Validate locally and in a pre-merge job. GitLab exposes a CI Lint API and the glab CLI does it offline.
glab ci lint # validate the current file
# or hit the API:
# POST /projects/:id/ci/lint
Add a fast validate-ci job that runs on every MR so a broken config never reaches main.
21. Copy-pasted job definitions instead of extends/!reference
When the same five lines appear in eight jobs, every change is an eight-way edit and they drift. Factor shared config into a hidden .template job and extends it.
.docker-job:
image: docker:24
services: [docker:dind]
before_script: [docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" "$CI_REGISTRY"]
build:
extends: .docker-job
script: [docker build -t "$CI_REGISTRY_IMAGE" .]
For cross-project reuse, publish reusable CI/CD catalog components.
22. No review process for pipeline-as-code
.gitlab-ci.yml is production infrastructure, but teams let it merge without the scrutiny they apply to application code. Protect the file with a CODEOWNERS rule so a platform engineer reviews every pipeline change.
# CODEOWNERS
/.gitlab-ci.yml @platform-team
/ci/ @platform-team
Combined with required approvals on protected branches, no one ships a privilege escalation or a leaked-secret pattern unreviewed.
Workflow mistakes
The last bucket is about how pipelines fit into how your team actually ships.
23. Running the full pipeline on every commit (no rules: scoping)
Building the whole monorepo and running every test on a one-line docs change wastes minutes and money. Scope jobs to the paths they care about with rules:changes.
backend-tests:
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
changes: [backend/**/*]
script: [make -C backend test]
For monorepos, dynamic child pipelines take this further — see GitLab monorepo pipelines with child pipelines and rules.
24. Deploying to production from feature branches without a manual gate
If any branch can trigger a production deploy, one mis-scoped rules: entry is an outage. Gate production behind a protected environment plus a manual when: manual action so a human deliberately promotes.
deploy-prod:
stage: deploy
environment: production
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
when: manual # explicit click to ship
script: [./deploy.sh]
Combine this with deployment approval gates and protected environments so only authorized users can press the button.
25. Ignoring merge-request pipelines (and double pipelines)
Running branch pipelines instead of merge-request pipelines means your tests never see the merge result, and the classic only config triggers two redundant pipelines per push. Use workflow:rules to run MR pipelines and suppress duplicates.
workflow:
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH && $CI_OPEN_MERGE_REQUESTS'
when: never # no duplicate branch pipeline
- if: '$CI_COMMIT_BRANCH'
Now tests run against the merged code reviewers will actually approve, and you pay for one pipeline, not two.
How AI helps catch these
The honest problem with a list of 25 mistakes is that you will not remember all 25 while reviewing a 300-line YAML file at 5pm on a Friday. That is exactly the kind of pattern-matching an AI assistant is good at. Paste your .gitlab-ci.yml into the code review dashboard and ask it to audit against this list — it will flag the floating tags, the missing interruptible, the unquoted variables, and the allow_failure that is hiding a broken test, with the specific line and a suggested fix.
I keep a set of reusable GitLab CI prompts for exactly this: “review this pipeline for security mistakes,” “convert this only/except to rules:,” “add a needs: DAG to this stage-based pipeline.” If you want a curated, ready-to-run bundle, the prompt packs include a GitLab CI hardening set. Any capable model works — I run these through both Claude and ChatGPT depending on the task, and the output is consistently good enough to catch the obvious offenders before a human ever opens the MR.
The workflow that sticks: AI does the first pass against the checklist, a human reviews the diff, and CODEOWNERS makes sure that human is on the platform team. You get breadth from the model and judgment from the person.
FAQ
What is the single most damaging GitLab CI mistake? Hardcoded long-lived secrets. They leak into git history and logs, survive forever, and grant standing access to anyone who reads them. Move to masked, protected variables immediately and to OIDC short-lived credentials as soon as you can.
Should I use rules: or only/except?
Always rules: for new work. It is the actively developed mechanism, supports combining if, changes, and exists, and integrates with workflow:rules. only/except is legacy and will not receive new capabilities — migrate existing jobs as you touch them.
How do I stop redundant pipelines from running?
Add a workflow:rules block that runs merge-request pipelines and explicitly sets when: never for branch pipelines when an open MR exists. That kills the classic double-pipeline and ensures tests run against the merged result.
Is image: latest really that bad?
Yes. It makes builds non-reproducible and lets upstream changes break your pipeline with no commit of your own. Pin to a specific version tag, and use a digest for builds you need to reproduce exactly months later.
How can AI realistically help with pipeline review? Treat it as a tireless first-pass reviewer. Paste the YAML, ask it to audit against a known checklist of security, cost, and reliability mistakes, and let it flag specific lines. It is excellent at catching the mechanical issues — floating tags, missing timeouts, unquoted variables — so your human reviewers can focus on intent and architecture.
Conclusion
None of these 25 mistakes require a re-platform. They are config changes you can land incrementally, and the payoff compounds: faster pipelines, lower runner bills, fewer 2am rollbacks, and a .gitlab-ci.yml your team can still understand next year. Start with the security bucket, then knock out caching and interruptible for the quick cost wins, and wire an AI review pass into your MR flow so you stop re-introducing the ones you just fixed. For more depth on any single topic, the GitLab CI/CD category has a focused deep dive on each.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.