GitLab CI Error Guide: 'ERROR: Job failed: exit code 1' Generic Script Failure
Fix GitLab's 'Job failed: exit code 1' by scrolling up the job log to the real failing command — set -e, pipefail, masked errors, and how to debug.
- #gitlab-cicd
- #troubleshooting
- #errors
- #scripts
Exact Error Message
The job ends red and the very last lines of the log read like this:
$ ./run-tests.sh
...
Cleaning up project directory and file based variables
00:01
ERROR: Job failed: exit code 1
On some runners and shells you will see the equivalent wording exit status 1 instead:
ERROR: Job failed (system failure): prepare environment: exit status 1
This is the single most common — and least specific — failure in all of GitLab CI. The number 1 is just the exit code of the last command the runner executed. It tells you the job failed; it tells you almost nothing about why.
What the Error Means
A GitLab job’s script is run line by line inside a shell. The runner watches the exit code of the shell. When any command exits non-zero (POSIX convention: 0 is success, anything else is failure), the shell stops and the runner marks the job failed with that exit code.
exit code 1 is the catch-all “general error” code that almost every CLI tool uses for “something went wrong.” pytest returns 1 on test failures, eslint returns 1 on lint errors, npm run build returns 1 on a compile error, grep returns 1 when it finds no match, and a plain failed shell command returns 1. So the error you actually need to fix happened earlier in the log — GitLab is merely reporting the exit code of whatever ran last.
The mental model: this message is a symptom, not the disease. The disease is several lines higher up.
Common Causes
- A script command genuinely failed. A test suite, linter, compiler, or deploy step returned non-zero. This is the normal, healthy case — CI did its job.
set -e(errexit) is active. GitLab’s shell wraps your script so the job aborts on the first failing command. The failing command’s output may be far above the finalexit code 1.- The real error is masked by later output. Cleanup steps,
after_script, or a noisy teardown print after the failing command, pushing the actual error off the bottom of the screen. - A multi-command line hides which command failed. A line like
a && b && corcmd1; cmd2fails as a unit; you have to find which sub-command broke. - A piped command and
pipefail.cmd | tee lognormally reports onlytee’s exit code. Withset -o pipefail, a failure anywhere in the pipe surfaces — often unexpectedly. - An environment difference. A missing env var, secret, or file makes a command that passes locally fail in CI.
How to Reproduce the Error
Any non-zero command in script reproduces it. The smallest example:
reproduce:
script:
- echo "this line is fine"
- exit 1
- echo "this line never runs"
A more realistic one — a failing test masked by a later command:
test:
image: python:3.12
script:
- pip install -r requirements.txt
- pytest # returns 1 on a failing test
- echo "done" # never reached; pytest already aborted the job
The job log shows the pytest failure, then jumps straight to ERROR: Job failed: exit code 1. The echo "done" never runs because GitLab’s wrapper uses set -e.
Diagnostic Commands
1. Read the entire job log, top to bottom. The actual error is above the final line. Search the log for Error, FAILED, Traceback, npm ERR!, or a non-zero summary line.
2. Turn on full trace output so you see each command and its expansion as it runs:
variables:
CI_DEBUG_TRACE: "true"
CI_DEBUG_TRACE: "true" makes the runner emit set -x-style output for the whole job, including the generated shell wrapper, so you can see exactly which command exited non-zero. (Note: it prints masked variables in clear text, so use it on a private branch and remove it afterward.)
3. Add set -x to a single noisy line when you do not want trace for the whole job:
deploy:
script:
- set -x
- ./deploy.sh
- set +x
4. Run the exact command locally, ideally in the same image:
docker run --rm -it -v "$PWD:/app" -w /app python:3.12 bash
pip install -r requirements.txt
pytest # reproduce the same exit 1 locally, where you can iterate fast
echo "exit code: $?"
echo "exit code: $?" after a command prints its exit status so you can confirm which step is returning non-zero.
Step-by-Step Resolution
1. Find the real failing command. Scroll up from ERROR: Job failed: exit code 1 to the first error, traceback, or FAILED line. That command — not the last line — is what you fix.
2. Fix the underlying problem. A failing test gets fixed (or the test gets corrected). A lint error gets resolved. A missing dependency gets installed. A missing env var gets added to the project’s CI/CD variables. CI is reporting a real defect — treat it as one.
3. Disambiguate multi-command lines. Split a chained line so the log shows which sub-command failed:
# before — which one failed?
script:
- npm ci && npm run lint && npm run build
# after — each is its own log line with its own exit code
script:
- npm ci
- npm run lint
- npm run build
4. Handle pipelines deliberately. If you pipe, decide whether a mid-pipe failure should fail the job:
script:
- set -o pipefail # make the whole pipe fail if any stage fails
- terraform plan | tee plan.txt
5. Control exit codes only when failure is genuinely acceptable. || true swallows the failure and forces success — useful for a best-effort cleanup, dangerous for a real check:
script:
- flaky-optional-step || true # OK: this step is non-critical
- run-tests.sh # NEVER mask this — it would hide real failures
The caveat: || true makes the job green even when the command broke. Use allow_failure: true at the job level instead if you want the pipeline to continue but still see the job marked as failed/allowed:
optional-scan:
script:
- ./scan.sh
allow_failure: true
This keeps the signal visible instead of hiding it behind a forced 0.
Prevention and Best Practices
- Make logs readable. Keep one logical command per
scriptline so the failing line is obvious in the log. - Fail fast, fail loud. Do not blanket-wrap commands in
|| true. Reserve it for genuinely optional steps and preferallow_failure:for visibility. - Use
set -o pipefailin jobs that pipe throughtee,grep, orhead, so a real failure is not hidden by a successful final stage. - Pin your image so a command that passes today does not silently break tomorrow when
latestmoves. - Reproduce locally in the CI image (
docker runwith the sameimage:) before pushing — it is far faster than push-and-pray. - Keep
CI_DEBUG_TRACEhandy but off by default; flip it on a branch when a failure is opaque, then turn it back off.
Related Errors
- GitLab CI Error Guide: script ‘No such file or directory’ — when the failing command is missing or not executable rather than returning
1. - GitLab CI Error Guide: ‘Invalid CI config’ — when the config is rejected before any job (and any exit code) runs.
- GitLab CI Error Guide: shallow clone ‘reference is not a tree’ — a git-checkout failure that surfaces as a job failure.
Frequently Asked Questions
Why does the log just say exit code 1 with no detail?
Because 1 is only the exit status of the last command the runner ran. The actual error message was printed by that command earlier in the log. Scroll up from the final line to the first Error/FAILED/traceback and you will find the real cause.
What is the difference between exit code 1 and exit status 1?
They mean the same thing — a non-zero exit. The wording differs by runner shell and context (exit code 1 for a normal script failure, exit status 1 sometimes for a prepare/system step). Both say the command returned 1.
Should I add || true to make the job pass?
Only if that specific step is genuinely optional. || true forces a 0 exit and hides real failures, turning your CI green when it should be red. For steps that may fail but shouldn’t block the pipeline, use allow_failure: true on the job so the failure stays visible.
My script works locally but fails with exit code 1 in CI. Why?
The CI environment differs — a missing secret/variable, a different shell, a not-installed tool, a different working directory, or set -e aborting on a command you ignored locally. Run the command inside the same image: with docker run to reproduce it, and check CI_DEBUG_TRACE: "true" output for the divergence.
How do I see exactly which command failed in a chained line?
Either split a && b && c into separate script lines so each gets its own log entry, or set CI_DEBUG_TRACE: "true" (or add set -x) so every command is echoed before it runs, making the failing one easy to spot.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.