Keying GitLab CI Caches on Lockfiles With cache:key:files

There are two ways a dependency cache fails you. It can be too sticky — you bump a package, but the cache still serves the old node_modules, so your build either uses stale deps or wastes time reconciling. Or it can be too volatile — a per-branch key means every new branch starts cold. GitLab’s cache:key:files solves both by tying the cache identity to the contents of your lockfile. Change the lockfile, get a new cache; don’t change it, share the warm one across branches. Here’s how to set it up so it actually helps.

The problem with branch-keyed caches

The default-ish pattern most people start with keys the cache to the ref:

cache:
  key: "$CI_COMMIT_REF_SLUG"
  paths:
    - node_modules/

Every branch gets its own cache. That sounds tidy, but it means a brand-new feature branch downloads every dependency from scratch even though package-lock.json is identical to main. Multiply that across a busy team and you’re paying for thousands of redundant installs. Worse, within a single long-lived branch, the cache never invalidates on a dependency change unless you remember to clear it — so a stale cache can mask a dependency bump.

Keying on the lockfile

cache:key:files computes the cache key from a hash of the listed files. Point it at your lockfile:

cache:
  key:
    files:
      - package-lock.json
  paths:
    - node_modules/
  policy: pull-push

Now the key is a hash of package-lock.json. Two branches with identical dependencies share the exact same cache — instant warm start on new branches. The moment someone changes a dependency, the lockfile hash changes, the key changes, and you get a fresh cache automatically. No manual clearing, no stale deps. This is the behavior you actually wanted.

You can list up to two files, which is handy for split lockfiles:

cache:
  key:
    files:
      - package-lock.json
      - yarn.lock

Add a prefix to scope it

If multiple jobs cache different things from the same lockfile, add a prefix so they don’t collide:

cache:
  key:
    files:
      - package-lock.json
    prefix: "$CI_JOB_NAME"
  paths:
    - node_modules/

The final key becomes the prefix plus the file hash. This is also how you keep, say, a build job’s cache separate from a test job’s cache even when both derive from the same lockfile.

The fallback for warm starts

When the lockfile does change, the new key has no cache yet — so that build is cold. cache:fallback_keys lets it warm-start from the previous cache while it builds the new one:

cache:
  key:
    files:
      - package-lock.json
  fallback_keys:
    - "deps-default"
  paths:
    - node_modules/

The job tries the exact lockfile-hash key first, then falls back to a stable key if there’s no exact match, so even a dependency bump restores most of node_modules and only installs the delta. Just keep fallbacks within the same trust zone — never let an untrusted pipeline’s cache become a fallback for a protected build.

Let AI translate your stack — then prove the hit rate

The lockfile name and install command differ per ecosystem (package-lock.json, yarn.lock, poetry.lock, Gemfile.lock, go.sum), and the right paths differ too. I let an LLM map my stack to the config:

Prompt: “I use pnpm in a monorepo. Write a GitLab CI cache: block keyed on the lockfile with cache:key:files, a per-job prefix, a fallback key for warm starts, and the correct paths for the pnpm store. Then tell me exactly how to confirm in the job log whether the cache was a hit or a miss.”

The verification half is the part I care about:

Output (excerpt): ”…Confirm the result in the job log’s ‘Restoring cache’ section: a hit shows Successfully extracted cache; a miss shows No URL provided, cache will not be downloaded. Compare the cache key printed at the top of the section across two pipelines with no lockfile change — it should be identical, proving branches share the cache.”

That log check is non-negotiable. It’s easy to write a cache:key:files block that looks right but caches the wrong path, so the key matches while node_modules is never actually populated — and you won’t notice because the build still passes, just slowly. Read the cache section, confirm the hit. For the reusable versions of these patterns, the package manager cache keys prompt and the broader GitLab CI/CD category are where I keep them.

The bottom line

Branch-keyed caches are either too sticky or too cold; lockfile-keyed caches are precise. cache:key:files hashes your lockfile so identical dependency sets share one warm cache across every branch, and any dependency change invalidates the cache automatically. Add a prefix to separate jobs, a fallback_key for warm starts on a bump, and always confirm the hit in the job log rather than assuming. It’s a few lines that turn caching from a maintenance chore into something that just works.