Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for GitLab CI/CD By James Joyner IV · · 11 min read

Using AI to Speed Up Docker Builds in GitLab CI

Cut Docker build times in GitLab CI using AI to fix layer ordering, wire up BuildKit registry cache with buildx, and push inline cache for fast, reliable rebuilds.

  • #gitlab
  • #ci-cd
  • #ai
  • #docker
  • #buildkit

I stared at the same pipeline stage for the third time that morning: build-image, 9 minutes 40 seconds, every single push. It didn’t matter that I’d only changed a comment in a README. The runner spun up docker:dind, pulled nothing useful, and rebuilt the entire image from FROM to ENTRYPOINT as if it had never seen my project before. Multiply that by a dozen pushes a day across the team and you’re burning hours of human waiting time on a problem a machine should have solved.

The fix wasn’t exotic. It was caching — the thing every Docker tutorial mentions and almost no CI pipeline actually configures correctly. What got me unstuck fast was pairing with an AI assistant to audit my Dockerfile and .gitlab-ci.yml together. I’ll show you exactly what we changed, why it works, and where you should absolutely not trust the robot.

The slow baseline (what most teams ship)

Here’s roughly what my original job looked like. It works, it’s just slow, because nothing persists between runs:

build-image:
  stage: build
  image: docker:27
  services:
    - docker:27-dind
  variables:
    IMAGE: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA
  script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" "$CI_REGISTRY"
    - docker build -t "$IMAGE" .
    - docker push "$IMAGE"

Every CI run starts a fresh dind daemon with an empty layer store. docker build has no parent layers to reference, so it re-runs every instruction. No cache, no mercy.

Step one: turn on BuildKit and pull a cache image

The first thing the AI flagged when I pasted the job in: I wasn’t using BuildKit at all, and I wasn’t seeding the build with a previously-built image. BuildKit’s classic registry cache works by pulling a known tag and offering its layers as cache sources.

build-image:
  stage: build
  image: docker:27
  services:
    - docker:27-dind
  variables:
    DOCKER_BUILDKIT: "1"
    IMAGE: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA
    CACHE_TAG: $CI_REGISTRY_IMAGE:buildcache
  script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" "$CI_REGISTRY"
    - docker pull "$CACHE_TAG" || true
    - >
      docker build
      --build-arg BUILDKIT_INLINE_CACHE=1
      --cache-from "$CACHE_TAG"
      -t "$IMAGE"
      -t "$CACHE_TAG"
      .
    - docker push "$IMAGE"
    - docker push "$CACHE_TAG"

Two things matter here. DOCKER_BUILDKIT: "1" switches the engine. BUILDKIT_INLINE_CACHE=1 embeds cache metadata into the pushed image, so a future --cache-from "$CACHE_TAG" can actually reuse those layers. Pull the cache tag first (|| true so the very first run doesn’t fail), build against it, then push both the SHA tag and the rolling cache tag. The || true is the kind of small resilience detail an assistant reliably remembers and I reliably forget.

Step two: the layer-ordering bug AI is genuinely great at spotting

This is where the AI earned its keep. I pasted my Dockerfile and asked, “why does my cache always bust?” It immediately pointed at this:

FROM node:22-slim
WORKDIR /app
COPY . .
RUN npm ci
RUN npm run build
CMD ["node", "dist/server.js"]

COPY . . before npm ci means any file change — a README, a test, a typo fix — invalidates the layer that installs dependencies. So npm ci re-runs on every commit even though package.json hasn’t changed in weeks. The reordered version:

FROM node:22-slim
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build
CMD ["node", "dist/server.js"]

Copy the lockfiles first, install, then copy source. Now the npm ci layer only rebuilds when dependencies actually change. On a typical code-only commit, that cached install layer alone shaved ~4 minutes off my build. This is the canonical example of AI behaving like a sharp junior engineer: it spotted a mechanical, well-known anti-pattern instantly and explained it clearly. Cache-busting layer order is exactly the kind of thing it’s reliably good at.

Pro Tip: ask the AI to “explain which layers will rebuild if I change a source file vs. a dependency.” Forcing it to reason layer-by-layer surfaces ordering bugs far better than asking “optimize my Dockerfile,” which tends to produce hand-wavy rewrites.

Step three: multi-stage builds with —target

If you ship a slim runtime image but still want to run tests, multi-stage plus --target lets you build only what you need per job. The AI suggested splitting the Dockerfile:

FROM node:22-slim AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci

FROM deps AS build
COPY . .
RUN npm run build

FROM node:22-slim AS runtime
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY --from=build /app/dist ./dist
CMD ["node", "dist/server.js"]

Then in CI you can build a single stage when that’s all a job needs:

test:
  stage: test
  image: docker:27
  services:
    - docker:27-dind
  variables:
    DOCKER_BUILDKIT: "1"
  script:
    - docker build --target build -t app-test --build-arg BUILDKIT_INLINE_CACHE=1 .
    - docker run --rm app-test npm test

--target build stops the build at the build stage, skipping the runtime assembly entirely for the test job.

Step four: graduate to buildx with type=registry cache

Inline cache is great, but it can only store the cache layers that end up in the final image. For richer caching — including intermediate stages — docker buildx with a dedicated registry cache export is the upgrade. This is the configuration that gave me the most consistent speedups.

build-image:
  stage: build
  image: docker:27
  services:
    - docker:27-dind
  variables:
    DOCKER_BUILDKIT: "1"
    IMAGE: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA
    CACHE: $CI_REGISTRY_IMAGE:buildcache
  before_script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" "$CI_REGISTRY"
    - docker buildx create --use --name ci-builder
  script:
    - >
      docker buildx build
      --cache-from type=registry,ref=$CACHE
      --cache-to type=registry,ref=$CACHE,mode=max
      --target runtime
      -t "$IMAGE"
      --push
      .

The key flag is mode=max on --cache-to: it exports cache for every stage, not just the layers in the final image. Combined with type=registry, your cache lives in the GitLab Container Registry and survives across runners and across pipelines. --push hands the result straight to the registry in one step. After this landed, a dependency-unchanged build dropped from 9:40 to under 2 minutes.

Pro Tip: registry cache costs storage. Point --cache-to at a dedicated :buildcache tag rather than your release tags, and add a registry cleanup policy in GitLab so old cache blobs get garbage-collected instead of growing forever.

Step five: review the diff like you would any junior’s PR

Everything above came together fast because the AI is a quick, well-read pair-programmer. But treat its output exactly like a pull request from an eager junior: read every line before you merge. A few real things to check:

  • It once suggested --cache-to mode=max pointed at my release tag, which would have bloated production image pulls with cache metadata. Wrong target, easy to miss.
  • It happily generated a docker login line — but it has no idea whether your variables are masked and protected in GitLab. That’s on you.
  • It can hallucinate buildx flags that don’t exist in your installed version. Pin your Docker version and verify against the real docker buildx build --help.

And the hard rule: never hand the assistant your registry credentials, CI_REGISTRY_PASSWORD, deploy tokens, or any CI secret. It doesn’t need them to suggest a flag or reorder a COPY. Keep secrets in GitLab CI/CD variables (masked, protected, scoped), and let the AI reason about structure, never about the values. If you want a second set of eyes on the security of a generated pipeline, run it through a dedicated code review pass rather than trusting the generator to grade its own homework.

Putting it together

The pattern that worked: use the AI to audit layer ordering and propose buildx flags, validate each suggestion against the real CLI, and gate everything behind a human review before it touches a protected branch. I keep my battle-tested prompts for this in a reusable prompt library so I’m not re-explaining BuildKit caching to a fresh chat every week, and the heavier optimization recipes live in my prompt packs. If you’re doing this interactively, an editor-integrated assistant like Cursor or a chat model like Claude both handle Dockerfile reasoning well — pick whichever already sits in your workflow.

Conclusion

My build-image job went from a guaranteed 9-plus-minute tax on every push to a sub-2-minute step for the common case, mostly by fixing one COPY line and wiring up registry cache. None of it was novel — it was just configuration I’d been too busy to get right, and an AI assistant made the audit fast. Let it be the fast junior engineer it is: brilliant at spotting cache-busting layers and remembering buildx syntax, but never the one holding the keys. For more pipeline tuning, browse the rest of the GitLab CI/CD guides.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.