Skip to content
CloudOps
Newsletter
All prompts
AI for GitLab CI/CD Difficulty: Intermediate ClaudeChatGPT

GitLab Container Registry Cleanup & Hygiene Prompt

Audit and clean up GitLab Container Registry storage — cleanup policies, tag retention, dependency proxy, garbage collection, image bloat across projects.

Target user
Platform engineers managing GitLab Container Registry storage
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a senior platform engineer who has managed GitLab Container Registry storage in production — across self-managed GitLab instances and SaaS. You know how to right-size cleanup policies, configure GC, and use the dependency proxy without disabling features users rely on.

I will provide:
- GitLab type: self-managed or SaaS
- Storage backend for registry: filesystem, S3, GCS, Azure Blob
- Current registry size (rough), top 5 largest projects/repos
- The cleanup policy setting (if any) per project
- Tag patterns in use (e.g., commit SHA tags, branch tags, semantic version tags, mutable `:latest`)
- The goal: reduce storage cost, enforce retention, comply with audit, or recover from runaway growth

Your job:

1. **Audit current state**:
   - Total registry size
   - Number of repos, tags per repo, layers
   - Cleanup policy on each repo? Enabled? When does it run?
   - Garbage collection (self-managed only): last run, configured behavior
2. **Design or fix cleanup policy** at the project level:
   - `name_regex_keep` — keep tags matching regex (e.g., `^v\d+\.\d+\.\d+$` for semver releases)
   - `name_regex_delete` — delete tags matching regex (e.g., `^[0-9a-f]{40}$` for commit SHA tags)
   - `keep_n` — keep the N most recent matching tags
   - `older_than` — only consider tags older than N days
   - **Order of evaluation matters**: keep_n is applied first (newest N kept), then everything else older than `older_than` is candidate-for-delete, then `name_regex_keep` rescues, then `name_regex_delete` deletes
3. **For self-managed**: garbage collection is required to reclaim actual disk space after tag deletes:
   - Soft delete (default): tag removed from registry metadata; layers remain on disk until GC
   - `registry-garbage-collect` runs the GC; should be scheduled
   - In read-only mode during GC (or use `-m` mark-only carefully)
4. **For SaaS**: GitLab handles GC for you, but cleanup policies still need configuration per project
5. **Tag pattern recommendations**:
   - **Keep**: `^v\d+\.\d+\.\d+$` (semver releases), `^main$`, `^latest$` (if mutable latest is used)
   - **Aggressively clean**: `^[0-9a-f]{7,40}$` (commit SHA), `^merge-request-\d+$` (ephemeral MR tags), feature-branch tags
   - **Mutable tags are a hazard** for content-addressable deploys; consider digest pinning
6. **Dependency proxy**:
   - For mirroring `docker.io` and other registries to avoid rate limits
   - Configured at the group level (`gitlab.example.com/<group>/dependency_proxy/containers/image:tag`)
   - Storage grows with usage; can be cleaned via API
7. **Cross-cutting issues**:
   - Builds tagging with `$CI_COMMIT_SHA` AND `$CI_COMMIT_REF_SLUG` → tag explosion (every commit on every branch)
   - "Latest" tag pushed by every successful pipeline → multiple layers stack up
   - Multi-arch manifests count as separate tags; cleanup policy needs to handle them
   - PVE long-lived branches (release/* etc.) — make sure cleanup policy excludes them
8. **Tag strategy refactor**:
   - Tag with semver on release: `myapp:v1.2.3`
   - Push ephemeral SHA tags for review: `myapp:rev-abc1234` (clean aggressively)
   - Don't tag every CI commit with stable names you'd want to keep

Provide concrete cleanup-policy YAML examples and CLI/API commands to apply.

---

GitLab type: [self-managed v.X / SaaS]
Registry backend: [filesystem / S3 / GCS / Azure]
Total registry size: [N TB]
Top consumers: [DESCRIBE — top 5 projects/repos]
Current cleanup policy state: [DESCRIBE — enabled/per-project, sample policy]
Tag patterns in use:
```
[DESCRIBE — SHA tags, branch tags, semver, latest, etc.]
```
Goal: [reduce cost / enforce retention / audit / runaway recovery]

Why this prompt works

GitLab Container Registry storage grows unbounded by default; many self-managed instances accumulate terabytes of stale CI-build images over months. Cleanup policies exist but are off by default and require careful regex tuning. This prompt walks the tag-pattern analysis and policy design.

How to use it

  1. Inventory by project, not just total size. One bad project often dominates.
  2. Decide your tag strategy explicitly: which tags are kept forever (releases), which are ephemeral (SHA/branch), which are mutable (latest).
  3. Test cleanup policies on a low-stakes project first. First runs are heavy.
  4. Schedule GC for off-hours on self-managed. It pauses pushes.

Useful commands

# Find largest container repos via API (admin)
curl --header "PRIVATE-TOKEN: <t>" \
  "https://gitlab.example.com/api/v4/registry/repositories?size=true" | jq

# Per-project repos
curl --header "PRIVATE-TOKEN: <t>" \
  "https://gitlab.example.com/api/v4/projects/<id>/registry/repositories?tags_count=true" | jq

# Tags in a repo
curl --header "PRIVATE-TOKEN: <t>" \
  "https://gitlab.example.com/api/v4/projects/<id>/registry/repositories/<repo-id>/tags" | jq

# Delete an individual tag (DESTRUCTIVE)
curl --request DELETE --header "PRIVATE-TOKEN: <t>" \
  "https://gitlab.example.com/api/v4/projects/<id>/registry/repositories/<repo-id>/tags/<tag>"

# Bulk delete tags by regex (DESTRUCTIVE) — schedule the cleanup policy instead
curl --request DELETE --header "PRIVATE-TOKEN: <t>" \
  "https://gitlab.example.com/api/v4/projects/<id>/registry/repositories/<repo-id>/tags" \
  --data "name_regex_delete=.*&name_regex_keep=release-.*&keep_n=10&older_than=14d"

# Set/update cleanup policy on a project
curl --request PUT --header "PRIVATE-TOKEN: <t>" \
  "https://gitlab.example.com/api/v4/projects/<id>" \
  --data 'container_expiration_policy_attributes[cadence]=1d' \
  --data 'container_expiration_policy_attributes[enabled]=true' \
  --data 'container_expiration_policy_attributes[keep_n]=10' \
  --data 'container_expiration_policy_attributes[older_than]=30d' \
  --data 'container_expiration_policy_attributes[name_regex_delete]=.*' \
  --data 'container_expiration_policy_attributes[name_regex_keep]=^(v\d+\.\d+\.\d+|main|latest)$'

# Self-managed: garbage collection (run on registry host)
sudo gitlab-ctl registry-garbage-collect          # full GC
sudo gitlab-ctl registry-garbage-collect -m       # mark-only (preview)

Cleanup policy templates

Aggressive (CI-build heavy project)

# Settings → Packages and registries → Container registry → Cleanup policies
enabled: true
cadence: 1d                                  # daily
keep_n: 10                                   # keep 10 most recent matching keep regex
older_than: 7d                               # only consider tags older than 7 days
name_regex_keep: '^(v\d+\.\d+\.\d+|main|latest|release-.*)$'
name_regex_delete: '.*'                      # everything else

Effect: keep all release/main/latest tags forever; for everything else, keep 10 newest, delete the rest if older than 7 days.

Conservative (production image, slow churn)

enabled: true
cadence: 7d
keep_n: 25
older_than: 90d
name_regex_keep: '^(v\d+\.\d+\.\d+|main|stable)$'
name_regex_delete: '^(rev-|branch-|mr-).*'

Effect: only delete review-style tags older than 90 days; keep semver/main/stable forever.

enabled: true
cadence: 1d
keep_n: 5
older_than: 14d
name_regex_keep: ''                          # only delete what name_regex_delete matches
name_regex_delete: '^(rev-[0-9a-f]{7,40}|mr-\d+|branch-.*)$'

Effect: explicit pattern of deletion only; keeps anything not matching.

In .gitlab-ci.yml:

build-image:
  script:
    - docker build -t "$CI_REGISTRY_IMAGE:rev-$CI_COMMIT_SHORT_SHA" .
    - docker push "$CI_REGISTRY_IMAGE:rev-$CI_COMMIT_SHORT_SHA"
    # Also tag main:
    - |
      if [ "$CI_COMMIT_BRANCH" = "$CI_DEFAULT_BRANCH" ]; then
        docker tag "$CI_REGISTRY_IMAGE:rev-$CI_COMMIT_SHORT_SHA" "$CI_REGISTRY_IMAGE:main"
        docker push "$CI_REGISTRY_IMAGE:main"
      fi
    # Also tag semver on release:
    - |
      if [ -n "$CI_COMMIT_TAG" ]; then
        docker tag "$CI_REGISTRY_IMAGE:rev-$CI_COMMIT_SHORT_SHA" "$CI_REGISTRY_IMAGE:$CI_COMMIT_TAG"
        docker push "$CI_REGISTRY_IMAGE:$CI_COMMIT_TAG"
      fi

Cleanup policy keeps main, vX.Y.Z; deletes rev-* older than 14d.

Common findings this catches

  • Cleanup policy off across entire instance → quick win to enable with conservative defaults.
  • name_regex_delete: .* without name_regex_keep → deletes everything. Always pair.
  • No GC scheduled on self-managed → metadata-deleted tags still consume disk. Schedule weekly GC.
  • :latest pushed every CI run → many latest-historical layers. Switch to semver + rev pattern; rebuild :latest only on release.
  • Multi-arch manifests not cleaned → cleanup policy might delete the OCI list but leave per-arch tags. Use manifest-aware tooling.
  • Forks have separate registries that accumulate — audit fork registries; same policy may not apply.
  • Dependency proxy growing unbounded → check if proxy has TTL settings in your GitLab version.

Garbage collection (self-managed)

# Schedule GC weekly (cron)
0 3 * * 0 /usr/bin/gitlab-ctl registry-garbage-collect > /var/log/registry-gc.log 2>&1

# During GC, registry is read-only by default (pushes 503). Use:
# - off-hours
# - or 'continuous garbage collection' (newer registry; in-memory tracking)

When to escalate

  • Registry size doubling/month → emergency policy + GC; communicate to users before the next cleanup run.
  • S3/GCS backend bill spiking — check lifecycle rules on the bucket too; layered storage tiers can help.
  • Compliance audit requires immutable retention of certain tags — combine cleanup policy with manual “tag protection” on those release tags (newer GitLab feature).

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week