GitLab Container Registry Cleanup & Hygiene Prompt
Audit and clean up GitLab Container Registry storage — cleanup policies, tag retention, dependency proxy, garbage collection, image bloat across projects.
- Target user
- Platform engineers managing GitLab Container Registry storage
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior platform engineer who has managed GitLab Container Registry storage in production — across self-managed GitLab instances and SaaS. You know how to right-size cleanup policies, configure GC, and use the dependency proxy without disabling features users rely on.
I will provide:
- GitLab type: self-managed or SaaS
- Storage backend for registry: filesystem, S3, GCS, Azure Blob
- Current registry size (rough), top 5 largest projects/repos
- The cleanup policy setting (if any) per project
- Tag patterns in use (e.g., commit SHA tags, branch tags, semantic version tags, mutable `:latest`)
- The goal: reduce storage cost, enforce retention, comply with audit, or recover from runaway growth
Your job:
1. **Audit current state**:
- Total registry size
- Number of repos, tags per repo, layers
- Cleanup policy on each repo? Enabled? When does it run?
- Garbage collection (self-managed only): last run, configured behavior
2. **Design or fix cleanup policy** at the project level:
- `name_regex_keep` — keep tags matching regex (e.g., `^v\d+\.\d+\.\d+$` for semver releases)
- `name_regex_delete` — delete tags matching regex (e.g., `^[0-9a-f]{40}$` for commit SHA tags)
- `keep_n` — keep the N most recent matching tags
- `older_than` — only consider tags older than N days
- **Order of evaluation matters**: keep_n is applied first (newest N kept), then everything else older than `older_than` is candidate-for-delete, then `name_regex_keep` rescues, then `name_regex_delete` deletes
3. **For self-managed**: garbage collection is required to reclaim actual disk space after tag deletes:
- Soft delete (default): tag removed from registry metadata; layers remain on disk until GC
- `registry-garbage-collect` runs the GC; should be scheduled
- In read-only mode during GC (or use `-m` mark-only carefully)
4. **For SaaS**: GitLab handles GC for you, but cleanup policies still need configuration per project
5. **Tag pattern recommendations**:
- **Keep**: `^v\d+\.\d+\.\d+$` (semver releases), `^main$`, `^latest$` (if mutable latest is used)
- **Aggressively clean**: `^[0-9a-f]{7,40}$` (commit SHA), `^merge-request-\d+$` (ephemeral MR tags), feature-branch tags
- **Mutable tags are a hazard** for content-addressable deploys; consider digest pinning
6. **Dependency proxy**:
- For mirroring `docker.io` and other registries to avoid rate limits
- Configured at the group level (`gitlab.example.com/<group>/dependency_proxy/containers/image:tag`)
- Storage grows with usage; can be cleaned via API
7. **Cross-cutting issues**:
- Builds tagging with `$CI_COMMIT_SHA` AND `$CI_COMMIT_REF_SLUG` → tag explosion (every commit on every branch)
- "Latest" tag pushed by every successful pipeline → multiple layers stack up
- Multi-arch manifests count as separate tags; cleanup policy needs to handle them
- PVE long-lived branches (release/* etc.) — make sure cleanup policy excludes them
8. **Tag strategy refactor**:
- Tag with semver on release: `myapp:v1.2.3`
- Push ephemeral SHA tags for review: `myapp:rev-abc1234` (clean aggressively)
- Don't tag every CI commit with stable names you'd want to keep
Provide concrete cleanup-policy YAML examples and CLI/API commands to apply.
---
GitLab type: [self-managed v.X / SaaS]
Registry backend: [filesystem / S3 / GCS / Azure]
Total registry size: [N TB]
Top consumers: [DESCRIBE — top 5 projects/repos]
Current cleanup policy state: [DESCRIBE — enabled/per-project, sample policy]
Tag patterns in use:
```
[DESCRIBE — SHA tags, branch tags, semver, latest, etc.]
```
Goal: [reduce cost / enforce retention / audit / runaway recovery]
Why this prompt works
GitLab Container Registry storage grows unbounded by default; many self-managed instances accumulate terabytes of stale CI-build images over months. Cleanup policies exist but are off by default and require careful regex tuning. This prompt walks the tag-pattern analysis and policy design.
How to use it
- Inventory by project, not just total size. One bad project often dominates.
- Decide your tag strategy explicitly: which tags are kept forever (releases), which are ephemeral (SHA/branch), which are mutable (
latest). - Test cleanup policies on a low-stakes project first. First runs are heavy.
- Schedule GC for off-hours on self-managed. It pauses pushes.
Useful commands
# Find largest container repos via API (admin)
curl --header "PRIVATE-TOKEN: <t>" \
"https://gitlab.example.com/api/v4/registry/repositories?size=true" | jq
# Per-project repos
curl --header "PRIVATE-TOKEN: <t>" \
"https://gitlab.example.com/api/v4/projects/<id>/registry/repositories?tags_count=true" | jq
# Tags in a repo
curl --header "PRIVATE-TOKEN: <t>" \
"https://gitlab.example.com/api/v4/projects/<id>/registry/repositories/<repo-id>/tags" | jq
# Delete an individual tag (DESTRUCTIVE)
curl --request DELETE --header "PRIVATE-TOKEN: <t>" \
"https://gitlab.example.com/api/v4/projects/<id>/registry/repositories/<repo-id>/tags/<tag>"
# Bulk delete tags by regex (DESTRUCTIVE) — schedule the cleanup policy instead
curl --request DELETE --header "PRIVATE-TOKEN: <t>" \
"https://gitlab.example.com/api/v4/projects/<id>/registry/repositories/<repo-id>/tags" \
--data "name_regex_delete=.*&name_regex_keep=release-.*&keep_n=10&older_than=14d"
# Set/update cleanup policy on a project
curl --request PUT --header "PRIVATE-TOKEN: <t>" \
"https://gitlab.example.com/api/v4/projects/<id>" \
--data 'container_expiration_policy_attributes[cadence]=1d' \
--data 'container_expiration_policy_attributes[enabled]=true' \
--data 'container_expiration_policy_attributes[keep_n]=10' \
--data 'container_expiration_policy_attributes[older_than]=30d' \
--data 'container_expiration_policy_attributes[name_regex_delete]=.*' \
--data 'container_expiration_policy_attributes[name_regex_keep]=^(v\d+\.\d+\.\d+|main|latest)$'
# Self-managed: garbage collection (run on registry host)
sudo gitlab-ctl registry-garbage-collect # full GC
sudo gitlab-ctl registry-garbage-collect -m # mark-only (preview)
Cleanup policy templates
Aggressive (CI-build heavy project)
# Settings → Packages and registries → Container registry → Cleanup policies
enabled: true
cadence: 1d # daily
keep_n: 10 # keep 10 most recent matching keep regex
older_than: 7d # only consider tags older than 7 days
name_regex_keep: '^(v\d+\.\d+\.\d+|main|latest|release-.*)$'
name_regex_delete: '.*' # everything else
Effect: keep all release/main/latest tags forever; for everything else, keep 10 newest, delete the rest if older than 7 days.
Conservative (production image, slow churn)
enabled: true
cadence: 7d
keep_n: 25
older_than: 90d
name_regex_keep: '^(v\d+\.\d+\.\d+|main|stable)$'
name_regex_delete: '^(rev-|branch-|mr-).*'
Effect: only delete review-style tags older than 90 days; keep semver/main/stable forever.
Tag rev/MR/branch only (recommended starting point)
enabled: true
cadence: 1d
keep_n: 5
older_than: 14d
name_regex_keep: '' # only delete what name_regex_delete matches
name_regex_delete: '^(rev-[0-9a-f]{7,40}|mr-\d+|branch-.*)$'
Effect: explicit pattern of deletion only; keeps anything not matching.
Tag strategy (recommended)
In .gitlab-ci.yml:
build-image:
script:
- docker build -t "$CI_REGISTRY_IMAGE:rev-$CI_COMMIT_SHORT_SHA" .
- docker push "$CI_REGISTRY_IMAGE:rev-$CI_COMMIT_SHORT_SHA"
# Also tag main:
- |
if [ "$CI_COMMIT_BRANCH" = "$CI_DEFAULT_BRANCH" ]; then
docker tag "$CI_REGISTRY_IMAGE:rev-$CI_COMMIT_SHORT_SHA" "$CI_REGISTRY_IMAGE:main"
docker push "$CI_REGISTRY_IMAGE:main"
fi
# Also tag semver on release:
- |
if [ -n "$CI_COMMIT_TAG" ]; then
docker tag "$CI_REGISTRY_IMAGE:rev-$CI_COMMIT_SHORT_SHA" "$CI_REGISTRY_IMAGE:$CI_COMMIT_TAG"
docker push "$CI_REGISTRY_IMAGE:$CI_COMMIT_TAG"
fi
Cleanup policy keeps main, vX.Y.Z; deletes rev-* older than 14d.
Common findings this catches
- Cleanup policy off across entire instance → quick win to enable with conservative defaults.
name_regex_delete: .*withoutname_regex_keep→ deletes everything. Always pair.- No GC scheduled on self-managed → metadata-deleted tags still consume disk. Schedule weekly GC.
:latestpushed every CI run → manylatest-historical layers. Switch to semver + rev pattern; rebuild:latestonly on release.- Multi-arch manifests not cleaned → cleanup policy might delete the OCI list but leave per-arch tags. Use manifest-aware tooling.
- Forks have separate registries that accumulate — audit fork registries; same policy may not apply.
- Dependency proxy growing unbounded → check if proxy has TTL settings in your GitLab version.
Garbage collection (self-managed)
# Schedule GC weekly (cron)
0 3 * * 0 /usr/bin/gitlab-ctl registry-garbage-collect > /var/log/registry-gc.log 2>&1
# During GC, registry is read-only by default (pushes 503). Use:
# - off-hours
# - or 'continuous garbage collection' (newer registry; in-memory tracking)
When to escalate
- Registry size doubling/month → emergency policy + GC; communicate to users before the next cleanup run.
- S3/GCS backend bill spiking — check lifecycle rules on the bucket too; layered storage tiers can help.
- Compliance audit requires immutable retention of certain tags — combine cleanup policy with manual “tag protection” on those release tags (newer GitLab feature).
Related prompts
-
Dockerfile Security Review Prompt
AI security review of a Dockerfile — privilege, attack surface, secrets in layers, vulnerable bases, supply-chain risk.
-
GitLab CI/CD Pipeline Optimization Prompt
Speed up slow GitLab pipelines — DAG with `needs:`, cache vs artifacts, parallel jobs, image pre-builds, dependency proxy, and shallow clones.
-
Kubernetes ImagePullBackOff Debugging Prompt
Diagnose `ImagePullBackOff` / `ErrImagePull` — wrong image name, private registry auth, imagePullSecrets, image signing/content trust, network reach to the registry.