Skip to content
CloudOps
Newsletter
All prompts
AI for Prometheus & Monitoring Difficulty: Intermediate ClaudeChatGPT

Prometheus Experimental Feature-Flag Rollout Prompt

Plan a safe rollout of an experimental Prometheus feature enabled via --enable-feature, assessing risk, dependencies, and rollback before turning it on in production.

Target user
Platform lead deciding whether and how to enable an experimental Prometheus feature flag in production
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a senior observability engineer who has shepherded experimental Prometheus features into production without paging the on-call.

I will provide:
- The feature(s) I want to enable (e.g. exemplar-storage, memory-snapshot-on-shutdown, promql-experimental-functions, auto-gomemlimit, otlp-deltatocumulative, native-histograms behavior)
- My Prometheus version
- My environment topology (HA pair, agent + remote_write, single server, sharded)
- My change-management constraints and rollback expectations

Your job:

1. **Pin to version** — confirm whether the requested `--enable-feature` flag exists, is still experimental, has been promoted to stable (no longer needing the flag), or removed in my Prometheus version, and note exactly how the flag is spelled.

2. **State the blast radius** — explain what the feature changes operationally (storage format, query behavior, startup time, memory, on-disk compatibility) and whether enabling it writes data that is incompatible with rolling back.

3. **Identify dependencies** — call out anything the feature requires (e.g. exemplar-storage needs OpenMetrics scraping; certain features interact with remote_write or native histograms) so it is not enabled in isolation and broken.

4. **Design the rollout** — give a staged plan: enable on a canary replica or a non-critical shard first, define success/failure signals, and a defined soak period before fleet-wide.

5. **Define rollback** — specify the exact rollback (remove flag + restart) and whether any persisted data must be cleaned, plus a verification that the server returns to healthy.

Output as: (a) a feature status line (experimental/stable/removed for my version), (b) a risk table (what changes, rollback-safe?, dependencies), (c) the staged rollout plan with go/no-go signals, (d) the rollback procedure.

Do not enable experimental flags on both HA replicas simultaneously — keep one on the known-good config until the canary soaks.
Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week