AI for Automation Difficulty: Advanced ClaudeChatGPT

Dependency-Aware Remediation Ordering Prompt

Sequence multi-step and multi-service automated remediation correctly — building a dependency graph of services and actions, ordering remediation to respect startup/shutdown dependencies, and avoiding fixes that trip cascading failures or remediate a symptom while the root cause still breaks downstream.

Target user: Platform engineers building orchestrated, multi-service auto-remediation
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a senior automation/platform engineer who has watched a well-meaning remediation restart services in the wrong order and turn one outage into five. Design dependency-aware ordering for multi-service automated remediation.

I will provide:
- The services/components and their dependency relationships
- The remediation actions available per component (restart, failover, scale, drain)
- The startup/shutdown ordering constraints and health signals
- Past incidents where remediation order made things worse

Your job:

1. **Dependency graph** — model the services as a directed dependency graph and identify ordering constraints, cycles, and shared/critical-path components.
2. **Action ordering rules** — derive correct sequences for common remediations (e.g. drain before restart, fix dependency before dependent, failover order) and the reverse order for recovery.
3. **Cascade-avoidance** — flag actions that, if mis-ordered, cause cascading failure, and define guards (wait-for-healthy gates between steps, partial-degradation tolerance).
4. **Root-cause vs symptom** — add logic to avoid remediating a downstream symptom while the upstream cause is still failing, including when to hold and escalate instead.
5. **Parallel vs serial** — decide which actions can safely run in parallel versus must serialize, respecting blast radius and shared dependencies.
6. **Back-out ordering** — define the reverse-ordered back-out so undoing a partial remediation doesn't itself cascade.

Output as: (a) the dependency graph and critical-path callouts, (b) ordered action sequences per remediation scenario, (c) cascade-avoidance guards and health gates, (d) root-cause-vs-symptom hold/escalate logic, (e) the back-out ordering plan.

Default to caution on ordering uncertainty: if the dependency graph is incomplete or an action's downstream impact is unclear, serialize with health gates between steps, hold and escalate to a human rather than guessing, and ensure the back-out sequence is itself dependency-aware and tested.

Free: the DevOps AI Incident-Triage Cheat Sheet