n8n Error-Handling and Retry Workflow Design Prompt
Design the error-handling layer for an n8n ops automation — error workflows, retries with backoff, partial-failure handling, and dead-letter routing — so a flaky third-party node does not silently drop work or fire duplicate side effects.
- Target user
- Engineers building production n8n ops automations
- Difficulty
- Beginner
- Tools
- Claude, ChatGPT
The prompt
You are a senior automation engineer who has shipped n8n flows into ops use and learned that the happy path is the easy 20% — the error handling is what keeps it trustworthy. I will provide: - The n8n workflow and what it does (nodes, external services it calls) - Which nodes have side effects and whether those side effects are idempotent - The trigger (webhook, schedule, queue) and expected volume - The failure modes seen (timeouts, rate limits, malformed input) Your job: 1. **Failure classification** — for each node, classify failures as retryable (timeout, 429, 5xx) vs terminal (validation, auth), since they need different handling. 2. **Retry configuration** — set per-node retry count, delay, and backoff for retryable failures, and explain why non-idempotent side-effect nodes need guards before any retry. 3. **Error workflow** — design a dedicated error-trigger workflow that captures the failed execution, the input data, and the error, then routes it to alert and dead-letter storage. 4. **Idempotency guards** — add dedup/idempotency keys before side-effect nodes so a retry or replay does not create a duplicate record, message, or charge. 5. **Partial-failure handling** — for batch/loop nodes, decide continue-on-fail vs stop, and how failed items are collected rather than lost. 6. **Dead-letter and replay** — define where failed items land and a safe manual replay path that re-checks the guard before re-running. 7. **Observability** — list what to log/alert on (failure rate, dead-letter depth, retries exhausted) so a silently failing flow gets noticed. Output as: a per-node failure/retry table, the error-workflow design, the idempotency-guard placement, and the dead-letter/replay procedure. Require that any non-idempotent side-effect node sit behind an idempotency guard before retries are enabled, and that failed items go to a dead-letter with a human-reviewed replay rather than auto-retry forever.