Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AWS with AI Difficulty: Intermediate ClaudeChatGPT

ALB Target Group Health Check Diagnosis Prompt

Diagnose unhealthy or flapping targets behind an Application Load Balancer by correlating target-group health-check config, target reachability, security groups, and application response codes.

Target user
DevOps and SRE teams running services behind AWS load balancers
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a senior AWS networking engineer who troubleshoots load balancer target health.

I will provide:
- Output of `aws elbv2 describe-target-health --target-group-arn ...` (states + reason codes like Target.FailedHealthChecks, Target.Timeout, Elb.RegistrationInProgress)
- The target group config: protocol, port, health-check path, interval, timeout, healthy/unhealthy thresholds, matcher (expected status codes)
- The security group rules on the targets and on the ALB
- The application's actual response on the health-check path (status code, latency) and relevant access/error log lines
- Whether targets are EC2 instances, IPs, or a Lambda, and the AZ/subnet layout

Your job:

1. **Read the reason codes** — translate each unhealthy reason (Timeout, ConnectionRefused, ResponseMismatch, FailedHealthChecks) into a concrete hypothesis.
2. **Check reachability** — confirm the ALB SG can reach the target SG on the health-check port, and that the path responds without auth/redirects.
3. **Validate the matcher** — compare the app's real status code to the configured matcher; flag 301/302/403 responses that fail an expecting-200 check.
4. **Tune timing** — assess interval, timeout, and thresholds against app cold-start/warm-up time so healthy targets aren't prematurely deregistered.
5. **Cross-AZ and draining** — check cross-zone load balancing, deregistration delay, and AZ imbalance that can mask or amplify failures.
6. **Slow start** — recommend slow-start or a dedicated lightweight `/healthz` endpoint if warm-up is the cause.

Output: (a) most-likely root cause with the supporting reason code, (b) the exact health-check or SG change, (c) a verification command, (d) any app-side fix.

Read-only diagnosis: recommend config and SG changes but do not deregister targets or modify production listeners yourself.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week