Routing Azure Front Door and Application Gateway With AI

The page said 502 Bad Gateway. The backend was fine — I could curl it directly from a jumpbox and get a clean 200. But every request through Application Gateway came back 502, and the longer I stared at the backend the further I got from the answer, because the backend was never the problem. The health probe was sending the wrong host header, the backend pool was marked unhealthy, and the gateway was returning 502 for a service that was working perfectly. That is the defining feature of Azure’s L7 stack: the symptom at the edge almost never points at the cause.

Front Door and Application Gateway are powerful and unforgiving. Listeners, routing rules, backend pools, health probes, end-to-end TLS, and a WAF all sit between the client and your app, and any one of them can produce a 502 or a 404 that looks like an application failure. AI is genuinely useful here, not because it knows your topology but because it imposes structure: classify the failure first, then walk the layers. It will not change your routing for you — you own every az network command — but it turns a panicked stare into a methodical narrowing.

Classify before you touch anything

The single most valuable move is to decide what kind of failure you have before changing a setting. A 404 with no matching rule is a routing problem. A 502 with a probe failing is a health problem. A 403 from the WAF is a security-rule problem. Each has a different fix, and conflating them is how you spend an hour editing the wrong thing.

Prompt: “Application Gateway WAF_v2 returns 502 for requests to app.example.com/api. The backend pool is an App Service that returns 200 when I curl it directly from a VM in the same VNet. Backend health shows ‘Unhealthy’. Classify the most likely cause and tell me the read-only checks to confirm it before I change anything.”

A good answer points straight at the health probe and the origin host header, and gives you read-only commands to confirm rather than a config change to gamble on:

# What does the gateway think of the backend pool?
az network application-gateway show-backend-health \
  --name appgw-prod --resource-group rg-net \
  --query "backendAddressPools[].backendHttpSettingsCollection[].servers[].{addr:address, health:health}" -o table

# What host header does the probe send, and is pick-host-from-backend on?
az network application-gateway probe list \
  --gateway-name appgw-prod --resource-group rg-net -o table

If the probe is sending appgw.local and your App Service expects app.example.com, the probe gets a 404 or a redirect, marks the pool unhealthy, and the gateway returns 502 for everyone. That host-header mismatch is the most common phantom-502 on the platform, and you’d never find it by looking at the backend.

Host headers are where most 502s actually live

Application Gateway and Front Door each have their own notion of which host header reaches the backend, and getting it wrong breaks both the health probe and live traffic. The setting names differ — pick-host-name-from-backend-address on Application Gateway, origin host header on Front Door — but the failure is identical: the backend receives a host it doesn’t recognize and responds with a 404 or a redirect that the gateway interprets as unhealthy.

Prompt: “Here is my Application Gateway HTTP settings and probe config. The backend is an App Service that only responds correctly to its own hostname myapp.azurewebsites.net. Tell me whether the host header sent by the probe and by the routing rule will match what the backend expects, and the exact setting to fix if not.”

Let AI reason about the host-header path end to end, but verify the fix in a test rule before you touch the production listener. Changing the probe host header on a live gateway can flip a healthy pool to unhealthy in seconds — which is the same outage in reverse. This kind of careful, one-layer-at-a-time debugging is the through-line across all the Azure networking work.

Treat the WAF as a scalpel, not a switch

When the WAF returns 403 and a legitimate request is being blocked, the tempting move is to disable the WAF and watch the error vanish. Don’t. You’ve just removed a protection for every request to make one path work. The right move is to find the specific rule that fired and scope an exclusion to that rule and that path.

# Find the rule that blocked the request in the WAF logs (Log Analytics)
AzureDiagnostics
| where Category == "ApplicationGatewayFirewallLog"
| where action_s == "Blocked"
| project TimeGenerated, ruleId_s, Message, requestUri_s
| order by TimeGenerated desc

Prompt: “My Application Gateway WAF blocked a legitimate POST to /api/upload with rule ID 942100 (SQL injection). The request body contains a base64 field that’s triggering a false positive. Recommend the narrowest WAF exclusion — scoped to that rule ID and that request attribute — and how to validate it without disabling the WAF globally.”

AI is good at proposing the scoped exclusion or a switch to Detection mode for one path while you validate. You own the decision to apply it and the decision to re-enable enforcement afterward. The discipline is the same as everywhere in security: narrow the change, validate, restore the protection. The matching debug prompt for this whole workflow lives in the prompts library.

End-to-end TLS adds a layer that hides correct routing

If you terminate TLS at the gateway and re-encrypt to the backend, a backend certificate problem can break an otherwise-perfect route. A self-signed backend cert, an expired cert, or a name that doesn’t match the backend address all surface as 502s that look exactly like a routing failure. When you’ve ruled out host headers and probes, this is the next layer.

Prompt: “End-to-end TLS is enabled on my Application Gateway. Routing and probes look correct but I still get 502. Walk me through what could be wrong with the backend TLS — cert trust, hostname mismatch, SNI — and the read-only checks to confirm which it is.”

The loop

Front Door and Application Gateway reward structure and punish guessing. Classify the failure — edge, backend, or WAF — before you touch a setting. Check host headers and probe results before you blame the application. Scope WAF exclusions to a rule and a path instead of disabling protection. Verify TLS layers last. AI accelerates every one of those steps by imposing the order and recalling the property paths, but you own each az network command and you verify probe health before any change reaches production traffic. Do that, and the 502 that used to mean an afternoon means ten minutes. There’s more L7 and networking material in the Azure category, and the full debug prompt is ready to copy from the prompts library.

Routing Azure Front Door and Application Gateway With AI Without Breaking Traffic