Hardening Cloud Armor With AI: WAF Rules and Rate Limits

I once watched someone enable the full OWASP rule set in Cloud Armor at maximum sensitivity, in enforce mode, on a Friday afternoon. Within minutes the support queue filled with users who couldn’t save their work, because the application’s rich-text editor sent HTML that the XSS rules flagged as an attack. The rules weren’t wrong, exactly — they were doing precisely what they were configured to do — but nobody had measured what “maximum sensitivity” would actually block before turning it on for real traffic. Cloud Armor is powerful and unforgiving in equal measure, and the difference between hardening your edge and taking down your own site comes down to two disciplines: getting rule order right, and never enforcing anything you haven’t first watched in preview mode. AI helps with both.

Rule order is the whole ballgame

Cloud Armor evaluates rules by priority, lowest number first, and the first match wins. That means a broad allow sitting above a specific deny silently disables the deny, and a misordered blanket deny can black-hole everything. Before I touch rule content, I have AI read the order.

gcloud compute security-policies describe edge-policy --format=yaml

Prompt: “Here’s a Cloud Armor security policy as YAML. Rules evaluate by priority, lowest first, first match wins. Walk the priorities and flag any rule that’s shadowed by an earlier one, any allow that sits above a deny it should have caught, and any deny broad enough to block legitimate traffic. Don’t propose new rules yet — just audit the ordering.”

This catches the structural mistakes that no clever rule content can compensate for. I’ve seen a deny(403) on a src.region_code match get completely bypassed because an earlier allow rule matched first — the policy looked locked down and wasn’t.

WAF sensitivity: tune, don’t max

The preconfigured OWASP rules (sqli, xss, lfi, rce) are genuinely useful, but they’re tuned for a generic application and they generate false positives against real apps whose legitimate input happens to look like an attack. The right move is to enable them in preview mode, measure the false positives, then dial sensitivity to fit.

Prompt: “I want to add the preconfigured sqli-v33-stable and xss-v33-stable rules. Our app has a search box where users type free text and a rich-text editor that submits HTML. Recommend a starting sensitivity level, write the rules in PREVIEW mode (not enforce), and tell me which legitimate inputs are most likely to trigger false positives so I know what to watch for in the logs.”

Preview mode is the key word. A rule in preview logs what it would have blocked without actually blocking it, so I get a real measurement against production traffic before a single user is affected. After a day or two I read the matched requests:

resource.type="http_load_balancer"
jsonPayload.enforcedSecurityPolicy.name="edge-policy"
jsonPayload.enforcedSecurityPolicy.configuredAction="DENY"
jsonPayload.previewSecurityPolicy.outcome="DENY"

Prompt: “Here are the requests my preview-mode WAF rules matched over 48 hours. For each, tell me whether it looks like a real attack or a false positive from legitimate app traffic, and if false positives dominate, suggest a rule exclusion or a lower sensitivity rather than disabling the rule entirely.”

Rate limiting that doesn’t punish the wrong people

Rate limiting is where the CDN-in-front detail bites. If a CDN or proxy sits between users and the load balancer, the source IP Cloud Armor sees is the proxy’s, so a naive rate limit either bans the proxy (everyone) or never triggers. The fix is keying on the forwarded client IP.

Prompt: “Design a Cloud Armor rate_based_ban for an API where a legitimate client makes at most a few requests per second. We have a CDN in front, so the real client IP is in the forwarded header. Key the limit on the client IP, not the proxy, set a threshold above legitimate usage with headroom for shared-NAT users, and give a ban duration. Show the gcloud command and the trusted-proxy / userIpRequestHeaders config.”

Getting the enforce-on-key right matters: key too tightly and you punish users behind a shared corporate NAT; key on the proxy and the limit is meaningless.

Roll it out like you mean to keep the site up

The rollout sequence is non-negotiable for me: preview, then enforce on a canary fraction, then full enforce, watching the matched-and-blocked counts at each stage. I have the model lay out the steps and the validation query so there’s no ambiguity about what “looks healthy” means before each promotion. Adaptive Protection for L7 DDoS goes through the same preview-first treatment, because its automatic rules can be aggressive against a sudden-but-legitimate traffic spike like a marketing launch.

The honest division of labor

AI is strong on the structural and pattern parts of Cloud Armor: auditing rule order, classifying preview-mode matches as attack-versus-false-positive, and writing correct rate-limit and WAF rule syntax. Those are well-defined problems and the model handles them well. What it can’t see is your actual traffic, your users’ tolerance, or your business’s risk appetite — so it tells me which rules to consider and what they’d block, and I decide what to enforce and when.

The one rule I never bend: every new blocking rule goes to preview first, and I read the logs before enforcing. The reusable prompts are in my prompts library, and the GCP with AI series covers the network layer underneath the edge, including VPC firewall and routing debugging for when the problem is below the load balancer rather than at it. A good WAF is one you tuned with evidence, not one you turned up to eleven and hoped.