Debugging VPC Firewall and Routing on GCP With AI
When traffic vanishes inside a GCP VPC, the cause is buried in firewall priorities, route tables, and implied rules. Here's how I use AI to decode the path packets actually take.
- #gcp
- #ai
- #vpc
- #networking
- #firewall
A VM couldn’t reach an internal load balancer that was clearly healthy. Ping from a neighboring instance worked. The app’s own connectivity test failed every time. I’d been staring at it for forty minutes before I remembered that GCP VPC firewall rules have priorities, that there are two implied rules you never see in the console, and that a higher-priority deny somewhere can shadow the allow you’re looking at. GCP networking fails quietly: no error, just packets that don’t arrive. That silence is what makes it a strong AI debugging target — I can dump the full rule set and route table and ask the model to trace the path, instead of holding the whole evaluation order in my head.
Get the full picture before guessing
The mistake is debugging from the console one rule at a time. Pull everything at once so the model sees what you see:
# All firewall rules sorted by priority (lower number wins)
gcloud compute firewall-rules list \
--filter="network=prod-vpc" \
--sort-by=priority \
--format="table(name, priority, direction, sourceRanges.list(), allowed[].map().firewall_rule().list(), denied[].map().firewall_rule().list(), targetTags.list())"
# Effective routes for the source instance
gcloud compute routes list --filter="network=prod-vpc" \
--format="table(name, destRange, nextHopGateway, nextHopInstance, priority)"
Then paste both into the model with the actual question.
Prompt: “Below are all firewall rules for
prod-vpc(sorted by priority) and the route table. A VM with network tagapp-tierat 10.20.1.5 cannot open TCP 443 to an internal load balancer at 10.20.4.9. Trace egress from the VM and ingress to the LB. Tell me which specific rule allows or blocks this flow, accounting for the two implied GCP rules (default-deny ingress, default-allow egress) and priority ordering. If nothing explicitly allows it, say so.”
The model walks the evaluation: egress is allowed by the implied egress-allow, ingress to the LB subnet needs a rule, and the highest-priority matching ingress rule was a deny at priority 900 that I’d forgotten existed, sitting above my allow at priority 1000. Lower number wins. That’s the bug, and it’s exactly the kind of off-by-priority mistake humans miss when scanning rules top to bottom.
Let Connectivity Tests be the ground truth
AI reasons about the config, but GCP’s Connectivity Tests reason about the actual data plane, including rules AI can’t see. I use both: AI to form a hypothesis fast, the test to confirm it against reality.
gcloud network-management connectivity-tests create app-to-ilb \
--source-instance=projects/my-proj/zones/us-central1-a/instances/app-vm-1 \
--destination-ip-address=10.20.4.9 \
--destination-port=443 \
--protocol=TCP
Then feed the test’s JSON trace back to the model:
Prompt: “This is the JSON result of a GCP Connectivity Test. It dropped at a step — read the
dropsandtracesarrays, tell me the exact cause in one sentence, and the precisegcloudcommand to fix it without widening the rule more than necessary.”
I insist on “without widening more than necessary” because the lazy fix is a 0.0.0.0/0 allow, which trades a connectivity bug for a security hole. The model will happily suggest the broad fix unless you tell it not to.
Have AI write the corrective rule with the right scope
Once the cause is clear, the model drafts a tightly-scoped rule that I review before applying. Network tags beat IP ranges for tier-to-tier rules because they survive re-IPing:
gcloud compute firewall-rules create allow-app-to-ilb \
--network=prod-vpc \
--direction=INGRESS \
--action=ALLOW \
--rules=tcp:443 \
--source-tags=app-tier \
--target-tags=ilb-backend \
--priority=950 \
--enable-logging
Note the priority 950 — placed deliberately below the existing deny at 900? No. Above it, numerically below, so it wins. This is the detail to check by hand: I ask the model to state which existing rule its new rule must out-prioritize, and I verify that number myself. I never trust an AI-chosen priority without seeing the neighbor it has to beat.
Turn on logging and let AI read the firehose
Firewall rule logging produces a lot of noise. AI is good at turning that into a summary of what’s actually being denied:
Prompt: “Here are 200 GCP firewall log entries (JSON). Group them by
rule_details.referenceanddisposition. Show me the top denied flows by count, with source range and destination port, so I can tell which denies are protecting me and which are blocking legitimate traffic that needs a rule.”
That report tells me whether a deny is doing its job or quietly breaking something — the question you actually care about, surfaced from logs you’d never read line by line.
The division of labor
The model is excellent at the parts that are pure mechanical reasoning over config: priority ordering, implied rules, route precedence, log aggregation. It is not a substitute for the data-plane truth that Connectivity Tests give you, and it has no idea what your security posture should be. So I let it form hypotheses and draft rules, I confirm with the platform’s own tooling, and I personally own every priority number and source range that ships.
For the prompts I reuse across networking incidents, see my prompts library, and the rest of the GCP with AI series for adjacent problems. A VPC that fails silently doesn’t have to stay a mystery — it just needs the full rule set in front of a reader patient enough to trace it, and that reader can be AI as long as you stay the one who decides.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.