Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for DevOps Security & Hardening By James Joyner IV · · 10 min read

DNS Egress Filtering: Closing the Exfiltration Channel Everyone Forgets

Lock down outbound name resolution: force DNS through a resolver, allowlist egress domains, log queries, and detect DNS tunneling and C2 before data leaves.

  • #security
  • #hardening
  • #dns
  • #networking
  • #detection

The first time I watched a red team walk a few gigabytes of “customer data” out of a locked-down VPC, they did it over port 53. No proxy bypass, no clever TLS trick — just a chatty stream of base32-encoded subdomains resolving against an attacker-controlled authoritative server. Every firewall rule we had was watching ports 80 and 443. DNS was wide open because DNS is always wide open. It’s the protocol nobody filters because everything breaks when you get it wrong.

That’s exactly why attackers love it. DNS is the blind spot in most egress strategies: it’s allowed outbound by default, it’s rarely logged, and it tunnels arbitrary data inside lookups that look like ordinary name resolution. After years of cleaning up after this, here’s how I close the channel without breaking the network — and how I use AI to review the rules before they touch production.

Why DNS Is the Channel You Forgot to Watch

Egress filtering usually means “block outbound except 80/443 through the proxy.” Fine — but UDP/53 and TCP/53 almost always stay open, often to any destination, so any internal host can talk to any DNS server on the internet. An attacker who lands on a box doesn’t need to beat your proxy. They run their own authoritative nameserver, encode the payload into subdomain labels, and let your own resolver forward the data to them, one query at a time.

The fix has two halves: control where DNS can go, and see what DNS is actually doing. Most shops do neither. Let’s do both.

Block Direct Outbound DNS, Force Everything Through One Resolver

The single highest-value change is to stop letting workloads talk DNS to arbitrary servers. Every query should be forced through a resolver you run and log. On a Linux host or NAT gateway, nftables makes this clean. Allow 53 only to your resolver, drop everything else:

table inet egress {
  chain output {
    type filter hook output priority 0; policy accept;

    # Allow DNS only to the sanctioned resolver
    ip daddr 10.0.0.53 udp dport 53 accept
    ip daddr 10.0.0.53 tcp dport 53 accept

    # Everything else on 53 is exfil-shaped — log and drop
    udp dport 53 log prefix "DNS-EGRESS-DROP " drop
    tcp dport 53 log prefix "DNS-EGRESS-DROP " drop
    udp dport 853 log prefix "DOT-DROP " drop
    tcp dport 853 log prefix "DOT-DROP " drop
  }
}

Don’t forget port 853 (DNS-over-TLS) and outbound DoH (DNS-over-HTTPS rides 443, so you handle it at the proxy by blocking known DoH endpoints). The goal is simple: there is exactly one path for name resolution, and it runs through software you control.

Pro Tip: Block DoH explicitly. A workload that resolves cloudflare-dns.com or dns.google over 443 has just routed around every DNS control you built. Sinkhole those hostnames on your resolver so the DoH bootstrap fails.

Allowlist Egress Domains, Not Just IPs

Once every query flows through your resolver, you can decide which names are even allowed to resolve outbound. For a server fleet that should only ever reach a known set of upstreams (package mirrors, your cloud APIs, your registry), an allowlist on the resolver is brutally effective. In CoreDNS, a Corefile that only answers for sanctioned zones and refuses the rest:

.:53 {
    # Only these upstreams get forwarded
    forward apt.example.com s3.amazonaws.com ghcr.io {
        policy sequential
    }
    # Log every query in dnstap for analysis
    dnstap unix:///var/run/dnstap.sock full
    log
    errors
    # Everything not explicitly allowed gets refused
    template ANY ANY {
        rcode REFUSED
    }
}

A refused lookup is a loud signal. When a host that should only talk to your package mirror suddenly asks for a8f3c.exfil.attacker.tld, the resolver says no and you get a log line. That single REFUSED is worth more than a dozen dashboards.

Detecting DNS Tunneling in the Query Logs

Allowlisting catches the obvious cases. Tunneling detection catches the clever ones. DNS exfiltration has a fingerprint, and once you’re logging queries (via dnstap or query logs) the fingerprint is easy to grep for:

  • Long labels / high entropy — encoded payloads produce subdomains far longer than human-chosen names. Anything over ~40 chars in a single label deserves a look.
  • High query volume to one parent domain — tunneling means hundreds or thousands of unique subdomains under one registered domain in a short window.
  • Unusual record typesTXT, NULL, and CNAME carry more bytes per response than A, so they’re favored for the download direction of C2.

A quick first pass over a query log to surface long-label suspects:

# Flag queries with an unusually long first label
awk '{ split($NF, l, "."); if (length(l[1]) > 40) print }' /var/log/coredns/query.log

# Count unique subdomains per parent domain — tunneling spikes here
awk '{ n=split($NF,p,"."); print p[n-1]"."p[n] }' query.log \
  | sort | uniq -c | sort -rn | head -20

If one parent domain shows thousands of distinct children, that’s not a CDN — that’s a tunnel. From there, anomaly detection is mostly about baselining: learn each host’s normal query rate and unique-domain count, then alert on deviations. Feed the log slices into a code-review workflow or your monitoring pipeline rather than eyeballing them by hand.

Sinkhole Known-Bad and Newly-Registered Domains

You don’t have to detect everything from scratch. Threat intel feeds publish known C2 and malware domains; point them all at a sinkhole (0.0.0.0 or a logging honeypot) so any infected host announces itself the moment it tries to call home. Pi-hole, dnsmasq, and Unbound all do this with a blocklist:

# Unbound: sinkhole and log
server:
    local-zone: "evil-c2.example." redirect
    local-data: "evil-c2.example. A 192.0.2.1"   # honeypot, not the real host
    log-queries: yes

Equally important: block newly-registered domains (NRDs). A huge share of phishing and C2 infrastructure is less than 30 days old. Feeds of NRDs let you refuse resolution for any domain registered in the last month — legitimate business almost never depends on a domain that fresh, and the few false positives are cheap to allowlist by hand.

Pro Tip: Sinkhole to a host that logs the full request, not to 0.0.0.0. A black hole tells you a query happened; a honeypot tells you which internal IP, how often, and with what payload — the difference between “something’s wrong” and a contained incident.

DNS Egress in Kubernetes

Containers make the blind spot worse: pods get DNS for free via CoreDNS, and a default cluster lets a compromised pod resolve anything. A NetworkPolicy should allow egress to kube-dns and nothing else on 53, then restrict the rest of egress to what the workload actually needs:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: dns-and-api-egress-only
  namespace: payments
spec:
  podSelector: {}
  policyTypes: ["Egress"]
  egress:
    # DNS only to the cluster resolver
    - to:
        - namespaceSelector:
            matchLabels: { kubernetes.io/metadata.name: kube-system }
          podSelector:
            matchLabels: { k8s-app: kube-dns }
      ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53
    # Then the one upstream this namespace legitimately needs
    - to:
        - ipBlock: { cidr: 10.20.0.0/16 }
      ports:
        - protocol: TCP
          port: 443

With CoreDNS as the only resolver pods can reach, you apply the same allowlisting, dnstap logging, and tunneling detection from above — now covering the whole cluster. Pair it with a default-deny egress policy so anything not on the list simply can’t leave.

Let AI Audit the Rules — Then Verify Before You Apply

Here’s where AI earns its keep. Generating and reviewing DNS-filtering config is tedious and error-prone — one wrong nft priority and you’ve blackholed the cluster. I treat the model like a fast junior engineer: it drafts the Corefile, sanity-checks the NetworkPolicy selectors, and spots the DoH endpoint I forgot to sinkhole. But it does not get the keys. A human verifies every rule against the real topology before it’s applied, because the model can’t see your network and will confidently allow a CIDR it shouldn’t.

Two hard rules: keep it strictly defensive — these techniques are for protecting your own egress, full stop — and never hand the model real secrets, resolver credentials, or live IP inventory. Paste a sanitized config and ask for review. Tools like Claude or Cursor are great for this kind of audit, and a reusable prompt pack keeps the review checklist consistent across the team. For more in this vein, the security hardening category collects the rest of the playbook.

Closing the Channel

DNS exfiltration works because almost nobody filters DNS. Flip that: one sanctioned resolver, an egress allowlist, full query logging, sinkholed bad and newly-registered domains, and tunneling detection on the logs. Do that on the hosts and inside Kubernetes, let AI accelerate the review while a human owns the apply, and the channel everyone forgets stops being a free ride out of your network.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.