Skip to content
CloudOps
Newsletter
All guides
AI for DevOps Security & Hardening By James Joyner IV · · 8 min read

WAF and Rate Limiting: Hardening the Edge Without Breaking Real Users

Your edge takes the first hit from every bot, scraper, and exploit scanner online. Here's how to layer a WAF and rate limiting that stops abuse without false-positiving your customers.

  • #security
  • #hardening
  • #waf
  • #rate-limiting
  • #edge
  • #nginx

The moment you put a service on the public internet, it starts getting probed. Not eventually — immediately. Within minutes there are bots hammering /wp-login.php on a Go service that has never seen PHP, scanners trying ../../../etc/passwd in every parameter, and credential-stuffing runs working through leaked password lists. Your edge is the front door, and it’s perpetually under low-grade siege.

A web application firewall and rate limiting are the two controls that absorb most of that noise. The trick is tuning them so they stop the abuse without throwing your actual customers a 403.

What a WAF is good at (and what it isn’t)

A WAF inspects HTTP requests and blocks ones matching known-bad patterns: SQL injection signatures, path traversal, command injection, common scanner fingerprints. It is excellent at killing the broad, automated, signature-based junk that makes up the vast majority of hostile traffic.

What it is not is a substitute for fixing the application. A WAF is a speed bump and an early-warning system, not a patch. Treat it as defense-in-depth: it buys you time and cuts noise, but a determined attacker who finds a real flaw will eventually craft a payload your rules don’t match. So you run the WAF and you fix the code.

The most common open ruleset is the OWASP Core Rule Set (CRS), which runs on ModSecurity (now Coraza for the modern engine). With Nginx and ModSecurity:

modsecurity on;
modsecurity_rules_file /etc/nginx/modsec/main.conf;

location /api/ {
    proxy_pass http://backend;
}
# main.conf — start in DetectionOnly, then promote
SecRuleEngine DetectionOnly
Include /etc/nginx/modsec/coreruleset/crs-setup.conf
Include /etc/nginx/modsec/coreruleset/rules/*.conf

Note DetectionOnly. Never turn a fresh WAF straight to blocking. Run it in log-only mode for a week or two, watch what it would have blocked, and you’ll find it would have nuked your legitimate file-upload endpoint and your search box. Tune out those false positives, then flip to SecRuleEngine On.

Rate limiting: the control that scales with abuse

Where a WAF matches content, rate limiting matches behavior. It doesn’t care what the request says, only how fast it’s coming. That makes it the right tool against credential stuffing, scraping, and application-layer floods.

Nginx’s limit_req is the workhorse:

# Define zones: 10MB tracks ~160k unique IPs
limit_req_zone $binary_remote_addr zone=api:10m rate=20r/s;
limit_req_zone $binary_remote_addr zone=login:10m rate=5r/m;

server {
    location /api/ {
        limit_req zone=api burst=40 nodelay;
        limit_req_status 429;
    }

    location /login {
        limit_req zone=login burst=3;
        limit_req_status 429;
    }
}

A few things matter here:

  • Different limits for different endpoints. Login deserves 5 requests per minute; a read API can tolerate 20 per second. One global limit is always wrong for something.
  • burst absorbs legitimate spikes. Real users click fast sometimes. The burst queue smooths that without letting a sustained flood through.
  • Return 429, not 403. A rate-limited client should know to back off and retry, and well-behaved clients honor Retry-After.

Behind a load balancer, $binary_remote_addr may be the LB’s IP, not the user’s. Key off the real client address from X-Forwarded-For — but only trust that header from your own LB, or attackers will spoof it to dodge limits.

Bot management and reputation

Beyond raw rate, you want to distinguish who is knocking. Cheap, effective layers:

  • Geo and ASN awareness. If you don’t serve a region, you can rate-limit or challenge traffic from hosting-provider ASNs (where bots live) far more aggressively than residential ranges.
  • Challenge, don’t just block. For suspicious-but-uncertain traffic, a JS challenge or proof-of-work interstitial filters bots while letting humans through — far fewer false positives than an outright block.
  • Reputation feeds. Maintain a denylist of IPs that have tripped your WAF repeatedly. A short-TTL block on a repeat offender costs nothing and stops a lot of grinding.

Don’t forget the slow stuff

Rate limiting catches volume. Two abuse patterns sneak under it:

  • Slowloris-style attacks hold connections open with trickle-fed headers, exhausting your connection pool with very little traffic. Set aggressive timeouts: client_header_timeout 10s; client_body_timeout 10s; keepalive_timeout 15s;
  • Large-payload abuse. Cap request body size (client_max_body_size 1m; unless an endpoint genuinely needs more) so an attacker can’t tie up workers with giant uploads.

Observe before you tune, then keep watching

The recurring theme: every edge control has a false-positive cost, and the only way to set it correctly is data. Before tightening anything, log it. After tightening, alert on the 429 and 403 rates broken down by endpoint — a sudden spike means either an attack (good, it’s working) or a deploy that changed client behavior (bad, you’re blocking customers).

log_format edge '$remote_addr $status $request_time "$request" '
                'limit=$limit_req_status';

Wire those status codes into your dashboards next to your normal traffic. The edge is where security and reliability blur together, and the same metrics serve both. If you want the wider context on layering these controls, the security hardening guides cover how the edge fits with segmentation and identity, and reviewing rule changes through automated code review keeps an over-broad block from shipping unnoticed.

Start with the WAF in detection mode and per-endpoint rate limits in log-only, watch for a week, tune out the false positives, then enforce. The edge will get quieter, your logs will get more useful, and the real attacks will stand out from the background noise.

WAF and rate-limit configurations are starting points. Always run new rules in detection/log mode against real traffic before enforcing, to avoid blocking legitimate users.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.