DAST in CI Without the Noise: Triaging OWASP ZAP Baseline

A ZAP baseline scan is the easiest dynamic security test to add to a pipeline and the easiest to render useless. It is fast because it is passive — it spiders your app and inspects responses without actively attacking — which means it surfaces a lot of low-value alerts: missing headers, informational notices, and findings whose relevance depends entirely on what kind of app you are scanning. Drop that raw output into a CI gate and you get one of two failure modes: the build fails on every run and people disable the check, or it never fails and people stop reading it. Making DAST useful is almost entirely a triage problem, and triage is where this guide lives.

Passive Baseline vs Active Scan, and Why It Matters for Triage

The ZAP baseline scan runs passive rules only — it looks at traffic generated by spidering, but it does not send crafted attack payloads. That makes it safe to run against staging on every pull request, but it also means many alerts are suspicions, not confirmations. ZAP can flag that a parameter is reflected in a response, but the baseline scan cannot confirm it is actually exploitable as XSS without active testing.

This shapes how you triage. A baseline finding is a lead, not a verdict. Some leads are obviously real (a Set-Cookie without Secure on an HTTPS app), some need a manual reproduction to confirm, and some are noise for your particular target. Knowing which is which depends on context the scanner does not have, which is exactly why raw ZAP risk scores are a poor priority signal.

The CI Wiring

The mechanical part is simple — the ZAP baseline action against a deployed staging URL:

- name: ZAP Baseline Scan
  uses: zaproxy/action-baseline@v0.x
  with:
    target: 'https://staging.example.com'
    rules_file_name: '.zap/rules.tsv'   # tune alert thresholds here
    cmd_options: '-a'                    # include alpha passive rules
    fail_action: false                   # start non-blocking; gate after triage

Two choices matter. The rules_file_name points at a TSV where you set each rule to WARN, FAIL, or IGNORE — this is your primary noise-control lever. And fail_action: false to begin with: you do not turn this into a blocking gate until you have triaged a few runs and know which alerts are real for your app. A DAST gate that blocks before it is tuned is a DAST gate that gets removed.

Triage by Exposure and Exploitability, Not Risk Score

The single most important reframe is to stop sorting by ZAP’s risk rating. A “medium” missing-header finding on an internet-facing anonymous endpoint may matter more than a “high” finding on an internal, authenticated-only admin tool. Priority is a function of exposure (who can reach it) and exploitability (can it actually be turned into something), and ZAP knows neither.

A practical triage bucketing:

Exploitable now — confirmed or near-confirmed, internet-reachable. Fix first.
Needs manual verification — a reflected parameter, a potential injection point the baseline scan can only suspect. Confirm with a crafted request before spending fix effort.
Config, not code — missing CSP, X-Content-Type-Options, cookie flags. Real but cheap, and owned by whoever controls the edge configuration, not the app developers.
Low-value or false positive for this target — informational notices, findings that do not apply to a server-rendered app vs an SPA. Suppress in the rules file with a reason.

That last discipline — never suppress without a stated reason — is what keeps a tuned rules file from becoming a place where real findings go to die.

Confirming the Keepers Before Anyone Fixes Them

Every finding you keep should come with a concrete reproduction so an engineer can confirm it is real before opening a fix branch. For a reflected-parameter suspicion, that is a single curl:

# ZAP suspected reflection on the `q` parameter — confirm it's actually reflected unescaped
curl -s "https://staging.example.com/search?q=zap_probe_<b>x</b>" | grep -o 'zap_probe_<b>x</b>'
# If the markup comes back unescaped in an HTML context, it's a real reflected-XSS lead.
# If it's escaped or the response is JSON with correct content-type, it's likely a false positive.

This step protects engineering time. DAST findings, especially from passive baseline scans, have a real false-positive rate, and confirming before fixing means you spend effort on actual vulnerabilities rather than chasing scanner ghosts. It also catches the inverse: a finding that looks minor but reproduces into something serious.

Using AI to Do the First-Pass Triage

The first pass over a ZAP report is mechanical enough to delegate to a model — bucketing, owner routing, and drafting reproductions — while you keep the verification and suppression calls:

Prompt: “Triage this ZAP baseline report for a server-rendered app behind a CDN, scanned unauthenticated. For each alert: bucket as exploitable / needs-verification / config / low-value-for-this-target, name the deciding factor, assign an owner (app vs platform), and for keepers give a curl reproduction. Don’t mark anything a false positive without a reason. Prioritize by exposure and exploitability, not ZAP’s risk score.”

Output (excerpt): “Alert: ‘X-Content-Type-Options missing’ — bucket: config; owner: platform (CDN/edge); cheap header add, not code. Alert: ‘Reflected parameter ref’ — bucket: needs-verification; the baseline scan can’t confirm XSS; reproduce: curl with a markup probe and check for unescaped reflection in HTML context. Alert: ‘Cross-Domain Misconfiguration’ — given the app is behind a CDN, verify whether ZAP saw the CDN’s permissive default vs the app’s real policy before treating as real.”

The model’s bucketing and routing save real time, but a human runs the reproductions and decides what gets suppressed. The AI drafts; the engineer verifies.

From Non-Blocking to Gate

Once a few runs are triaged and the rules file reflects which alerts are real for your app, tighten the loop: set the genuinely actionable rules to FAIL, keep informational ones as WARN, and flip fail_action on for the severities you have decided block a merge. Baseline-suppress the known-accepted findings in the rules file so the gate only fires on new or unaddressed issues. The result is a DAST check that engineers trust because every time it fails, it is telling them something true. If you want a structured triage pass over your own report, the ZAP baseline triage prompt sorts findings by exposure, drafts reproductions for the keepers, and recommends the rules-file tuning to keep the next run quiet.

DAST in CI Without the Noise: Triaging OWASP ZAP Baseline Findings