Prometheus Error Guide: 'Bad Gateway' Grafana Datasource

Overview

Grafana queries Prometheus through a datasource. When something between Grafana and Prometheus breaks, panels render one of three failures: “Bad Gateway” (Grafana’s proxy could not reach Prometheus or got a 502 from something in front of it), a datasource error (Prometheus returned an HTTP error or Grafana could not parse the response), or “No data” (the request succeeded but the query returned an empty result). The first two are connectivity/config problems; the last is almost always a query, label, or time-range problem.

You will see these in the panel and in Grafana’s logs:

t=2026-06-23T14:14:02+0000 lvl=eror msg="Data proxy error" logger=data-proxy-log error="http: proxy error: dial tcp 10.0.2.9:9090: connect: connection refused"

The panel itself shows:

Bad Gateway

or, when the query runs but matches nothing:

No data

The distinction matters: “Bad Gateway” never reached Prometheus’s query engine, “datasource error” reached it and got a non-200, and “No data” got a clean 200 with an empty result set.

Symptoms

All panels on all dashboards fail with “Bad Gateway” (datasource-wide connectivity issue).
One panel shows a red datasource error with a Prometheus message (e.g., parse error, too many samples).
A panel renders axes but says “No data” while the metric clearly exists in Prometheus.
Grafana’s Explore view reproduces the error, isolating it from dashboard JSON.

curl -s -o /dev/null -w '%{http_code}\n' http://localhost:3000/api/health

Common Root Causes

1. Wrong datasource URL or proxy access mode (Bad Gateway)

Grafana’s “proxy” access mode means the Grafana server, not your browser, connects to the URL. A URL that resolves on your laptop but not from the Grafana container yields “Bad Gateway.”

# Test from where Grafana actually runs (the container/host)
curl -s -o /dev/null -w '%{http_code}\n' http://prometheus:9090/-/ready

000 (no connection) from the Grafana host confirms a reachability/URL problem even though Prometheus is up elsewhere.

2. Prometheus down or not ready (Bad Gateway)

The datasource URL is correct but Prometheus is restarting, replaying its WAL, or crashed.

curl -s http://prometheus:9090/-/ready
curl -s http://prometheus:9090/-/healthy

Prometheus is not ready.

During WAL replay after a restart, /-/ready returns “not ready” and Grafana shows Bad Gateway until replay finishes.

3. A reverse proxy / auth layer returning 502 (Bad Gateway)

If Prometheus sits behind nginx, Caddy, or an auth proxy, that layer can 502 even when Prometheus is healthy.

curl -s -o /dev/null -w 'direct=%{http_code}\n' http://prometheus:9090/-/ready
curl -s -o /dev/null -w 'proxy=%{http_code}\n'  https://metrics.internal/prometheus/-/ready

direct=200
proxy=502

direct=200 but proxy=502 isolates the failure to the proxy/auth tier, not Prometheus.

4. Query returns an HTTP error (datasource error)

The datasource reaches Prometheus, but the query itself errors (bad syntax, too many samples, timeout). Reproduce the exact API call:

curl -s -G 'http://prometheus:9090/api/v1/query' \
  --data-urlencode 'query=rate(http_requests_total[5m]' | jq .

{"status":"error","errorType":"bad_data","error":"1:31: parse error: unclosed left parenthesis"}

Grafana surfaces this verbatim as a datasource error.

5. Label or value mismatch causing No data

The query runs cleanly but matches nothing because a label name/value drifted (e.g., instance vs pod, job="api" vs job="api-server").

curl -s 'http://prometheus:9090/api/v1/label/job/values' | jq -r '.data[]'

node
kube-state-metrics
api-server

A panel filtering job="api" returns “No data” because the actual value is api-server.

6. Time range or step outside available data (No data)

The dashboard time range predates the data, or the resolution/step is too coarse/fine for a sparse metric.

curl -s 'http://prometheus:9090/api/v1/query' \
  --data-urlencode 'query=count_over_time(up[1h])' | jq '.data.result | length'

Zero results over the last hour with a non-empty result over a wider range means the panel’s time range is outside the data window (or retention expired it).

Diagnostic Workflow

Step 1: Classify the failure

“Bad Gateway” = never reached Prometheus’s engine. “Datasource error with a message” = reached it, got a non-200. “No data” = got a 200 with empty results. Each branch goes a different way.

Step 2: For Bad Gateway, test reachability from the Grafana host

curl -s -o /dev/null -w '%{http_code}\n' <DATASOURCE_URL>/-/ready

Run this from the Grafana container/host (not your laptop) because proxy mode connects server-side.

Step 3: Check Prometheus readiness and any fronting proxy

curl -s http://prometheus:9090/-/ready
curl -s -o /dev/null -w 'proxy=%{http_code}\n' <PUBLIC_URL>/-/ready

Separate Prometheus being down from a 502 at the proxy/auth tier.

Step 4: For datasource errors, replay the query against the API

curl -s -G 'http://prometheus:9090/api/v1/query' --data-urlencode 'query=<EXPR>' | jq .

The API returns the precise errorType/error Grafana is echoing.

Step 5: For No data, verify labels and time range

curl -s 'http://prometheus:9090/api/v1/label/<LABEL>/values' | jq -r '.data[]'
curl -s 'http://prometheus:9090/api/v1/query' --data-urlencode 'query=<EXPR>' | jq '.data.result | length'

Confirm the label values exist and the metric has samples in the panel’s window.

Example Root Cause Analysis

After a Prometheus host migration, every Grafana panel shows “Bad Gateway,” but https://prometheus.internal/graph loads fine in a browser.

Testing from the Grafana container:

docker exec grafana curl -s -o /dev/null -w '%{http_code}\n' http://prometheus.old:9090/-/ready

The datasource URL still points at the old hostname prometheus.old, which no longer resolves from inside the Grafana container, even though the new host is reachable from the browser and DNS. Grafana’s proxy access mode connects server-side, so the stale URL is the failure.

The fix updates the datasource URL to the new address and confirms reachability from Grafana’s network namespace:

# Update datasource URL to http://prometheus:9090, then:
docker exec grafana curl -s -o /dev/null -w '%{http_code}\n' http://prometheus:9090/-/ready

With the corrected URL, the proxy reaches Prometheus and all panels render. (Provisioned datasources should be fixed in the YAML, not just the UI, so the change survives restarts.)

Prevention Best Practices

Provision datasources as code (YAML) with a health check, and validate the URL resolves from Grafana’s network — not just your browser — after any host or DNS change.
Put a readiness gate in front of dashboards: alert on Prometheus /-/ready and on Grafana’s datasource health so “Bad Gateway” is caught before users report it.
Build panels with variables sourced from label_values(...) so filters can’t drift from real label values and silently produce “No data.”
Keep dashboard default time ranges within your Prometheus retention; “No data” on old ranges is expected once retention expires.
When Prometheus sits behind a proxy, monitor the proxy’s 5xx rate separately so a 502 there isn’t mistaken for Prometheus being down.
The free incident assistant can classify a Grafana failure as connectivity vs query vs empty-result and point at the layer to fix; more dashboards guidance is under Prometheus and monitoring.

Quick Command Reference

# Reach Prometheus from where Grafana runs (proxy mode is server-side)
docker exec grafana curl -s -o /dev/null -w '%{http_code}\n' <DATASOURCE_URL>/-/ready

# Prometheus readiness/health
curl -s http://prometheus:9090/-/ready
curl -s http://prometheus:9090/-/healthy

# Direct vs proxied (isolate a fronting 502)
curl -s -o /dev/null -w 'direct=%{http_code}\n' http://prometheus:9090/-/ready
curl -s -o /dev/null -w 'proxy=%{http_code}\n'  <PUBLIC_URL>/-/ready

# Replay a failing query
curl -s -G 'http://prometheus:9090/api/v1/query' --data-urlencode 'query=<EXPR>' | jq .

# Verify label values exist (No data triage)
curl -s 'http://prometheus:9090/api/v1/label/<LABEL>/values' | jq -r '.data[]'

Conclusion

Grafana failures against Prometheus split cleanly by where they break:

“Bad Gateway” never reached Prometheus’s engine — test reachability from the Grafana host, not your browser.
Confirm Prometheus /-/ready and rule out a 502 at a fronting proxy.
A datasource error with a Prometheus message means the query failed — replay it against the API.
“No data” means a clean 200 with no results — verify label values and the panel’s time range.
Fix provisioned datasources in YAML so the change survives restarts.

Classify first, then dive into the right layer — connectivity, query, or empty result — and the fix follows directly.

Prometheus Error Guide: 'Bad Gateway' Grafana Datasource Error / No Data