Prometheus Error Guide: 'connect: connection refused' Scrape Target DOWN
Fix Prometheus scrape 'connection refused', 'connection reset by peer', and 'no route to host' errors: diagnose dead exporters, wrong ports, firewalls, and bind addresses.
- #prometheus-monitoring
- #troubleshooting
- #errors
- #scraping
Exact Error Message
On the Prometheus Targets page (/targets) the endpoint shows DOWN with a red error string in the Error column:
Get "http://10.0.0.5:9100/metrics": dial tcp 10.0.0.5:9100: connect: connection refused
Two closely related variants appear for the same class of failure:
Get "http://10.0.0.5:9100/metrics": dial tcp 10.0.0.5:9100: connect: no route to host
Get "http://10.0.0.5:9100/metrics": read tcp 10.0.2.1:51234->10.0.0.5:9100: read: connection reset by peer
The same text is logged by the scrape manager:
ts=2026-06-27T10:14:03.221Z caller=scrape.go:1351 level=debug component="scrape manager" scrape_pool=node target=http://10.0.0.5:9100/metrics msg="Scrape failed" err="Get \"http://10.0.0.5:9100/metrics\": dial tcp 10.0.0.5:9100: connect: connection refused"
What the Error Means
Prometheus tried to open a TCP connection to the target’s /metrics endpoint and the connection never completed at the network layer. This happens before any HTTP response, so it is a transport problem, not an authentication or TLS problem.
- connection refused — the host is reachable and answered the TCP SYN with a RST: nothing is listening on that IP:port.
- no route to host — a router or firewall actively rejected the packet (ICMP unreachable), or the host is on an unreachable subnet.
- connection reset by peer — the connection was accepted then torn down mid-flight, typically a proxy, a service that crashed during the scrape, or a security appliance.
The result is up{job="..."} == 0 for that instance and a hole in every metric the target would have exposed.
Common Causes
- The exporter is not running. The process (node_exporter, a
/metricshandler, etc.) died or never started. - Wrong port in the scrape config or service discovery.
9100vs9090, or a relabel rule that built the wrong__address__. - Firewall / security group blocking the port. AWS security group,
ufw,firewalld, or a Kubernetes NetworkPolicy drops traffic on the scrape port →no route to hostor a timeout. - Exporter bound to
127.0.0.1instead of0.0.0.0. It listens, but only on loopback, so remote Prometheus getsconnection refused. - Container/port-mapping mismatch. The container exposes
9100internally but the host or Service maps a different port, or the pod IP changed. - Stale or wrong service discovery. SD returns a terminated instance, an old pod IP, or a node that no longer runs the exporter.
How to Reproduce the Error
Point a scrape job at a port where nothing listens:
scrape_configs:
- job_name: "node"
static_configs:
- targets: ["10.0.0.5:9100"]
Then stop the exporter (or never start it) on 10.0.0.5. Within one scrape interval the Targets page flips that instance to DOWN with connect: connection refused. Binding the exporter to loopback only reproduces it identically from a remote Prometheus:
# On the target: bind to localhost only -> remote scrape is refused
node_exporter --web.listen-address=127.0.0.1:9100
Diagnostic Commands
Start from the API view of the failing target so you have the exact scrapeUrl and error:
# All DOWN targets with their last error and the exact URL Prometheus used
curl -s http://localhost:9090/api/v1/targets \
| jq -r '.data.activeTargets[] | select(.health!="up")
| [.labels.job, .scrapeUrl, .lastError] | @tsv'
Confirm the failure in the up metric:
curl -s http://localhost:9090/api/v1/query \
--data-urlencode 'query=up == 0' | jq -r '.data.result[].metric | "\(.job) \(.instance)"'
Reach the target’s /metrics from the Prometheus host (this is the test that matters — connectivity from anywhere else can mislead):
curl -v --max-time 5 http://10.0.0.5:9100/metrics
# connection refused -> nothing listening / wrong port
# no route to host -> firewall / routing / security group
# hangs then times out-> packets silently dropped
On the target host, confirm what is actually listening and on which address:
ss -ltnp | grep 9100
# 127.0.0.1:9100 -> bound to loopback (remote refused)
# 0.0.0.0:9100 -> bound on all interfaces (good)
# (no output) -> exporter not running
Check whether the exporter process is alive and review its journal:
systemctl status node_exporter
journalctl -u node_exporter --no-pager | tail -20
Inspect the resolved config and any relabeling that builds __address__:
curl -s http://localhost:9090/api/v1/status/config | jq -r '.data.yaml' | grep -A8 'job_name: node'
promtool check config /etc/prometheus/prometheus.yml
For service discovery, the Service Discovery page (/service-discovery) shows the pre-relabel __address__ and any dropped targets — invaluable when SD returns the wrong endpoint.
Step-by-Step Resolution
Cause: exporter not running. Start and enable it, then confirm it listens.
systemctl enable --now node_exporter
ss -ltnp | grep 9100
Cause: bound to loopback. Change the listen address to all interfaces and restart.
node_exporter --web.listen-address=0.0.0.0:9100
Cause: wrong port. Fix the target/port in the scrape config (or the relabel rule), validate, and reload.
scrape_configs:
- job_name: "node"
static_configs:
- targets: ["10.0.0.5:9100"] # match the port from `ss -ltnp`
promtool check config /etc/prometheus/prometheus.yml
curl -X POST http://localhost:9090/-/reload
Cause: firewall / security group (no route to host). Open the scrape port from the Prometheus source range.
# Example with firewalld on the target
firewall-cmd --add-port=9100/tcp --permanent && firewall-cmd --reload
For AWS, add an inbound rule on the target’s security group allowing TCP 9100 from the Prometheus security group/CIDR. In Kubernetes, allow the port in the relevant NetworkPolicy.
Cause: stale/wrong SD. Fix the relabel rule that builds __address__ so it points at the right pod/port; the Service Discovery page confirms the corrected address.
relabel_configs:
- source_labels: [__meta_kubernetes_pod_ip]
target_label: __address__
replacement: "$1:9100"
After any change, refresh /targets; the instance should return to UP and up == 0 should clear.
Prevention and Best Practices
- Run an exporter availability alert on
up == 0 for: 5mso a refused/dead target pages you instead of silently leaving a gap. - Standardize exporter ports (node 9100, cAdvisor 8080, blackbox 9115) and keep them in one inventory so scrape configs and firewall rules stay in sync.
- Always bind exporters to a routable address (
0.0.0.0or a specific NIC), never loopback, when a remote Prometheus scrapes them. - Manage firewall/security-group rules as code so opening the scrape port is part of provisioning, not a manual afterthought.
- Prefer service discovery (Kubernetes, EC2, Consul) over static IPs so terminated instances drop out automatically instead of lingering as DOWN.
- Alert on
scrape_samples_scraped == 0alongsideupto catch targets that connect but return nothing.
Related Errors
- Prometheus Error: target DOWN and up == 0 triage hub
- Prometheus Error: 401 Unauthorized / 403 Forbidden on scrape
- Prometheus Error: x509 certificate signed by unknown authority
- Prometheus Error: context deadline exceeded scrape timeout
Frequently Asked Questions
Why does the Targets page say “connection refused” when curl localhost:9100/metrics works on the target itself?
Because the exporter is bound to 127.0.0.1. A local curl on the target succeeds, but Prometheus connects over the network and is refused. Check ss -ltnp | grep 9100 — if it shows 127.0.0.1:9100, rebind to 0.0.0.0.
What is the difference between “connection refused” and “no route to host”?
connection refused means the host answered but nothing is listening on that port (or it is the wrong port). no route to host means a firewall, security group, or routing layer rejected the packet before it reached a listening socket. The first is usually an exporter/port problem; the second is a network/firewall problem.
The target was UP yesterday and is DOWN now with no config change — why?
The exporter process likely crashed or the pod/instance was replaced and its IP changed. Check systemctl status / journalctl on the target, and verify the Service Discovery page is returning the current address rather than a stale one.
Does connection reset by peer mean an authentication failure?
No. A reset is a transport-level teardown, not an HTTP 401/403. It usually means a proxy, load balancer, or the exporter itself dropped the connection mid-scrape. If you instead see HTTP status 401/403, the connection succeeded and the problem is auth — see the 401/403 guide.
How do I find the exact URL Prometheus is scraping?
Query /api/v1/targets and read .scrapeUrl, or open the Targets page and hover the Endpoint column. This shows the post-relabel address, which is what you should curl from the Prometheus host.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.