Grafana Error Guide: 'Failed to connect to database'

Overview

Grafana stores all of its state — dashboards, users, orgs, folders, API keys, alert rules — in a backend database. By default that is a local SQLite file, but production deployments almost always point Grafana at MySQL or Postgres so multiple replicas can share state. Grafana opens this connection at boot and runs schema migrations before it serves any HTTP traffic. If the database is unreachable, Grafana logs a fatal error and the process exits — the service never comes up.

The literal errors you see in journalctl or kubectl logs look like:

Failed to connect to database
dial tcp 10.0.0.5:5432: connect: connection refused

Error: ✗ dial tcp: lookup db on 10.96.0.10:53: no such host

migration failed: Error 1045: Access denied for user 'grafana'@'10.0.1.7' (using password: YES)

Because this happens before the web server starts, you will not see a login page or a 500 error — you get a dead port and a restart loop. The fix is almost always in the [database] section of grafana.ini or the equivalent GF_DATABASE_* environment variables.

Symptoms

Grafana service fails to start or crash-loops (systemctl status grafana-server shows activating (auto-restart) or failed).
Port 3000 refuses connections; no login page renders.
Logs contain Failed to connect to database followed by a dial tcp or Access denied message.
In Kubernetes the pod is in CrashLoopBackOff with the DB error in kubectl logs.
Intermittent too many connections errors under load even when startup succeeds.

Common Root Causes

1. Database server is down or the port is wrong

The most common cause: the DB host is unreachable because the service is stopped, the port is wrong, or a firewall/security group blocks it.

[database]
type = postgres
host = 10.0.0.5:5432
name = grafana
user = grafana
password = """s3cr3t"""
ssl_mode = require

Failed to connect to database
dial tcp 10.0.0.5:5432: connect: connection refused

connection refused means the TCP handshake reached the host but nothing is listening on that port. A hang followed by i/o timeout instead means a firewall is silently dropping packets.

2. Wrong or missing DNS name

If host uses a hostname (common in Kubernetes with a Service name like grafana-db), a DNS failure produces no such host.

Error: ✗ dial tcp: lookup grafana-db on 10.96.0.10:53: no such host

Confirm the Service/endpoint exists in the right namespace and that the name matches exactly, including any .namespace.svc.cluster.local suffix.

3. Bad credentials

MySQL returns error 1045, Postgres returns password authentication failed.

migration failed: Error 1045: Access denied for user 'grafana'@'10.0.1.7' (using password: YES)

pq: password authentication failed for user "grafana"

Check that the password in grafana.ini matches the DB and that the grant covers the source IP ('grafana'@'%' vs 'grafana'@'localhost').

4. TLS / ssl_mode mismatch

Postgres ssl_mode must match the server. If the server requires TLS but Grafana sends disable, or vice versa, the connection is rejected.

pq: SSL is not enabled on the server

pq: no pg_hba.conf entry for host "10.0.1.7", user "grafana", database "grafana", no encryption

Valid Postgres values are disable, require, verify-ca, and verify-full. For MySQL use true, false, skip-verify, or preferred.

5. Connection pool exhausted

If max_open_conn is set too high across many replicas, the DB hits its own max_connections limit and rejects new sessions.

[database]
max_open_conn = 100
max_idle_conn = 25
conn_max_lifetime = 14400

Error 1040: Too many connections

Diagnostic Workflow

Step 1 — Read the exact error. The Go driver error tells you which failure mode you are in.

journalctl -u grafana-server -n 50 --no-pager | grep -iE "database|dial tcp|migration"
# Kubernetes:
kubectl logs deploy/grafana --tail=50 | grep -iE "database|dial tcp"

Step 2 — Confirm what Grafana thinks it’s connecting to. Env vars override the ini file.

grafana-cli admin data-migration | head   # confirms binary works
grep -A8 '^\[database\]' /etc/grafana/grafana.ini
env | grep GF_DATABASE_

Step 3 — Test raw TCP reachability from the Grafana host.

nc -vz 10.0.0.5 5432    # Postgres
nc -vz 10.0.0.5 3306    # MySQL

Step 4 — Test the actual credentials with a native client.

PGPASSWORD=s3cr3t psql -h 10.0.0.5 -p 5432 -U grafana -d grafana -c '\dt'
mysql -h 10.0.0.5 -P 3306 -u grafana -ps3cr3t grafana -e 'SHOW TABLES;'

Step 5 — In Kubernetes, exec into the pod so you test from the same network namespace and DNS resolver.

kubectl exec -it deploy/grafana -- sh -c 'nc -vz grafana-db 5432; nslookup grafana-db'

Example Root Cause Analysis

A team upgraded their Postgres RDS instance and Grafana went into CrashLoopBackOff. Logs showed:

Failed to connect to database
pq: no pg_hba.conf entry for host "10.0.4.9", user "grafana", database "grafana", no encryption

TCP reachability was fine (nc -vz succeeded), and the password was correct when tested from a bastion — but only with sslmode=require. The RDS parameter group had rds.force_ssl=1 enabled during the upgrade. Grafana’s config still had ssl_mode = disable. Setting ssl_mode = require in the [database] section (via GF_DATABASE_SSL_MODE=require) let Grafana negotiate TLS, migrations ran, and the pod became Ready. Root cause: a server-side TLS requirement that no longer matched the client config.

Prevention Best Practices

Pin ssl_mode explicitly in config so an upstream default change doesn’t silently break you.
Add a startup readiness dependency: in systemd use After= / Requires=; in Kubernetes use an init container that waits for the DB port.
Size the pool sanely: keep replicas * max_open_conn well under the DB’s max_connections.
Store credentials in a secret manager and inject via GF_DATABASE_PASSWORD, not plaintext ini.
Monitor the DB port with a blackbox exporter so you catch outages before Grafana crash-loops.
Least-privilege but sufficient grants: the Grafana DB user needs DDL (CREATE, ALTER) for migrations, not just DML.

Quick Command Reference

# What is Grafana logging?
journalctl -u grafana-server -n 50 --no-pager | grep -iE "database|dial tcp"
kubectl logs deploy/grafana --tail=50 | grep -i database

# See the effective config
grep -A8 '^\[database\]' /etc/grafana/grafana.ini
env | grep GF_DATABASE_

# Reachability + auth tests
nc -vz 10.0.0.5 5432
PGPASSWORD=s3cr3t psql -h 10.0.0.5 -U grafana -d grafana -c '\dt'
mysql -h 10.0.0.5 -u grafana -ps3cr3t grafana -e 'SHOW TABLES;'

# From inside the pod (DNS + network parity)
kubectl exec -it deploy/grafana -- sh -c 'nslookup grafana-db; nc -vz grafana-db 5432'

# Restart after fixing config
systemctl restart grafana-server
kubectl rollout restart deploy/grafana

Conclusion

When Grafana logs Failed to connect to database and exits at boot, work through these root causes in order:

Database down or wrong port — connection refused / i/o timeout; verify with nc -vz.
DNS resolution failure — no such host; confirm the Service/hostname resolves from the Grafana host.
Bad credentials — Error 1045 / password authentication failed; test with psql/mysql.
TLS / ssl_mode mismatch — SSL is not enabled / no pg_hba.conf entry; align ssl_mode with the server.
Connection pool exhaustion — too many connections; lower max_open_conn across replicas.

For related connectivity problems, see the Grafana category and the sibling guide on migration failures on startup.

Grafana Error Guide: 'Failed to connect to database' — Fix Grafana's Backend DB Connection