OpenStack Error Guide: 'Something went wrong!' Horizon HTTP

Overview

The OpenStack dashboard (Horizon) is a Django application served by Apache mod_wsgi. When a request raises an unhandled exception, Django returns HTTP 500 and Horizon shows a generic Something went wrong! page with no detail — by design, since DEBUG is off in production. The real cause is always in the web server error log, not the browser. The error is usually configuration, not a Horizon bug: a SECRET_KEY mismatch across nodes, a dead session backend, missing compiled static assets, or an unreachable Keystone endpoint.

In the browser you see only:

Something went wrong!
An unexpected error has occurred. Try refreshing the page. If that doesn't help, contact your local administrator.

The Apache error log carries the traceback, for example:

[wsgi:error] [pid 1421] [client 10.0.0.5:51422] Internal Server Error: /auth/login/
[wsgi:error] django.core.exceptions.ImproperlyConfigured: Error importing CACHES backend
[wsgi:error] pylibmc.ConnectionError: error 3 from memcached_get(...): (0x...) CONNECTION FAILURE

Because Horizon is stateless across a load-balanced pair, a fault on one node (mismatched key, stale assets) can make logins fail intermittently while the page otherwise loads.

Symptoms

The dashboard shows Something went wrong! or a bare HTTP 500 page.
Login succeeds sometimes and 500s other times (load-balanced nodes out of sync).
Static assets (CSS/JS) 404, leaving an unstyled page before the 500.
The Apache/httpd error log shows a Django traceback on every failing request.

# Kolla-Ansible
docker logs horizon 2>&1 | tail -30
# Traditional packages (Debian/Ubuntu)
sudo tail -50 /var/log/apache2/error.log
# Traditional packages (RHEL/CentOS)
sudo tail -50 /var/log/httpd/horizon_error.log

[wsgi:error] Internal Server Error: /project/instances/
[wsgi:error] django.core.exceptions.SuspiciousOperation: Invalid HTTP_HOST header: 'dash.example.com'. You may need to add 'dash.example.com' to ALLOWED_HOSTS.

Common Root Causes

1. SECRET_KEY mismatch across nodes

Behind a load balancer every Horizon node must share the same SECRET_KEY, or sessions/CSRF tokens signed on one node fail on another, raising 500s on form posts.

grep -R "SECRET_KEY" /etc/openstack-dashboard/local_settings.py
docker exec horizon grep SECRET_KEY /etc/openstack-dashboard/local_settings

SECRET_KEY = 'a1b2c3...node1'

Two nodes printing different values is the bug; pin a shared key (Kolla manages this via horizon_secret_key).

2. Memcached / session backend down

Horizon stores sessions in memcached by default. If the backend is unreachable, every request that touches the session 500s with a cache backend error.

grep -A4 CACHES /etc/openstack-dashboard/local_settings.py
echo stats | nc 127.0.0.1 11211 | head

'BACKEND': 'django.core.cache.backends.memcached.PyMemcacheCache',
'LOCATION': '127.0.0.1:11211',

No response from nc to port 11211 means memcached is down or firewalled.

3. Missing compiled static assets

After an upgrade or fresh deploy, if collectstatic/compress never ran, Horizon can 500 on the offline-compressed templates (or render unstyled and break on JS).

ls /var/lib/openstack-dashboard/static/dashboard/ 2>/dev/null | head
docker exec horizon ls /var/lib/kolla/venv/lib/python*/site-packages/static 2>/dev/null | head

ls: cannot access ...: No such file or directory

An empty static tree means assets were never collected/compressed.

4. Keystone endpoint / SSL / CA issues

Horizon calls Keystone at OPENSTACK_KEYSTONE_URL. A wrong URL, an expired cert, or a CA the Horizon node does not trust raises an SSL or connection error mid-login.

grep -E 'OPENSTACK_KEYSTONE_URL|OPENSTACK_SSL_CACERT' /etc/openstack-dashboard/local_settings.py
curl -sk https://keystone.example.com:5000/v3 -o /dev/null -w '%{http_code}\n'

OPENSTACK_KEYSTONE_URL = "https://keystone.example.com:5000/v3"
000

000 (or a cert error without -k) means the Horizon node cannot reach or trust Keystone.

5. policy.yaml / local_settings syntax error

A malformed policy.yaml, a Python syntax error in local_settings.py, or a bad override makes Django fail to import settings and 500 on every request.

docker exec horizon python -c "import openstack_dashboard.settings" 2>&1 | tail -5
python3 -c "import ast,sys; ast.parse(open('/etc/openstack-dashboard/local_settings.py').read())"

SyntaxError: invalid syntax (local_settings.py, line 212)

Any import/parse error here breaks the whole dashboard.

6. ALLOWED_HOSTS, WSGI venv, or time skew

ALLOWED_HOSTS not listing the hostname raises SuspiciousOperation; a mod_wsgi pointed at the wrong Python venv fails to import Horizon; clock skew between nodes invalidates signed sessions/tokens.

grep -E 'ALLOWED_HOSTS' /etc/openstack-dashboard/local_settings.py
grep -R 'WSGIDaemonProcess\|python-home' /etc/apache2/sites-enabled/ /etc/httpd/conf.d/ 2>/dev/null
timedatectl | grep 'synchronized'

ALLOWED_HOSTS = ['horizon.internal']
System clock synchronized: no

A missing hostname or unsynced clock both surface as intermittent 500s.

Diagnostic Workflow

Step 1: Read the real traceback in the web server log

# Kolla-Ansible
docker logs horizon 2>&1 | grep -iE "Internal Server Error|Traceback|Error" | tail -30
# Debian/Ubuntu
sudo tail -f /var/log/apache2/error.log
# RHEL/CentOS
sudo tail -f /var/log/httpd/error.log

The browser shows nothing useful; the Django exception type in this log names the subsystem (cache, keystone, settings, host).

Step 2: Validate settings and policy import

# Kolla
docker exec horizon python -c "import openstack_dashboard.settings" 2>&1 | tail -5
# Traditional
sudo -u www-data python3 -c "import openstack_dashboard.settings" 2>&1 | tail -5

A clean import rules out syntax/policy errors; a traceback points straight at the offending file and line.

Step 3: Check the session backend and SECRET_KEY consistency

echo stats | nc 127.0.0.1 11211 | grep -E 'curr_connections|uptime'
for n in horizon-01 horizon-02; do
  ssh $n "grep SECRET_KEY /etc/openstack-dashboard/local_settings.py"
done

Confirm memcached answers and that every node prints the same SECRET_KEY.

Step 4: Verify Keystone reachability and static assets

curl -k "$(grep OPENSTACK_KEYSTONE_URL /etc/openstack-dashboard/local_settings.py | cut -d'"' -f2)" \
  -o /dev/null -w 'keystone: %{http_code}\n'
ls /var/lib/openstack-dashboard/static/ | head

A non-2xx/3xx from Keystone or an empty static dir is your cause.

Step 5: Turn on DEBUG temporarily, reproduce, then turn it off

# In local_settings.py set DEBUG = True, then:
# Kolla
docker restart horizon
# Traditional
sudo systemctl restart apache2   # or httpd

Reload the page to see the full Django error in the browser, fix the cause, then set DEBUG = False and restart — never leave DEBUG on in production.

Example Root Cause Analysis

After scaling Horizon from one node to two behind a load balancer, users report that the dashboard logs in fine, then randomly throws Something went wrong! on the next click.

The Apache error log on horizon-02 shows the giveaway:

[wsgi:error] Internal Server Error: /project/
[wsgi:error] django.core.signing.BadSignature: Session data corrupted

Sessions signed on one node are rejected by the other. Comparing the keys:

for n in horizon-01 horizon-02; do
  ssh $n "grep SECRET_KEY /etc/openstack-dashboard/local_settings.py"
done

SECRET_KEY = 'a1b2c3...one'
SECRET_KEY = 'd4e5f6...two'

The two nodes generated independent SECRET_KEY values, so whichever node serves the follow-up request cannot validate the session cookie.

Fix: set the same key on both nodes (and point them at shared memcached), then restart:

# set identical SECRET_KEY on both nodes (Kolla: horizon_secret_key)
sudo systemctl restart apache2     # docker restart horizon for Kolla

Logins and subsequent clicks now work regardless of which node the LB picks.

Prevention Best Practices

Pin a single shared SECRET_KEY (and shared memcached) across all Horizon nodes; in Kolla set horizon_secret_key once so every node deploys identically.
Health-check memcached and alert on it — a dead session backend takes the whole dashboard down with 500s.
Bake collectstatic + compress into the deploy/upgrade pipeline so static assets are never missing after a release.
Keep ALLOWED_HOSTS, OPENSTACK_KEYSTONE_URL, and the CA bundle in config management, and smoke-test curl to Keystone from each Horizon node post-deploy.
Enforce NTP/chrony across nodes; clock skew silently invalidates signed sessions and tokens.
Use DEBUG = True only to capture a traceback during triage, then switch it back off immediately. For ad-hoc help, the free incident assistant can turn a Horizon Apache traceback into the likely subsystem cause. See more in OpenStack guides.

Quick Command Reference

# Read the real traceback
docker logs horizon 2>&1 | grep -iE "Internal Server Error|Traceback" | tail -30
sudo tail -50 /var/log/apache2/error.log      # Debian/Ubuntu
sudo tail -50 /var/log/httpd/error.log        # RHEL/CentOS

# Validate settings / policy import
docker exec horizon python -c "import openstack_dashboard.settings" 2>&1 | tail -5

# Session backend + SECRET_KEY consistency
echo stats | nc 127.0.0.1 11211 | grep -E 'curr_connections|uptime'
grep SECRET_KEY /etc/openstack-dashboard/local_settings.py

# Keystone reachability and static assets
curl -k https://keystone.example.com:5000/v3 -o /dev/null -w '%{http_code}\n'
ls /var/lib/openstack-dashboard/static/ | head

# Re-collect static after upgrade (traditional)
sudo /usr/share/openstack-dashboard/manage.py collectstatic --noinput
sudo /usr/share/openstack-dashboard/manage.py compress --force

# Toggle DEBUG and restart
# set DEBUG = True/False in local_settings.py, then:
sudo systemctl restart apache2    # or httpd; docker restart horizon for Kolla

Conclusion

A Horizon Something went wrong! / HTTP 500 is a generic mask over a Django exception that is always recorded in the Apache/httpd (or Kolla container) error log. The usual root causes:

A SECRET_KEY mismatch across load-balanced nodes.
A down or unreachable memcached/session backend.
Missing compiled static assets (collectstatic/compress never ran).
An unreachable Keystone endpoint or untrusted SSL CA.
A policy.yaml or local_settings.py syntax/import error.
ALLOWED_HOSTS, a wrong WSGI venv, or node time skew.

Always start with the web server error log to get the real traceback, flip DEBUG on only long enough to reproduce, and fix the named subsystem — for load-balanced deployments the culprit is most often an out-of-sync key or session backend.

OpenStack Error Guide: 'Something went wrong!' Horizon HTTP 500 Internal Server Error