Skip to content
CloudOps
Newsletter
All prompts
AI for OpenStack Difficulty: Advanced ClaudeChatGPT

OVN Control Plane Deep Dive Prompt

Debug OVN control plane — Northbound/Southbound databases, ovn-northd, ovn-controller, logical flows, raft cluster health.

Target user
Senior network engineers running OVN-based OpenStack networking
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior OVN engineer who has operated OVN at large scale — multi-DB raft clusters, distributed gateway routers, logical flow troubleshooting, network agent replacement.

I will provide:
- The symptom (logical flow not present, controller out of sync, NB/SB DB raft issue, scale problem with high flow count)
- Output of `ovn-nbctl show`, `ovn-sbctl show`, `ovn-nbctl cluster/status`, `ovn-sbctl cluster/status`
- ovn-northd / ovn-controller logs
- The cluster topology (number of controllers, gateway chassis)

Your job:

1. **Understand the data flow**:
   - Neutron writes to NB DB (logical topology: switches, routers, ports)
   - ovn-northd reads NB, computes logical flows, writes to SB DB
   - ovn-controller on each chassis reads SB, programs OVS flows
2. **For "logical flow missing"**:
   - Verify NB entity exists (`ovn-nbctl show`, list specific)
   - Verify northd processed it (`ovn-sbctl find`)
   - Verify chassis processed it (OVS flows on the node)
   - Each step has logs
3. **For raft cluster issues**:
   - `ovn-nbctl --no-leader-only cluster/status` shows raft state per DB
   - Leader, term, votes
   - Lost quorum = no writes possible
   - Recovery: start with single member, add back
4. **For ovn-controller per-chassis issues**:
   - `ovs-vsctl show` for OVS state
   - `ovs-ofctl dump-flows br-int` for actual flows
   - Controller reads SB and programs OVS; if disconnected = no updates
5. **For gateway chassis**:
   - Distributed virtual routing (DVR) on each compute
   - Centralized routing via gateway chassis (NAT, North-South traffic)
   - HA gateway = multiple chassis with priority
6. **For scale**:
   - Large NB/SB DBs slow processing
   - Northd reprocessing on every change
   - `ovn-northd-ssl` for parallelism
7. **For DB compaction**:
   - DBs grow with operations
   - `ovsdb-tool compact` to compress

Mark DESTRUCTIVE: forcing leader election on a healthy cluster, modifying NB/SB DBs outside ovn tools (corrupts), restarting ovn-controller on many nodes simultaneously (cluster-wide flow update).

---

Topology: [DESCRIBE]
Symptom: [DESCRIBE]
`ovn-nbctl cluster/status OVN_Northbound`:
```
[PASTE]
```
`ovn-sbctl cluster/status OVN_Southbound`:
```
[PASTE]
```
ovn-northd / ovn-controller logs:
```
[PASTE]
```

Why this prompt works

OVN is the modern Neutron backend but its debugging tools are different from ML2/OVS. This prompt walks them.

How to use it

  1. Verify raft cluster health first — broken cluster = no debugging.
  2. Walk NB → SB → OVS for missing flows.
  3. For scale issues, monitor northd processing time.
  4. Stagger control plane operations to avoid storms.

Useful commands

# Cluster status
ovn-nbctl --no-leader-only cluster/status OVN_Northbound
ovn-sbctl --no-leader-only cluster/status OVN_Southbound

# NB inspection
ovn-nbctl show
ovn-nbctl list logical_switch
ovn-nbctl list logical_router
ovn-nbctl list acl
ovn-nbctl list logical_switch_port

# SB inspection (computed by northd)
ovn-sbctl show
ovn-sbctl list chassis
ovn-sbctl list datapath_binding
ovn-sbctl lflow-list <datapath>
ovn-sbctl get-ssl

# Trace a packet through OVN
ovn-trace <logical-switch> 'inport=="port-1" && eth.src==00:00:00:00:00:01 && eth.dst==00:00:00:00:00:02'

# Per-chassis (compute node)
sudo ovs-vsctl show
sudo ovs-ofctl dump-flows br-int | head
sudo ovs-appctl ofproto/trace br-int <packet-spec>

# Logs
sudo journalctl -u ovn-northd -n 100 --no-pager
sudo journalctl -u ovn-controller -n 100 --no-pager       # on each chassis
sudo journalctl -u ovsdb-server -n 100 --no-pager

# DB stats
ovsdb-tool show-log /etc/ovn/ovnnb_db.db | head
ovsdb-tool list-dbs /etc/ovn/

# Compact DB (during maintenance window)
sudo systemctl stop ovn-northd
sudo ovsdb-tool compact /etc/ovn/ovnnb_db.db
sudo systemctl start ovn-northd

Common findings this catches

  • Raft cluster lost quorum → recover with --force-leave and re-add members.
  • Logical flow missing → walk NB → SB; northd may not have processed.
  • ovn-controller disconnected → check SB connectivity; restart controller on chassis.
  • Slow northd processing → check NB size; consider sharding or compacting.
  • Gateway chassis failover not happening → priorities misconfigured.
  • OVS flows mismatch SB → controller out of sync; check chassis registration.
  • DB growth unbounded → no compaction scheduled.

When to escalate

  • Major scale events — engage OVN upstream / vendor.
  • Production raft failures — restoration from backup.
  • OVS / OVN version mismatch — coordinated upgrade.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week