You are a senior Kubernetes engineer who has designed multi-zone HA workload placement with `topologySpreadConstraints`. You know that misconfigured spread (`DoNotSchedule` + tiny cluster + low maxSkew) is a self-induced FailedScheduling. I will provide: - The workload (Deployment/StatefulSet name) - Cluster topology — zones, node count per zone (`kubectl get nodes --show-labels | grep topology.kubernetes.io/zone`) - Current pod placement (`kubectl get pods -l <selector> -o wide`) - The `topologySpreadConstraints` block from the pod spec - The symptom: pods stuck Pending, uneven distribution, scaling causes Pending Your job: 1. **Decode the constraint**: - **`topologyKey`** — the node label defining the bucket (e.g., `topology.kubernetes.io/zone` for AZ, `kubernetes.io/hostname` for node) - **`maxSkew`** — max difference in pod count between buckets - **`whenUnsatisfiable`** — `DoNotSchedule` (hard; rejects pod if violated) or `ScheduleAnyway` (soft; prefers but allows) - **`labelSelector`** — pods to count when computing skew - **`minDomains`** (1.27+) — minimum number of buckets that must exist; useful for new clusters where zones haven't been used yet 2. **Compute the current skew**: - Group pods by topology label value - Skew = max(group counts) - min(group counts) among labeled-matching pods - Adding a new pod: which bucket does the scheduler prefer? The smallest one that satisfies all other filters 3. **For "pods Pending after scale-up"**: - Current spread already at maxSkew; new pod increases skew further → blocked - Single-zone cluster + `topologyKey: zone` + `maxSkew: 1, DoNotSchedule` → only 1 pod can ever schedule (skew between "zone-A:1" and "no-zone:0" exceeds limit) - Fix: `ScheduleAnyway`, raise `maxSkew`, or add more zones 4. **For "skew higher than expected"**: - `nodeAffinity` excluding nodes from a bucket → fewer schedulable nodes there - `whenUnsatisfiable: ScheduleAnyway` lets the scheduler exceed maxSkew under pressure - Other constraints conflict: `podAntiAffinity` taking precedence 5. **For combining multiple constraints**: - Multiple `topologySpreadConstraints` entries all apply - Common: spread across zones AND nodes (`zone` constraint + `hostname` constraint) - All must be satisfied (effectively AND) 6. **For interaction with HPA**: - HPA scales replicas; new pods must fit the spread - Going from 3 → 4 replicas with 3 zones: where does pod 4 go? Any zone; skew goes 2-1-1. - Going from 4 → 5: another zone; skew 2-2-1, max skew 1. OK if maxSkew ≥ 1. 7. **For init / migration patterns**: - **`minDomains`** ensures enough zones exist (avoids 1-zone init that locks future spread) - For day-1 single-zone clusters that plan to be multi-zone: start with `ScheduleAnyway`, switch to `DoNotSchedule` after expanding Mark DESTRUCTIVE: changing `maxSkew` live (existing pods don't move; new pods may face unexpected constraints), `whenUnsatisfiable: DoNotSchedule` without verifying buckets exist (locks future scheduling). --- Workload: [Deployment/StatefulSet + namespace] Current pod count + intended replicas: [DESCRIBE] Cluster zone topology: [PASTE `kubectl get nodes -L topology.kubernetes.io/zone`] Pods now (with zone): [PASTE `kubectl get pods -l <selector> -o wide`] Spread constraints from pod spec: ```yaml [PASTE topologySpreadConstraints] ``` Symptom: [DESCRIBE]

Why this prompt works

Topology spread is powerful but misconfigured spread = “pods Pending forever.” The cluster might look big, but if the spread requires zones you don’t have, scheduling refuses. This prompt enforces a topology-aware diagnosis.

How to use it

Confirm the cluster’s topology BEFORE designing constraints. Count actual zones / nodes.
Start with ScheduleAnyway in production; flip to DoNotSchedule only after verifying placement works.
For multi-zone clusters, use zone constraints; for single-zone, use node (hostname) constraints.
Combine carefully with anti-affinity — both impose constraints.

Useful commands

# Cluster topology
kubectl get nodes --show-labels | grep -oE 'topology.kubernetes.io/zone=[a-z0-9-]+'
kubectl get nodes -o json | jq '.items[] | {name:.metadata.name, zone:.metadata.labels["topology.kubernetes.io/zone"]}'
# Count nodes per zone
kubectl get nodes -L topology.kubernetes.io/zone | awk '{print $NF}' | sort | uniq -c

# Pod placement
kubectl get pods -l <selector> -o wide --sort-by=.spec.nodeName
kubectl get pods -l <selector> -o json | \
    jq '.items[] | {name:.metadata.name, node:.spec.nodeName, zone:.metadata.labels["topology.kubernetes.io/zone"] // "none"}'

# Per-zone pod count
kubectl get pods -l <selector> -o json | \
    jq -r '.items[].spec.nodeName' | \
    while read n; do kubectl get node $n -o jsonpath='{.metadata.labels.topology\.kubernetes\.io/zone}'; echo; done | \
    sort | uniq -c

# Scheduler decisions
kubectl get events --field-selector reason=FailedScheduling | head

# Test changes safely
kubectl patch deploy <name> --type='strategic' -p '...'  # in staging first

Patterns

Zone spread + soft

topologySpreadConstraints:
- maxSkew: 1
  topologyKey: topology.kubernetes.io/zone
  whenUnsatisfiable: ScheduleAnyway
  labelSelector:
    matchLabels: { app: web }

Strict zone spread (multi-zone cluster required)

topologySpreadConstraints:
- maxSkew: 1
  topologyKey: topology.kubernetes.io/zone
  whenUnsatisfiable: DoNotSchedule
  minDomains: 3                          # require 3 zones
  labelSelector:
    matchLabels: { app: web }

Zone + node spread (HA per zone AND per node)

topologySpreadConstraints:
- maxSkew: 1
  topologyKey: topology.kubernetes.io/zone
  whenUnsatisfiable: DoNotSchedule
  labelSelector:
    matchLabels: { app: db }
- maxSkew: 1
  topologyKey: kubernetes.io/hostname
  whenUnsatisfiable: DoNotSchedule
  labelSelector:
    matchLabels: { app: db }

(Combine with pod anti-affinity for “never two pods on the same node” if needed.)

Single-zone cluster (only node spread)

topologySpreadConstraints:
- maxSkew: 1
  topologyKey: kubernetes.io/hostname
  whenUnsatisfiable: ScheduleAnyway
  labelSelector:
    matchLabels: { app: web }

Cluster-wide default spread (1.24+)

# kube-scheduler config (cluster admin only)
profiles:
- pluginConfig:
  - name: PodTopologySpread
    args:
      defaultConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: ScheduleAnyway
      - maxSkew: 3
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: ScheduleAnyway
      defaultingType: List

Common findings this catches

whenUnsatisfiable: DoNotSchedule + maxSkew: 1 + single-zone cluster → only 1 pod schedules; rest Pending.
labelSelector doesn’t match the pods you intend → spread is computed over wrong group.
Multiple constraints all DoNotSchedule → frequent FailedScheduling; loosen one to ScheduleAnyway.
Skew increases after HPA scale → expected; if exceeds maxSkew, pods Pending.
minDomains: N in single-N-1-zone cluster → never schedules.
Existing imbalance after node add — topology spread doesn’t rebalance existing pods; do kubectl rollout restart deploy <name>.
nodeAffinity excludes some zones → spread can’t use them; pods congregate in remaining zones.

When to escalate

Cluster topology issues (no zone labels on nodes) — engage cluster admin; cloud node provisioning should set these.
Frequent FailedScheduling in HA-critical workloads — review entire scheduling decision; topology spread may not be the only constraint.
Scheduling profile customization (cluster-wide defaults) — coordinate with cluster admin; affects every workload.

Reading prompts? Get all 500 in one free PDF

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response

Instant PDF download — yours free, forever

Plus one practical AI-workflow email a week (no spam)

Kubernetes Topology Spread Constraints Debug Prompt