Skip to content
CloudOps
All prompts
AI for Kubernetes & Helm Difficulty: Advanced ClaudeChatGPT

Kubernetes Pod Priority & Preemption Prompt

Design PriorityClass hierarchies — critical system pods, tenant tiers, preemption policy, non-preemptive priority, scheduling guarantees.

Target user
Kubernetes platform engineers managing multi-priority workloads
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior Kubernetes platform engineer who has built priority hierarchies for multi-tenant clusters. You know that PriorityClass + Preemption is a sharp tool — wrong values evict the wrong workloads.

I will provide:
- The workload mix (system, prod, dev, batch)
- Current priority classes (if any)
- Symptom (preempting unexpected pods, critical pods evicted, no preemption when expected)

Your job:

1. **PriorityClass basics**:
   - Cluster-scoped object with integer value (higher = more important)
   - `value` 0-1,000,000,000 (most reserved for system)
   - `system-cluster-critical` = 2,000,000,000 (built-in)
   - `system-node-critical` = 2,000,001,000 (built-in)
   - `globalDefault: true` — applies to pods without explicit class
2. **Preemption flow**:
   - Higher-priority pod can't schedule
   - Scheduler finds lower-priority pods to evict
   - Evicted pods enter grace period; new pod schedules
   - **PreemptionPolicy: Never** disables preemption for that PC (still respects priority for scheduling order)
3. **Common hierarchy**:
   ```
   system-node-critical (built-in, ~2B)
   system-cluster-critical (built-in, ~2B)
   platform-critical (custom, 1,000,000)     # CSI driver, monitoring
   tenant-tier-1 (high prod, 100,000)
   tenant-tier-2 (standard prod, 50,000)
   tenant-tier-3 (dev, 10,000)
   batch (low, 100)
   ```
4. **For unintended preemption**:
   - Lower-priority pods evicted to make room for higher
   - If "wrong" pods evicted, check priority values and PodDisruptionBudgets
   - PDBs are respected by preemption (best effort)
5. **For critical pod eviction**:
   - System pods should have `system-cluster-critical` or higher
   - Add `priorityClassName: system-cluster-critical` to system DaemonSets / Deployments
6. **For batch workloads**:
   - Low priority + `PreemptionPolicy: Never` → patient, doesn't kick others
   - Schedule when capacity available
7. **For tiered tenants**:
   - Per-tenant priority classes
   - Higher tier = guaranteed capacity (more or less)
   - Combined with ResourceQuota for hard limits
8. **For non-preemptive scheduling**:
   - `preemptionPolicy: Never` — pod prefers earlier scheduling but won't evict
   - Useful for "first come first served" semantics

Mark DESTRUCTIVE: setting tenant priority above system (evicts CSI driver), `globalDefault: true` on non-default class (every untagged pod inherits), priority values colliding across classes.

---

Workload mix: [DESCRIBE]
Current PCs:
```
[PASTE `kubectl get priorityclasses`]
```
Symptom: [DESCRIBE]

Why this prompt works

Priority/preemption is powerful but underused or misused. This prompt walks the hierarchy design.

How to use it

  1. Map workloads to tiers explicitly.
  2. Reserve top for system components.
  3. Test preemption under capacity pressure.
  4. Coordinate with PDBs for survival.

Useful commands

# Priority classes
kubectl get priorityclass
kubectl describe priorityclass <name>

# Per-pod priority
kubectl get pod <pod> -o jsonpath='{.spec.priorityClassName} {.spec.priority}'

# Pods by priority (sorted high to low)
kubectl get pods -A -o json | jq -r '.items[] | "\(.spec.priority // 0) \(.metadata.namespace)/\(.metadata.name)"' | sort -rn | head -20

# Find untagged pods (no priorityClassName)
kubectl get pods -A -o json | jq -r '.items[] | select(.spec.priorityClassName == null) | "\(.metadata.namespace)/\(.metadata.name)"' | head -20

# Watch preemption events
kubectl get events -A --field-selector reason=Preempted

# Test preemption (carefully)
# 1. Fill the cluster with low-priority pods
# 2. Create high-priority pod requesting all CPU
# 3. Observe events

Hierarchy pattern

# Tier 1: Cluster-critical (use built-ins; only modify if necessary)
# system-cluster-critical: ~2,000,000,000
# system-node-critical:    ~2,000,001,000

# Tier 2: Platform components (your custom system services)
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: platform-critical
value: 1000000
description: "Platform services (monitoring, ingress, CSI)"
preemptionPolicy: PreemptLowerPriority
---
# Tier 3: Production high
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: prod-high
value: 100000
description: "Production critical workloads"
preemptionPolicy: PreemptLowerPriority
---
# Tier 4: Production standard
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: prod-standard
value: 50000
description: "Standard production workloads"
preemptionPolicy: PreemptLowerPriority
---
# Tier 5: Dev/Staging
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: dev
value: 10000
preemptionPolicy: PreemptLowerPriority
---
# Tier 6: Batch (low, non-preemptive)
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: batch
value: 100
preemptionPolicy: Never                # batch waits for capacity, doesn't evict
description: "Batch workloads, non-preemptive"
globalDefault: false

Workload uses:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
spec:
  template:
    spec:
      priorityClassName: prod-standard
      containers:
      - name: app
        image: myapp

Common findings this catches

  • System pods evicted → set system-cluster-critical on them.
  • No preemption when expected → check preemptionPolicy: Never was set.
  • Untagged pods inheriting wrong default → multiple globalDefault: true.
  • PDB violation during preemption → log shows; tune.
  • Tenant priority too high → audit; lower to within bounds.
  • Batch workloads stealing prod capacity → switch batch to PreemptionPolicy: Never.
  • Critical services without priority → add explicitly.

When to escalate

  • Designing priority for compliance / SLA — engage stakeholders.
  • Capacity / cost analysis — combine with autoscaling strategy.
  • Eviction storms in incident — escalate to platform.

Related prompts

Newsletter

Get weekly AI workflows for DevOps engineers

Practical prompts, automation ideas, and tool reviews for infrastructure engineers. One email per week. No spam.