Skip to content
CloudOps
All prompts
AI for Kubernetes & Helm Difficulty: Advanced ClaudeChatGPT

Kubernetes API Server Flow Control & Priority Prompt

Tune API Priority and Fairness (APF) — flow schemas, priority levels, fair queueing, debugging API throttling and 429s.

Target user
Cluster admins tuning apiserver under load
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior Kubernetes platform engineer who has tuned API Priority and Fairness (APF) at scale — preventing noisy clients from overwhelming apiserver, isolating workload tiers, debugging 429 throttling.

I will provide:
- The symptom (kubectl 429, controllers slow, apiserver high CPU)
- Current APF configuration
- Workload mix

Your job:

1. **APF basics**:
   - Replaces `--max-requests-inflight` model
   - Categorizes requests by priority level
   - Within priority: fair queueing per "flow"
   - Allows isolation of important from less-important
2. **Components**:
   - **FlowSchema** — classifies requests (by user, group, verb, resource)
   - **PriorityLevelConfiguration** — defines queueing properties per priority
   - **Built-in**: workload-high, workload-low, leader-election, system, exempt, etc.
3. **For 429 errors**:
   - APF throttling kicked in
   - `kubectl_request_total{code="429"}` metric
   - Check which flow / priority is being throttled
4. **For controller starvation**:
   - Controller (kube-controller-manager) sharing flow with high-volume clients
   - Isolate via custom FlowSchema
5. **For investigating**:
   - `apf_current_inqueue_requests`
   - `apf_dispatched_requests_total`
   - `apf_rejected_requests_total`
   - apiserver concurrency settings
6. **For tuning**:
   - `assuredConcurrencyShares` — relative share within priority level
   - `limitedQueueing` for max queue depth
   - Match more important traffic to higher priority
7. **For exempt priority**:
   - Bypasses APF entirely; system-critical
   - Don't add user workloads here
8. **For per-user flow distinction**:
   - `flowDistinguisher` field
   - User-based, namespace-based

Mark DESTRUCTIVE: disabling APF (apiserver overload risk), exempt-priority abuse (system breakage), aggressive queueing rejection.

---

Symptom: [DESCRIBE]
Current APF config:
```
[PASTE `kubectl get flowschema, prioritylevelconfiguration`]
```
Workload mix: [DESCRIBE]

Why this prompt works

APF is critical at scale but underexplained. This prompt walks the tuning.

How to use it

  1. Verify APF metrics are scraped.
  2. For 429s, identify the flow.
  3. For controller issues, isolate.
  4. Tune cautiously.

Useful commands

# Inventory
kubectl get flowschema
kubectl get prioritylevelconfiguration

# Specific
kubectl describe flowschema <name>
kubectl describe prioritylevelconfiguration <name>

# Watch APF metrics
kubectl get --raw /metrics | grep apf_

# Top metrics
kubectl get --raw /metrics | grep "apf_rejected_requests_total" | head
kubectl get --raw /metrics | grep "apiserver_request_total.*code=\"429\"" | head

# Tracing 429
# 1. Get APF rejected per flow:
kubectl get --raw /metrics | grep "apf_rejected_requests_total"
# 2. Map back to client
# 3. Adjust FlowSchema or upgrade client behavior

Patterns

Custom FlowSchema for a specific controller

apiVersion: flowcontrol.apiserver.k8s.io/v1
kind: FlowSchema
metadata:
  name: my-controller
spec:
  matchingPrecedence: 200
  priorityLevelConfiguration:
    name: workload-high
  rules:
  - subjects:
    - kind: ServiceAccount
      serviceAccount:
        name: my-operator
        namespace: my-operator-system
    resourceRules:
    - apiGroups: ["*"]
      resources: ["*"]
      verbs: ["*"]
  distinguisherMethod:
    type: ByUser

Custom PriorityLevel

apiVersion: flowcontrol.apiserver.k8s.io/v1
kind: PriorityLevelConfiguration
metadata:
  name: critical-controllers
spec:
  type: Limited
  limited:
    assuredConcurrencyShares: 200      # higher = more share
    limitResponse:
      type: Queue
      queuing:
        queueLengthLimit: 100
        queues: 64
        handSize: 6

Common findings this catches

  • All controllers in same flow → mutual starvation: isolate via FlowSchema.
  • Many kubectl 429 from CI: rate-limit at CI side; or assign separate flow.
  • APF rejections spike during incident → temporary increase shares.
  • Exempt priority abused — review FlowSchemas.
  • Concurrency too low on apiserver: bump replicas; add load balancing.
  • Reaper / controller swamping — isolate.
  • No APF metrics visible → Prometheus scrape config.

When to escalate

  • API server overload during incident → emergency scaling.
  • Custom APF design across multi-tenant — strategic.
  • Cross-cluster API performance — federation considerations.

Related prompts

Newsletter

Get weekly AI workflows for DevOps engineers

Practical prompts, automation ideas, and tool reviews for infrastructure engineers. One email per week. No spam.