Kubernetes Cluster Autoscaler / Karpenter Debug Prompt
Diagnose cluster autoscaling — scale-up delay, scale-down protection, node group selection, pod doesn't fit any template, Karpenter NodePool/NodeClaim issues.
- Target user
- Kubernetes platform engineers operating autoscaled clusters
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior Kubernetes engineer who has operated both the upstream Cluster Autoscaler (CA) and Karpenter across AWS, GCP, and Azure. You know that CA can't add nodes a pod wouldn't fit on, and that Karpenter's "best instance" choice can surprise users.
I will provide:
- Autoscaler type and version (CA vs Karpenter)
- Cloud provider (EKS, GKE, AKS, etc.)
- The symptom (scale-up delay, scale-up never happens, scale-down too aggressive, wrong instance type chosen)
- CA: `kubectl -n kube-system logs deploy/cluster-autoscaler --tail=200`
- CA: `kubectl describe configmap cluster-autoscaler-status -n kube-system`
- Karpenter: `kubectl get nodepool`, `kubectl get nodeclaim`, `kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter --tail=200`
- Pending pod: `kubectl describe pod <pending-pod>`
Your job:
1. **For "scale-up doesn't happen"**:
- **CA**: check `cluster-autoscaler-status` ConfigMap for last decision and reason
- CA logs show "scale up not needed because pod XYZ wouldn't fit on any node group" — pod's request exceeds the largest node template
- **Karpenter**: NodeClaim creation but never Ready → cloud-side issue (IAM, AMI, subnet)
- Pod has nodeSelector/tolerations that exclude all node groups
2. **For "scale-up is slow"**:
- Cloud provider EC2 / GCE / Azure VM provisioning latency (1-5 min typical)
- CA: `--scan-interval` (default 10s)
- CA: pod's `requests` calc — if 0, autoscaler doesn't trigger; ensure requests set
- Karpenter: usually faster (seconds to provision); if slow, check `kubectl get nodeclaims` for errors
3. **For "scale-down too aggressive"**:
- **CA**: `--scale-down-utilization-threshold` (default 0.5; node removed if < 50% used)
- **CA**: nodes with non-replicated pods (single-pod DaemonSets, kube-system bits) — CA respects PDB and "no-evict" annotations
- **`cluster-autoscaler.kubernetes.io/safe-to-evict: "false"`** on critical pods prevents eviction
- Karpenter `disruption.budgets` and `disruption.consolidationPolicy` control aggressiveness
4. **For "wrong instance type chosen" (Karpenter especially)**:
- Karpenter picks cheapest fit; can choose smaller/larger than expected
- Pin to specific instance families: `requirements: [{key: node.kubernetes.io/instance-type, operator: In, values: [...]}]`
- Or restrict by capacity type, arch
5. **For "Karpenter NodeClaim stuck Pending"**:
- `kubectl describe nodeclaim <name>` — usually shows IAM, subnet quota, or launchTemplate issue
- Check cloud provider account quotas
6. **For CA + GPU workloads**:
- CA can scale GPU node groups but the pod needs `requests: { nvidia.com/gpu: 1 }`
- Node template must advertise GPU as schedulable resource
7. **For "nodes idle but autoscaler won't remove"**:
- `--scale-down-delay-after-add` / `delete` / `failure` set delays
- `--scale-down-unneeded-time` (default 10m) — node must be unneeded for this duration
- PDB blocking eviction
- kube-system pod without controller (orphan) on the node blocks removal
8. **For cost-aware tuning**:
- Karpenter consolidation: `disruption.consolidationPolicy: WhenUnderutilized` or `WhenEmpty`
- Spot instance preference via `karpenter.sh/capacity-type: spot` requirement
- Cluster Autoscaler: use cluster-autoscaler-priority-expander to bias node groups
Mark DESTRUCTIVE: lowering `scale-down-utilization-threshold` very low (aggressive removal), removing PDBs to enable scale-down (risk to availability), Karpenter `consolidation: WhenUnderutilized` on workloads sensitive to restarts.
---
Autoscaler: [CA / Karpenter + version]
Cloud + cluster type: [DESCRIBE]
Symptom: [DESCRIBE]
Pending pod (if scale-up issue): `kubectl describe pod <pod>`:
```
[PASTE]
```
Autoscaler logs (last 50 lines):
```
[PASTE]
```
For CA: status configmap:
```
[PASTE `kubectl describe configmap cluster-autoscaler-status -n kube-system`]
```
For Karpenter: NodePool + NodeClaim:
```yaml
[PASTE `kubectl get nodepool -o yaml` and `kubectl get nodeclaims`]
```
Why this prompt works
Cluster autoscaling fails in two ways: not scaling up when it should, or scaling down too aggressively. Each cause is specific (pod doesn’t fit, PDB blocks eviction, spot instance reclaimed). This prompt walks both directions.
How to use it
- State the autoscaler clearly. CA and Karpenter have very different behavior.
- For “won’t scale up”, the pending pod’s full spec is essential.
- For “won’t scale down”, identify what’s holding nodes (PDBs, single-pod DaemonSets, safe-to-evict=false).
- For Karpenter, the NodePool requirements determine what gets provisioned.
Useful commands
# Cluster Autoscaler
kubectl -n kube-system get pods -l app=cluster-autoscaler
kubectl -n kube-system logs deploy/cluster-autoscaler --tail=200
kubectl describe configmap cluster-autoscaler-status -n kube-system
# Status shows: scale-up decisions, candidate nodes, unschedulable pods
# Karpenter
kubectl get nodepool
kubectl describe nodepool <name>
kubectl get nodeclaim
kubectl describe nodeclaim <name>
kubectl -n karpenter logs -l app.kubernetes.io/name=karpenter --tail=200
# Pod-level reason for not fitting
kubectl describe pod <pending-pod>
# Per-node utilization (gauge of scale-down candidates)
kubectl top node
kubectl describe nodes | grep -E "Name:|Allocated"
# PDBs that may block eviction
kubectl get pdb -A
kubectl describe pdb <pdb>
# Annotations on a pod
kubectl get pod <pod> -o yaml | yq '.metadata.annotations'
# Node "safe-to-evict" check
kubectl get nodes -o json | jq '.items[] | {name:.metadata.name, sysCritical:(.metadata.annotations["cluster-autoscaler.kubernetes.io/safe-to-evict"]//"")}'
# Force a node deregistration (CA)
kubectl annotate node <node> cluster-autoscaler.kubernetes.io/scale-down-disabled=false --overwrite
# Karpenter manual eviction
kubectl delete node <node> # Karpenter recreates / deprovisions per NodePool
CA configuration knobs
# Cluster Autoscaler args
- --scale-down-utilization-threshold=0.5 # default
- --scale-down-unneeded-time=10m # default
- --scale-down-delay-after-add=10m
- --scale-down-delay-after-delete=10s
- --scale-down-delay-after-failure=3m
- --max-node-provision-time=15m
- --skip-nodes-with-system-pods=false # allow eviction of kube-system pods (with PDB)
- --skip-nodes-with-local-storage=false
- --balance-similar-node-groups=true
Karpenter NodePool example
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: default
spec:
template:
spec:
requirements:
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
- key: node.kubernetes.io/instance-type
operator: In
values: ["m6a.large", "m6a.xlarge", "m6a.2xlarge"]
nodeClassRef:
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
name: default
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 1m
limits:
cpu: 1000
memory: 1000Gi
Common findings this catches
- “max node provision time exceeded” in CA logs → cloud-side delay; check provider quotas, AZ capacity.
- “pod didn’t fit on any node group” → pod’s CPU/memory request larger than max instance template; add larger node group.
- Karpenter NodeClaim stuck Pending —
kubectl describe nodeclaim; usually IAM, subnet, or launch template error. - Node idle for 30+ min but autoscaler won’t remove → safe-to-evict=false on a pod; or PDB blocking.
- All pods schedule to one node even with autoscaler — pod’s requests too small (everything fits); or anti-affinity missing.
- Karpenter pulls expensive instance types —
requirementsnot restrictive; pin via instance-type or instance-family. - Spot reclamation causes restart storms — workload not designed for interruption; use on-demand for critical pieces.
When to escalate
- Cloud-side capacity issues (instance type unavailable in AZ) — Karpenter / CA can’t fix; coordinate with cloud team.
- Persistent NodeClaim provisioning failures — engage cloud + cluster admin; usually IAM or network config.
- Aggressive scale-down causing customer-visible blips — review PDB coverage; consider Karpenter
disruption.budgets.
Related prompts
-
Kubernetes HPA Debugging Prompt
Diagnose HorizontalPodAutoscaler issues — flapping replicas, `unable to fetch metrics`, custom metrics adapter, behavior tuning, scale-from-zero patterns.
-
Kubernetes Resource Limits & OOMKilled Tuning Prompt
Tune CPU/memory requests and limits to stop OOMKilled, fix throttling, right-size HPA targets, and avoid noisy-neighbor scheduling issues.
-
Kubernetes `FailedScheduling` Debug Prompt
Diagnose `FailedScheduling` events — taints/tolerations mismatch, node affinity, topology spread skew, resource fit failures, and PV zone constraints.