Cluster Autoscaling With Karpenter and Cluster Autoscaler
Pods stuck Pending or a cloud bill that won't quit usually mean your node autoscaling is wrong. Here's how Cluster Autoscaler and Karpenter differ and when to use each.
- #kubernetes
- #autoscaling
- #karpenter
- #cluster-autoscaler
- #cost-optimization
- #nodes
There are two ways node autoscaling makes itself known, and neither is pleasant. Either pods sit Pending because no node has room and nothing’s adding capacity, or your cloud bill quietly doubles because you provisioned for peak and the cluster never scales back down. Both are the same root problem: the layer that adds and removes nodes isn’t matched to how your workloads actually behave.
This is a different problem from the Horizontal Pod Autoscaler, which adds pods. The HPA can ask for more replicas all it likes, but if there’s no node to schedule them on, they’re Pending. Cluster autoscaling is the layer underneath that supplies the nodes. The two leading tools — Cluster Autoscaler and Karpenter — take fundamentally different approaches, and picking the right one matters.
Cluster Autoscaler: scale node groups
Cluster Autoscaler (CA) is the long-standing, cloud-agnostic option. It works against pre-defined node groups — an EC2 Auto Scaling Group, a GKE node pool, an AKS scale set. When pods are Pending and a node group could host them, CA increases that group’s size. When nodes sit underutilized, it drains and removes them.
The model is “scale these fixed-shape groups up and down.” You define the instance types per group in advance:
# CA reads node-group bounds from cloud tags / flags
# A typical setup: a few groups by instance family
nodes.cluster-autoscaler.kubernetes.io/scale-down-utilization-threshold: "0.5"
CA respects scheduling constraints — taints, affinity, resource requests — when deciding which group can satisfy Pending pods. Its main tuning knobs:
--scale-down-utilization-threshold— how empty a node must be before it’s a candidate for removal (0.5 = under 50% requested).--scale-down-unneeded-time— how long a node stays underutilized before CA removes it (default 10m; raise it if you see churn).--expander— how CA chooses among eligible groups (least-wastepacks efficiently;prioritylets you prefer cheaper groups).
CA is the right call when you’re on a cloud or platform Karpenter doesn’t support, when you need predictable fixed node shapes, or when your org already standardizes on node pools.
Karpenter: provision nodes just-in-time
Karpenter (originally AWS, now broadening) throws out fixed node groups. Instead of scaling predefined shapes, it looks at the actual Pending pods and provisions the right node for them directly from the cloud’s instance catalog — the cheapest instance that fits, right now.
You give it a NodePool describing the bounds it’s allowed to operate within, and Karpenter picks instance types from that space per workload:
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: general
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
- key: kubernetes.io/arch
operator: In
values: ["amd64", "arm64"]
- key: karpenter.k8s.aws/instance-category
operator: In
values: ["c", "m", "r"]
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 1m
limits:
cpu: "1000"
Two things make Karpenter compelling. First, bin-packing flexibility: because it isn’t locked to fixed shapes, it can pick a large instance to consolidate many small pods or a small one for a single pod, whichever is cheaper. Second, consolidation: the disruption block lets Karpenter actively repack the cluster — it’ll notice three half-empty nodes, schedule their pods onto one, and terminate the other two. CA removes empty nodes; Karpenter actively defragments.
Letting Karpenter range over Spot, multiple architectures, and several instance families is where the cost savings come from — it’ll grab cheap arm64 Spot capacity when your pods don’t care, and fall back to on-demand when Spot is reclaimed.
Make consolidation safe
Karpenter’s consolidation is aggressive by design, which is exactly what you want for cost and exactly what bites you if workloads aren’t protected. Two safeguards are non-negotiable:
PodDisruptionBudgets. Consolidation drains nodes. Without a PDB, Karpenter (or CA, or a node upgrade) can evict every replica of a service at once.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: payments-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: payments
do-not-disrupt for the genuinely sensitive. A pod mid-migration or a singleton batch job can opt out:
metadata:
annotations:
karpenter.sh/do-not-disrupt: "true"
The same logic applies to CA, which honors PDBs and a cluster-autoscaler.kubernetes.io/safe-to-evict: "false" annotation. The lesson learned the hard way: autoscalers will move your pods, so tell them the rules or they’ll discover them during an incident.
Requests are the input to everything
Both tools schedule off resource requests, not actual usage. If your pods request 2Gi and use 200Mi, the autoscaler provisions nodes for the 2Gi — you pay for phantom capacity. If they request nothing, the scheduler packs them tight and the autoscaler never adds nodes until things are already on fire. Getting requests right is the single biggest lever on autoscaling cost and stability. Do that work first; the autoscaler only amplifies whatever your requests tell it.
Watching it work
# why is a pod Pending?
kubectl describe pod mypod | sed -n '/Events/,$p'
# Karpenter's decisions
kubectl logs -n karpenter deploy/karpenter --since=15m | grep -i 'launched\|consolidat\|disrupt'
# what did it provision?
kubectl get nodeclaims
When pods are Pending, the pod’s own events tell you whether it’s “Insufficient cpu” (the autoscaler should be adding nodes) or “didn’t match node affinity” (no node group/NodePool can ever satisfy it — a config bug, not a capacity problem).
Where AI helps
The autoscaling questions that eat time are diagnostic: why is this pod Pending, why did Karpenter pick that instance, why won’t this node scale down. I paste the pod events, the autoscaler logs, and the NodePool or node-group config and ask the model to walk the decision chain — it’s good at spotting the affinity rule no node satisfies or the PDB that’s blocking every scale-down. It’s also useful for drafting NodePool requirements from a description of your workload mix. Run your autoscaler config and PDBs through our AI code review tool to catch the dangerous gaps, like a Deployment with no PDB that consolidation could take fully offline.
Node autoscaling done right is invisible: pods schedule, nodes appear and vanish, the bill tracks load. Done wrong it’s Pending pods or a runaway invoice. Start with honest resource requests, protect workloads with PDBs, and pick the tool that matches your platform. For more, see our Kubernetes and Helm guides.
AI autoscaling diagnoses are assistive, not authoritative. Validate provisioning and disruption changes in a non-production cluster first.