AWS Error Guide: 'VcpuLimitExceeded' and LimitExceeded

Overview

VcpuLimitExceeded (and the broader LimitExceeded) means a launch would push you past a Service Quota — most often the per-region On-Demand vCPU limit for an instance family. Unlike InsufficientInstanceCapacity (an AWS supply problem), this is an account ceiling: AWS has the capacity, but your account is not permitted to use that much. The launch is rejected until you either free up usage or raise the quota.

EC2 phrases it by family group:

An error occurred (VcpuLimitExceeded) when calling the RunInstances operation: You have requested more vCPU capacity than your current vCPU limit of 64 allows for the instance bucket that the specified instance type belongs to. Please visit http://aws.amazon.com/contact-us/ec2-request to request an adjustment to this limit.

Other services raise the generic form:

An error occurred (LimitExceeded) when calling the CreateFunction operation: Code storage limit exceeded.

It occurs on RunInstances, ASG scale-out, Spot/Fleet requests, EKS/ECS capacity, and many create/allocate APIs (Elastic IPs, NAT gateways, VPCs, Lambda storage).

Symptoms

RunInstances fails with VcpuLimitExceeded naming a vCPU limit and an “instance bucket”.
ASG activity history shows VcpuLimitExceeded and cannot scale.
A create call fails with LimitExceeded, ResourceLimitExceeded, or LimitExceededException.
The launch succeeds with a smaller count or in another region.

aws ec2 run-instances --instance-type c6i.8xlarge --count 4 \
  --image-id ami-0abcd1234ef567890 --subnet-id subnet-0aaa1111

An error occurred (VcpuLimitExceeded) when calling the RunInstances operation: You have requested more vCPU capacity than your current vCPU limit of 64 allows for the instance bucket that the specified instance type belongs to.

aws service-quotas get-service-quota --service-code ec2 \
  --quota-code L-1216C47A --query 'Quota.[QuotaName,Value]' --output text

Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances	64.0

Common Root Causes

1. The Standard On-Demand vCPU quota is reached

The default quota covers the combined vCPUs of all Standard-family (A, C, D, H, I, M, R, T, Z) On-Demand instances in the region. Your running fleet plus the new request exceeds it.

aws service-quotas get-service-quota --service-code ec2 \
  --quota-code L-1216C47A --query 'Quota.Value' --output text
aws ec2 describe-instances --filters Name=instance-state-name,Values=running \
  --query 'Reservations[].Instances[].InstanceType' --output text | tr '\t' '\n' | sort | uniq -c

64.0
      6 c6i.8xlarge

Six c6i.8xlarge is 6×32 = 192… (if the limit is 64, you are already over the Standard bucket and cannot add more).

2. A separate quota per instance-family group

GPU (G/VT), high-memory (X), Inf/Trn, and FPGA (F) families each have their own vCPU quota, often defaulting low. A launch in one of these hits a different limit than Standard.

aws service-quotas list-service-quotas --service-code ec2 \
  --query "Quotas[?contains(QuotaName,'On-Demand')].[QuotaName,QuotaCode,Value]" --output text

Running On-Demand Standard (A,C,D,H,I,M,R,T,Z) instances	L-1216C47A	64.0
Running On-Demand G and VT instances	L-DB2E81BA	8.0
Running On-Demand P instances	L-417A185B	0.0

A P (GPU) quota of 0.0 means you cannot launch any P instance until you request an increase.

3. Spot vCPU quota is separate from On-Demand

Spot has its own per-region “All Standard Spot” vCPU quota. A Spot fleet can hit MaxSpotInstanceCountExceeded / a Spot vCPU limit even when On-Demand has room.

aws service-quotas list-service-quotas --service-code ec2 \
  --query "Quotas[?contains(QuotaName,'Spot')].[QuotaName,Value]" --output text

All Standard (A,C,D,H,I,M,R,T,Z) Spot Instance Requests	32.0

If your Spot fleet needs more than 32 Standard vCPUs, raise the Spot quota specifically.

4. A non-EC2 resource quota (EIPs, NAT, VPCs, Lambda storage)

LimitExceeded also fires for Elastic IPs per region, NAT gateways per AZ, VPCs per region, Lambda code storage, and many others — each a distinct quota.

aws service-quotas get-service-quota --service-code ec2 \
  --quota-code L-0263D0A3 --query 'Quota.[QuotaName,Value]' --output text
aws ec2 describe-addresses --query 'length(Addresses)' --output text

Number of EIPs - VPC EIPs	5.0
5

Five EIPs against a quota of 5 means the next allocate-address returns AddressLimitExceeded.

5. Requesting in a region with a low default quota

New or rarely used regions often carry the default (sometimes low) quota. A workload moved to such a region hits limits the primary region never showed.

aws service-quotas get-service-quota --service-code ec2 \
  --quota-code L-1216C47A --region ap-south-1 --query 'Quota.Value' --output text

5.0

A Standard vCPU quota of 5.0 in a secondary region blocks all but the smallest launch — request increases per region.

6. A pending or denied quota-increase request

You raised the quota but the request is still PENDING (or was denied), so the effective limit has not changed yet.

aws service-quotas list-requested-service-quota-change-history \
  --service-code ec2 --query 'RequestedQuotas[0:3].[QuotaName,DesiredValue,Status]' --output text

Running On-Demand Standard ... instances	256.0	PENDING

PENDING means the higher limit is not yet active — wait for APPROVED/CASE_CLOSED.

Diagnostic Workflow

Step 1: Read which limit and bucket the message names

aws ec2 run-instances --instance-type <TYPE> --count <N> \
  --image-id <AMI> --subnet-id <SUBNET> 2>&1 | grep -oE 'vCPU limit of [0-9]+ .*bucket'

The “instance bucket” wording tells you which family group’s quota you hit (Standard vs G/P/X/etc.).

Step 2: Look up the matching quota and its code

aws service-quotas list-service-quotas --service-code ec2 \
  --query "Quotas[?contains(QuotaName,'On-Demand')].[QuotaName,QuotaCode,Value]" --output text

Match the family group from Step 1 to its QuotaCode and current Value.

Step 3: Measure current usage against the quota

aws ec2 describe-instances --filters Name=instance-state-name,Values=running \
  --query 'Reservations[].Instances[].[InstanceType]' --output text \
  | tr '\t' '\n' | sort | uniq -c

Sum the vCPUs of running instances in that family group; compare to the quota to see how much headroom exists.

Step 4: Check for in-flight increase requests

aws service-quotas list-requested-service-quota-change-history --service-code ec2 \
  --query 'RequestedQuotas[].[QuotaName,DesiredValue,Status]' --output text

Avoid filing a duplicate if one is already PENDING.

Step 5: Request the increase (or reduce the request)

aws service-quotas request-service-quota-increase \
  --service-code ec2 --quota-code <QUOTA_CODE> --desired-value <NEW_LIMIT> \
  --query 'RequestedQuota.[Status,DesiredValue]' --output text

For an immediate workaround, launch fewer/smaller instances or use another region/family with headroom while the increase processes.

Example Root Cause Analysis

A new GPU training pipeline failed its first launch with VcpuLimitExceeded, blocking the data-science team. The On-Demand Standard quota was comfortably high, so the team was confused — they had “plenty of vCPUs”.

The message named a different bucket:

aws ec2 run-instances --instance-type g5.12xlarge --count 1 \
  --image-id ami-0abcd1234ef567890 --subnet-id subnet-0aaa1111 2>&1 \
  | grep -oE 'vCPU limit of [0-9]+ .*'

vCPU limit of 0 allows for the instance bucket that the specified instance type belongs to

A limit of 0 — the G/VT On-Demand quota was never raised, separate from Standard.

aws service-quotas list-service-quotas --service-code ec2 \
  --query "Quotas[?contains(QuotaName,'G and VT')].[QuotaName,QuotaCode,Value]" --output text

Running On-Demand G and VT instances	L-DB2E81BA	0.0

A g5.12xlarge is 48 vCPUs, so the team needed at least 48. Fix: request the G/VT quota increase to 96 (room for two nodes plus headroom).

aws service-quotas request-service-quota-increase \
  --service-code ec2 --quota-code L-DB2E81BA --desired-value 96 \
  --query 'RequestedQuota.Status' --output text

PENDING

Once approved, the GPU launches succeeded. The lesson: each family group has its own vCPU quota.

Prevention Best Practices

Track usage against each family-group vCPU quota (Standard, G/VT, P, X, Inf/Trn, F) with CloudWatch usage metrics and alarm before you hit the ceiling.
Request quota increases per family and per region ahead of new workloads — GPU and high-memory defaults are often 0 or very low.
Remember Spot and On-Demand have separate vCPU quotas; raise the one your fleet actually uses.
Distinguish VcpuLimitExceeded (your account ceiling, fixable by a quota increase) from InsufficientInstanceCapacity (AWS supply, fixable by flexibility) — they look similar but the remedy differs.
Automate quota requests through Service Quotas (or a quota-template for new accounts) so a region rollout does not stall on default limits.
For mapping a launch failure to the exact quota and family bucket, the free incident assistant can read the message and name the limit to raise. More walkthroughs are in the AWS guides.

Quick Command Reference

# Which family bucket and limit did you hit?
aws ec2 run-instances --instance-type <TYPE> --count <N> --image-id <AMI> \
  --subnet-id <SUBNET> 2>&1 | grep -oE 'vCPU limit of [0-9]+ .*'

# List On-Demand vCPU quotas with codes
aws service-quotas list-service-quotas --service-code ec2 \
  --query "Quotas[?contains(QuotaName,'On-Demand')].[QuotaName,QuotaCode,Value]" --output text

# Current running vCPU usage by type
aws ec2 describe-instances --filters Name=instance-state-name,Values=running \
  --query 'Reservations[].Instances[].InstanceType' --output text | tr '\t' '\n' | sort | uniq -c

# In-flight increase requests
aws service-quotas list-requested-service-quota-change-history --service-code ec2 \
  --query 'RequestedQuotas[].[QuotaName,DesiredValue,Status]' --output text

# Request an increase
aws service-quotas request-service-quota-increase \
  --service-code ec2 --quota-code <QUOTA_CODE> --desired-value <NEW_LIMIT>

Conclusion

VcpuLimitExceeded / LimitExceeded means a launch or create would exceed a Service Quota — your account ceiling, not AWS supply. The usual root causes:

The Standard On-Demand vCPU quota for the region is reached.
A separate, lower quota for a family group (G/VT, P, X, Inf/Trn, F).
The Spot vCPU quota (distinct from On-Demand) is exhausted.
A non-EC2 resource quota (EIPs, NAT gateways, VPCs, Lambda storage).
A low default quota in a secondary region.
A quota-increase request still PENDING or denied.

Read the “instance bucket” in the message, find the matching quota code, compare usage to the limit, then request the increase for that exact family and region.

AWS Error Guide: 'VcpuLimitExceeded' and LimitExceeded Service Quota Failures