AWS Error Guide: 'InstanceLimitExceeded' EC2 On-Demand Quota

Exact Error Message

An error occurred (InstanceLimitExceeded) when calling the RunInstances operation: You have requested more instances (5) than your current instance limit of 4 allows for the specified instance type. Please visit http://aws.amazon.com/contact-us/ec2-request to request an adjustment to this limit.

Auto Scaling groups surface the same condition in activity history:

Launching a new EC2 instance failed: You have requested more instances than your current instance limit allows. (InstanceLimitExceeded)

What the Error Means

EC2 enforces Service Quotas that cap how many On-Demand instances you can run, measured by vCPU count per instance family (Standard, F, G, P, X, and so on) per region. InstanceLimitExceeded means launching the requested instances would push your running total past the account’s quota for that family in that region. This is a count/capacity quota, not a permissions problem and not a momentary rate limit — your account simply is not allowed to run that many at once, and retrying the identical request will fail every time until you free up headroom or raise the limit.

A subtle but important detail is that modern EC2 quotas are counted in vCPUs, not instance counts, even though the legacy error text still says “instances.” A quota of 64 for the Standard family means 64 vCPUs across every m, c, r, and t instance in that region — so eight m5.2xlarge instances exhaust it just as surely as sixty-four single-vCPU ones, because every type in a family draws down the same shared budget.

Note this differs from InsufficientInstanceCapacity (AWS has no hardware available right now) and from VcpuLimitExceeded (the newer vCPU-based phrasing of the same quota). The remedy for InstanceLimitExceeded is to reduce the request or raise the quota; the remedy for a capacity error is to wait, change the AZ, or pick a different type.

Common Causes

Default account quota. New accounts have modest On-Demand limits that batch or scaling workloads quickly exhaust. AWS deliberately starts accounts low to limit runaway spend, so the default Standard-family quota is often far below what a production workload needs.
Quota measured per family per region. You can hit the limit in one region or family while others have headroom. Because each is tracked independently, a launch that works in us-east-1 can fail in eu-west-1, and an m5 request can fail while g5 GPU capacity sits unused.
Auto Scaling spike. A scale-out event requests more instances than the quota permits. The group reports the failure in its activity history and stalls below desired capacity, so the application never gets the headroom it scaled out for.
Orphaned running instances. Forgotten instances consume the quota and leave no room for new launches. Long-lived test boxes, failed-deploy leftovers, and detached Auto Scaling instances silently eat into the budget until a launch fails.
Large instance types. A few large instances consume many vCPUs, hitting the family quota fast. Two m5.12xlarge instances alone burn 96 vCPUs, so right-sizing matters as much as count when you are near a limit.
Parallel CI/test fleets. Many short-lived instances launched at once exceed the cap. Parallel test runners or nightly batch jobs spike vCPU usage well above the steady-state baseline you sized the quota for.

How to Reproduce the Error

Request more instances than the family quota allows in a region:

aws ec2 run-instances --image-id ami-0abcd1234 --instance-type m5.large \
  --count 5 --region us-east-1

If the running m5/standard-family vCPU total plus this request exceeds the quota:

An error occurred (InstanceLimitExceeded) when calling the RunInstances operation: You have requested more instances (5) than your current instance limit of 4 allows ...

Diagnostic Commands

Confirm the caller and region:

aws sts get-caller-identity

Look up the current On-Demand Standard-family quota (code L-1216C47A) and its value:

aws service-quotas get-service-quota \
  --service-code ec2 --quota-code L-1216C47A \
  --query 'Quota.[QuotaName,Value]' --output text

List all EC2 quotas to find the right family code:

aws service-quotas list-service-quotas --service-code ec2 \
  --query 'Quotas[?contains(QuotaName, `On-Demand`)].[QuotaCode,QuotaName,Value]' \
  --output table

Count how many instances are currently running and consuming the quota:

aws ec2 describe-instances \
  --filters Name=instance-state-name,Values=running \
  --query 'Reservations[].Instances[].[InstanceId,InstanceType]' --output text

Check whether a quota-increase request is already pending:

aws service-quotas list-requested-service-quota-change-history \
  --service-code ec2 --query 'RequestedQuotas[].[QuotaName,Status]' --output table

Step-by-Step Resolution

Confirm it is a quota, not capacity. InstanceLimitExceeded is about your account limit; InsufficientInstanceCapacity is AWS having no hardware. The fix differs completely — a quota increase will never resolve a capacity shortfall, and waiting will never resolve a quota wall — so read the error code before acting.
Identify the family and region. The message names the instance type; map it to its quota family (Standard, G, P, etc.) and the region you launched in. This matters because you must request the increase against the exact family and region that threw the error, not a neighboring one.
Check current usage and quota with the diagnostic commands. Compare the family’s running vCPU total against the quota value to see how much headroom remains. If running instances leave no room, reclaim quota by terminating instances you no longer need (through change management, not diagnostics) before assuming you need a larger limit.
Request a quota increase for that family/region via Service Quotas:
```
aws service-quotas request-service-quota-increase \
  --service-code ec2 --quota-code L-1216C47A --desired-value 64
```
Small increases on common families are often approved automatically within minutes, while large jumps may be routed to AWS Support, so request realistic headroom rather than a round number far beyond your need.
Reduce the request in the meantime. Lower --count, choose a smaller instance type, or spread launches across regions with available headroom. This unblocks the immediate launch while the quota request is pending and keeps a deployment from sitting idle.
For Auto Scaling, raise the quota before increasing the group’s max size so scale-out does not fail. A group whose maximum exceeds the family quota will hit this error mid-scale-out, exactly when load is highest.
Verify by re-running the diagnostic quota query after approval, then retry the launch. Confirm the new value is active first, since the request status can read APPROVED slightly before the higher limit takes effect.

Prevention and Best Practices

Track On-Demand quota usage per family per region and alarm before you approach the limit. CloudWatch publishes Service Quotas usage metrics, so you can alert at 80 percent of a family’s vCPU quota and act before a launch fails.
Request quota increases proactively ahead of known traffic spikes or large batch jobs. Approvals are not always instant, so treat a quota bump as a lead-time item rather than a launch-day scramble.
Clean up orphaned instances so they do not silently consume quota. Tag instances with an owner and purpose and reap untagged or expired ones automatically.
Right-size instance types so a few large instances do not exhaust a family’s vCPU budget, and prefer several smaller instances when that gives the same compute with more flexibility.
Spread fleets across regions or families where appropriate to balance quota pressure, which also improves availability.
Pre-flight Auto Scaling max sizes against the relevant quota so scale-out never hits the cap, and revisit the comparison whenever you change instance types.

VcpuLimitExceeded — the vCPU-based phrasing of the same On-Demand quota.
InsufficientInstanceCapacity — AWS has no hardware available for that type/AZ right now (not a quota).
SpotMaxPriceTooLow / Spot capacity errors — capacity issues specific to Spot, not On-Demand quotas.
AddressLimitExceeded — the analogous quota error for Elastic IP addresses.

Frequently Asked Questions

Is this the same as InsufficientInstanceCapacity? No. InstanceLimitExceeded is your account quota — a limit you can raise — while InsufficientInstanceCapacity means AWS has no hardware available for that type and AZ at that moment. The first is fixed with a quota request; the second by waiting, switching AZ, or choosing a different type.

How do I raise the limit? Use Service Quotas (request-service-quota-increase) for the specific family and region. Approval for modest increases on common families is often automatic and near-instant, while larger requests can be reviewed by AWS Support, so request ahead of need.

Why is the limit per family and region? AWS measures On-Demand quotas by vCPUs per instance family per region, so each must be raised independently. Raising the Standard-family quota in us-east-1 does nothing for a GPU launch in eu-west-1 — identify and raise the exact quota that matches your workload.

Can I work around it without a quota increase? Yes, in the short term. Reduce --count, pick a smaller instance type, terminate unused instances to reclaim headroom, or launch in another region or family with room. These buy time, but a quota increase is the durable fix for a workload that genuinely needs the capacity.

Does this count Spot and On-Demand together? No. On-Demand and Spot have separate quotas, so heavy On-Demand usage does not consume your Spot budget. If On-Demand headroom is tight and the workload tolerates interruption, shifting part of the fleet to Spot sidesteps the On-Demand limit entirely.

Does this affect Auto Scaling? Yes. Scale-out fails with this error if the group’s max size exceeds the family quota. Raise the quota first. See the AWS guides for capacity-planning patterns.

AWS Error Guide: 'InstanceLimitExceeded' EC2 On-Demand Quota Reached

Exact Error Message

What the Error Means

Common Causes

How to Reproduce the Error

Diagnostic Commands

Step-by-Step Resolution

Prevention and Best Practices

Frequently Asked Questions

Download the Free 500-Prompt DevOps AI Toolkit

Exact Error Message

What the Error Means

Common Causes

How to Reproduce the Error

Diagnostic Commands

Step-by-Step Resolution

Prevention and Best Practices

Related Errors

Frequently Asked Questions

Download the Free 500-Prompt DevOps AI Toolkit