DynamoDB Capacity and Cost Optimization

DynamoDB has a reputation for being either suspiciously cheap or alarmingly expensive, with very little in between, and which one you get comes down to a handful of decisions you make early and rarely revisit. The service bills on read and write capacity units, storage, and a few peripheral features — but the headline number on your bill is almost always a story about your access pattern, not about DynamoDB’s pricing. Fix the access pattern and the bill follows.

I use an AI assistant constantly when reasoning about DynamoDB capacity, because it’s good at translating “this table gets bursty traffic at the top of every hour” into a concrete capacity-mode recommendation. But DynamoDB cost mistakes are expensive and quiet, so every recommendation gets checked against the actual CloudWatch metrics before it touches a table. Here’s the model I reason with.

On-demand vs provisioned: the real decision

On-demand mode bills per request with no capacity to manage — you pay for what you consume and it scales instantly. Provisioned mode reserves a fixed read/write throughput that you pay for whether you use it or not, at a lower per-unit rate.

The decision isn’t “which is cheaper” in the abstract; it’s about your utilization. If your traffic is steady and predictable, provisioned with high utilization wins, because you’re paying the discounted rate on capacity you actually use. If your traffic is spiky, unpredictable, or you genuinely don’t know it yet (new tables), on-demand wins, because provisioned-with-low-utilization means paying for reserved capacity that sits idle.

# Switch a table to on-demand
aws dynamodb update-table \
  --table-name Orders \
  --billing-mode PAY_PER_REQUEST

# Or provisioned with explicit capacity
aws dynamodb update-table \
  --table-name Orders \
  --billing-mode PROVISIONED \
  --provisioned-throughput ReadCapacityUnits=200,WriteCapacityUnits=100

A real cost trap: people put a new table on provisioned with generous capacity “to be safe,” then never tune it. That’s the worst of both worlds — you pay the reserved rate on capacity you mostly don’t touch. New tables belong on on-demand until you have weeks of metrics to size provisioned correctly. You can switch billing mode, but only once every 24 hours, so plan the cutover.

Auto scaling tracks demand, with caveats

If you go provisioned, auto scaling adjusts capacity toward a target utilization (commonly 70%). It registers the table as a scalable target and applies a tracking policy.

aws application-autoscaling register-scalable-target \
  --service-namespace dynamodb \
  --resource-id "table/Orders" \
  --scalable-dimension "dynamodb:table:WriteCapacityUnits" \
  --min-capacity 50 --max-capacity 500

aws application-autoscaling put-scaling-policy \
  --service-namespace dynamodb \
  --resource-id "table/Orders" \
  --scalable-dimension "dynamodb:table:WriteCapacityUnits" \
  --policy-name OrdersWriteTracking \
  --policy-type TargetTrackingScaling \
  --target-tracking-scaling-policy-configuration '{
    "TargetValue": 70.0,
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "DynamoDBWriteCapacityUtilization"
    },
    "ScaleInCooldown": 60,
    "ScaleOutCooldown": 60
  }'

The caveat the AI flagged and I’d want everyone to know: auto scaling reacts, it doesn’t predict. It scales on a CloudWatch metric that’s already trending, so a sharp, instantaneous spike can hit your current ceiling and throttle before scaling catches up. For genuinely spiky workloads, on-demand handles the burst more gracefully than provisioned-plus-auto-scaling. Auto scaling shines on gradual diurnal curves, not on flash traffic.

Hot partitions: the silent throttle

DynamoDB spreads your data across partitions by the hash of the partition key. Capacity is shared across the table via adaptive capacity, but a single partition still has hard physical limits. If your access concentrates on one partition key — a status = ACTIVE flag everyone queries, a single tenant that dwarfs the rest, a monotonically increasing timestamp key — you get a hot partition: throttling on one key while the table as a whole looks under-provisioned.

The fix is design, not capacity. Choose a partition key with high cardinality and even access. If you’re stuck with a low-cardinality key, write-sharding (suffixing the key with a small random or hashed value and fanning reads across the shards) spreads the load.

Prompt: Our Events table throttles on writes even though table-level consumed capacity is well under provisioned. The partition key is the event date. What’s happening and what do I change?

Reply: That’s a classic hot-partition signature. A date partition key means every event written on a given day lands on the same partition, so all of today’s write traffic concentrates on one physical partition while the rest of the table sits idle — table-level metrics look healthy while one partition is pinned. Adaptive capacity helps but won’t fully rescue a single key absorbing the whole write stream. Re-key so writes spread: a composite like eventType#date raises cardinality if your types are balanced, or add a write-shard suffix (date#<0-9>) and query all shards on read. Confirm the diagnosis before redesigning, though — enable CloudWatch Contributor Insights on the table and check whether one partition key really is dominating the throttled requests; the metric will name the offending key rather than leaving you to guess.

That reply correctly separates the table-level metric (healthy) from the per-partition reality (throttled), names the two standard fixes, and — importantly — tells me to confirm with Contributor Insights before re-keying a production table. I enabled it, watched it finger the exact date key as the top throttled contributor, and only then committed to the re-key. Redesigning a partition key on a hunch is how you turn one problem into two.

aws dynamodb update-contributor-insights \
  --table-name Events \
  --contributor-insights-action ENABLE

TTL: free deletes that shrink storage and cost

Time to Live deletes expired items automatically with no consumed write capacity — DynamoDB removes them in the background. For tables holding transient data (sessions, short-lived events, caches), TTL is the cheapest cleanup you’ll find: it trims storage cost and, because expired items leave the table, it keeps your indexes and scans smaller too.

aws dynamodb update-time-to-live \
  --table-name Sessions \
  --time-to-live-specification "Enabled=true, AttributeName=expiresAt"

The attribute must be a Number holding a Unix epoch timestamp in seconds. The classic bug — and one AI assistants reproduce because the model can’t see your data — is storing milliseconds. A millisecond timestamp reads as a date thousands of years out, so nothing ever expires and you quietly pay storage on data you meant to delete. Verify with a single item and a clock, not by assuming.

The cost levers, in priority order

When a DynamoDB bill needs trimming, the levers in rough order of impact: get the billing mode right for the actual traffic shape; fix hot partitions so you’re not over-provisioning the whole table to feed one key; enable TTL so you’re not paying storage on dead data; prune unused global secondary indexes, since every GSI carries its own capacity and storage; and consider reserved capacity only once provisioned utilization is stable and high. Don’t buy reserved capacity for a table you haven’t yet tuned — you’ll lock in the wrong number.

The honest summary is that DynamoDB cost is an access-pattern problem wearing a pricing-page costume. The AI is excellent at proposing the lever — “this looks like on-demand,” “this smells like a hot partition,” “your TTL is in milliseconds” — and you are responsible for the metric that confirms it: CloudWatch utilization, Contributor Insights, a single expiring item.

For deeper dives, the AWS guides cover the surrounding data-layer decisions, and the database and cost entries in the prompt library include tested scaffolds for capacity sizing and hot-partition diagnosis. The combination I keep coming back to is a “size this table from its CloudWatch metrics” prompt paired with a Contributor Insights check — draft the recommendation fast, then make the data earn your trust before you change a production table.