CloudFront Caching and Performance With AI

The fastest way to break a CloudFront distribution is to get the cache key wrong. Forward one too many headers and your hit ratio collapses to single digits; forward one too few and a logged-in user sees another customer’s dashboard out of the edge cache. I’ve watched both happen, and the second one is the kind of mistake that ends up in an incident review. CloudFront’s caching model is deceptively deep — cache behaviors, cache policies, origin request policies, TTL precedence, origin shield, invalidation economics — and most of the defaults are tuned for a static marketing site, not the mixed static-plus-dynamic origins most of us actually run.

This is where AI earns its keep, and where it can also get you fired if you trust it blindly. The model is genuinely good at drafting cache policies and explaining why CloudFront chose a particular TTL given a confusing mix of origin headers. It is not good at knowing which of your paths return personalized content. So the division of labor I use is simple: AI drafts the cache behaviors and the cache-key strategy and explains its reasoning; I verify every path that could possibly carry user state before any of it ships.

Start by mapping behaviors to content types

CloudFront evaluates path patterns in order, and the default behavior (*) is the catch-all. The first design decision is segmenting traffic so each class of content gets the right cache treatment. A typical app has at least three: long-lived versioned static assets, short-lived HTML, and never-cache API and authenticated routes.

aws cloudfront get-distribution-config --id E1ABCDEF234567 \
  --query 'DistributionConfig.CacheBehaviors.Items[].{Path:PathPattern,Policy:CachePolicyId,OriginReq:OriginRequestPolicyId}' \
  --output table

Reviewing the existing layout first matters because AI will happily propose a greenfield design that ignores the behaviors you already depend on. Feed it the current config and the path semantics, and constrain it to changes.

Cache policies own the cache key

Since CloudFront moved to managed and custom cache policies, the cache key is defined by exactly three inputs: which query strings, headers, and cookies are included. Everything not in the cache key is invisible to caching but can still be forwarded to the origin through an origin request policy. Conflating those two is the single most common mistake I see, and it’s where I want AI to slow down and justify itself.

Here is a custom cache policy for versioned static assets. The asset URLs already carry a content hash, so the query string and cookies are irrelevant to the cache key, and a long TTL is safe.

{
  "Name": "static-versioned-assets",
  "DefaultTTL": 86400,
  "MaxTTL": 31536000,
  "MinTTL": 86400,
  "ParametersInCacheKeyAndForwardedToOrigin": {
    "EnableAcceptEncodingGzip": true,
    "EnableAcceptEncodingBrotli": true,
    "HeadersConfig": { "HeaderBehavior": "none" },
    "CookiesConfig": { "CookieBehavior": "none" },
    "QueryStringsConfig": { "QueryStringBehavior": "none" }
  }
}

For the HTML behavior, I want a short TTL and the origin’s Cache-Control to win when present, so MinTTL stays at 0 and the origin controls freshness. The point of separate policies is that the cache key and the TTL envelope travel together per content class.

Let TTL precedence be explicit

The rule that trips people up: CloudFront uses the origin’s Cache-Control: max-age or Expires if it falls between MinTTL and MaxTTL. If the origin sends no caching headers, DefaultTTL applies. If the origin sends a value below MinTTL, CloudFront clamps up to MinTTL. That clamping is exactly how a misconfigured MinTTL of 3600 silently caches a logout redirect for an hour. When I ask AI to reason about a reported staleness bug, I give it the origin response headers and the policy and make it walk the precedence.

Prompt: Origin returns Cache-Control: max-age=60 on /dashboard. The cache policy has MinTTL 300, DefaultTTL 86400, MaxTTL 31536000, and CookieBehavior “none”. Users report seeing another account’s data. Explain what’s wrong and the minimal fix.

Reply: Two compounding problems. First, MinTTL 300 overrides the origin’s max-age of 60 — CloudFront clamps the TTL up to 300 seconds, so responses live at the edge five times longer than the origin intends. Second, and far more serious, CookieBehavior is “none”, so the session cookie is excluded from the cache key. Every authenticated user shares one cached object per URL, which is why one account’s dashboard is served to another. The minimal fix is to stop caching this path: assign a behavior for /dashboard* that uses the managed CachingDisabled policy, and forward the session cookie and Authorization header through an origin request policy. Lowering MinTTL alone would not fix the cross-account leak — the cookie still must enter the cache key (or caching must be disabled) for correctness.

That answer is correct, and it’s the kind of precedence reasoning that’s tedious to do by hand at 2 a.m. But notice the model only got there because I gave it the cookie behavior. It cannot infer that /dashboard is authenticated — I have to tell it, and I have to confirm the fix against the real distribution.

Origin shield and hit ratio

Origin shield adds a regional caching layer between the edge POPs and your origin, which raises the effective cache hit ratio and shields the origin from the thundering herd of cache fills across dozens of edge locations. It’s worth enabling when your origin is expensive to hit or geographically distant from much of your traffic. The trade-off is a small added latency on misses and a per-request cost, so I don’t enable it reflexively on every distribution. AI is useful for modeling whether shield helps given your origin location and traffic distribution, but the decision is mine.

aws cloudfront get-distribution-config --id E1ABCDEF234567 \
  --query 'DistributionConfig.Origins.Items[].{Id:Id,Shield:OriginShield.Enabled,Region:OriginShield.OriginShieldRegion}'

Set the shield region to the AWS region nearest your origin, not nearest your users. Getting that backwards adds a cross-region hop on every fill.

Invalidations are a smell, not a tool

Invalidations are slow, eventually consistent, and the first 1,000 paths per month are free before per-path charges kick in. Relying on them for routine deploys means you’ve given up on versioned URLs. The better pattern is content-hashed asset names so a deploy simply references new URLs and the old ones age out. Reserve invalidation for genuine mistakes — a bad HTML page that has to die now.

aws cloudfront create-invalidation --distribution-id E1ABCDEF234567 \
  --paths "/index.html" "/sitemap.xml"

When I do need to invalidate broadly, I prefer a wildcard like /static/css/* over enumerating hundreds of paths, since a single wildcard counts as one path against the quota. I ask AI to help me find the narrowest wildcard that covers the changed set, then I confirm the blast radius before running it.

Where the human stays in the loop

The whole workflow hinges on one discipline: AI never decides what is personalized. It drafts cache policies, explains TTL precedence, models origin shield trade-offs, and proposes the minimal invalidation — all faster than I could. But before any behavior ships, I personally trace every path that could carry a cookie, an auth header, or a query parameter that varies content per user, and I confirm those paths are either uncached or have the right inputs in the cache key. The model is a force multiplier on the analysis; it is not the owner of correctness.

If you want to push further, the cost side of the same distribution is worth a read in AWS cost optimization with AI, and you’ll find more edge and networking material under the AWS category. I keep my reusable CloudFront review prompts in the prompt library so the verification questions stay consistent across distributions.