Idempotency Keys for Safe API and Webhook Automation

I once watched an automation scale a node group up three times in ninety seconds. The trigger fired once. But the HTTP call to the cloud API timed out on the client side after the server had already accepted it, the retry logic fired, the second call also “timed out” the same way, and by the time the third succeeded we had three times the capacity and a very surprised finance team.

The bug was not the retries. Retries are correct — you want them. The bug was that the scaling endpoint had no way to know those three requests were the same intended action. That is precisely what an idempotency key fixes. In any automation that talks to APIs over an unreliable network, idempotency keys are the cheapest insurance you can buy.

Why “exactly once” is a lie

Every distributed system gives you one of two delivery guarantees: at-most-once (might drop messages) or at-least-once (might duplicate them). Nobody gives you exactly-once over the wire, because the sender can never be sure whether a request that timed out actually landed.

Webhook providers like Stripe, GitHub, and your cloud platform all document this: they deliver at-least-once and you will see the same event more than once. Retry libraries in your own code create the same situation from the other direction. So “exactly once” is not something you receive — it is something you construct on the receiving side, with idempotency keys.

The pattern in one diagram’s worth of code

The sender attaches a unique key to a request. The receiver records that key the first time it processes the request and refuses to process it again:

def handle_request(key, payload, store):
    existing = store.get(key)
    if existing is not None:
        # Already processed — return the SAME response we returned before
        return existing.response

    result = do_the_real_work(payload)
    store.put(key, Record(response=result, status="done"))
    return result

The key insight people miss: on a duplicate you must return the same response the first call produced, not just a bland “already handled.” The caller retried because it never saw the first response. If you return a different shape on the duplicate, you’ve just moved the inconsistency one layer up.

Choosing the key

For requests you originate, generate the key yourself and reuse it across retries of the same logical operation:

key = f"scale-{cluster_id}-{desired_count}-{intent_id}"
for attempt in range(3):
    resp = api.post("/scale", json=body, headers={"Idempotency-Key": key})
    if resp.ok:
        break

Note that the key is tied to intent, not to the attempt. All three retries send the same key. If you regenerate it per attempt — a depressingly common mistake — you’ve defeated the entire mechanism.

For inbound webhooks, the provider gives you an event ID (evt_... for Stripe, the delivery GUID for GitHub). Use that as your dedupe key. Never derive the key from a hash of the payload alone, because two genuinely separate events can carry identical payloads (think: two real “deploy succeeded” events seconds apart).

Pro Tip: Give idempotency records a TTL that comfortably exceeds your longest retry window, then expire them. A 24-hour TTL handles almost every retry scenario while keeping your dedupe store from growing forever. Permanent keys are a slow-motion storage leak.

Handling the in-flight race

The naive code above has a race: two duplicates can both miss the store.get, both run the work, and both write. Under at-least-once delivery with parallel workers, this will happen.

Close it with a conditional insert that reserves the key before doing the work:

def handle_request(key, payload, store):
    reserved = store.put_if_absent(key, Record(status="processing"))
    if not reserved:
        existing = store.get(key)
        if existing.status == "processing":
            raise Conflict("retry shortly")   # 409, let the caller back off
        return existing.response

    result = do_the_real_work(payload)
    store.update(key, response=result, status="done")
    return result

put_if_absent must be atomic — a unique constraint in Postgres, a conditional write in DynamoDB, or SET key val NX in Redis. The atomicity is doing the real work here; without it you’re back to the race.

Where AI helps and where you stay in charge

Writing this dedupe middleware is a great task to hand a Copilot or Claude. Treat the model like a fast junior engineer: describe your store (Postgres, Redis, DynamoDB) and it will draft the atomic put_if_absent and the TTL handling. It’s good at the mechanics.

What you do not delegate is the decision of which operations need idempotency at all. A human has to look at each side-effecting endpoint and judge the cost of double execution. Scaling a cluster, charging money, sending a page — those obviously need keys. A read-only status fetch does not. The model can’t weigh that blast radius for you, and it never gets the credentials to test against prod; generated middleware gets validated against a local store first. I keep a set of vetted dedupe prompts in my prompt workspace so this starts from patterns I’ve already reviewed.

Idempotency is your back-out path’s foundation

Here’s the part people underrate: idempotency keys make safe re-runs possible, which is the whole point of a back-out path. When a batch automation dies halfway, you want to just run it again. With idempotency keys, the already-completed items are no-ops and only the unfinished work executes. Without them, re-running risks double-applying everything that already succeeded.

So idempotency is not only about absorbing network retries — it’s what lets a human confidently hit “run again” after a partial failure instead of auditing every record by hand.

Testing it honestly

The only way to trust this is to test the duplicate path explicitly. Fire the same request twice and assert that the side effect happened exactly once and both responses match:

def test_duplicate_is_noop():
    r1 = handle_request("k1", body, store)
    r2 = handle_request("k1", body, store)
    assert r1 == r2
    assert work_counter.value == 1

Then test the concurrent case by firing two requests in parallel threads against a shared store. If your put_if_absent isn’t truly atomic, this test flushes it out.

Conclusion

Idempotency keys turn an unreliable, retry-happy network into something your automation can survive. Tie the key to intent, reserve it atomically before doing work, return the original response on duplicates, and expire keys on a TTL. Let AI draft the middleware, but keep the judgment about what needs protecting — and the production credentials — with a human.

For more on building automation that survives partial failure, the automation category covers retries, sagas, and reconciliation, and the prompts library has reviewed starting points for this kind of resilience code.