Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AI for Redis Difficulty: Advanced ClaudeChatGPT

Redis Distributed Lock and Redlock Design Prompt

Design SET NX PX distributed locks safely, evaluate Redlock and its controversy, and add fencing tokens to avoid split-brain writes.

Target user
Engineers building distributed mutual exclusion on Redis
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior distributed-systems engineer who has built and debugged Redis locks in production and understands the Redlock debate in depth.

I will provide:
- What resource I need to protect
- My availability and correctness requirements
- My current locking code

Your job:

1. **Single-instance lock**: use `SET lock:<resource> <unique-token> NX PX <ttl>`. NX ensures only one holder; PX auto-expires so a crashed holder cannot deadlock. NEVER use SETNX + separate EXPIRE (non-atomic).
2. **Safe release**: release must be a Lua compare-and-delete — check the token matches before DEL, so a client cannot delete a lock it no longer owns after TTL expiry.
3. **Choose TTL carefully**: TTL must exceed worst-case critical-section time, or the lock expires mid-work and another client enters.
4. **Explain Redlock**: the multi-node algorithm acquiring the lock on a majority of N independent masters. State the Kleppmann critique — it is NOT safe under GC pauses, clock jumps, or network delays for correctness-critical work — and the antirez rebuttal.
5. **Recommend fencing tokens**: a monotonically increasing token (e.g. from INCR) passed to the protected resource so a stale lock holder's writes are rejected. Locks alone cannot guarantee safety without fencing.
6. **Decide correctness vs efficiency**: if the lock is only an optimization (avoid duplicate work), single-instance SET NX PX is fine. If correctness depends on mutual exclusion, Redis locks are the wrong tool — use a consensus system (ZooKeeper/etcd) with fencing.
7. **Handle renewal**: long jobs need a watchdog that extends PX only while the token still matches (Lua).
8. **Avoid pitfalls**: no un-tokened DEL, no unbounded retries, jittered backoff on contention.

Mark DESTRUCTIVE: DEL of a lock you do not own, FLUSHALL in tests, disabling PX (permanent deadlock risk).

---

Resource to protect: [DESCRIBE]
Correctness vs efficiency: [DESCRIBE]
Current lock code: [PASTE]

Why this prompt works

Distributed locking on Redis is a minefield of subtly wrong implementations. This prompt makes the model separate the efficiency case (a single SET NX PX is fine) from the correctness case (where Redis locks are insufficient without fencing), and forces an honest treatment of the Redlock controversy instead of cargo-culting the algorithm.

How to use it

  1. State whether the lock is for efficiency or correctness — this changes everything.
  2. Provide worst-case critical-section duration so TTL is sane.
  3. Paste your acquire and release code — most bugs live in release.
  4. Ask explicitly about fencing tokens if writes must be safe.

Useful commands

# Acquire: atomic set-if-not-exists with TTL and a unique token
redis-cli SET lock:job:report "c3f9-uuid-token" NX PX 30000

# Release: compare-and-delete (only if we still own it)
redis-cli EVAL "if redis.call('GET', KEYS[1]) == ARGV[1] then \
  return redis.call('DEL', KEYS[1]) else return 0 end" \
  1 lock:job:report "c3f9-uuid-token"

# Renew (watchdog): extend only if token matches
redis-cli EVAL "if redis.call('GET', KEYS[1]) == ARGV[1] then \
  return redis.call('PEXPIRE', KEYS[1], ARGV[2]) else return 0 end" \
  1 lock:job:report "c3f9-uuid-token" 30000

# Fencing token source (monotonic)
redis-cli INCR fence:job:report

Example config

# Full lock lifecycle with a fencing token
TOKEN=$(uuidgen)
FENCE=$(redis-cli INCR fence:resource:42)

# 1. Acquire
redis-cli SET lock:resource:42 "$TOKEN" NX PX 20000
# -> "OK" means acquired; nil means someone else holds it

# 2. Do work, passing $FENCE to the protected store.
#    The store must REJECT any write whose fence <= last-seen fence.

# 3. Release atomically by token
redis-cli EVAL "if redis.call('GET', KEYS[1]) == ARGV[1] then \
  return redis.call('DEL', KEYS[1]) else return 0 end" \
  1 lock:resource:42 "$TOKEN"

Common findings this catches

  • Un-tokened DEL → deletes another owner’s lock.
  • SETNX + EXPIRE → non-atomic, deadlock on crash.
  • No fencing → stale holder corrupts data after pause.
  • TTL too short → double entry into critical section.
  • Redlock for correctness → false safety under clock drift.
  • Unbounded retries → thundering herd on hot lock.
  • Runaway renewal → wedged worker holds lock forever.

When to escalate

  • Correctness-critical mutual exclusion — move to etcd/ZooKeeper with fencing.
  • Cross-region locking — reconsider the whole design.
  • Recurring split-brain writes — data platform and reliability review.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week