Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AI for Redis Difficulty: Intermediate ClaudeChatGPT

Redis Connection Pool Tuning Prompt

Tune Redis client connection pools: pool sizing, timeouts, maxclients, TCP keepalive, and avoiding connection exhaustion and leaks.

Target user
Backend engineers tuning Redis client behavior
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a senior backend engineer who has tuned Redis client pools under production load.

I will provide:
- My client library and language
- App concurrency (threads/workers/instances) and traffic
- Symptoms (connection timeouts, "max number of clients reached", latency)

Your job:

1. **Size the pool to concurrency, not traffic**: max pool connections should roughly match the number of concurrent in-flight commands per process (often ~ worker/thread count), not requests/sec. Oversized pools waste server connection slots; undersized pools queue and time out.
2. **Account for server limits**: `maxclients` (default 10000) caps total connections; sum(all app instances * pool size) + replicas + monitoring must stay well under it, leaving headroom.
3. **Set sane timeouts**: connect timeout, command/socket timeout, and pool-wait timeout. A missing command timeout lets one slow op hang a whole worker.
4. **Reuse connections**: pipelining and single long-lived connections beat opening a connection per request. Blocking commands (BLPOP, BRPOP, XREAD BLOCK) and pub/sub must use dedicated connections, not the shared pool.
5. **Enable keepalive and idle handling**: TCP keepalive (`tcp-keepalive` on server, socket keepalive on client) detects dead peers; set an idle connection timeout and validate-on-borrow to avoid using stale sockets.
6. **Prevent leaks**: always return/close connections (context managers, try/finally); a leak slowly exhausts the pool then the server.
7. **Cluster/replica awareness**: Cluster clients keep a connection per node; read replicas need their own pools/routing.
8. **Observe**: watch INFO clients (connected_clients, blocked_clients, rejected_connections) and pool metrics; alert before exhaustion.

Mark DESTRUCTIVE: raising maxclients beyond OS file-descriptor limits (rejected connections/crash), CLIENT KILL of active app connections, CONFIG REWRITE of untested limits.

---

Client library/language: [DESCRIBE]
Concurrency/traffic: [DESCRIBE]
Symptoms: [DESCRIBE]

Why this prompt works

Connection pool problems masquerade as Redis being “slow” when the real cause is client-side: pools sized to traffic, leaked connections, missing timeouts, or blocking commands hogging shared sockets. This prompt makes the model size the pool to concurrency, reconcile it with server maxclients and FD limits, and separate blocking/pub-sub traffic from the shared pool.

How to use it

  1. State your client library — pool semantics differ per language.
  2. Give concurrency (workers/threads) and instance count, not just RPS.
  3. Describe the symptom — timeouts vs “max clients” point different ways.
  4. Ask for the maxclients math across all instances.

Useful commands

# Server-side connection state
redis-cli INFO clients
redis-cli CONFIG GET maxclients
redis-cli CONFIG GET tcp-keepalive
redis-cli CONFIG GET timeout            # idle client timeout (0 = never)

# Who is connected and doing what
redis-cli CLIENT LIST
redis-cli CLIENT INFO
redis-cli INFO stats | grep rejected_connections

# Check OS file-descriptor headroom (must exceed maxclients + overhead)
ulimit -n
redis-cli INFO clients | grep connected_clients

Example config

# redis.conf server-side connection settings
maxclients 10000                   # keep total connections well under this
timeout 300                        # close idle client connections after 300s
tcp-keepalive 60                   # send keepalive probes every 60s
tcp-backlog 511

# Ensure the OS allows enough file descriptors BEFORE raising maxclients:
#   /etc/security/limits.conf ->  redis  soft  nofile  65535
#                                 redis  hard  nofile  65535

# Client-side pool guidance (pseudo):
#   pool.max_connections   = worker_threads_per_process   # ~ concurrency
#   pool.connect_timeout   = 2s
#   pool.command_timeout   = 1s        # NEVER leave unset
#   pool.socket_keepalive  = true
#   dedicated connection for BLPOP/XREAD BLOCK and pub/sub

Common findings this catches

  • Pool sized to RPS → idle connection explosion.
  • maxclients > FD limit → rejected connections/crash.
  • No command timeout → one slow op hangs a worker.
  • Blocking on shared pool → BLPOP/pub-sub stalls others.
  • Connection leak → gradual pool exhaustion.
  • No keepalive → stale sockets through NAT die silently.
  • Cluster single pool → per-node connections mismanaged.

When to escalate

  • Sustained maxclients pressure across the fleet — capacity/scale review.
  • FD-limit changes at the OS/orchestrator level — platform team.
  • Connection storms during deploys — coordinate rollout and warmup with SRE.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week