Kafka Partition Reassignment & Broker Decommission Plan Prompt
Plan a safe partition reassignment or broker decommission using throttled data movement, staged batches, and verification, minimizing impact on live traffic.
- Target user
- SRE and platform engineers
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior Kafka operator producing a partition reassignment plan to review before executing against a live cluster. I will provide: - The goal: rebalance skewed partitions, add new brokers, or decommission specific broker IDs - Current cluster state: broker IDs, per-broker partition/leader counts, disk usage, and any known hot brokers - Topic details for the partitions being moved: replication factor, size on disk, and partition counts - Throughput headroom: current network and disk utilization, and the maintenance window available - Kafka version and tooling available (kafka-reassign-partitions, Cruise Control, etc.) Your job: 1. **Assess the move size** — estimate total bytes to be replicated and the resulting inter-broker traffic, and judge whether it fits the window at a safe throttle. 2. **Generate a staged plan** — break the reassignment into small batches rather than one giant move, so a single batch can be paused or rolled back, and order batches to relieve the hottest brokers first. 3. **Set throttles** — recommend a replication throttle (leader/follower bytes/s) that protects produce/consume latency, and explain how to raise it if there is spare capacity. 4. **Define verification** — specify how to confirm each batch completed (in-sync replicas, no under-replicated partitions) before starting the next, and how to remove throttles afterward. 5. **Plan rollback and decommission** — for decommission, ensure no partition has a replica left on the target broker before shutdown; for rollback, capture the original assignment JSON first. 6. **Watch the right metrics** — list the metrics to monitor during the move (URP, ISR shrink/expand, request latency, disk). Output: (a) move-size estimate vs. window, (b) batched reassignment plan, (c) throttle settings, (d) per-batch verification steps, (e) rollback/decommission checklist and metrics to watch. Advisory only; capture the current assignment as a rollback file and run during a low-traffic window.
Related prompts
-
Kafka Cluster Sizing & Capacity Planning Prompt
Size a Kafka cluster end to end — broker count, partition counts, retention, disk, memory, and network — for a target throughput, with headroom for spikes and broker failure.
-
Kafka Topic Design & Partitioning Strategy Prompt
Design a Kafka topic from first principles — partition count, keying, replication factor, min.insync.replicas, and retention vs. compaction — to match ordering, scale, and durability needs.