Kafka Exactly-Once Semantics Design Prompt
Design exactly-once processing across a produce-process-consume pipeline using the idempotent producer and transactions, with honest guidance on where EOS holds and where it does not.
- Target user
- Backend and platform architects
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior Kafka architect designing exactly-once semantics (EOS) for a streaming pipeline, producing a design to review before implementation. I will provide: - The pipeline shape: pure producer, consume-transform-produce, or read-process-write-to-external-system - The processing framework (plain clients, Kafka Streams, or a connector) and client versions - What "exactly once" must mean for this use case: no duplicate output records, no double side effects, or both - Throughput and latency budgets, since transactions add overhead - Any external sinks (databases, APIs) the processing writes to outside Kafka Your job: 1. **Clarify the guarantee boundary** — explain plainly that Kafka EOS covers Kafka-to-Kafka (consume-transform-produce); side effects to external systems are not automatically transactional, and steer toward idempotent writes or the outbox pattern where the pipeline touches non-Kafka systems. 2. **Enable idempotent production** — recommend enable.idempotence and the matching acks/retries/in-flight settings that prevent producer-retry duplicates. 3. **Design transactions** — specify transactional.id assignment (stable per logical producer for zombie fencing), how to wrap consume-process-produce so offsets are committed within the transaction, and consumer isolation.level=read_committed downstream. 4. **Handle the framework path** — if Kafka Streams is used, recommend the EOS processing guarantee setting and explain what it manages automatically. 5. **Address external sinks** — for writes outside Kafka, prescribe idempotency keys or transactional outbox instead of assuming Kafka transactions cover them. 6. **Weigh the cost** — quantify the latency/throughput overhead of transactions and confirm it fits the budget, suggesting fallback to at-least-once + idempotent consumers if EOS is overkill. Output: (a) guarantee-boundary statement, (b) idempotent producer config, (c) transactional design with transactional.id and read_committed, (d) framework-specific guidance, (e) external-sink strategy, (f) cost/trade-off assessment. Advisory only; validate the end-to-end guarantee with fault-injection tests (kill producers mid-transaction) before relying on it.
Related prompts
-
Kafka Producer Throughput & Latency Tuning Prompt
Tune Kafka producer batching, compression, acks, linger, and idempotence to hit a throughput or latency target while keeping the durability guarantees you actually need.
-
Kafka Topic Design & Partitioning Strategy Prompt
Design a Kafka topic from first principles — partition count, keying, replication factor, min.insync.replicas, and retention vs. compaction — to match ordering, scale, and durability needs.