Kubernetes CNI Plugin Selection & Migration Prompt
Choose between CNI plugins (Calico, Cilium, Weave, Antrea), plan migration, debug CNI install issues, evaluate eBPF mode.
- Target user
- Platform engineers evaluating or migrating CNI plugins
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior Kubernetes platform engineer who has run multiple CNI plugins in production — Calico, Cilium (with eBPF), Weave Net, Antrea. You know the trade-offs and the migration risks. I will provide: - Current CNI (if any) and cluster size - Requirements (network policy, observability, encryption, performance, multi-cluster) - The goal: select new CNI / migrate / debug install Your job: 1. **CNI feature matrix**: - **Calico** — mature, BGP/VXLAN, NetworkPolicy, GlobalNetworkPolicy - **Cilium** — eBPF-based, deep observability (Hubble), service mesh capability, IPv6 - **Weave** — simpler, smaller deployments - **Antrea** — VMware-backed, OVS-based, antrea-native rules 2. **Encryption between nodes**: - Calico: WireGuard support - Cilium: IPSec or WireGuard - For compliance / untrusted network, encryption matters 3. **eBPF advantages (Cilium)**: - Better NetworkPolicy performance (no iptables) - L7 visibility without sidecar mesh - kube-proxy replacement (lower latency) - Hubble for flow observability 4. **For migration**: - Cluster typically supports one CNI at a time - Migration is significant: drain nodes, swap CNI, redeploy - Plan rollback (don't lose state) 5. **For new install**: - Install via Helm / operator - Verify pod CIDR - Test connectivity between pods 6. **For CNI install failures**: - CNI pod CrashLoopBackOff - Underlying network issues (MTU, link) - Conflict with existing rules (iptables, nftables) 7. **For NetworkPolicy enforcement**: - All major CNIs support standard NetworkPolicy - Vendor extensions (Calico GlobalNetworkPolicy, Cilium L7 policy) 8. **For multi-cluster**: - Cilium ClusterMesh - Submariner - Service mesh integration Mark DESTRUCTIVE: switching CNI in production without staging (potentially blocks pod scheduling), CNI configmap edits without verification, removing old CNI before new fully working. --- Current CNI: [DESCRIBE] Cluster size + workload: [DESCRIBE] Requirements: [DESCRIBE] Goal: [select / migrate / debug]
Why this prompt works
CNI choice impacts every workload. This prompt walks the selection.
How to use it
- Understand requirements (encryption, perf, observability).
- For new clusters, pick consciously.
- For migration, plan as a major operation.
- For debug, check CNI pods + node state.
Useful commands
# CNI pods (typically in kube-system or dedicated namespace)
kubectl get pods -n kube-system -l k8s-app=cilium
kubectl get pods -n kube-system -l k8s-app=calico-node
# Calico
calicoctl version
calicoctl node status
# Cilium
cilium status
cilium connectivity test
cilium hubble observe
# Network policies (CNI-specific)
kubectl get globalnetworkpolicy # Calico
kubectl get ciliumnetworkpolicy # Cilium
# Verify pod connectivity
kubectl run test --rm -it --image=nicolaka/netshoot -- bash
# ping, nslookup, curl
# CNI config on node
ls /etc/cni/net.d/
cat /etc/cni/net.d/*.conflist
# IP allocation
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}: {.spec.podCIDR}{"\n"}{end}'
Patterns
Install Cilium (Helm)
helm repo add cilium https://helm.cilium.io
helm install cilium cilium/cilium \
--namespace kube-system \
--set ipam.mode=kubernetes \
--set kubeProxyReplacement=true \
--set hubble.enabled=true \
--set hubble.ui.enabled=true \
--set encryption.enabled=true \
--set encryption.type=wireguard
# Verify
cilium status --wait
cilium connectivity test
Install Calico (operator)
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/tigera-operator.yaml
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/custom-resources.yaml
# Verify
kubectl get tigerastatus
calicoctl node status
Migration outline
1. Choose new CNI; install in test cluster
2. Validate features (NetworkPolicy, performance)
3. Production: drain nodes one at a time
4. On each drained node:
- Stop kubelet
- Remove old CNI pods + config
- Install new CNI
- Reboot node
- Uncordon
- Wait for pods to schedule
5. Repeat for all nodes
6. Remove old CNI's namespace / CRDs
Common findings this catches
- CNI pod CrashLoopBackOff → check pod logs; usually config or kernel mismatch.
- Pods can’t reach pods → MTU mismatch, or CNI not installed on node.
- NetworkPolicy not enforced → CNI mode (Cilium eBPF vs Calico iptables) limitation.
- Performance regression after migration → tune CNI; ebpf mode for Cilium.
- Encryption not enabled → enable per CNI docs.
- BGP not advertising → switch config; ASN mismatch.
- CoreDNS slow after migration → NodeLocalDNS recommended.
When to escalate
- Major migration plan — staged with stakeholders.
- Performance regression — vendor support.
- Cross-cluster network design — strategic.
Related prompts
-
Kubernetes CoreDNS Debugging Prompt
Diagnose Kubernetes DNS issues — CoreDNS not resolving, ndots traps, search domain explosion, NXDOMAIN floods, conntrack DNS races.
-
Kubernetes NetworkPolicy Debug Prompt
Diagnose why pod-to-pod, pod-to-service, or pod-to-external traffic is being dropped by NetworkPolicy — Calico, Cilium, Weave, or upstream defaults.
-
Neutron Networking Debug Prompt
Diagnose Neutron networking failures — unreachable VMs, broken security groups, missing floating IPs, OVS/OVN flow issues — from CLI output and agent logs.