Kubernetes StorageClass Design Prompt
Design StorageClasses — provisioner, parameters, reclaim policy, volumeBindingMode, multi-tier (fast/slow), default class.
- Target user
- Kubernetes platform engineers designing persistent storage
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior Kubernetes engineer who has designed StorageClasses for production — fast SSD vs cheap HDD, zonal vs regional, encrypted vs not, expanded vs immutable. I will provide: - The cluster type (cloud provider + supported drivers) - The workload mix (databases, batch, archive) - The goal: design new SC / debug existing / migrate Your job: 1. **StorageClass elements**: - **`provisioner`** — CSI driver name - **`parameters`** — driver-specific (type, iops, encryption) - **`reclaimPolicy`** — Delete or Retain on PVC delete - **`volumeBindingMode`** — Immediate or WaitForFirstConsumer - **`allowVolumeExpansion`** — required for PVC resize - **`mountOptions`** — fs-level options 2. **For tiered design**: - **Fast** — NVMe, high IOPS, expensive (databases, hot workloads) - **Standard** — SSD, balanced (general) - **Cheap** — HDD, cold data - **Archive** — S3-class, infrequent (backups) 3. **For volumeBindingMode**: - **Immediate** — PV created at PVC creation; may land wrong zone - **WaitForFirstConsumer** — wait for pod scheduling to know zone; preferred for zonal disks 4. **For default StorageClass**: - Annotation `storageclass.kubernetes.io/is-default-class: "true"` - Only one should have this - PVCs without `storageClassName` use default 5. **For encryption**: - Provider-managed (e.g., AWS KMS) - Per-volume encrypted parameter 6. **For reclaim policy**: - **Delete** — removes backend volume on PVC delete (data lost) - **Retain** — keeps backend; admin cleanup - For production data: Retain typical; for review apps: Delete 7. **For migration between SCs**: - Can't change SC on existing PVC - Migrate: snapshot + restore to new SC OR rsync to new PVC 8. **For multi-cluster**: - StorageClasses are cluster-scoped - Consistent naming across clusters helps portability Mark DESTRUCTIVE: changing reclaimPolicy of existing PV from Retain to Delete (next PVC delete = data loss), removing default SC while PVCs depend on it, deleting SC with bound PVs (orphans). --- Cluster type: [DESCRIBE] Workload mix: [DESCRIBE] Current SCs: [DESCRIBE] Goal: [design / debug / migrate]
Why this prompt works
StorageClass is a small object with large consequences. This prompt walks the design.
How to use it
- Pick tier based on workload.
- WaitForFirstConsumer for zonal disks.
- Retain reclaim for production data.
- Always allow expansion.
Useful commands
# Inventory
kubectl get storageclass
kubectl describe storageclass <name>
kubectl get sc -o yaml
# Find PVCs per SC
kubectl get pvc -A -o json | jq -r --arg sc "<sc>" '.items[] | select(.spec.storageClassName == $sc) | "\(.metadata.namespace)/\(.metadata.name)"'
# Default SC
kubectl get sc -o jsonpath='{range .items[*]}{.metadata.name} {.metadata.annotations.storageclass\.kubernetes\.io/is-default-class}{"\n"}{end}'
# Set as default
kubectl patch storageclass <name> -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
# Remove default from another
kubectl patch storageclass <name> -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
Patterns
Fast tier (databases on AWS gp3 with IOPS)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: db-fast
provisioner: ebs.csi.aws.com
parameters:
type: gp3
iops: "16000"
throughput: "1000"
encrypted: "true"
kmsKeyId: "arn:aws:kms:us-east-1:123:key/abc"
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
Standard tier
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: standard
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: ebs.csi.aws.com
parameters:
type: gp3
encrypted: "true"
reclaimPolicy: Delete # OK for default / non-critical
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
Cheap / batch
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: cheap
provisioner: ebs.csi.aws.com
parameters:
type: sc1 # cold HDD
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
Multi-zone / replicated (e.g., Longhorn)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: replicated
provisioner: driver.longhorn.io
parameters:
numberOfReplicas: "3"
staleReplicaTimeout: "30"
fromBackup: ""
fsType: ext4
reclaimPolicy: Retain
volumeBindingMode: Immediate
allowVolumeExpansion: true
Common findings this catches
- PVC stuck Pending in single-zone cluster with
Immediatebinding zonal SC → use WaitForFirstConsumer. - Multiple default SCs → only one should have annotation.
Deletereclaim on production = silent data loss; verify.allowVolumeExpansion: falseblocks future resize.- Backend missing parameter (e.g., wrong
type) → provision fails. - KMS key wrong → encryption fails.
- Mount option breaks fs check → driver-specific.
When to escalate
- New CSI driver evaluation — POC in staging.
- Multi-tier rollout — staged.
- Default SC change — coordinate widely.
Related prompts
-
Ceph + OpenStack Integration Tuning Prompt
Tune Ceph as storage backend for OpenStack — Glance, Cinder, Nova ephemeral pools; performance tuning, capacity planning, snapshot/clone semantics.
-
Kubernetes PV / PVC / CSI Storage Troubleshooting Prompt
Diagnose stuck PVCs, failed pod mounts, StorageClass provisioning errors, CSI driver crashes, and orphaned volume cleanups.
-
Kubernetes VolumeSnapshot & CSI Snapshot Prompt
Use CSI volume snapshots — VolumeSnapshotClass, VolumeSnapshot, restore from snapshot, snapshotter sidecar issues.