Kubernetes CRD Design & Versioning Prompt
Design Custom Resource Definitions — schema validation, versioning (v1alpha1 → v1), conversion webhooks, status subresource, printer columns.
- Target user
- Kubernetes engineers building operators and platform APIs
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior Kubernetes engineer who has designed CRDs for production operators. You know that getting the schema, versioning, and conversion webhook right upfront saves painful migrations later. I will provide: - The CRD use case (describing your domain object) - Current CRD (if any) or sketch - Versioning needs (v1alpha1 OK or going to v1?) - Symptom (if debugging: validation failing, conversion error, status not updating) Your job: 1. **CRD basics**: - Define resource type with schema (OpenAPI v3) - Cluster-scoped or namespaced - Versions: alpha → beta → v1 - Each version can have its own schema 2. **Schema validation**: - OpenAPI v3 schema for spec / status - Required fields, types, constraints - `x-kubernetes-validations` for CEL expressions (1.25+) - Reject invalid resources at admission 3. **Versioning**: - Multiple versions can serve simultaneously - One is `served: true, storage: true` (the storage version) - Conversion webhook for schema changes between versions - `None` conversion strategy = same schema across versions 4. **For schema changes**: - Add new field optionally → backward compat OK - Remove field → backward incompat; conversion webhook helps - Rename field → conversion webhook required 5. **Status subresource**: - Separate `/status` endpoint - Controller updates only `/status` - Generation/observedGeneration pattern - Conditions array for state representation 6. **Printer columns** (kubectl): - `additionalPrinterColumns` in CRD - Custom kubectl output (Age, Ready, etc.) 7. **For conversion webhook**: - Mutating-style webhook that GitOps tools call - Convert between versions - Round-trip safe (v1 → storage → v1 == original) 8. **For deprecation**: - Mark old version `deprecated: true` with warning message - Eventually `served: false` then `storage: false` Mark DESTRUCTIVE: removing field from served storage version (data loss), `storage: true` on wrong version (data location confusion), removing CRD with active CRs (cluster cleanup). --- Use case: [DESCRIBE] Current CRD (or sketch): ```yaml [PASTE] ``` Versioning need: [DESCRIBE] Symptom: [DESCRIBE]
Why this prompt works
CRD design errors are expensive to undo. This prompt walks the design choices.
How to use it
- Schema upfront matters most.
- Plan versioning from day 1.
- Status subresource for controllers.
- Test round-trip conversion.
Useful commands
# Inventory
kubectl get crd
kubectl get crd <name> -o yaml
# Versions served
kubectl get crd <name> -o jsonpath='{.spec.versions}'
# Validation test
cat <<EOF | kubectl apply --dry-run=server -f -
apiVersion: example.com/v1
kind: MyResource
metadata: { name: test }
spec:
invalidField: foo
EOF
# Watch CRs
kubectl get <crd-plural> -A
kubectl get <crd-plural> <name> -o yaml
# Conversion webhook test
kubectl get <crd-plural> --request-version=v1alpha1 <name>
CRD example with versioning
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: cronjobs.batch.example.com
spec:
group: batch.example.com
names:
kind: CronJob
plural: cronjobs
singular: cronjob
shortNames: [cj]
scope: Namespaced
conversion:
strategy: Webhook
webhook:
clientConfig:
service:
name: conversion-webhook-svc
namespace: my-operator-system
path: /convert
caBundle: <base64-CA>
conversionReviewVersions: ["v1"]
versions:
- name: v1alpha1
served: true
storage: false
deprecated: true
deprecationWarning: "batch.example.com/v1alpha1 is deprecated; use v1"
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
schedule: { type: string }
jobTemplate: { type: object }
required: [schedule, jobTemplate]
- name: v1
served: true
storage: true # storage version
subresources:
status: {} # enable status subresource
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
schedule:
type: string
pattern: '^(\*|[0-9,/*-]+\s){4}(\*|[0-9,/*-]+)$'
concurrencyPolicy:
type: string
enum: [Allow, Forbid, Replace]
default: Allow
jobTemplate:
type: object
required: [schedule, jobTemplate]
x-kubernetes-validations:
- rule: "self.schedule != ''"
message: "schedule cannot be empty"
status:
type: object
properties:
conditions:
type: array
items:
type: object
properties:
type: { type: string }
status: { type: string }
lastTransitionTime: { type: string, format: date-time }
reason: { type: string }
message: { type: string }
lastScheduleTime:
type: string
format: date-time
observedGeneration:
type: integer
format: int64
additionalPrinterColumns:
- name: Schedule
type: string
jsonPath: .spec.schedule
- name: Last Schedule
type: date
jsonPath: .status.lastScheduleTime
- name: Age
type: date
jsonPath: .metadata.creationTimestamp
Common findings this catches
- Status updates overwrite spec → enable
subresources.status. - Validation too lax → add OpenAPI constraints + CEL rules.
- Multi-version without conversion → API client compatibility breaks.
storage: trueon multiple versions → only one allowed.- Removed field still in storage → conversion webhook needed.
- Custom kubectl output not showing →
additionalPrinterColumnsmissing. - CR validation error after CRD update → existing CRs may violate new schema.
When to escalate
- Schema migration for production CRs → carefully staged.
- Conversion webhook implementation — engage operator team.
- Deprecation of widely-used CRs — comms plan.
Related prompts
-
Kubernetes Admission Webhook Debug Prompt
Diagnose admission webhook failures — timeout, TLS cert errors, mutating/validating semantics, failure policy traps, cluster-wide outages from webhook misconfig.
-
Kubernetes Cluster Upgrade Pre-Flight Planning Prompt
Pre-upgrade safety review of a Kubernetes cluster going N → N+1 (or N+2 skip) — deprecated APIs, removed features, control-plane & node ordering, workload compatibility.
-
Kubernetes Operator Reconcile Loop Debug Prompt
Debug operator reconciliation issues — finalizers stuck, status not updating, requeue storms, owner references, leader election.