Testing Helm Charts Before They Reach Production

The first time a Helm upgrade took down a service I owned, the chart “worked.” helm template rendered, the CI pipeline was green, and helm upgrade exited zero. The problem was that the rendered manifest pointed a readiness probe at a port the container no longer exposed. Helm doesn’t know that. It renders text and ships it. Whether that text describes a healthy workload is your job to verify.

After enough of those, I stopped treating “the chart renders” as a passing test. A chart has several distinct failure modes, and each one needs its own layer of testing. Here’s the stack I run on every chart now, cheapest checks first.

Layer 1: lint catches the obvious

helm lint is the cheapest gate and it catches a surprising amount — missing Chart.yaml fields, invalid YAML, templates that reference values that don’t exist in the default set.

helm lint ./mychart --strict

The --strict flag turns warnings into failures, which is what you want in CI. Run it against your real values files too, not just the defaults, because a chart can lint clean with default values and explode with production overrides:

helm lint ./mychart --strict -f values-prod.yaml

Lint is necessary but weak. It tells you the template is syntactically valid. It says nothing about whether the output is correct.

Layer 2: template and diff the rendered output

The single highest-value habit is rendering the chart and reading the actual YAML that will hit the cluster:

helm template myrelease ./mychart -f values-prod.yaml > rendered.yaml

Read it. Every time. The number of bugs that are obvious in the rendered output and invisible in the templates is high — a misindented env block that silently drops a variable, a label selector that doesn’t match the pod template, a replicas: with an empty value because an override was misspelled.

Better still, diff the rendered output against what’s currently deployed before you upgrade. The helm-diff plugin does exactly this:

helm plugin install https://github.com/databus23/helm-diff
helm diff upgrade myrelease ./mychart -f values-prod.yaml

This is the Helm equivalent of terraform plan. It shows you precisely which fields change, which objects get created or deleted, and — critically — whether your “small config tweak” is about to recreate a StatefulSet. I won’t run a production upgrade without reading a diff first.

Layer 3: validate the rendered manifests against the API

Rendered YAML can be valid YAML and still be an invalid Kubernetes object. Catch that with a schema validator. kubeconform is fast and works offline against the right Kubernetes version:

helm template myrelease ./mychart -f values-prod.yaml \
  | kubeconform -strict -summary -kubernetes-version 1.30.0

This catches deprecated or removed APIs (the classic apiVersion that vanished in an upgrade), required fields you forgot, and type mismatches. If you use CRDs, point kubeconform at their schemas so it validates custom resources too. This step has saved me from helm upgrade failures that would otherwise only surface mid-rollout, after Helm has already partially applied.

Layer 4: enforce a values schema

If your chart is consumed by other teams, a values.schema.json file turns “I typo’d replicaCount as replicaCounts” from a silent no-op into a hard error at install time. Helm validates values against this JSON schema automatically:

{
  "$schema": "https://json-schema.org/draft-07/schema#",
  "type": "object",
  "required": ["image", "replicaCount"],
  "properties": {
    "replicaCount": { "type": "integer", "minimum": 1 },
    "image": {
      "type": "object",
      "required": ["repository", "tag"],
      "properties": {
        "repository": { "type": "string" },
        "tag": { "type": "string" }
      }
    }
  }
}

additionalProperties: false on the object is aggressive but worth it for internal charts — it makes typos fail loudly instead of being ignored.

Layer 5: helm test for live behavior

Everything above is static analysis. helm test runs actual workloads against a deployed release. You define test pods as templates annotated with the test hook:

apiVersion: v1
kind: Pod
metadata:
  name: "{{ .Release.Name }}-connection-test"
  annotations:
    "helm.sh/hook": test
spec:
  restartPolicy: Never
  containers:
    - name: curl
      image: curlimages/curl:8.8.0
      command: ["curl"]
      args: ["--fail", "http://{{ .Release.Name }}-svc:8080/healthz"]

After installing into a throwaway namespace, run:

helm test myrelease --logs

A non-zero exit means the test pod failed. This is where you assert the things static checks can’t: the service actually resolves, the health endpoint actually responds, the database migration job actually completed. Run this against an ephemeral cluster (kind or a CI namespace) so a broken chart never reaches a real environment.

Wire it into CI

The whole stack belongs in a pipeline that runs on every chart change:

test-chart:
  script:
    - helm lint ./mychart --strict -f values-prod.yaml
    - helm template r ./mychart -f values-prod.yaml | kubeconform -strict -summary -kubernetes-version 1.30.0
    - helm install t ./mychart -f values-ci.yaml --namespace ci-$CI_JOB_ID --create-namespace --wait
    - helm test t --namespace ci-$CI_JOB_ID --logs
    - helm uninstall t --namespace ci-$CI_JOB_ID

The --wait flag matters: it blocks until pods are ready, so a chart that deploys but never becomes healthy fails the job instead of passing.

Where AI fits

The tedious part of chart testing is writing the test cases and reading dense rendered output for subtle problems. That’s exactly where I lean on AI. Paste in a rendered manifest and ask it to flag mismatched selectors, probes pointing at undeclared ports, or resource blocks that don’t add up — it’s a fast second pair of eyes on the diff. It’s also good at scaffolding a values.schema.json from an existing values.yaml. If you want that review built into your workflow, our AI code review tool reads rendered manifests and chart diffs and calls out exactly these classes of bug before they merge.

The mindset shift is the whole point: a chart isn’t “done” when it renders. It’s done when you’ve proven the rendered output is valid, the values are constrained, and a real pod comes up healthy. For more Kubernetes practices like this, see the rest of our Kubernetes and Helm guides.

Generated chart reviews and test scaffolding are assistive, not authoritative. Always validate against your own cluster and version before upgrading production.