Kubernetes Error Guide: 'Error: UPGRADE FAILED' Helm Release Stuck in pending-upgrade
Fix the Helm Error: UPGRADE FAILED with another operation in progress: clear pending-upgrade releases, immutable field rejections, failed hooks, and resource conflicts.
- #kubernetes
- #troubleshooting
- #errors
- #helm
Overview
Error: UPGRADE FAILED is Helm’s catch-all for an upgrade that did not complete. The most common variant is a release left in a pending-upgrade (or pending-install) state after a previous helm upgrade was killed mid-flight — Helm records each release revision in a Kubernetes Secret and refuses to start a new operation while one appears in progress.
You will see one of these on the next attempt:
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
Or an upgrade that the API server rejected:
Error: UPGRADE FAILED: cannot patch "api" with kind Deployment: Deployment.apps "api" is invalid: spec.selector: Invalid value: ... : field is immutable
It occurs when you run helm upgrade against a release whose last revision is still marked pending (because the previous run was interrupted by Ctrl-C, a CI timeout, or a crash), or when the new chart asks the API server to do something it will not allow. Because Helm stores release state in-cluster, a stuck revision blocks every subsequent operation until it is resolved.
Symptoms
helm upgradefails immediately withanother operation ... is in progress.helm listdoes not show the release (it hides non-deployed states), buthelm list -ashows itpending-upgrade/pending-install.- A rollout is half-applied: some new pods exist, the release status is stuck.
- Repeated upgrades keep failing the same way without changing anything.
helm list -a -n shop
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
api shop 8 2026-06-23 14:01:55 UTC pending-upgrade api-2.4.1 1.8.2
helm status api -n shop
NAME: api
LAST DEPLOYED: Mon Jun 23 14:01:55 2026
NAMESPACE: shop
STATUS: pending-upgrade
REVISION: 8
Common Root Causes
1. A previous operation crashed, leaving the release pending
The last helm upgrade was interrupted (Ctrl-C, CI job killed, network drop), so revision 8 is frozen at pending-upgrade and never resolved to deployed or failed.
helm history api -n shop
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
6 2026-06-22 09:11:02 UTC superseded api-2.3.0 1.8.0 Upgrade complete
7 2026-06-23 10:40:18 UTC deployed api-2.4.0 1.8.1 Upgrade complete
8 2026-06-23 14:01:55 UTC pending-upgrade api-2.4.1 1.8.2 Preparing upgrade
Revision 8 stuck at pending-upgrade is the lock that blocks new upgrades.
2. The “another operation in progress” lock
Helm sees a pending revision and aborts before doing anything, even if no process is actually running.
helm upgrade api ./api -n shop
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
This is a state problem, not a chart problem — the pending revision must be cleared (rollback or delete the stuck revision Secret).
3. An immutable field change rejected by the API
The new chart changes a field Kubernetes does not allow to be patched (e.g., a Deployment/StatefulSet spec.selector, a Service clusterIP, a Job template), so the upgrade fails partway.
helm upgrade api ./api -n shop 2>&1 | tail -3
Error: UPGRADE FAILED: cannot patch "api" with kind Deployment: Deployment.apps "api" is invalid: spec.selector: Invalid value: v1.LabelSelector{...}: field is immutable
Changing the selector labels means the existing object cannot be patched in place.
4. A failed hook job
A pre-upgrade/post-upgrade hook (often a migration Job) failed or timed out, so Helm marks the release failed/pending.
kubectl get jobs -n shop -l 'helm.sh/hook=pre-upgrade'
kubectl logs job/api-db-migrate -n shop --tail=20
NAME COMPLETIONS DURATION AGE
api-db-migrate 0/1 5m 5m
...
Error: relation "users" already exists (SQLSTATE 42P07)
migration 0014 failed; aborting
The hook Job never completed, so Helm could not finish the upgrade.
5. A resource already exists, not owned by the release
The chart tries to create an object that already exists in the cluster but lacks the Helm ownership metadata, so Helm refuses to adopt it.
helm upgrade api ./api -n shop 2>&1 | tail -2
Error: UPGRADE FAILED: rendered manifests contain a resource that already exists. Unable to continue with update: ConfigMap "api-config" in namespace "shop" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by"
The api-config ConfigMap was created out-of-band, so Helm will not take ownership.
6. Timeout waiting for resources to become ready
With --wait, Helm blocks until pods/PDBs/Services are ready; if they never become ready within --timeout, the upgrade fails and the release can be left pending.
helm upgrade api ./api -n shop --wait --timeout 5m 2>&1 | tail -2
Error: UPGRADE FAILED: timed out waiting for the condition
kubectl get pods -n shop -l app=api
NAME READY STATUS RESTARTS AGE
api-7c9d8f6b54-k2v8p 0/1 CrashLoopBackOff 4 5m
The new pods never became Ready, so the --wait upgrade timed out.
Diagnostic Workflow
Step 1: See the real release state
helm list -a -n <NS>
helm status <RELEASE> -n <NS>
helm history <RELEASE> -n <NS>
-a is essential — a pending-upgrade/pending-install release is hidden from a plain helm list.
Step 2: Inspect the release storage Secrets
kubectl get secret -n <NS> -l owner=helm,name=<RELEASE> \
-L status,version --sort-by=.metadata.creationTimestamp
NAME STATUS VERSION
sh.helm.release.v1.api.v7 deployed 7
sh.helm.release.v1.api.v8 pending-upgrade 8
The latest Secret holds the stuck pending-upgrade revision.
Step 3: Read the actual error from the last attempt
helm upgrade <RELEASE> <CHART> -n <NS> --dry-run 2>&1 | tail -5
kubectl get events -n <NS> --sort-by=.lastTimestamp | tail -15
Distinguish a state lock (another operation in progress) from a real rejection (immutable field, ownership, timeout, hook).
Step 4: Resolve the stuck state
If a known-good prior revision exists, roll back — this clears the pending state cleanly:
helm rollback <RELEASE> <LAST_GOOD_REVISION> -n <NS>
If rollback is not possible (e.g., the first install is pending-install), delete the stuck revision Secret so Helm forgets the in-progress operation:
kubectl delete secret -n <NS> sh.helm.release.v1.<RELEASE>.v<STUCK_VERSION>
Step 5: Re-run the upgrade after fixing the underlying cause
# fix the chart (immutable field, hook, ownership) first, then:
helm upgrade <RELEASE> <CHART> -n <NS> --atomic --timeout 5m
helm status <RELEASE> -n <NS>
--atomic auto-rolls-back on failure so you do not get left in pending-upgrade again.
Example Root Cause Analysis
A CI pipeline runs helm upgrade api ./api -n shop. The job hit its 5-minute limit and was killed. The next run fails instantly:
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
Checking the real state:
helm list -a -n shop
NAME NAMESPACE REVISION STATUS CHART APP VERSION
api shop 8 pending-upgrade api-2.4.1 1.8.2
Revision 8 is stuck pending-upgrade — the lock the new run is hitting. The history shows revision 7 was a healthy deployed:
helm history api -n shop
REVISION STATUS CHART APP VERSION DESCRIPTION
7 deployed api-2.4.0 1.8.1 Upgrade complete
8 pending-upgrade api-2.4.1 1.8.2 Preparing upgrade
There is a known-good revision to fall back to, so a rollback both clears the lock and restores a consistent release:
helm rollback api 7 -n shop
Rollback was a success! Happy Helming!
helm status api -n shop
STATUS: deployed
REVISION: 9
With the release back to deployed, the upgrade can be retried — this time with --atomic so an interrupted run cleans up after itself instead of leaving another pending revision.
Prevention Best Practices
- Always upgrade with
--atomic --timeout <duration>so a failed or interrupted run auto-rolls-back instead of leaving the releasepending-upgrade. - Give CI jobs a timeout longer than Helm’s
--timeout, so the pipeline never kills Helm mid-operation and creates a stuck revision. - Avoid changing immutable fields (Deployment/StatefulSet
spec.selector, ServiceclusterIP, Job templates); when unavoidable, delete and recreate the resource deliberately rather than patching. - Make hook Jobs idempotent and bounded (sensible
backoffLimit/activeDeadlineSeconds) so apre-upgrademigration cannot hang the whole release. - Manage every object through Helm — out-of-band
kubectl createof a name the chart also renders causes ownership conflicts on upgrade. See the Kubernetes & Helm guides for chart hygiene. - Before retrying a stuck upgrade, confirm with
helm list -aandhelm historyso you roll back to a real known-good revision rather than guessing.
Quick Command Reference
# See the real (including hidden) release state
helm list -a -n <NS>
helm status <RELEASE> -n <NS>
helm history <RELEASE> -n <NS>
# Release storage Secrets (Helm 3 stores state here)
kubectl get secret -n <NS> -l owner=helm,name=<RELEASE> -L status,version --sort-by=.metadata.creationTimestamp
# Find the underlying error
helm upgrade <RELEASE> <CHART> -n <NS> --dry-run 2>&1 | tail -5
kubectl get events -n <NS> --sort-by=.lastTimestamp | tail -15
# Inspect a failed hook job
kubectl get jobs -n <NS> -l 'helm.sh/hook'
kubectl logs job/<HOOK_JOB> -n <NS> --tail=20
# Clear the stuck state
helm rollback <RELEASE> <LAST_GOOD_REVISION> -n <NS>
kubectl delete secret -n <NS> sh.helm.release.v1.<RELEASE>.v<STUCK_VERSION>
# Re-run safely
helm upgrade <RELEASE> <CHART> -n <NS> --atomic --timeout 5m
Conclusion
Error: UPGRADE FAILED means Helm could not complete an upgrade — most often because a prior run left the release locked in pending-upgrade/pending-install. The usual root causes:
- A previous operation crashed or was killed, freezing a revision in a pending state.
- The “another operation in progress” lock blocks every new run until cleared.
- An immutable field change (selector, clusterIP, Job template) is rejected by the API.
- A
pre-/post-upgradehook Job failed or timed out. - A resource already exists in the cluster without Helm ownership metadata.
- A
--waitupgrade timed out because new resources never became Ready.
Start with helm list -a and helm history to see the real state, then roll back to a known-good revision (or delete the stuck revision Secret) before fixing the underlying cause and retrying with --atomic. For ad-hoc triage, the free incident assistant can summarize Helm and event output into the likely cause.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.