Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Kubernetes & Helm By James Joyner IV · · 9 min read

Kubernetes Error Guide: 'Failed to list *v1.Pod' Reflector / Informer Error

Fix reflector.go Failed to list *v1.Pod errors: RBAC Forbidden, Unauthorized tokens, and API connectivity that break controller and informer watch caches.

  • #kubernetes-helm
  • #troubleshooting
  • #errors
  • #rbac

Exact Error Message

Controllers, operators, and any client built on client-go use informers backed by a reflector that LISTs and then WATCHes a resource. When the initial LIST fails, the reflector logs a repeating error naming the Go type it could not list:

E0628 14:02:11.339481       1 reflector.go:147] k8s.io/client-go/tools/cache/reflector.go:229: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:monitoring:collector" cannot list resource "pods" in API group "" at the cluster scope

Other variants point at authentication or connectivity rather than RBAC:

reflector.go:229: Failed to list *v1.Pod: Unauthorized
reflector.go:229: Failed to list *v1.Endpoints: Get "https://10.96.0.1:443/api/v1/endpoints": dial tcp 10.96.0.1:443: connect: connection refused

The Failed to list *v1.Pod phrase, the reflector.go source location, and the repeating cadence are the signatures of a broken informer.

What the Error Means

An informer keeps a local in-memory cache of objects so a controller does not hammer the API server. The reflector populates that cache by first issuing a LIST to get the full set and a starting resourceVersion, then opening a WATCH for incremental changes. The LIST must succeed before the WATCH can start. When the LIST fails, the reflector logs Failed to list *v1.Pod and retries with backoff — forever — because without a populated cache the controller cannot function.

The Go type (*v1.Pod, *v1.Endpoints, a CRD type) tells you which resource the controller could not read. The tail of the message tells you why: forbidden is RBAC (authenticated but not authorized), Unauthorized is authentication (bad or missing token), and a dial tcp ... connection refused/no route to host is connectivity to the API server. The fix depends entirely on that tail.

Common Causes

  • Missing RBAC — the ServiceAccount lacks a Role/ClusterRole granting list/watch on that resource (forbidden).
  • Wrong scope — RBAC grants namespace access but the controller lists cluster-wide (at the cluster scope).
  • Bad/expired token — the mounted ServiceAccount token is invalid, deleted, or the SA was removed (Unauthorized).
  • API server unreachable — DNS for kubernetes.default, the KUBERNETES_SERVICE_HOST env, or network policy blocks the apiserver (connection refused/timeout).
  • CRD not installed — an informer for a custom type whose CRD is absent fails to list it.
  • Aggregated/metrics API down — listing a type served by an aggregated APIService that is unavailable.
  • Throttling/overload — sustained 429s during LIST under apiserver pressure.

How to Reproduce the Error

Run a pod whose ServiceAccount has no permission to list pods, using a client that builds a pod informer:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: collector
  namespace: monitoring
---
apiVersion: v1
kind: Pod
metadata:
  name: collector
  namespace: monitoring
spec:
  serviceAccountName: collector
  containers:
    - name: app
      image: bitnami/kubectl:latest
      command: ["sh","-c","kubectl get pods -A -w; sleep 3600"]
kubectl apply -f collector.yaml
kubectl -n monitoring logs collector
Error from server (Forbidden): pods is forbidden: User "system:serviceaccount:monitoring:collector" cannot list resource "pods" in API group "" at the cluster scope

A controller using client-go logs the same condition as the repeating reflector.go: Failed to list *v1.Pod line.

Diagnostic Commands

# Identify the ServiceAccount the failing pod runs as
kubectl get pod <pod> -n <ns> -o jsonpath='{.spec.serviceAccountName}'

# Test exactly the permission the reflector needs
kubectl auth can-i list pods \
  --as=system:serviceaccount:<ns>:<sa> -A

# Find the bindings (if any) for that ServiceAccount
kubectl get clusterrolebindings,rolebindings -A -o wide | grep <sa>

# Confirm the resource/type is even served (CRD installed?)
kubectl api-resources | grep -i <resource>

# Check apiserver reachability from the controller's namespace
kubectl run netcheck --rm -it --image=busybox -n <ns> -- \
  sh -c 'nslookup kubernetes.default; wget -qO- --no-check-certificate https://kubernetes.default 2>&1 | head'

# Look for the repeating reflector lines in the controller log
kubectl -n <ns> logs <controller-pod> | grep reflector

kubectl auth can-i ... --as=system:serviceaccount:... reproduces the exact authorization decision the reflector hit.

Step-by-Step Resolution

1. Read the tail of the error. It routes you: forbidden → RBAC, Unauthorized → authentication/token, dial tcp ... refused/timeout → connectivity. Fix the matching category.

2. Fix forbidden (RBAC). Grant the ServiceAccount list and watch (and usually get) on the resource, at the scope the controller uses. For a cluster-wide informer, use a ClusterRole + ClusterRoleBinding:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: pod-reader
rules:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list", "watch"]

Bind it to the SA, then confirm with kubectl auth can-i list pods --as=system:serviceaccount:<ns>:<sa> -A.

3. Match the scope. at the cluster scope means the controller lists across all namespaces but the binding is a namespaced RoleBinding. Either grant a ClusterRoleBinding, or configure the informer to a single namespace.

4. Fix Unauthorized. The token is invalid or absent. Confirm the ServiceAccount still exists, the projected token volume is mounted, and the SA was not recreated (invalidating old tokens). Restart the pod to remount a fresh token.

5. Fix connectivity. A connection refused or timeout means the reflector cannot reach the apiserver. Verify kubernetes.default resolves, the kubernetes Service in default has endpoints, and no NetworkPolicy blocks egress to the apiserver. See the no endpoints available guide.

6. Install missing CRDs. If the type is custom and kubectl api-resources does not list it, install the CRD before the informer can list it.

Prevention and Best Practices

  • Ship every controller/operator with a Role or ClusterRole granting exactly the get/list/watch verbs its informers need — no more, no less.
  • Match RBAC scope to informer scope: cluster-wide informers need ClusterRoleBindings; single-namespace informers need RoleBindings.
  • Validate permissions in CI with kubectl auth can-i ... --as=system:serviceaccount:... so a missing rule fails the pipeline, not production.
  • Alert on repeating reflector.go ... Failed to list log lines — a stuck informer means the controller is operating on a stale or empty cache.
  • Install CRDs before the controllers that watch them (Helm hooks or ordered applies) to avoid startup list failures.
  • Keep ServiceAccount tokens projected and short-lived; do not hardcode long-lived tokens. More patterns in the Kubernetes & Helm guides.

Frequently Asked Questions

Why does the log repeat forever instead of crashing? Reflectors are designed to retry with backoff so a transient outage self-heals. A controller cannot function without a populated cache, so it keeps trying the LIST rather than exiting. Persistent repetition means a real misconfiguration, not a blip.

What does *v1.Pod mean — is the pod itself broken? No. *v1.Pod is the Go type the informer watches. It tells you which resource the controller failed to list, not that any specific pod is unhealthy. Other types appear similarly (*v1.Endpoints, *v1.ConfigMap, CRD types).

The controller has a ClusterRole but still gets forbidden. Why? The ClusterRole exists but is probably not bound to the controller’s ServiceAccount, or it is bound with a RoleBinding (namespaced) while the informer lists cluster-wide. Check the binding and its scope with kubectl get clusterrolebindings | grep <sa>.

How is Unauthorized different from forbidden here? Unauthorized (401) means the apiserver could not authenticate the request — a bad, missing, or expired token. forbidden (403) means it authenticated you but RBAC denies the verb. The fixes are different: token vs RBAC rule.

Could this be a network problem rather than permissions? Yes. A Failed to list ending in dial tcp ... connection refused, i/o timeout, or no route to host is connectivity to the apiserver, not RBAC. Check DNS for kubernetes.default, the kubernetes Service endpoints, and NetworkPolicy egress.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.