Azure Key Vault Secrets and Rotation With AI as a Second Set

A secret in a Key Vault had a not-before date from three years prior and no expiration set. The connection string it held was for a database that had been migrated twice since. Five different applications still referenced it by name, three of them dead. Nobody knew which app actually used the live value, so nobody dared rotate it, so it sat there — a long-lived credential with no expiry, readable by an access policy that granted get and list to a group half the company belonged to. That’s not a Key Vault problem. That’s a nobody-is-looking problem, and Key Vault is full of them.

Key Vault is the right place to put secrets. The failure is everything around it: access policies that grew permissive, secrets that never expire, rotation that’s “on the roadmap,” and references nobody can trace. The data to fix all of this is queryable. AI is good at reading vault metadata and access configs and telling you what’s stale, over-broad, or unrotatable. It does not hold your secrets or rotate them for you. You run the audit and approve the change; it reads the sprawl you’ve been avoiding.

Audit what’s actually in the vault

Start with the secrets’ metadata — you never need the values to find the problems. Expiry, enabled state, and age are all you need:

# Every secret with its expiry and last-updated, no values exposed
az keyvault secret list --vault-name "$VAULT" \
  --query "[].{name:name, enabled:attributes.enabled, expires:attributes.expires, updated:attributes.updated}" \
  -o table

The secrets with expires: null are the ones from my opening story — credentials with no end date. Pipe the full list to AI:

Prompt: “Here is the metadata for every secret in a production Key Vault (names, enabled state, expiry, last-updated). Flag: (1) secrets with no expiration set, (2) secrets not updated in over a year, (3) any naming that suggests duplicates or dead apps (e.g. db-conn-old, db-conn-v2). For each flag, suggest the rotation or cleanup action. Do not ask for the secret values — you don’t need them.”

The AI builds you a remediation list from metadata alone. Reading thirty secrets by eye, you miss the one with the null expiry; AI doesn’t. You verify each flagged secret against which app uses it before touching anything.

Find the over-broad access before someone exploits it

Key Vault has two access models — legacy access policies and Azure RBAC — and the legacy policies are where over-grants hide because they’re vault-wide and easy to over-scope. Pull them:

# Legacy access policies: who can do what to this vault
az keyvault show --name "$VAULT" \
  --query "properties.accessPolicies[].{objectId:objectId, secrets:permissions.secrets, keys:permissions.keys, certs:permissions.certificates}" \
  -o json

# Is the vault on RBAC or legacy policies?
az keyvault show --name "$VAULT" --query "properties.enableRbacAuthorization" -o tsv

Feed the access policies to AI for the threat read:

Prompt: “Here are a Key Vault’s access policies. Flag any policy granting purge, delete, or set on secrets to a group rather than a specific service identity, and any granting all permissions. Explain what an attacker could do with each over-broad grant — note that purge defeats soft-delete recovery. Recommend the minimal permission set for an app that only needs to READ one secret at runtime.”

purge rights are the quiet danger — they let someone permanently destroy a secret past soft-delete recovery, which is a denial-of-service and an evidence-destruction path in one. AI surfaces it from the permissions array; you decide who legitimately needs it (almost nobody).

Move to RBAC, but verify the cutover doesn’t break apps

Migrating from access policies to Key Vault RBAC is the right modernization, but it’s where you break production if you’re sloppy — flip the model and every app whose identity lacks an RBAC role loses access instantly. Map the existing policies to RBAC roles before flipping:

# What identities currently have access (to map to RBAC roles)
az keyvault show --name "$VAULT" --query "properties.accessPolicies[].objectId" -o tsv

Prompt: “I’m migrating this Key Vault from access policies to RBAC. Here are the current access policies. For each identity, recommend the equivalent built-in Key Vault RBAC role — Key Vault Secrets User for read-only secret access, Key Vault Secrets Officer for management, etc. Give me the az role assignment create commands. Then list the exact verification steps to confirm every app still has access BEFORE I disable access policies.”

The verification-before-cutover step is the one people skip and regret. AI maps get+list on secrets to Key Vault Secrets User reliably because it’s a well-defined correspondence, but you run the test that confirms each app’s managed identity can still read its secret. Never flip enableRbacAuthorization on a live vault until that test passes. This is the same human-verifies-the-blast-radius discipline that applies to all Azure RBAC and identity work.

Automate rotation, with the human owning the trigger

The endgame is secrets that rotate themselves. For supported services (storage keys, some databases), Key Vault can rotate on a policy; for the rest, you wire an Event Grid SecretNearExpiry event to a Function that rotates and updates the secret. AI is genuinely useful for drafting the rotation policy and the Function logic — but rotation that fails silently is worse than no rotation, so the design has to fail loud.

# Set a rotation policy / expiry so near-expiry events actually fire
az keyvault secret set-attributes --vault-name "$VAULT" --name "$SECRET" \
  --expires "2026-12-21T00:00:00Z"

Prompt: “Draft the architecture for auto-rotating a database password stored in Key Vault: Event Grid on SecretNearExpiry triggers a Function that generates a new password, updates the database, then writes the new version to Key Vault. Critically: what happens if the database update succeeds but the Key Vault write fails, or vice versa? Design it so a partial failure alerts a human instead of silently leaving apps with a stale secret.”

That ordering question — update the source of truth, verify, then update the vault, and alert on any mismatch — is the difference between rotation that helps and rotation that causes a 3 a.m. outage. AI reasons through the partial-failure modes well when you force it to; you own the decision to ship the design and the alerting that keeps a human in the loop.

The discipline

AI reads vault metadata, audits access, and drafts rotation logic; you approve every grant and every rotation. Secrets are the one place where “let the AI just run it” is genuinely dangerous — so keep the model on analysis and drafting, and keep the apply behind a human and a verification test. The loop: audit metadata for stale and unexpiring secrets, flag over-broad and purge-granting access, map to RBAC and test before cutover, and design rotation to fail loud. Do that and the three-year-old null-expiry secret never happens.

My Key Vault audit prompts live in the prompts library, and there’s more Azure security material in the Azure category. The vault did its job holding the secret; the failure was always that nobody read the sprawl around it. Let the model read it, and keep your hands on the rotation.

Azure Key Vault Secrets and Rotation With AI as a Second Set of Eyes

Audit what’s actually in the vault

Find the over-broad access before someone exploits it

Move to RBAC, but verify the cutover doesn’t break apps

Automate rotation, with the human owning the trigger

The discipline

Download the Free 500-Prompt DevOps AI Toolkit