Grafana Dashboards as Code with Grafonnet: A GitOps Workflow

I once inherited a Grafana instance with 140 dashboards and exactly zero source of truth. Someone had clicked them into existence over three years, exported a few to JSON when they remembered, and let the rest drift. When a panel broke during an incident, nobody could tell whether the query was wrong or whether the dashboard had quietly mutated under someone’s mouse. That night I made a decision I’ve never regretted: every dashboard I own from now on lives in Git, gets generated from code, and never gets edited in the UI as the canonical copy. This post is the workflow I landed on, built on Grafonnet and jsonnet.

Why hand-edited JSON and UI clicking don’t scale

Grafana dashboards are JSON. You can edit that JSON by hand, and you can certainly build dashboards by clicking around the UI and hitting export. Both work for one dashboard. Neither works for fifty.

The JSON is enormous, deeply nested, and full of fields you don’t care about (gridPos, id, fieldConfig.defaults.thresholds.steps…). A two-panel change produces a 600-line diff. Reviewers can’t tell signal from noise, so they rubber-stamp it. Worse, there’s no reuse: if your standard “p99 latency” panel needs to change, you’re editing it in every dashboard that copied it.

UI-clicking has the same problem plus a sharper edge: the dashboard in Grafana and the dashboard in your repo silently diverge the moment someone tweaks a threshold live during an incident. Now your “source of truth” is a lie.

Dashboards-as-code fixes both. You write a small, readable program; jsonnet expands it into the verbose JSON Grafana wants; and the program is what you review. If this sounds adjacent to the case for recording rules that make queries fast, it is — both are about treating your observability config as real, versioned engineering artifacts.

Grafonnet basics: panels as functions, not JSON

Grafonnet is a jsonnet library that gives you typed builders for Grafana objects. Instead of remembering JSON field names, you compose functions. Here’s a minimal dashboard with one time series panel:

local g = import 'g.libsonnet';  // grafonnet entrypoint
local dashboard = g.dashboard;
local timeSeries = g.panel.timeSeries;
local prometheus = g.query.prometheus;

dashboard.new('API Overview')
+ dashboard.withUid('api-overview')
+ dashboard.withTags(['api', 'generated'])
+ dashboard.withRefresh('30s')
+ dashboard.withPanels([
  timeSeries.new('Request rate (req/s)')
  + timeSeries.queryOptions.withTargets([
    prometheus.new(
      '$datasource',
      'sum(rate(http_requests_total{job="$job"}[5m]))',
    )
    + prometheus.withLegendFormat('total'),
  ])
  + timeSeries.gridPos.withW(12) + timeSeries.gridPos.withH(8),
])

That + is jsonnet’s deep-merge operator — each builder returns a partial object and you compose them. The whole thing is data, so you can loop over it, parameterize it, and import shared pieces.

That PromQL target is worth lingering on. The query sum(rate(http_requests_total{job="$job"}[5m])) uses a $job template variable, so one panel serves every service. Get the query right once, in code, reviewed once.

Factor out the repeated panels

The real payoff is reuse. Define your house-style panels once and import them everywhere:

// lib/panels.libsonnet
local g = import 'g.libsonnet';
local timeSeries = g.panel.timeSeries;
local prometheus = g.query.prometheus;

{
  errorRate(job)::
    timeSeries.new('Error rate (%)')
    + timeSeries.standardOptions.withUnit('percent')
    + timeSeries.queryOptions.withTargets([
      prometheus.new(
        '$datasource',
        |||
          100 * sum(rate(http_requests_total{job="%(job)s",code=~"5.."}[5m]))
          / sum(rate(http_requests_total{job="%(job)s"}[5m]))
        ||| % { job: job },
      ),
    ]),

  p99Latency(job)::
    timeSeries.new('p99 latency (s)')
    + timeSeries.standardOptions.withUnit('s')
    + timeSeries.queryOptions.withTargets([
      prometheus.new(
        '$datasource',
        'histogram_quantile(0.99, sum by (le) '
        + ('(rate(http_request_duration_seconds_bucket{job="%s"}[5m])))' % job),
      ),
    ]),
}

Now any dashboard is three lines plus a list of panels:

local panels = import 'lib/panels.libsonnet';
local g = import 'g.libsonnet';

g.dashboard.new('Checkout Service')
+ g.dashboard.withPanels([
  panels.errorRate('checkout'),
  panels.p99Latency('checkout'),
])

Change p99Latency once and every dashboard that uses it updates on the next build. That is the thing the UI can never give you.

Pro Tip: Add templating with g.dashboard.variable.query.new('job', 'label_values(http_requests_total, job)') and a datasource variable. Parameterizing the datasource means the same generated JSON works across staging and prod without find-and-replace.

Generate JSON and provision it from Git

jsonnet turns the program into Grafana’s JSON. With the jsonnet-bundler managing the Grafonnet dependency, the build is one command:

jsonnet -J vendor -m dashboards/generated dashboards/checkout.jsonnet

Then let Grafana load those files itself via file-based provisioning. Grafana watches a directory and imports anything it finds — no API calls, no manual import:

# /etc/grafana/provisioning/dashboards/dashboards.yaml
apiVersion: 1

providers:
  - name: 'generated-dashboards'
    orgId: 1
    folder: 'Generated'
    folderUid: generated
    type: file
    disableDeletion: true
    editable: false
    updateIntervalSeconds: 30
    allowUiUpdates: false
    options:
      path: /var/lib/grafana/dashboards/generated
      foldersFromFilesStructure: true

editable: false and allowUiUpdates: false are the load-bearing lines: they make the UI read-only for these dashboards, so nobody can fork the source of truth with a stray click. Your CI pipeline syncs the generated JSON into path, Grafana picks it up within 30 seconds, done.

Review the diff, not the dashboard

Because the generated JSON is deterministic, you commit it (or build it in CI) and review changes, not whole files. A good pipeline runs three checks on every pull request:

# .github/workflows/dashboards.yml (excerpt)
jobs:
  dashboards:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Lint
        run: jsonnet-lint -J vendor dashboards/*.jsonnet
      - name: Build
        run: |
          for f in dashboards/*.jsonnet; do
            jsonnet -J vendor -m dashboards/generated "$f"
          done
      - name: Fail if generated JSON is stale
        run: git diff --exit-code dashboards/generated

That last step is the whole game. If someone edits the .jsonnet but forgets to regenerate, CI fails. If they hand-edit generated JSON, CI fails. The reviewer reads a tight diff of the jsonnet source — “p99 threshold went from 0.5 to 0.3” — instead of scrolling 600 lines of merged objects.

Where AI fits — and where it doesn’t

Grafonnet’s API is wide, and remembering whether it’s withUnit or withFormat, or how the latency histogram quantile query should be shaped, is exactly the kind of lookup that slows you down. This is where I lean on an AI assistant — through Cursor or Claude — to draft the jsonnet. Describe the panel (“p99 and p50 latency, stacked error rate, templated by job”) and you get a believable first pass in seconds.

Treat that output the way you’d treat a fast, eager junior engineer’s: useful, fast, and absolutely not to be merged unread. The model will confidently invent a Grafonnet function that doesn’t exist, or write a PromQL histogram_quantile with the by (le) clause in the wrong place — a query that parses fine and returns subtly wrong numbers. The dashboards-as-code workflow is what makes that safe: the AI’s draft becomes a reviewable diff, the linter catches the bad function name, and a human reads the PromQL before it ships. The output has to be explainable before it’s deployable.

Pro Tip: Pair AI-drafted dashboards with AI-drafted alerts, but review both the same way. Our free Alert Rule Generator turns a plain-English SLO into deterministic Prometheus alert YAML you can drop into the same Git repo as your Grafonnet — alerts and dashboards reviewed together, shipped together.

If you want reusable prompts for generating and reviewing this kind of config, the prompt library and prompt packs have starting points tuned for observability work.

Conclusion

Dashboards-as-code isn’t about being fancy — it’s about making your dashboards survivable. Grafonnet gives you reuse and readability, jsonnet gives you deterministic JSON, provisioning makes Grafana load it without humans, and CI turns every change into a small, reviewable diff. AI accelerates the writing, but the diff is what you actually ship, and the diff is what you review. For more on building dashboards people actually use, see this companion post, and browse the rest of the Prometheus monitoring category for the alerting and query-performance pieces that round out the stack.

Grafana Dashboards as Code with Grafonnet: A GitOps Workflow That Scales