Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Infrastructure as Code By James Joyner IV · · 9 min read

NixOS for Servers: Truly Reproducible Infrastructure

Most IaC describes desired state and hopes the package manager cooperates. NixOS makes the entire OS a single declarative artifact you can roll back instantly.

  • #iac
  • #nixos
  • #nix
  • #reproducibility
  • #immutable
  • #declarative

Every other tool in the IaC world describes part of a server and trusts the rest to behave. Ansible configures packages but relies on apt’s view of the world. Terraform provisions the VM but hands off to a provisioner. There’s always a gap where “desired state” meets “whatever the package manager actually did,” and that gap is where drift and “works on the old box but not the new one” live.

NixOS closes the gap. The entire operating system — kernel, packages, services, users, config files — is described by a single declarative expression, and the build is reproducible: the same expression produces a bit-for-bit identical system. After running NixOS servers for a while, here’s why it’s worth the steep learning curve and how to actually use it.

The whole machine is one config file

A NixOS system is defined by configuration.nix. This isn’t a package list layered on top of a base image — it is the system definition.

{ config, pkgs, ... }:
{
  imports = [ ./hardware-configuration.nix ];

  boot.loader.grub.device = "/dev/sda";
  networking.hostName = "web-01";

  # Services are first-class, declarative, and typed
  services.nginx = {
    enable = true;
    virtualHosts."example.com" = {
      forceSSL = true;
      enableACME = true;
      root = "/var/www/example";
    };
  };

  services.openssh = {
    enable = true;
    settings.PasswordAuthentication = false;
  };

  environment.systemPackages = with pkgs; [ git htop jq ];

  users.users.deploy = {
    isNormalUser = true;
    openssh.authorizedKeys.keys = [ "ssh-ed25519 AAAA..." ];
  };

  system.stateVersion = "24.05";
}

There is no separate “install nginx” then “configure nginx” then “enable the service.” It’s one declaration. nixos-rebuild switch makes the running system match it, and anything not in the config is not on the system. That last property is the killer feature: there’s no accumulated cruft, because the config is the complete truth.

Atomic upgrades and instant rollback

This is the part that converts skeptics. Every nixos-rebuild switch builds a new system generation and adds it as a boot entry. Upgrades are atomic — they fully succeed or leave the previous generation untouched. And rollback is genuinely instant:

# Apply a new config
sudo nixos-rebuild switch

# It broke. Roll back the entire OS to the previous generation:
sudo nixos-rebuild switch --rollback

# Or pick a generation at the boot menu — every past state is bootable
nixos-rebuild list-generations

You’re not “undoing” changes the way you’d try to with Ansible. The previous generation still physically exists on disk; you’re just pointing the system at it. A bad kernel upgrade, a broken service config, a dependency that went sideways — one command and you’re back, with zero reimaging.

Flakes: pinning the world

The reproducibility guarantee only holds if your inputs are pinned, and that’s what flakes do. A flake.nix pins the exact nixpkgs revision (and any other inputs) via a lockfile, so a build today and a build in two years produce the same result.

{
  inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-24.05";

  outputs = { self, nixpkgs }: {
    nixosConfigurations.web-01 = nixpkgs.lib.nixosSystem {
      system = "x86_64-linux";
      modules = [ ./configuration.nix ];
    };
  };
}

flake.lock records the exact commit hash of nixpkgs. Commit it. Now “reproducible” isn’t aspirational — the locked inputs plus the config deterministically define the entire OS. This is a stronger guarantee than any other IaC tool offers.

Deploying to fleets

You don’t SSH into each box and rebuild. Tools like deploy-rs, colmena, or plain nixos-rebuild --target-host push a configuration to remote machines, build it (locally or on the target), and switch them atomically — with rollback if a health check fails.

# Build locally, push the closure, switch the remote host
nixos-rebuild switch \
  --flake .#web-01 \
  --target-host deploy@web-01.example.com \
  --use-remote-sudo

Because the build is reproducible, what you tested in CI is byte-identical to what lands on the server. No “the build machine had a different library” surprises.

Where AI fits

The honest downside of NixOS is the Nix language — it’s functional, lazily evaluated, and unfamiliar, and the error messages are notoriously cryptic. This is exactly where an assistant earns its place: translating “I want nginx with ACME and a Postgres service” into valid module config, and decoding a wall-of-text Nix evaluation error into the actual problem. I keep a few NixOS prompts for generating service blocks and debugging flake errors, then always nixos-rebuild build (which builds without switching) to verify before deploying.

Where it fits and where it doesn’t

NixOS shines for servers you control end-to-end: reproducible fleets, build agents, infrastructure where rollback safety is worth real money. It’s a poor fit when you’re forced onto vendor images, when your team can’t invest in learning Nix, or when most of your stack is managed cloud services rather than servers.

But for the machines it fits, nothing else comes close to the guarantee: the entire OS as one versioned, pinned, reproducible artifact, with atomic upgrades and instant whole-system rollback. Once you’ve recovered from a bad deploy with a single --rollback, the rest of the IaC world feels like it’s held together with hope. For the broader landscape, see our Infrastructure as Code category.

Generated Nix configurations are assistive, not authoritative. Always nixos-rebuild build to validate, and test on one host before pushing to a fleet.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.