Creating a Multi-System Modular NixOS Configuration with Flakes

Kate Brock
~11 minutes
Tags:
The reasoning and development of my multi-system modular NixOS configuration, that serves multiple LXCs and VMs on Proxmox, as well as my personal laptop.

In the life of my home server I have gone through many iterations of configuration management. I started using Docker compose stacks on a single Debian host with every other bit of maintenance done via ssh. Eventually I wised up, and shifted over to Ansible - writing complex playbooks to configure the machine and deploy my Docker stacks. But hey actually LXCs are cool - maybe I can put stuff in LXCs, which works well with Ansible and it’s multi-host support. And so on and so on it spiraled until I eventually landed on Proxmox with multiple Debian LXCs and VMs. But I could never quite shake the feeling that Ansible just wasn’t quite the right tool for me. I found myself time and time again fighting against configuration drift, writing hacky scripts when Ansible modules didn’t do what I want (and I just didn’t have the motivation or time to write modules myself). Every time I used Ansible I just knew there had to be a better solution but I just didn’t know what it was!

Eventually, I stumbled across NixOS and the absolute nerd in me saw the possibilities immediately. The idea of defining everything in my server configurations in a single repo, with minimal hacky scripts required just spoke to me. And the more I used it the more I wanted to convert anything and everything over to NixOS. But how do I manage multiple servers and physical devices with NixOS? Well, that’s what this post is all about.

Now, in this article I’m intentionally not going to go into too much detail on some more advanced topics; I won’t be talking about home-manager or secrets management, but it is all there in my GitHub repo for you to look into if you’re interested, and I am itching to write up more detail about secrets management with Sops! What I will be talking about is how I setup and structured my NixOS configs, and how that led to a simple, modular configuration with custom defaults deployed out to multiple LXCs, VMs, and laptops.

Admittedly there is a lot of code and some documentation out there that shows how to do this, but I found there is a real lack of explanations, and why someone would set it up in these ways, which is why I wanted to compile my findings.

Note: This article is aimed towards people with a somewhat entry-level understanding of Nix and Nix Flakes, you should know roughly how Nix works and what Nix Flakes are.

Goals

My goals when setting up my NixOS configuration was simple; create a single unified repo that contained the configuration for each of my LXCs and VMs as well as supporting any laptops or desktops.

I could break this down into a handful of discrete goals

  1. Multiple machine support

    I want to be able to configure each of my machines from one repo. With (ideally) one command.

  2. Shared base configuration

    Each of the machines should share the absolute essentials - user configuration, utilities, shell setup.

  3. Maintenance and updates should be hassle free

After doing some research into NixOS, I quickly discovered that the best - and perhaps only feasible - approach was to use flakes.

So after a bit of time learning Nix and taking a look at how other people have used flakes this is what I ended up with.

Structure

Note: The content in this post refers to my NixOS configuration as it was when I wrote it. This may change, so if you’re referring to my repo for guidance, make sure to use it as of this commit

The general structure of my flakes is actually pretty simple. We have a single flake.nix that defines each of our systems, these systems import common modules which I want every system to use (ZSH configs, SSH keys etc.), and then any host specific modules and configurations, and finally any roles or modules that this system is going to use.

Okay so immediately some of you are going to have a question - what the hell are roles? Do you mean modules?

And yes, roles aren’t really a thing, it’s just a term I use to describe a larger set of modules that have a certain specific purpose - the kind of thing you look at a machine and go “ah yes, that is the proxy machine”, it uses the proxy role. Obviously a specific system can have multiple roles, but you get the gist.

In this article I’ll use the following definitions

Module: A small discreet, standalone piece of configuration that does one single thing (think ZSH configs)

Role: A much larger set of one or more modules which when combined fulfil the entire purpose of that role. Think an authentication server with Authelia, an LDAP database, backups, and maybe a Regis cache. Or even a media-downloading stack with Sonarr, Radarr, and qBittorrent.

File Structure

The file structure of my repo is relatively simple. An abridged version of the file structure can be seen below.

├── home-manager
│   ├── common
│   └── roles
│       └── ...
├── hosts
│   ├── physical
│   │   ├── laptop
│   │   │   ├── default.nix
│   │   │   └── hardware-configuration.nix
│   └── server
│       ├── server-1
│       │   └── default.nix
│       └── ...
├── nixos
│   ├── common
│   │   └── default.nix
│   ├── modules
│   │   ├── ...
│   │   └── ...
│   └── roles
│       ├── physical
│       │   └── ...
│       └── server
│           ├── lxc
│           │   └── default.nix
│           ├── proxy
│           │   └── default.nix
│           ├── vm
│           │   └── default.nix
│           └── default.nix
│── users
│   └── ...
├── base.nix
├── flake.lock
├── flake.nix
└── hosts.nix

While this might look quite intimidating due to the sheer number of files, it is actually incredibly simple. Lets break it down a little bit.

base.nix defines the bare minimum configuration required to setup a system which must be entirely standalone, requiring no flake inputs or external modules. I use this in combination with a module golden.nix to provision my LXCs and VMs from a golden image, but this is out of scope for this post.

nixos/common/default.nix is essentially the same as base.nix, however it can rely on flake inputs and external modules. The reasoning for splitting this out is to allow for a generic configuration that relies on external inputs. This is particularly helpful when I’m setting up sops-nix as I can define shared secrets and keys easily. Splitting this up also allows me to define certain packages and configurations that I simply don’t need in a minimal configuration. If you don’t care about having these split up, just get rid of base.nix and use nixos/common/default.nix instead.

flake.nix is the central point of our system configuration. It takes a defined system configuration and imports the relevant modules and roles for that system. No matter what, all systems imports the base.nix file, and also imports the nixos/common/default.nix module to define some simple basics (users, common configs, etc.). It also imports the relevant host configuration for the system. For example, my laptop imports the hosts/physical/laptop/default.nix module, which defines any host-specific configurations (setting up fingerprint sensor, hard drive layout, etc.)

hosts.nix is basically a library that contains our function that flake.nix uses to calculate what to import. We’ll discuss this later, but won’t go into too much detail - if you’re interested take a look at the repo!

From there, it’s pretty straight forward. If a NixOS host uses any custom modules, they are imported from nixos/modules/[module], if they use a role it’s imported from nixos/roles/[type]/[role]. The same goes for home-manager roles, except the directory root is home-manager/. Again, we won’t be going into any more detail for home-manager.

flake.nix

Now lets look at how we tie all this together. Here is an abridged version of the flake.nix file.

{
  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";
    sops-nix.url = "github:Mic92/sops-nix";
    home-manager = {
      url = "github:nix-community/home-manager";
      inputs.nixpkgs.follows = "nixpkgs-stable";
    };
  };

  outputs = {
    self,
    nixpkgs,
    home-manager,
    ...
  } @ inputs: let
    inherit (self) outputs;

    commonInherits = {
      inherit (nixpkgs) lib;
      inherit inputs outputs nixpkgs home-manager;
    };

    # Set the primary/default user. Can be overwritten on a system level
    user = "kate";

    systems = {
      aurora = {
        systemType = "physical";
        roles = [
          /physical/desktop/gnome
        ];
        hmRoles = [
          /desktop/gnome
          /sops-management
        ];
        extraImports = [
          inputs.nixos-hardware.nixosModules.lenovo-thinkpad-e14-amd
        ];
      };

      # Server
      auth-01 = {
        systemType = "server";
        serverType = "lxc";
        roles = [
          /server/auth
        ];
      };

      prox-01 = {
        systemType = "server";
        serverType = "vm";
        roles = [
          /server/proxy
        ];
      };
    };

    mkSystem = host: system:
      import ./hosts.nix (commonInherits
        // {
          hostName = "${host}";
          user = system.user or user;
          serverType = system.serverType or null;
        }
        // system);
  in {
    nixosConfigurations = nixpkgs.lib.mapAttrs mkSystem systems;
  };
}

On first glance, this looks like a pretty standard flake; we define some inputs, and define some outputs. However, looking closer, we see that the systems attrSet contains each of our systems, rather than a top level nixosConfigurations output.

    mkSystem = host: system:
      import ./hosts.nix (commonInherits
        // {
          hostName = "${host}";
          user = system.user or user;
          serverType = system.serverType or null;
        }
        // system);

To simplify this to the point of being almost incorrect, this iterates over each of the systems in the systems attribute, and sets common configurations options: hostname, users, and any shared modules which all systems should have (commonInherits). Essentially, we’ve replaced repeating the nixosSystem attributes by creating a generator function in hosts.nix. Really, all this is is just a way of avoiding a pitfall I kept falling into: I kept getting lots of repetition and updating shared configuration became a hassle.

One key point to note here: I am only importing the modules a specific system uses. This is slightly against the normal Nix approach of “import all modules; enable the ones you’re using” but personally I found that a bit tiresome and required either intense documentation or digging through the code to work out what you need to enable. At least this way I can tell at a glance what each node is doing by what modules it imports!

hosts.nix

Lets have a quick look at the hosts.nix file. I don’t want to go into too much detail since it can be dense, but essentially what this is doing is taking the inputs we’re passing to it, and creating the nixosSystem based on that.

{
  inputs,
  nixpkgs,
  home-manager,
  hostName,
  user,
  systemType ? "physical",
  serverType ? null,
  roles ? [],
  hmRoles ? [],
  extraImports ? [],
  ...
}:
# Inspired by https://github.com/Baitinq/nixos-config/blob/31f76adafbf897df10fe574b9a675f94e4f56a93/hosts/default.nix
let
  commonNixOSModules = hostName: systemType: [
    # Set common config options
    {
      networking.hostName = hostName;
      nix.settings.experimental-features = ["nix-command" "flakes"];
    }

    # Include our host specific config
    ./hosts/${systemType}/${hostName}

    # Absolute minimum config required
    ./base.nix

    # Include our shared configuration
    ./nixos/common

    # Add in sops
    inputs.sops-nix.nixosModules.sops
  ];

  mkNixRoles = roles: (map (n: ./nixos/roles/${n}) roles);
  mkHMRoles = roles: (map (n: ./home-manager/roles/${n}) roles);

  mkHost = hostName: user: systemType: serverType: roles: hmRoles: extraImports:
    nixpkgs.lib.nixosSystem {
        system = "x86_64-linux";

        modules =
            # Shared modules
            commonNixOSModules hostName systemType
            # Add all our specified roles
            ++ mkNixRoles roles
            # Add all Home-Manager configurations + specified HM roles
            ++ [
            home-manager.nixosModules.home-manager
            {
                home-manager.extraSpecialArgs = {
                  inherit inputs;
                  inherit (inputs) nix-colors;
                  inherit user;
                };

                home-manager.useUserPackages = true;
                home-manager.users.${user}.imports =
                [
                    ./home.nix
                    ./home-manager/common
                ]
                # Add specified home-manager roles
                ++ mkHMRoles hmRoles;
            }
            ];
        specialArgs = {
            inherit inputs;
            inherit user;
        };
    };
in
  mkHost hostName user systemType serverType roles hmRoles extraImports

Theres a lot of Nix-iness in this, so it may be best to do some research into how this all fits together if it’s unclear!

By using this generator function, we can have systems with an incredibly consistent base, a clear definitions of system roles, and allows for an extraordinarily modular system. This allows us to incredibly simply setup a brand new configuration for a system; all we need to do is define any host-specific configuration, and then just select which modules and roles that system should use!

Conclusion

By putting all of these parts together, I’ve ended up with exactly what I was looking for in a NixOS configuration:

  • It easily supports multiple machines, and is incredibly easy to add additional systems
  • Each machine shares basic configuration to allow for common defaults no matter which machine I’m using
  • Adding and removing roles and modules to a system is trivial
  • And by leveraging NixOS Flakes, updates are pretty much hassle free (given the ease of rollbacks)

I’ve been using this setup for several months and I don’t see how I could ever go back to Ansible (for this purpose). The idea that everything is just in one configuration where I can see exactly every change in configuration and iteration, with minimal side-effects from rolling-back just works perfectly for me.

Originally I was just intending to convert my server configurations over to NixOS, but I became obsessed with migrating everything to it - I’m currently typing on my NixOS laptop and I’m eyeing off converting my gaming PC to NixOS… The idea that I can have full declarative systems, easy roll backs, and secrets management has led to me being obsessed.

Hopefully this article can provide someone else with the basic frameworks and structure of how you can setup multi-host declarative NixOS flakes.

Future Plans

With this modular approach and my current obsession I have a lot of future plans with NixOS. I’m looking to implement easy deployments using colmena as well as setting up Automated CI/CD pipelines to run full-blown integration tests in Nix VMs.

Keep an eye out for the inevitable blog posts outlining those!

About the Author

Kate is a Software Engineer based on Kaurna land in South Australia. She's passionate about reproducible environments, and is often tinkering with her home server running NixOS.