Building and Importing NixOS AMIs on EC2

Posted on August 30, 2020 by Jack Kelly
Tags: aws, coding, nix

Update (2020-09-03): /u/zimbatm on lobste.rs suggested the nixos-generators project. Added link and brief discussion.

Update (2020-11-29): Added notes on the impact of instance limits on Packer builds.

The NixOS project publishes Amazon Machine Images (AMIs) that are a great base for reproducible servers. This post describes how NixOS and EC2 work together, first showing how to build upon the NixOS project’s public AMIs; and then digging all the way into the scripts maintainers use to build, import and distribute new AMIs on AWS.

Getting a NixOS AMI

There are few good ways to get AMI IDs for NixOS project images:

Configuring NixOS inside the AMI

The NixOS AMIs can rebuild themselves from NixOS configuration in instance user data. To do this, the user data should look something like this:

### https://nixos.org/channels/nixos-unstable nixos
### https://example.com/path/to/another/channel channel-name

{ config, pkgs, ... }:
{
  # Normal NixOS config goes here
}

On each boot the system refreshes its configuration:

If you only want this to happen once, you can set systemd.services.amazon-init.enable = false;. The first boot will always refresh the configuration from user data (because amazon-init is enabled in the AMI), but then turn off the service so it doesn’t happen on subsequent restarts.

How it Works

EC2 instance meta data and user data (if it exists) gets downloaded from the Instance Meta Data Service (IMDS) and applied to a NixOS AMI by the following mechanism:

Customising your NixOS AMIs

Evaluating a full NixOS configuration on each boot can take a lot of CPU and network resources, particularly if it needs to build uncached derivations. This can cause runaway autoscaling if you’re not careful: if autoscaling starts in response to CPU usage and the new instances spend a lot of CPU trying to nixos-rebuild, further autoscaling can happen before the new instances have finished coming online. On T2 instances, it can also burn through your launch credits for no real benefit.

A tool like Packer can help you build and distribute AMIs by customising the base NixOS AMI. The main steps to provision our image are very simple, NixOS gives us declarative OS configuration:

There is one more very important step you must do at the end: make sure the new image responds to its instance meta data and user data when it boots, and not the meta data/user data from when packer booted the NixOS AMI. As the final provisioning action, you must remove all the files created by the EC2 metadata fetcher, any SSH host keys, and most importantly root’s .ssh/authorized_keys file. If you do not do this, you will be locked out of your image.

Here’s a simple packer configuration that provisions a NixOS AMI with git installed:

nixos-packer-example.json
{
  "builders": [
    {
      "type": "amazon-ebs",
      "ami_name": "nixos-packer-example {{timestamp}}",
      "instance_type": "t2.micro",
      "ssh_username": "root",
      "source_ami_filter": {
        "filters": {
          "architecture": "x86_64"
        },
        "most_recent": true,
        "owners": [
          "080433136561"
        ]
      }
    }
  ],
  "provisioners": [
    {
      "type": "file",
      "source": "./configuration.nix",
      "destination": "/tmp/"
    },
    {
      "type": "shell",
      "inline": [
        "mv /tmp/configuration.nix /etc/nixos/configuration.nix",
        "nixos-rebuild switch --upgrade",
        "nix-collect-garbage -d",
        "rm -rf /etc/ec2-metadata /etc/ssh/ssh_host_* /root/.ssh"
      ]
    }
  ]
}
configuration.nix
{ pkgs, ... }:

{
  imports = [ <nixpkgs/nixos/modules/virtualisation/amazon-image.nix> ];
  ec2.hvm = true;
  environment.systemPackages = with pkgs; [ git ];
}

Save the two files to the same directory, and run packer build nixos-packer-example.json from inside it. Remember to clean up any registered AMIs and EBS snapshots when you’re done playing around, otherwise Amazon will charge you to host them.

Instance Resource Limits

nixos-rebuild can easily use all the instance’s disk space, especially when building against more recent nixos channels than the one used to build the base NixOS AMI. You can ask for additional space by adding a launch_block_device_mappings stanza to the amazon-ebs builder:

"launch_block_device_mappings": [
  {
    "delete_on_termination": true,
    "device_name": "/dev/xvda",
    "volume_size": 10,
    "volume_type": "gp2"
  }
]

Some builds (e.g., anything that triggers a rebuild of NixOS documentation) use a lot of memory, and can exhaust the RAM of a t2.micro. If this happens, you’ll see nixos-rebuild (or one of its children) fail with exit code 137 and no useful error message. To fix this, you’ll have to use a larger instance, or create a swap file as a temporary provisioning step.

Building NixOS AMIs from Scratch

Customising NixOS AMIs with a tool like packer lets you prebuild almost-ready-to-go images, and delivering each instance’s configuration.nix via user data creates a very flexible configuration system with reasonable cold-start times. This is probably all you need unless you’re building images for multiple formats (e.g., ISO, EC2 AMI, OpenStack) or hacking on nixpkgs’ image-building support. But if you’re interested in the gory details, read on.

You can build a .vhd virtual HD image using the infrastructure in nixpkgs:

$ nix-build '<nixpkgs/nixos/release.nix>' \
    -A amazonImage.x86_64-linux \
    --arg configuration /path/to/configuration.nix

These builds boot a VM to finish the build, so you will want ample CPU, memory and storage. If building as root (which is the only user on a default NixOS AMI), you’ll probably want to set NIX_REMOTE=daemon so that the build takes place in /tmp.

The nixos-generators project provides a nice wrapper around the expressions in nixpkgs, and a single command to build NixOS images in selected formats. If you’re looking to build the same NixOS config into multiple formats, consider looking into it.

Either way, once you’ve built the .vhd file, you’ll need to get it into S3 so you can import it with Amazon’s VM Import/Export service. It’s probably easiest to do the build on an EC2 instance, to avoid pushing gigabytes of data across the public internet. I used a t3a.medium spot instance when writing this post, and that was fast enough.

(It should also be possible to specify -A amazonImage.aarch64-linux to have nix build an AArch64 image, but I couldn’t make it work. Any tips?)

The configuration argument is not strictly necessary, but if you omit it, you will get a “blank” image like the ones published by the NixOS project. The only real difference is that it will be built against your version of nixpkgs.

Once the build finishes, the symlink result will point to a directory in the nix store that contains the .vhd image, along with a nix-support directory containing image metadata.

Importing the image into an AMI

Once you have built your image, you need to import it into EC2 as an AMI. The tool to do this is VM Import/Export.

VM Import/Export needs an S3 bucket to store the images before triggering the import, and a role specifically called vmimport for the service to use. If you’re just mucking around, you might want to try the following CloudFormation template to create an S3 bucket and the necessary role:

template.yaml for VM Import
AWSTemplateFormatVersion: 2010-09-09
Description: Bucket and roles for VM import
Resources:
  VMImportBucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketEncryption:
        ServerSideEncryptionConfiguration:
          - ServerSideEncryptionByDefault:
              SSEAlgorithm: AES256
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true

  VMImportExportServiceRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: vmimport
      Description: Service role for VM import/export
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service: vmie.amazonaws.com
            Action: sts:AssumeRole
            Condition:
              StringEquals:
                sts:Externalid: vmimport
      Policies:
        - PolicyName: vmimport
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - s3:GetBucketLocation
                  - s3:GetObject
                  - s3:ListBucket
                Resource:
                  - !GetAtt VMImportBucket.Arn
                  - !Sub "${VMImportBucket.Arn}/*"
              - Effect: Allow
                Action:
                  - ec2:ModifySnapshotAttribute
                  - ec2:CopySnapshot
                  - ec2:RegisterImage
                  - ec2:Describe*
                Resource: "*"

There are a few things to note before you use the template.yaml in your own environment:

Once you have the role and bucket set up, you can import your NixOS image as an EBS snapshot and then register it as an AMI. The NixOS maintainers use a script from nixpkgs at nixos/maintainers/scripts/ec2/create-amis.sh to release the new AMIs, but it does more than we need for our experiments. It:

For tinkering, it’s probably enough to comment out the calls to make_image_public, and also comment out the loop in upload_all that iterates across the regions and copies the AMI.

Conclusion

I think customising the NixOS project’s images with a tool like packer and then configuring instances with custom configuration.nix user data is a very solid way to get started with NixOS on EC2. If you need to ship the same NixOS config in multiple image formats, or you have extremely unusual configuration needs, nixpkgs provides great tooling for fully-declarative image specifications. Odds are you probably won’t need this level of control, but it’s still interesting to see how the sausage is made.

Previous Post
All Posts | RSS | Atom
Next Post
Copyright © 2024 Jack Kelly
Site generated by Hakyll (source)