Docker and other containerization technologies are making the rounds in the Linux community, but a lowly hero lurks beneath now every major Linux distro; enter systemd-nspawn containers.

systemd has arrived with mixed reviews -- you either love it or you hate it -- but one thing stands for certain: it gets the job done. Regardless of your emotions projected towards it, systemd is likely here to stay for a while, so you might as well exploit as many features as you can out of it.

systemd-nspawn containers are akin to FreeBSD Jails more than Docker containers. They're basically just a fancy chroot with some handy built-in integrations with systemd. You can start, stop, enable, and disable the containers as if they were regular services.

Keep in mind, however, that by its own admission, systemd-nspawn is an experimental feature that hasn't been thoroughly tested or audited. There are no guarantees of security or stability; it's probably best to keep them out of production for the time being. That said, here at BlackieOps we've been using systemd-nspawn containers for Jenkins, Stash, and Jira for a while now without any issues.

Creating the container

We will be working on a fresh install of CentOS 7, but this process is possible on any system using systemd. The only difference will be the first package installation step. Obviously, on Debian you will not be using yum, and on Fedora your repo names and release versions will be different (and you'll be using dnf)…

First step is to "install" a new root filesystem into a directory.

# yum -y --releasever=7 --nogpg --installroot=/var/lib/machines/cool-container \
  --disablerepo='*' --enablerepo=base install systemd passwd yum centos-release

This will create a directory, /var/lib/machines/cool-container, and populate it with a new root filesystem and a couple core packages.

Second, we need to enter into the container to set up some basic things like the root password. We can use the systemd-nspawn command directly for this:

# systemd-nspawn -D /var/lib/machines/cool-container

This drops you to a shell in the container without actually "booting" anything inside of it (think of it as like chroot-ing from a recovery mode. From here, we can set the root password so we can log in later.

When you're done, just ^D out as usual and you'll be dropped back to your host machine.

Aside: kernel auditing and containers

If you ignore this section and continue trying to boot the container, you will likely get a warning before the container starts about the kernel auditing subsystem. There are supposedly odd bugs that can surface if auditing is enabled, so we're just going to disable it. If this worries you, feel free to inspect the issue further, but since this is not a production system it's probably fine.

We just need to add a flag to the kernel parameters in our bootloader. This will vary between distros, but for CentOS it's as easy as editing the /etc/sysconfig/grub file and changing the GRUB_CMDLINE_LINUX variable by appending audit=0 to the list of parameters.

After editing the parameters, we'll need to regenerate our GRUB configuration:

# grub2-mkconfig -o /etc/grub2.cfg

Configuring the base system

We now have a skeleton of a container installed, but we still need to actually configure what's inside of it, and get it prepped to start automatically, or at least as a service from systemd.

Since we now have access to the root account, we can fully "boot" the container:

# systemd-nspawn -bD /var/lib/machines/cool-container

The -b flag is short for --boot and basically means systemd-nspawn will search for an init binary and execute it. You'll see the standard boot log fly by, and then be dropped at a standard PTY login prompt. Log in with the root credentials you set up previously, and now we can start installing things as if we were on a brand new machine.

Once you have your container set up and everything is running and configured, you can exit by "shutting down" the container as if it was a physical machine: poweroff or halt (or whatever you usually use).

Managing the container

While the /var/lib/machines prefix at the beginning may have seemed arbitrary, in fact it was intentional -- containers in this directory will be auto-discovered by systemd and we can enable and manage them automatically.

To have your container start with everything else when your host boots:

# systemctl enable systemd-nspawn@cool-container

And as you can perhaps guess, we can start and stop our container just as any other service:

# systemctl start systemd-nspawn@cool-container
# systemctl stop systemd-nspawn@cool-container

Accessing the container

Accessing a running container can be a bit tricky; one option is to install openssh in the container and have it run on a non-standard port (as containers share the host's network interfaces). Alternatively, you can access the machine through machinectl.

Just running machinectl without arguments will list all running containers (and other VMs, etc). Interestingly, the older version of machinectl on CentOS does not allow us to use the login argument (so you may want to install openssh)... If you're on Fedora (or a different more up-to-date distro), we can use machinectl login command:

# machinectl login cool-container

... which will drop us at that familiar PTY prompt.

Since we don't necessarily want to halt the container to escape from this prompt, there is a panic button to disconnect: hit escape three times within a second (i.e., fast).

In conclusion, systemd-nspawn is an interesting technology that shows promise. Its ubiquity through the proliferation of systemd means containers are quite portable, easy to set up, and well-integrated directly into the OS's init system.

Would I use it in production? Probably not. It's a very green technology and its immaturity is worrisome enough that my sleep cycles would be lessened dramatically by its deployment. For production "containers", FreeBSD Jails still provide the best security and featureset.

For now, systemd-nspawn is staying on my internal infrastructure, running my Atlassian stack, Jenkins, etc.; and it is running those internal services quite well. But until its features are more solidified and someone has verified it is at least moderately secure, it won't be finding its way to my production stack for a few years yet.