Splitting Warewulf Images Between PXE and NFS
This article was also published via the CIQ blog on 6 December 2022.
Warewulf 4 introduced compatibility with the OCI container ecosystem, which greatly streamlines the process of defining, importing, and maintaining compute node images compared to other systems--even compared to Warewulf 3! But one aspect of compute node images remains unchanged: they can quickly grow in size.
Warewulf (and the technique of PXE-booting a node image more broadly)
expects that a compute node image will remain relatively small. Larger
sets of software, like you might provide via an Environment Modules
stack or, perhaps, via Spack, are typically deployed via a central NFS
share, which is then mounted at runtime by the booted compute
node. Even OpenHPC, with software packaged as operating system
containers, supports this paradigm, with packages installed on the
head node, landing in /opt
, and then being shared from the head node
to compute nodes.
However, there are still benefits to maintaining this software as part of a compute node image; but such a large image can quickly grow to tens of gigabytes, making network booting difficult.
In this article I'll demonstrate how a full software stack can be managed together with a given compute node image, but the resultant payload can be split in-place between PXE-served netbooting and an NFS-mounted file system.
NOTE
This procedure depends on support for /etc/warewulf/excludes
, which
was broken in Warewulf v4.3.0.
The root image
First, I start with the standard Rocky Linux 8 image as published by HPCng.
[root@wwctl1 ~]# wwctl container import docker://docker.io/warewulf/rocky:8 rocky-8-split
Installing some software
Using the OpenHPC project as a source, I install a set of typical
scientific software. Most OpenHPC packages install software in /opt
for distribution via NFS, which is what we're going to do: just a
little bit differently than usual.
[root@wwctl1 ~]# wwctl container shell rocky-8-split [rocky-8-split] Warewulf> dnf -y install 'dnf-command(config-manager)' [rocky-8-split] Warewulf> dnf config-manager --set-enabled powertools [rocky-8-split] Warewulf> dnf -y install epel-release http://repos.openhpc.community/OpenHPC/2/CentOS_8/x86_64/ohpc-release-2-1.el8.x86_64.rpm [rocky-8-split] Warewulf> dnf -y install valgrind-ohpc {netcdf,pnetcdf,hypre,boost}-gnu9-mpich-ohpc
After installing the software our image is approaching 2GB. This isn't egregious (and the compressed image as sent over the network is even smaller), but gives us a point of comparison for what comes next.
[root@wwctl1 ~]# du -h /var/lib/warewulf/container/rocky-8-split.img{,.gz} 1.8G /var/lib/warewulf/container/rocky-8-split.img 651M /var/lib/warewulf/container/rocky-8-split.img.gz
Excluding the software from the final image
Warewulf consults /etc/warewulf/excludes
within the image itself to
define files that should not be included in the built image. For our
example here, I exclude the full contents of /opt/
, in anticipation
that we'll be mounting it via NFS in stead.
[rocky-8-split] Warewulf> cat /etc/warewulf/excludes /boot /usr/share/GeoIP /opt/*
Rebuilding the image with /opt/*
excluded, the image is reduced in
size, and further software installation would no longer increase the
final size of the image delivered over PXE.
[root@wwctl1 ~]# du -h /var/lib/warewulf/container/rocky-8-split.img{,.gz} 1.1G /var/lib/warewulf/container/rocky-8-split.img 483M /var/lib/warewulf/container/rocky-8-split.img.gz
Exporting the software via NFS
With the software in /opt
excluded from the image, we need to export
it via NFS in stead. This is relatively easily done, though we must
discover and hard-code paths to the container directory.
[root@wwctl1 ~]# readlink -f $(wwctl container show rocky-8-split)/opt /var/lib/warewulf/chroots/rocky-8-split/rootfs/opt
Add an NFS export to /etc/warewulf/warewulf.conf
, restart the
Warewulf server, and configure NFS with wwctl
. Note that I've
specified mount: false
for this export, as I want to control which
nodes will mount it: presumably nodes that aren't using this image
should not mount this image's software.
nfs: export paths: - path: /var/lib/warewulf/chroots/rocky-8-split/rootfs/opt export options: rw,sync,no_root_squash mount: false
[root@wwctl1 ~]# systemctl restart warewulfd [root@wwctl1 ~]# wwctl configure nfs
Mounting the software on the compute node
We can mount this new NFS share just like any other, by listing it in fstab
.
Warewulf typically configures fstab
as part of the wwinit
overlay. In order to mount this NFS share without setting mount:
true
for all nodes, I copy fstab.ww
to a new overlay and add an
additional entry.
[root@wwctl1 ~]# wwctl overlay list -a rocky-8-split OVERLAY NAME FILES/DIRS rocky-8-split /etc/ rocky-8-split /etc/fstab.ww [root@wwctl1 ~]# wwctl overlay show rocky-8-split /etc/fstab.ww | tail -n1 {{ .Ipaddr }}:/var/lib/warewulf/chroots/rocky-8-split/rootfs/opt /opt nfs defaults 0 0
I can add the new overlay to our wwinit
list, and the fstab
in
rocky-8-split
will override the one in wwinit
. (Note: --wwinit
was specified as --system
in Warewulf 4.3.0.)
[root@wwctl1 ~]# wwctl profile set --wwinit wwinit,rocky-8-split default [root@wwctl1 ~]# wwctl profile set --container rocky-8-split default
From a compute node, we can see that /opt
is mounted via NFS as
expected.
[root@compute1 ~]# findmnt /opt TARGET SOURCE FSTYPE OPTIONS /opt 10.0.0.3:/var/lib/warewulf/chroots/rocky-8-split/rootfs/opt nfs4 rw,relatime,vers=4.2,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.4,local_lock=none,addr=10.0.0.3
We can further confirm that /opt
is empty on the local, PXE-deployed
file system.
[root@compute1 ~]# mount -o bind / /mnt [root@compute1 ~]# du -s /mnt/opt 0 /mnt/opt
Future work
As demonstrated here, we can already implement split PXE/NFS images using functionality already in Warewulf; but future Warewulf development may simplify this process further:
Container path variables in warewulf.conf
We could support referring to compute node images in
warewulf.conf
. For example, it would be nice to be able to replace
nfs: export paths: - path: /var/lib/warewulf/chroots/rocky-8-split/rootfs/opt export options: rw,sync,no_root_squash mount: false
with something like
nfs: export paths: - path: {{ containers['rocky-8-split'] }}/opt export options: rw,sync,no_root_squash mount: false
This way, our configuration would not have to hard-code the path to the container chroot.
Move NFS mount settings to nodes and profiles
Right now, NFS client settings are stored in warewulf.conf
as mount
options
, mount
, and implicitly via path
; but if these settings
were moved to nodes and profiles we could configure per-profile and
per-node NFS client behavior without having to manually edit or
override fstab
.