This section describes the boot process of Qlustar cluster-nodes.
The boot process of the compute-nodes follows precise rules. It takes place in six steps:
The PXE boot ROM of the network card sends a DHCP request. If the node is already registered in QluMan, the request is answered by the DHCP server running on the head-node(s), allowing the adapter to configure its basic IP settings.
The boot ROM requests a PXE loader program from the TFTP server on the head-node (the TFTP server specified by DHCP could also be on another node, but this is not the default). The PXE loader is then sent to the compute-node via TFTP.
PXELinux downloads the Qlustar Linux kernel and the assigned initial RAM-disk image and boots the kernel. This image doesn’t hold the final OS, it has just enough functionality to download the real OS image in the next step.
A Qlustar specific script
/initis executed as the initial init process. This script sets up basic networking functionality for the boot NIC and starts the Qlustar multicast client ql-mcast-client to download the real node OS image assigned to the node. It does so by connecting to the Qlustar multicast image server ql-mcastd on the headnode to request a multicast IP and port. The OS image for the node will then be streamed to all nodes which requested the same image at that time. Should multicast fail, a slower unicast fallback is used to download the OS image.
After the real OS image is downloaded and it’s checksum verified, a unionfs filesystem structure is setup under
/union. The image is then unpacked into
/union/imageas one component of the union. A tmpfs to store runtime changes made to the image filesystem is created as a second component and finally a third, empty one is reserved for an optional chroot to be added later.
At the end, the system moves to the next boot stage by changing the root filesystem to the just created unionfs. Control is then passed to the 2nd Qlustar init script
The latter first executes systemd-udevd to trigger the auto-loading of the full set of kernel drivers, and then starts QluMan execd in a one-shot configure-mode. Hereby execd a) receives all the node-specific options from the head-node’s qlumand and b) executes corresponding scripts to process the options received.
This dynamic customization/configuration of the node must be done before systemd starts. Among others, the following tasks are performed at this stage: Setup of systemd units for QluMan defined Network FS mounts, Root FS Customization, synchronization of the system time with the head-node(s), pam/sssd customization, enabling of NIS and OpenSM configuration (both optional). Finally, if configured and present, local disks are initialized/mounted before running any Root FS Customization scripts transferred to
When all the above is finished, control is finally passed to systemd as the final init process. From here on, the boot procedure continues in the standard Linux fashion.
Log files concerning the Qlustar specific boot phase are located under
TFTP server component of
dnsmasq transfers the boot image to the compute-nodes. All
files that should be served by tftp must reside in the directory
/var/lib/tftpboot. On a
Qlustar installation, it contains three symbolic links:
pxelinux.0 -> /usr/lib/syslinux/pxelinux.0 pxelinux.cfg -> /etc/qlustar/pxelinux.cfg qlustar -> /var/lib/qlustar
/etc/qlustar/pxelinux.cfg contains the PXE boot configuration files for the
compute-nodes. There is a default configuration that applies to any node without an assigned
custom boot configuration in QluMan. For every host with a custom boot configuration, QluMan
adds a symbolic link pointing to the actual configuration file. The links are named after the
Hostid, which you can find out with the
gethostip command. For more details about
how to define boot configurations see the corresponding section of the
The squashfs-based RAM-disk image is the file-system holding the node OS that is mounted as the root filesystem of the compute-nodes. It is assembled on the head-node(s) from the image modules, you are able to select in QluMan. Every RAM-disk image contains at least the core module. See the corresponding section of the QluMan Guide for more details. All available image modules are displayed and selectable in QluMan and the configuration and assembly of images is done automatically from within QluMan.
By default, the root password of a Qlustar OS image and hence the node booting it, is taken
from the head-node(s)
Any Qlustar node OS image contains changelogs of the various image modules it is composed of. They are located in the directory
/usr/share/doc/qlustar-image. The main changelog file is
core.changelog.gz. The other files are automatically generated. The files
.packages.version.gzlists the packages each module is made of. The files
.contents.changelog*.gzlists the files that were changed between each version, and
.packages.changelog.gzlist differences in the package list and versions. Hence, you always have detailed information about what has been changed in new images as well as the package sources of their content.
Node OS images are regenerated automatically, when the image module packages they are based on are updated. That means, that files can’t be simply modified or added to a generated image as the changes would be lost on the next update.
Qlustar therefore provides a mechanism to add extra files to images every time they are rebuild and hence make changes permanent. Files can be added to all images or only to one specific image using the qlustar-image-edit tool. All the commands in this section must be executed as root on the head-node.
To modify or add the file
/some/path/filename to all images execute:
0 root@cl-head ~ # qlustar-image-edit -e /some/path/filename
To modify/add the file to a specific image <img>:
0 root@cl-head ~ # qlustar-image-edit -e img /some/path/filename
To edit the file again later, simply run the same command again.
Files created this way will be located underneath the path
/etc/qlustar/images on the
head-node, either in the sub-directory
common (for files entering all images) or in the
img (for files entering just the image img).
The whole directory structure of a file is created there so the full path of the above
examples would be
/etc/qlustar/images/img/copy/some/path/filename respectively. To undo adding such files to
the images, simply remove these files.
A second mode of qlustar-image-edit is to directly edit the generated images. Such changes are always temporary meaning they will be overwritten by image module updates. This method is suitable to apply a quick fix for a problem that is known to be solved in subsequent image module versions or for testing.
To edit the initial RAMdisk of the image img execute:
0 root@cl-head ~ # qlustar-image-edit -i img
To edit the squashfs OS image do:
0 root@cl-head ~ # qlustar-image-edit -s img
In both cases, you will be placed into the root directory of the corresponding
initrd/image. You can then manipulate any file in the initrd/image or add new ones to it. When
exit and the initrd/image will be regenerated. Alternatively enter
exit 1 to
abort and throw away any modifications.
By manipulating the image in this way, you can easily break things and in the worst case make
the OS unbootable. Please be aware that you’re on your own, if you choose to experiment with
the above methods. In other words: The Qlustar team won’t be able to give support for problems
arising from a modified OS initrd/image. You can reset the initramfs and squashfs images to
the original content using
Rebuilding images is time consuming and changes made to an image apply to all nodes using the same image. This makes customizations somewhat unflexible. To improve on this, Qlustar provides another mechanism to customize/modify node OS Images. It is applied in the pre-systemd boot phase to target node-specific customizations assignable via QluMan.
This is implemented by the Root FS Customization config class in qluman-qt. For details about how to create such a config class and how to assign it to nodes see the corresponding section of the QluMan Guide.
The files and directory structure for the Root FS Customization is stored below
/var/lib/qlustar/root-fs/<custom> on the head-node(s) where
<custom> is the name of the
config as defined in QluMan. The qlustar-image-edit tool provides shortcuts to create/edit or
delete files. More complex operations like changing file ownership or permissions must be done
from the shell directly using the full path.
To create or edit the file
/some/path/filename for a Root FS Customization config named
0 root@cl-head ~ # qlustar-image-edit -r -e custom /some/path/filename
This will use sensible-editor to open the file in an editor, honoring your EDITOR and VISUAL settings or using the system default editor.
To delete the file execute:
0 root@cl-head ~ # qlustar-image-edit -r -d custom /some/path/filename
The QluMan execution server qluman-execd runs on any node of a Qlustar cluster. It is one of Qlustar’s main components, responsible for executing remote commands, writing configurations to disk, as well as monitoring.
When a compute-node boots, qluman-execd initially starts in a one-shot fashion (starts and
exits when done with its configuration tasks) during the pre-systemd boot phase (see
booting for details). At this stage, it performs a number of initialization/configuration
tasks depending on the node’s configuration settings defined in QluMan. Generated option files
are written under
/etc/qlustar/options.d. The following is a list of these tasks:
- Network configuration
Configuration of all network parameters in the corresponding configuration files, so that they can be activated later on by systemd. The information is written to
/etc/network/interfaces.d/qluman(Ubuntu nodes), or in adapter specific files under
- Disk configuration
Writing of the host’s QluMan defined disk configuration into the file
/etc/qlustar/options.d/disk-configfor later use by the disk initialization script.
- Setup of Network FS mounts
Writing of systemd (auto)mount unit files according to the Network FS mounts config assigned to the host in QluMan.
- Infiniband OpenSM activation
Activation of OpenSM in case the node is configured to run it.
- IPMI IP configuration
Reconfiguration of the node’s IPMI address, if activated for the node in QluMan.
- UnionFS chroot
An optionally assigned custom unionFS chroot will be setup instead of the one that is defined in the Qlustar image.
- SSH authorized_keys
The ssh keys that are configured in QluMan to allow password-less login to the node as root, are copied into
- Root FS Customization
If one or more Root FS Customization configs are assigned to a node then the corresponding directory structure(s) under
/var/lib/qlustar/root-fs/configwill be sent to the node preserving the user, group and permissions of each file or directory. Any files ending up in
/lib/qlustar/init.dwill be executed in alphanumeric order at the end of the 2nd stage boot script
/sbin/init.qlustarjust before handing over control to systemd.
If configured in QluMan, mail transport will be activated on the node and ssh access for normal users will be limited to those having a running slurm job on the node (by making changes to the pam config).
For details about the configuration of the above components, see the corresponding sections of the QluMan Guide.
All files being written to a node by qluman-execd in the pre-systemd boot phase can be previewed in the QluMan GUI.