Introduction

Qlustar Clusters

A Qlustar cluster is designed to boot and manage compute and/or storage nodes (hosts) over the network and make them run a minimal OS (Operating System) image in RAM. Local disks (if present) are only used to preserve log files across boots and for temporary storage (e.g. for compute jobs). Hence all Qlustar cluster nodes apart from head-nodes are always state-less.

One or more head-nodes deliver the OS boot images to the nodes. Additionally, a small NFS share containing part of the configuration space for the nodes is exported from one of the head-nodes. Optionally, the RAM-based root FS (file-system) can be supplemented by a global UnionFS chroot to support software not already contained in the boot images themselves. The head-node(s) of the cluster typically provides TFTP/PXE boot services, DHCP service, NIS service and/or slurm resource management etc. to the cluster.

The management of these and all cluster-related components of a Qlustar installation in general can easily be accomplished through a single administration interface: QluMan, the Qlustar Management interface. The QluMan GUI is multi-user as well as multi-cluster capable: Different users are allowed to work simultaneously with the GUI. Changes made by one user are updated and visible in real-time in the windows opened by all the other users. On the other hand, it is possible to manage a virtually unlimited number of clusters within a single instance of the QluMan GUI at the same time. Each cluster is shown in a tab or in a separate main window.

Overview of basic Setup Principles

A central part of Qlustar are its pre-configured modular OS images. Different nodes may have different hardware or need to provide specific and varying functionality/services. Therefore, to optimize the use of hardware resources and increase stability/security, Qlustar does not come with just one boot image that covers every use-case. Instead, a number of image modules with different software components are provided from which individual custom OS images can be created as needed. A Qlustar OS image just contains what is actually required to accomplish the tasks of a node, nothing more. See below for more details about configuring OS images.

But providing different OS images is still not enough for a flexible yet easily manageable cluster: A node booting a generated image also receives extra configuration options via DHCP, via qlumand and via NFS at boot time, thus allowing to fine-tune the OS configuration at run-time. E.g. it is possible to determine how the local disks are to be used (if any are present), whether additional services like OpenSM or samba should be enabled/disabled and a lot more. Four different configuration/property categories exist in QluMan:

  • Generic-Properties are simple on/off options or key+value pairs applicable to groups of nodes, e.g. to flag the reformatting of the local disks at the next boot, add SMTP mail functionality, etc.

  • Config Classes handle more complex configurations like boot/disk configs, DHCP, etc.

  • Hardware-Properties are not used to configure the nodes themselves but describe their hardware configuration and are of importance e.g. for the slurm workload manager and/or inventory management.

Of course, one can configure every host in a cluster individually. But in most clusters, there are large groups of hosts that need to be configured identically. However, even if there are several groups, they might share only some properties/configurations, but not all of them. To provide a simple handling for such scenarios, while at the same time maintaining maximum flexibility, QluMan allows to combine generic properties, hardware properties and config classes each into sets.

For settings that apply to all hosts of a cluster, there are global sets: A global Generic Property set, a global Hardware Property set and a global Config set.

Additionally, it is possible to combine exactly one Generic Property set, one Hardware Property set and one Config set into a Host Template. Assigning a Host Template to a group of hosts allows to specify all of their specific properties and configuration settings with a single mouse-click.

For situations where flexibility is required (e.g. one host in a group has a slightly different hardware configuration than all the others), it is also possible to override or extend the settings defined in the chosen Host Template, by assigning either one of the sets and/or individual properties/config classes directly to a host. In case of conflicts, values from individual properties/config classes have highest priority, followed by set values, then the Host Template values and finally the global values. The Enclosure View presents a nice graphical representation of this hierarchy of settings for each host. For more details on this, see Configuring Hosts.