Setting up an HPC cluster is much easier than in the past, due to a wide array of prepackaged solutions

In the early days of Linux HPC clustering, installing the necessary software was a customized and often tedious process. Each new cluster often brought with it a new way to install and manage each system. Fortunately, today's HPC cluster market offers more options and choices for cluster software than in the past. Depending on your needs, budget, and resources, turning a pile of servers into a real HPC resource can be a pain-free process. The end result is less time fiddling with software and more time doing meaningful (production level) computing.

Before we talk about software options, let's take a look at the requirements for a simple, but functional cluster. There is a head node server, which serves as the access point for users and administrators. In our simple cluster, the head node is also the Network File System (NFS) file server and the batch scheduler that manages users jobs. Next, there are a number of compute nodes that are used to execute applications. Finally, there are one or more networks connecting all the nodes.

The software needs of a basic cluster are as follows:
  1. Provide a full software environment on the head node that includes user directories, software tools, and applications.
  2. Provide identical copies of the head node software on the compute nodes.
  3. Export the user directories from the head node to the compute nodes using NFS.

Since all of the nodes (including the head node) have identical software environments with shared user directories, parallel programs (multiple copies of a binary program) can be run on any number of the compute nodes. When the programs need to communicate, they can talk using the Message Passing Interface (MPI) over the high-speed compute network.

There are several popular methods for managing a Linux cluster software environment, including Image Based, Stateless, and Remote State management. The differences in each method are due to how the software is installed on the nodes (provisioning), but in the end, all nodes provide the basic software environment mentioned above.

Image Based Management

Image based methods store a complete operating system (OS) image on the node’s local disk. While the software needed on the compute nodes may be different than that needed on the head node, the software is identical on each compute node. A standard compute node OS image is maintained on the head node and copied to the nodes the first time they are booted. The next time the node boots, the image is ready for use.

The advantage of this method is the control of node images by the head node. If the image is changed in some way, the head node can tell the nodes to re-image themselves with the new OS image. Of course, this requires time for the new images to propagate, but it allows tight control of exactly what was installed on the nodes. Below are several freely-available and commercial distributions (indicated by a "$") that provide Image Based Management:

Stateless Provisioning

In a "stateless" mode, the entire OS image comes from a boot server (head node) and lives in the node memory in a RAM disk. This image is usually small and supported by NFS mounts from the head node. When a node is rebooted, all state information is lost and the image must be reloaded. With stateless booting, hard disk drives on the nodes are optional and changes can be easily managed and quickly propagated by rebooting the node. It should also be noted that Rocks, Oscar, Kusu, and their commercial counterparts all provide some form of stateless (disk-less) provisioning. In addition, Bright Computing supports supports stateless nodes. The packages mentioned below are designed primarily for stateless operation:

Remote State Management

In this scenario, the node state is maintained on a remote file system (i.e., NFS) or disk (i.e., iSCSI). One popular package for remote state management is oneSIS, an open-source software package aimed at simplifying disk-less cluster management. Remote State is similar to Stateless Provisioning in that a kernel is loaded over the network, but all other files are mounted via NFS from the head node.

Final Thoughts

Today’s array of options for cluster administrators is much better than those in the past. They include both freely available packages and commercially supported options (or combinations of both). Your choice depends on your capabilities and needs. It may also be valuable to use a consultant with HPC experience to help make the right choices for your cluster.

Installing an HPC cluster is much easier than it used to be. The right choice of methodology and support level depends on your needs and budget. Like any maturing technology, choices and compute cycles abound.