EGSC HPCC Operating Systems and Software |
|
Operating SystemsThe 16 work nodes and the Worldly (Master Control) Node of the EGSC HPCC all operate under various versions of Linux. The Worldly Node was created with the RedHat 9 (Shrike) distribution. Nodes 1-5, 7, and 11 operate under version 2.6.27.5-117.fc10.x86_64 of the Linux kernel as provided by the Fedora Core 10 distribution. Nodes 6, 8-10, and 12-16 operate under version 2.6.16.21-0.8-smp of the Linux kernel as set up using Suse Enterprise Linux 10. Each node runs its own installation of the OS from its own local disk. The Worldly Node operates at run level 5, which provides access to a graphical desktop. Although the additional processes required for run level 5 do impose a small but noticeable overhead on the system, this is acceptable because the Worldly Node functions within the Cluster primarily as the front-end for control and communication. (In particular, the Web server software runs on the Worldly Node and provides the focus for user interfaces, which are expected to be primarily Web based.) By contrast, the work nodes typically operate at run-level 3 (without the grapical desktop processes running) to allow them to devote as much RAM and other processing resources as possible to computational applications, although each work node is configured so it may be rebooted into run level 5 during system maintenance. To further free up resources on the work nodes, the collections of background processes (daemons) launched on run-level initiation has been pared down to just those needed for computational purposes and, to a lesser extent, ease of administration so that shutdowns and reboots can be minimized. Although many Beowulfs do use a distributed file system such as Network File System (NFS), the HPCC does not. Distributed file systems impose a significant overhead on clusters which detracts from computational performance. Because the primary role and function of the HPCC is to provide a controlled and predictable computational environment, and since the activities and operations supported by a distributed file system can be performed through straightforward alternate means (such as secure-shell software, Message Passing Interface (MPI), and scripts) the HPCC does not use NFS or any other distributed filesystem. The Windows adjunct node runs under Windows XP® Professional. |
Other SoftwareAdditional software available to users and developers on the HPCC include:
None of the remote-shell family of software (such as rsh and rcp) are available on the HPCC because of the security vulnerabilities introduced by this family. Users will find that the secure-shell alternatives are more than satisfactory replacements except when large volumes of data need to be moved over the HPCC's private network; for large-volume transfers the work nodes all provide Very-Secure FTP (VSFTP) servers; but because of security concerns the Worldly Node does not run any FTP server. The use of secure-shell rather than remote-shell will not diminish throughput for parallel computations and processing since the versions of the Message Passing Interface (MPI) used on the HPCC use secure shell only to initiate jobs on the work nodes; once jobs have been initiated, MPI facilities take over both control transmissions and data transmissions for parallel MPI jobs. |