USGS - science for a changing world

EGSC HPCC Operational Web Site

Help
HPCC Home | Help Section Main Page

Help Using the Message Passing Interface (MPI)

About the Message Passing Interface (MPI)

The Message Passing Interface is a specification for one particular approach to enabling communication and coordination among processes running in parallel on multiple independent but connected computing nodes. Whether the Message Passing Interface (MPI) is the best approach for a particular computational problem is not a question which can be answered in general, but MPI is widely used on Beowulf clusters and is an approach which allows developers of parallel-processing algorithms flexibility in algorithm design along with opportunities to keep the overhead of inter-node communication relatively low. (The realization of these opportunities, of course, is dependent on the developer's skill and the efficiency of the MPI implementation.) Power and flexibility come at the cost of complexity — the effective use of the Message Passing Interface in designing and implementing parallel algorithms and software requires that an already proficient scientific programmer become familiar with the concepts of MPI and then gain experience in applying MPI to solve a variety of computational problems. Two languages widely used for Beowulf software development (with or without MPI) are FORTRAN and C; there are a number of implementations of MPI, both commercial and open-source, available for use with these two high-level, procedural programming languages.

The MPI topic page in the General Information Section provides additional introductory information about MPI.

Why Two Implementations of MPI?

The Message Passing Interface is a specification: it is a statement of the syntax and semantics of additional statements to be added to the standard statements of a high-level, procedural programming language (such as FORTRAN or C) in order to extend the expressive range of the language with parallel-processing constructs. The MPI specification does not dictate how an MPI implementation is to get from source code, with MPI statements embedded, to real, functioning software; it only dictates the functionality to be added by MPI.

MPI specifies that a program written as if it were a single program can be initiated (launched) on any number of the nodes of a Beowulf or other computational cluster to run in parallel in such a way that each copy of the program can identify itself individually, perform blocks of computations without outside interference, but also communicate with all of the other copies. Conceptually, the cleanest way to implement MPI might be: first, to create an operating system (OS) specifically designed for parallel cluster processing and equipped with low-level system calls to support interprocess communication among processes distributed over a number of nodes; and second, to create new language compilers which understand MPI constructs and syntax and create executables tailored for use on the underlying parallel-cluster-processing operating system. Although the custom-OS/custom-compiler approach has intellectual appeal, it implies a substantial amount of development for implementers and significant constraints and specialization for users.


Actual implementers have (reasonably) chosen to work with existing compilers, operating systems, and communication techniques. Consequently implementations of MPI tend to have various strengths and weaknesses resulting from the combination of techniques applied to the problem of emulating a unified cluster-OS when using a cluster actually consisting of independent nodes each running its own OS. Two implementations of MPI — LAM-MPI and MPICH-2 — have been made available on the EGSC HPCC to provide developers with a choice; developers may choose the implementation which provides the compromise of capabilities best suited to their particular computational problems.

Choosing LAM-MPI or MPICH-2

To make a choice between LAM-MPI or MPICH-2, a developer may develop a simple, throw-away benchmarking application which is in some reasonable sense representative of kind of processing and communication likely to be found in the final, working application. Then the developer can run the benchmarks and make a choice. Alternatively, the developer can develop under either implementation and then create a version for the other implementation and test the two through short runs. A program developed under one implementation ordinarily requires modification before being run under another implementation, though the modification required tends to be small relative to overall development effort. Although it has not been tested yet, there appears to be no reason why LAM-MPI and MPICH-2 applications can't be running on the EGSC HPCC simultaneously, so the choice of implementation for a particular application need not be influenced by the choices made for other applications.

Additional Information on Using MPI

The Developers'Information Section of this Web site provides additional information about how to use LAM-MPI and MPICH-2 on the EGSC HPCC.

Accessibility FOIA Privacy Policies and Notices

USA.gov logo U.S. Department of the Interior | U.S. Geological Survey | DOI Inspector General
URL: http://egscbeowulf.er.usgs.gov/help/mpi.php
Page Contact Information: hpcc_administrator@usgs.gov
Page Last Modified: Sunday, March 3, 2024 -- 10:54 PM