Virtual Shared Memory vs. Message-Passing
Existing software tools generally take one of two major approaches to
parallel program execution: message passing or virtual shared memory computing.
These two paradigms differ in many ways, but most importantly in their
approaches to storing the data that is shared among the various components of a
parallel program and to making the data available to the components that need it
as the program runs.
Message passing is a model that arises directly from the architecture of
distributed memory multiprocessors and networks of workstations; the best known
examples are MPI, a widely implemented standard message passing system, and PVM,
a system developed at Oak Ridge National Laboratory. In this approach, each
program datum belongs to some specific process, and it must be explicitly
transmitted to any other processes that need it as the program progresses.
Sending and receiving a single such message requires many steps by both the
transmitting and receiving processes, and parallel programs built with message
passing systems typically send many, many messages in the course of
Virtual Shared Memory
The virtual shared memory (VSM) approach is built around a familiar paradigm
for writing parallel programs; multiple processes interacting with and
communicating by means of shared memory. The best known products of this type
are SCAI's Linda® and Paradise® systems.
Linda provides a single,
logically shared memory to all of the processes in a parallel program. Each
process sees the same data space, and it can read or write shared data at will
using simple operations, often comprising only a single line of code. No process
ever has to worry about directly communicating with any other process; all such
low level operations are handled by the system itself. The VSM works in this way
regardless of whether it resides physically in a single memory, or (as is more
often the case) it is actually distributed among the various processors
participating in a program execution.
Paradise generalizes the Linda model by offering multiple VSMs that can exist
independently of any particular application. This means that VSMs can be used to
share information among applications that run at entirely different times. In
addition to the capabilities of Linda, Paradise also includes a number of
features that are aimed more at flexible distributed computing environments than
at "pedal to the metal" parallel applications.
VSM-based systems like
Linda and Paradise also differ from message passing libraries in that they
include high-level coordination languages for parallel programming. They add
functionality to a standard programming language like C, C++, or Fortran for
managing ensembles of independent processes, usually via a small set of simple
operations which programmers use to implement parallelism.
The VSM approach has a number of benefits:
- It is very simple to learn and to use, enabling existing programs to be
parallelized rapidly and new ones to be developed easily.
- It makes creating portable programs much less complicated, since
architecture-specific low-level details are hidden from the user.
- It enables advanced parallel execution features like dynamic load balancing
to be implemented easily.
In order to illustrate the differences between the message passing and
VSM approaches to parallel programming in a concrete way, we present some sample
code taken from the PVM documentation, along with the code required to perform
the identical tasks using the C-Linda system. The entire program uses a
"master-worker" design in which the master handles data input and output and
oversees the entire execution. The program fragments below, which would be
executed by each of the component "worker" processes in the parallel program,
implement three distinct phases in the computation:
- obtain the necessary input data from the master
- perform a calculation (in a function work that is not shown)
- return the results to the master