|
Only the basic derived data types and some simple but quite useful applications are discussed here. Full details can be found in Chapter 3, Section 12 of "MPI: A Message-Passing Interface Standard". Transferring an entire array of integers or reals is relatively straightforward in MPI. When multiple non-contiguous sections of an array or a mixture of integer and real variables must be passed from one process to another, it is often more efficient to pack all of the data into one array and transfer it as a single message rather than sending a message for each section. Many programs have their own packing utilities for portability and MPI provides an intrinsic set of packing and unpacking functions for compatibility, MPI_Pack and MPI_Unpack, respectively. These intrinsic routines allow you to pack an array with any type of data, send it as a byte (undifferentiated) stream, receive that byte data, and unpack it. |
|
Because MPI attempts to buffer messages, it makes sense to have MPI pack your data as it copies the data into a communications buffer or data area, thereby avoiding explicit use of packing routines and reducing memory requirements. There is a price for this convenience: you must declare the arrangement as a derived data type to MPI through a function call. This data typing is quite different from that found in some languages, such as FORTRAN-90 (F90), C, and C++, because MPI data typing describes the layout of the data in memory whereas data typing within a programming language (like F90, C, and C++) groups disparate intrinsic data elements into single object which can be referenced and manipulated with operators.
Whether the data of the derived data type is actually packed into a system buffer area or is retrieved directly from its original location and sent directly to a process is implementation dependent. For instance, the MPI standard allows the low-level routines of the implementation to bypass packing a buffer if a receive has been posted. It is within the specification of the standard that data can be sent directly from its storage locations and reconstructed on the receiving process.
As an example of an MPI derived data type, consider sending columns of an NxN matrix to another process. Instead of packing each column element into a vector and sending the vector to another process, a column data type (coltype) can be defined in an MPI function and then used as the data type in a send function argument list. The only difference in the MPI_Send call is that the count is in units of the derived data type and the data type is given by its Derived Data type variable, not as an intrinsic data type constant. For example,
ierr= MPI_Send( &a[0][i], 1, coltype, idest, itag, icomm)
will send the column of elements a[0][i], a[1][i], ...,
a[N-1][i].
The predefined (intrinsic) data types of MPI are given as parameters in the
mpif.h include file. Derived data types are built on these
basic intrinsic types which include:
| Elementary MPI Data Type | Corresponding FORTRAN Data Type |
|---|---|
| MPI_CHAR | char (compiler default) |
| MPI_SHORT | short (compiler default) |
| MPI_INT | int (compiler default) |
| MPI_LONG | long (compiler default) |
| MPI_FLOAT | float (compiler default) |
| MPI_DOUBLE | double (compiler default) |
| MPI_LONG_DOUBLE | long double (compiler default) |
| MPI_BYTE | BYTE (eight bit octets) |
| MPI_PACKED | undifferentiated, used to specify the count in bytes. |
| MPI_UNSIGNED_CHAR/_SHORT/_LONG | unsigned char/short/long (compiler default) |
New data types are defined by MPI_Type_arrangement functions
which describe the locations of intrinsic data type elements or previously
defined derived data type elements. The locations (arrangements) can be
contiguous, block-replicated, or indexed. A new data type is assigned to
to a MPI_Datatype variable
(newtype) in the calling function's
argument list (e.g., MPI_Type_structure(..., newtype,...).
Before a new data type can be used in a message-passing call, it must be
committed with the MPI_Type_commit function. A reference (pointer)
to the new MPI_Datatype, newtype, is passed in the committing call:
ierr= MPI_Type_commit(newtype)
where:
| Parameter | Description | Status |
|---|---|---|
newtype (MPI_Datatype *)
|
data type to be committed (NOTE: newtype is a pointer.) | [IN] |
The simplest data type function, MPI_Type_contiguous, defines a contiguous sequence of data as a single data type element with the following syntax:
ierr= mpi_type_contiguous(icount, oldtype, newtype)
where:
| Parameter | Description | Status |
|---|---|---|
icount (int)
|
number of elements of oldtype data type
|
[IN] |
oldtype (MPI_Datatype)
|
an intrinsic or previously defined data type (NOTE: oldtype is passed by value.) | [IN] |
newtype (MPI_Datatype *)
|
the new data type definition (NOTE: newtype is a pointer.) | [OUT] |
The following section of code defines, commits, and uses a contiguous type to send M rows of a real NxN matrix to another process:
MPI_Datatype * contigtype; ierr= MPI_Type_contiguous(N, MPI_DOUBLE, contigtype); ierr= MPI_Type_commit(contigtype); ierr= MPI_Send(&a[irow,1], M, *contigtype,idest,itag,icomm);
The vector type function, MPI_Type_vector, is very convenient and simple to use. It consists of equally spaced blocks of contiguous data. The distance from the first element of one block to the first element of the next block is the stride, measured in units of the defining data type. The syntax is:
ierr= MPI_Type_vector(iblks, iblklen, istride, oldtype, newtype)
where:
| Parameter | Description | Status |
|---|---|---|
iblks (int)
|
number of blocks | [IN] |
iblklen (int)
|
number of oldtype elements in each block
|
[IN] |
istride (int)
|
number of elements between the start of each block | [IN] |
oldtype (MPI_Datatype)
|
data type of elements in block (an intrinsic or previously defined data type). This is also the data type of strided-over elements. (NOTE: oldtype is passed by value.) | [IN] |
newtype (MPI_Datatype *)
|
new data type definition (NOTE: newtype is a pointer.) | [OUT] |
The following diagram depicts a vector data type of three 3-element blocks (e.g., a 3x3 matrix within a 4x4 matrix).

The following code section shows how to use a vector type to send a 3x3 matrix, imbedded in a larger NxN matrix, to another process:
MPI_Datatype vectype;
double a[N][N];
...
ierr= MPI_Type_vector(3,3,N, MPI_DOUBLE, &vectype);
ierr= MPI_Type_commit(&vectype);
ierr= MPI_Send(&a[j][i], 1, vectype, idest,itag,icomm);
The MPI_Type_hvector function is identical to MPI_Type_vector except that the stride is given in bytes.
The most general arrangement of data can be defined by using the MPI_Type_indexed function. It uses two index arrays to define the block lengths and storage locations. You can think of MPI_Type_indexed as a generalized MPI_Type_vector function that permits variable block sizes and replaces a constant stride with an array of block locations. The syntax is:
ierr MPI_Type_indexed(icount, ivblklen, ivblkloc, oldtype, newtype)
where:
| Parameter | Description | Status |
|---|---|---|
icount (int)
|
number of blocks | [IN] |
iblklen (int *)
|
length of each block (int array) | [IN] |
iblkloc (int *)
|
location of each block (int array) | [IN] |
oldtype (MPI_Datatype)
|
data type of elements, an intrinsic or previously defined data type | [IN] |
newtype (MPI_Datatype *)
|
new data type definition | [IN] |



