Click here to go to the TACC Home Page

TACC

Derived Data Types

Only the basic derived data types and some simple but quite useful applications are discussed here. Full details can be found in Chapter 3, Section 12 of "MPI: A Message-Passing Interface Standard".

Transferring an entire array of integers or reals is relatively straightforward in MPI. When multiple non-contiguous sections of an array or a mixture of integer and real variables must be passed from one process to another, it is often more efficient to pack all of the data into one array and transfer it as a single message rather than sending a message for each section. Many programs have their own packing utilities for portability and MPI provides an intrinsic set of packing and unpacking subroutines for compatibility, mpi_pack and mpi_unpack, respectively. These intrinsic routines allow you to pack an array with any type of data, send it as a byte (undifferentiated) stream, receive that byte data, and unpack it.

   Message-Passing Interface

Because MPI attempts to buffer messages, it makes sense to have MPI pack your data as it copies the data into a communications buffer or data area, thereby avoiding explicit use of packing routines and reducing memory requirements. There is a price for this convenience: you must declare the arrangement as a derived data type to MPI through a subroutine call. This data typing is quite different from that found in some languages, such as FORTRAN-90 (F90), because MPI data typing describes the layout of the data in memory whereas data typing within a programming language (like F90) groups disparate intrinsic data elements into single object which can be referenced and manipulated with operators.

Whether the data of the derived data type is actually packed into a system buffer area or is retrieved directly from its original location and sent directly to a process is implementation dependent. For instance, the MPI standard allows the low-level routines of the implementation to bypass packing a buffer if a receive has been posted. It is within the specification of the standard that data can be sent directly from its storage locations and reconstructed on the receiving process.

As an example of an MPI derived data type, consider sending rows of an NxN matrix to another process. Instead of packing each row into a vector and sending the vector to another process, a row data type (irowtype) can be defined in an MPI subroutine and then used as the data type in a send subroutine argument list. The only difference in the mpi_send call is that the count is in units of derived data type and the data type is given by its integer reference, not as an intrinsic data type parameter. For example,

call mpi_send (a(i,1), 1, irowtype, idest, itag, icomm, ierr)

will send the row of elements A(i,1), A(i,2), ..., A(i,N).

Intrinsic Data Types

The predefined (intrinsic) data types of MPI are given as parameters in the mpif.h include file. Derived data types are built on these basic intrinsic types which include:

Elementary MPI Data Type Corresponding FORTRAN Data Type
MPI_REAL REAL (compiler default)
MPI_INTEGER INTEGER (compiler default)
MPI_DOUBLE_PRECISION DOUBLE PRECISION (compiler default)
MPI_CHARACTER CHARACTER (compiler default)
MPI_LOGICAL LOGICAL (compiler default)
MPI_BYTE BYTE (eight bit octets)
MPI_PACKED undifferentiated, used to specify the count in bytes.
MPI_REAL8* 8-byte REAL
MPI_REAL4* 4-byte REAL
MPI_INTEGER4* 4-byte INTEGER
* These derived data types are optional and may not appear in all MPI implementations.

New data types are defined by mpi_type_arrangement subroutines which describe the locations of intrinsic data type elements or previously defined derived data type elements. The locations (arrangements) can be contiguous, block-replicated, or indexed. A new data type is assigned an integer reference value (inewtype) in the calling subroutine's argument list (e.g., mpi_type_structure(..., inewtype,...).

Before a new data type can be used in a message-passing call, it must be committed with the mpi_type_commit subroutine. Suppose that itype is the newly defined data type, the committing subroutine syntax is:

call mpi_type_commit (itype, ierr)

where:

Parameter Description Status
itype data type to be committed [IN]

Contiguous Data Types

The simplest type subroutine, mpi_type_contiguous, defines a contiguous sequence of data as a single data type element with the following syntax:

call mpi_type_contiguous (icount, ioldtype, inewtype, ierr )

where:

Parameter Description Status
icount number of elements of ioldtype data type [IN]
ioldtype an intrinsic or previously defined data type [IN]
inewtype the new data type definition [OUT]

The following section of code defines, commits, and uses a contiguous type to send M columns of a real NxN matrix to another process:

	call mpi_type_contiguous(N, MPI_REAL8,  icontigtype,  ierr)
	call mpi_type_commit(icontigtype, ierr)
	call mpi_send(a(1,icol), M, icontigtype,idest,itag,icomm, ierr)

Vector Data Types

The vector type subroutine, mpi_type_vector, is very convenient and simple to use. It consists of equally spaced blocks of contiguous data. The distance from the first element of one block to the first element of the next block is the stride, measured in units of the defining data type. The syntax is:

call mpi_type_vector (iblks, iblklen, istride, ioldtype, inewtype, ierr)

where:

Parameter Description Status
iblks number of blocks [IN]
iblklen number of ioldtype elements in each block [IN]
istride number of elements between the start of each block [IN]
ioldtype

data type of elements in block (an intrinsic or previously defined data type). This is also the data type of strided-over elements.

[IN]
inewtype new data type definition [OUT]

Matrix data representation The diagram to the right depicts a vector data type of three 3-element blocks (a 3x3 matrix within a 4x4 matrixa). The following code section shows how to use a vector type to send a 3x3 matrix, imbedded in a larger NxN matrix, to another process:

	dimension a(N,N)
        ...
	call mpi_type_vector(3,3,N, MPI_REAL8,  ivectype,  ierr)
	call mpi_type_commit(ivectype, ierr)
	call mpi_send(a(i,j), 1, ivectype, idest,itag,icomm,  ierr)

The mpi_type_hvector subroutine is identical to mpi_type_vector except that the stride is given in bytes.

Indexed Data Types

The most general arrangement of data can be defined by using the mpi_type_indexed subroutine. It uses two index arrays to define the block lengths and storage locations. You can think of mpi_type_indexed as a generalized mpi_type_vector subroutine that permits variable block sizes and replacing strides with block locations indexed by an integer array. The syntax is:

call MPI_type_indexed (icount, ivblklen, ivblkloc, ioldtype, inewtype, ierr)

where:

Parameter Description Status
icount number of blocks [IN]
iblklen length of each block [IN]
iblkloc location of each block [IN]
ioldtype data type of elements, an intrinsic or previously defined data type [IN]
inewtype new data type definition [IN]