Click here to go to the TACC Home Page
Buffered Send and Receive

The buffered send routine, mpi_isend, uses a scratch array that you supply to buffer outgoing messages. Its arguments are identical to the standard send and it is distinguished by a "b" prefix:

call mpi_bsend (data, icount, itype, idest, itag, icomm, ierr)
   Message-Passing Interface
where:

Parameter Description Status
idest destination rank (process) [IN]
itag message tag [IN]
icomm send message within this context [IN]

Note: There is no reference to the buffer area in the calling statement.

The buffered send subroutine returns when the data area may be reused. On return, the data will have been packed safely away in the buffer area if no corresponding receive has been posted or it will have been sent directly to a posted receiver. While the standard mode send (mpi_send) might have to wait for a receive to be posted (depending upon the buffer size used and the MPI implementation), the buffered mode send (mpi_bsend) guarantees that no receive need be posted. Buffering is instrumented and used only by the sender and, therefore, there is no buffered receive subroutine.

You must allocate the buffer area before calling mpi_bsend but it can be used for multiple mpi_bsend calls. When a buffered send is called, it packs the data into the buffer area at the next available free location. For each mpi_bsend call, the buffer area must have the capacity to hold the data portion as well as the request handle and a pointer to the next entry. For arrays of integers or reals, the size of the buffer for the data is easily determined, and the space for one request handle and pointer is given by the MPI_BSEND_OVERHEAD parameter.

An array is assigned as the buffer area with the mpi_buffer_attach subroutine. All you need do is call mpi_buffer_attach once (for example, in your main program) and then mpi_bsend can be called in any subroutine without reference to the buffer since MPI stores the buffer address internally.

When mpi_bsend tries to fill the buffer beyond its capacity, the subroutine will not write beyond the end of the buffer area. Instead, it returns with an error code indicating that the buffer limit has been exceeded. The buffered send call uses the mpi_pack routine to pack the data into the communication buffer. This is of no great concern when sending simple arrays but does become important when using MPI derived data types which are usually complex data structures. The attachment subroutine syntax is:

call mpi_attach (buffer, ibytes, ierr)

where:

Parameter Description Status
buffer scratch area to be used as the communication buffer [IN]
ibytes size of buffer array in bytes [IN]
ierr MPI error number (0 = no error) [OUT]

Below is an example of an mpi_bsend/mpi_recv pair. The program generates a different vector, Vi, in each process i. It then adds the nearest neighbor vectors of process i-1 and i+1 to its own vector. To accomplish this, process i must send Vi to processes i+1 and i-1 while receiving Vi+1 and Vi-1 from the corresponding process.

Explanation: The size of the vector V, N, is used to derive the size of the buffer which will (possibly) store up to two copies (MAXBSEND=2) of Vi, one for the send to process i+1 and one for the send to process i-1. For the sake of brevity, we have not written the code to conserve space nor have we considered the most efficient message-passing algorithm for this particular problem. The size of each buffer section is given by

N+MPI_BSEND_OVERHEAD/8 + 1

and is rounded up to the nearest word. For convenience, the MPI_COMM_WORLD reference integer is assigned to the variable iwcomm. mpi_buffer_attach is called with the array buffer as the scratch area argument while the size is specified by the bytes argument. After the array spaces have been declared, Vi is generated. The neighboring high-number processor, i+1, is derived by the modulo function:

ihi = mod (mype+1, npes)

and the low-number nearest neighbor, i-1, by:

ilo = mod (mype+npes-1, npes)

These values are used as the destination and source integers in the bsend/recv pairs. Each process first sends a copy of Vi to its two neighbor processes, and then receives a copy of Vi+1 and Vi-1. The mpi_recv receivers in each process distinguish the two vectors by the message tags 0 and 1, respectively. A ring communication pattern is used: rank npes-1 process has neighbors npes-2 and 0, and rank 0 process has neighbors npes-1 and 1.

      program bsr
C         Buffered Send/Receive Example.
C                         S = V     + V   + V
C                              i-1     i     i+1 
C         Form local vector sum (S) on process i
C         of vectors (V) stored on processes i-1, i ,& i+1
C         Assume "Ring" communication.
                      -0->  )-1-> ...)-j->  )-k->  )-
                      |_____________________________|
C         Use Buffered Send: buffers 2 copies of V on i.
C         
      implicit real*8 (a-h,o-z)
      parameter(N=10)
      dimension v(N),s(N)
C
C         MPI: Declare status and buffer arrays.
C
      include 'mpif.h'
      dimension istatus(MPI_STATUS_SIZE)
      parameter(MAXPE=64,MAXBSEND=2)
      parameter(NWBUFF=MAXBSEND*(N+MPI_BSEND_OVERHEAD/8+1) )
      dimension buffer(NWBUFF)
C
C         MPI: Get size and rank, attach buffer.
C
      iwcomm=MPI_COMM_WORLD
      call mpi_init(ierr)
      call mpi_comm_rank(MPI_COMM_WORLD,mype,ierr)
      call mpi_comm_size(MPI_COMM_WORLD,npes,ierr)
      ibytes=MAXBSEND*(N+MPI_BSEND_OVERHEAD/8+1)*8 
      call mpi_buffer_attach(buffer,ibytes,ierr)
C
C          Generate vector V   N values: (mype,...,mype+N-1)
C                           i
      do i = 1, N 
        v(i)=real(i-1+mype)
        s(i)=   v(i)
      enddo
c
      ihi = mod(mype+1,     npes)
      ilo = mod(mype+npes-1,npes)
      itag0 = 0
      itag1 = 1
      call mpi_bsend(v,N,MPI_REAL8, ihi,itag0,iwcomm, ierr)
      call mpi_bsend(v,N,MPI_REAL8, ilo,itag1,iwcomm, ierr)
c
      call mpi_recv (v,N,MPI_REAL8, ilo,itag0,iwcomm, istatus,ierr)
      s = s + v
      call mpi_recv (v,N,MPI_REAL8, ihi,itag1,iwcomm, istatus,ierr)
      s = s + v
c
      print*,mype,s(1),s(N)
      call mpi_finalize(ierr)
      end