XcalableMP is a language specification that stresses performance transparency, that is, making it easy for the user to understand where and what kind of execution is being performed. This is to avoid drops in performance that the user does not expect, with the aim of making performance tuning simpler. Accordingly, communication is not automatically generated unless the user explicitly specifies it with a directive. Instead, consideration is made for communication instructed with as simple coding as possible. Representative of this is the reflect directive and the gmove directive construct.

  • Using Sleeve Communication
  • Using the Gmove Directive Construct
    • Between arrays with different distribution types
    • Between array with different partition dimensions
    • Percentage change for uneven distribution
    • Distribution of a given cross-section to all nodes

Using Sleeve Communication

The reflect directive is used to set the values held by the adjacent nodes in the sleeve area. The figure below shows an example of this.

  • Fortran
!$xmp nodes p(3)
!$xmp template t(30)
!$xmp distribute t(block) onto p
real a(30,30)
!$xmp align a(*,j) with t(j)
!$xmp shadow a(1)

・・・!*** Set value in the sleeve area ***

!$xmp reflect a

!*** Refer to the sleeve area ***
!xmp loop on p(j)
do j=1,30
	do i=1,30
		a(i,j) = a(i,j)+a(i-1,j)+a(i,j+1)+ ...
	enddo
enddo
  • C
#pragma xmp nodes p(3)
#pragma xmp template t(0:29)
#pragma xmp distribute t(block) onto p
double a[30][30];
#pragma xmp align a[j][*] with t(j)
#pragma xmp shadow a[1]

・・・//*** Set value in the sleeve area ***

#pragma xmp reflect a

//*** Refer to the sleeve area ***
#pragma xmp loop on p(j)
for(j=0;j<30;j++)
	for(i=0;i<30;i++)
		a[j][i] = a[j][i] + a[j][i-1] + a[j+1][i] + ...

reflect.pngThe shadow and reflect directives can be used with multidimensional partitioned arrays. An example of a two-dimensional partitioning is shown below. Although the central node receives sleeve data from the surrounding eight nodes from forward, backward, left, right, and diagonal directions, the access pattern can be coded using the reflect directive in only one line, and so it is extremely easy.

  • Fortran
!$xmp nodes p(3,3)
!$xmp template t(30,30)
!$xmp distribute t(block,block) onto p
real a(30,30)
!$xmp align a(i,j) with t(i,j)
!$xmp shadow a(1)

・・・//*** Set value in the sleeve area ***

!$xmp reflect a

!*** Refer to the sleeve area ***
!xmp loop (i,j) on p(i,j)
 do j=1,30
	do i=1,30
		a(i,j) = a(i,j)+a(i-1,j)+a(i,j+1)+ ...
	enddo
enddo
  • C
#pragma xmp nodes p(3,3)
#pragma xmp template t(0:29,0:29)
#pragma xmp distribute t(block,block) onto p
double a[30][30];
#pragma xmp a[i][j] with t(j,i)
#pragma xmp shadow a[1]

・・・//*** Set value in the sleeve area ***

#pragma xmp reflect a

//*** Refer to the sleeve area ***
#pragma xmp loop (i,j) on p(i,j)
for(j=0;j<30;j++)
	for(i=0;i<30;i++)
		a[j][i] = a[j][i]+a[j][i-1]+a[j+1][i]+ ...

reflect2.png

Using the Gmove Directive Construct

The gmove directive construct allows a variety of communication patterns to be specified with this construct alone. The figure below shows an example of this:

Between arrays with different distribution types

real A(N), B(N)
!$xmp nodes P(4)
!$xmp template TA(N),TB(N)
!$xmp distribute TA(block) onto P
!$xmp distribute TB(cyclic) onto P
!$xmp align A(i) with TA(i)
!$xmp align B(i) with TB(i)
...
!$xmp gmove
B(:) = A(:)
double A[N], B[N];
#pragma xmp nodes P(4)
#pragma xmp template TA(0:N-1)
#pragma xmp template TB(0:N-1)
#pragma xmp distribute TA(block) onto P
#pragma xmp distribute TB(cyclic) onto P
#pragma xmp align A[i] with TA(i)
#pragma xmp align B[i] with TB(i)
...
#pragma xmp gmove
B[:] = A[:];

gmove.png

Between array with different partition dimensions

real A(M,N), B(M,N)
!$xmp nodes P(4)
!$xmp template T1(M),T2(N)
!$xmp distribute T1(block) onto P
!$xmp distribute T2(block) onto P
!$xmp align A(i,*) with T1(i)
!$xmp align B(*,j) with T2(j)
...
!$xmp gmove
B(:,:) = A(:,:)
double A[N][M], B[N][M];
#pragma xmp nodes P(4)
#pragma xmp template T1(0:M-1)
#pragma xmp template T2(0:N-1)
#pragma xmp distribute T1(block) onto P
#pragma xmp distribute T2(block) onto P
#pragma xmp align A[*][i] with T1(i)
#pragma xmp align B[j][*] with T2(j)
...
#pragma xmp gmove
B[:][;] = A[:][;];

gmove2.png

Percentage change for uneven distribution

real A(22), B(22)
!$xmp nodes P(4)
!$xmp template TA(22),TB(22)
!$xmp distribute TA &
!$xmp  (gblock(/6,8,4,4/)) onto P
!$xmp distribute TB &
!$xmp  (gblock(/3,8,5,6/)) onto P
!$xmp align A(i) with TA(i)
!$xmp align B(i) with TB(i)
 ...
!$xmp gmove
B(:) = A(:)
double A[22], B[22];
#pragma xmp nodes P(4)
#pragma xmp template T1(0:21)
#pragma xmp template T2(0:21)
int a[4] = {6,8,4,4};
#pragma xmp distribute TA(gblock(a)) onto P
int b[4] = {3,8,5,6};
#pragma xmp distribute TB(gblock(b)) onto P
#pragma xmp align A[i] with TA(i)
#pragma xmp align B[i] with TA(i)
...
#pragma xmp gmove
B[:] = A[:];

gmove3.png

Distribution of a given cross-section to all nodes

real A(M,N), B(M)
!$xmp nodes P(4)
!$xmp template T2(N)
!$xmp distribute T2(block) onto P
!$xmp align A(*,i) with T2(i)
...
k=...
!$xmp gmove
B(:) = A(:,k)
double A[N][M], B[M];
#pragma xmp nodes P(4)
#pragma xmp template T2(0:N-1)
#pragma xmp distribute T2(block) onto P
#pragma xmp align A[i][*] with T2(i)
...
k=...
#pragma xmp gmove
B[:] = A[k][:];

gmove4.png