The first three directives that must be added when parallelizing the program are the nodes directive, the template directive, and the distribute directive. These are written into the declarations section of the program.

```!\$xmp nodes p(2) !\$xmp template t(10) !\$xmp distribute t(block) onto p
integer a(10)

do i=1,10
a(i)=i*2
end do

write(*,*) a```
```#pragma xmp nodes p(2) #pragma xmp template t(0:9) #pragma xmp distribute t(block) onto p

int a;

for(i=0;i<10;i++)
a[i] = (i+1)*2;

for(i=0;i<10;i++)
printf("%d ", a[i]);```

In this example, we declare a template t of size 10, a node array p of size 2 in “block” distribution. In Fortran, this would be represented with node p(1) handling template elements t(1) through t(5), and node p(2) handling elements t(6) through t(10). In C, this would be represented with node p(1) handling template elements t(0) through t(4), and node p(2) handling elements t(5) through t(9).

The nodes directive is used to declare the node name and shape. In this example, the name p is assigned to a one-dimensional node array that consists of two nodes. These two nodes might represent all of the execution nodes, or possibly only a portion of them. Specifying “*” for the number of nodes assigns all the nodes to p. The total number of nodes is determined at execution start time.

The template directive is used to declare the template name and shape. A template is an abstract array that is used as an intermediary to represent the distribution of the array variables to the nodes and the distribution of computation load to nodes in the computation loop. In this program, both the size of the array and the number of loop iterations in the array are 10. Therefore, it is helpful to make the template follow this pattern, as an array of size 10. The above example in C uses a colon (:) and has the lower and upper limits of the abstract index of the template set to 0 and 9 respectively. In the Fortran example, only the size of the template is declared; in this case the lower and upper limits are 1 and 10.

The distribute directive is used to declare the template distribution. The distribution types are block, cyclic, block-cyclic, and gblock (nonuniform division). The way these types are used depends on the nature of the program.

Up to this point, node array p and template t have been declared. By doing this, the parallelization strategy has been decided, but variable a has not yet been declared, nor have any instructions been added for the DO loop and the WRITE statements, or to the for and printf statements. Hence, if the program were to be compiled and run, it would result in redundant execution, the same as with the sequential program.