A example for distributed application on multi-node cluster.

This is a template for programming a distributed application in the AXEL multi-node cluster. This example add two vectors, A and B, repeatedly and the result is stored in vector A.

Parallelism is achieved by segmenting the vectors and distribute it to the nodes. The workload within a node is then further distributed to various PEs of the node. In this example, both GPU and FPGA are used.

To run the distributed application, user should enter the following:

qsub cfg/myapp.sh

The 'qsub' command assume we have PBS/Torque cluster management system up and running. To run the application without PBS/Torque, user can use the MPI runtime directly as shown below:

mpirun -bynode \
  -host axel05 -np 1 ./myapp_m0 cfg/myapp.xml : \
  -host axel06 -np 1 ./myapp_m0 cfg/myapp.xml : \
  -host axel07 -np 1 ./myapp_m0 cfg/myapp.xml : \
  -host axel08 -np 1 ./myapp_m0 cfg/myapp.xml

The node assignment is random.