Wednesday, June 13, 2012

Running a simple MPI job on Ranger

Let's consider running a simple MPI job on Ranger [1]. The MPI program considered here will be a hello-world program.

1) Write a hello world application in C.
#include <stdio.h>
#include <mpi.h>


int main (argc, argv)
     int argc;
     char *argv[];
{
  int rank, size;

  MPI_Init (&argc, &argv);        /* starts MPI */
  MPI_Comm_rank (MPI_COMM_WORLD, &rank);        /* get current process id */
  MPI_Comm_size (MPI_COMM_WORLD, &size);        /* get number of processes */
  printf( "Hello world from process %d of %d\n", rank, size );
  MPI_Finalize();
  return 0;
}

2) Compile it in Ranger.
mpicc -o mpi-hellow-world mpi-hellow-world.c

3) Write a schedular script to run your application. (Let's assume that we have saved it under the name scheduler_sge_job_mpi_helloworld)
#!/bin/bash
# Grid Engine batch job script built by Globus job manager

#$ -S /bin/bash
#$ -V
#$ -pe 16way 16
#$ -N MPI-Airavata-Testing-Script
#$ -M heshan@ogce.org
#$ -m n
#$ -q development
#$ -A ***********
#$ -l h_rt=0:09:00
#$ -o /share/home/01437/ogce/airavata-test/mpi-hello.stdout
#$ -e /share/home/01437/ogce/airavata-test/mpi-hello.stderr
ibrun /share/home/01437/ogce/airavata-test/mpi-hellow-world

4) Use the qsub command to submit a batch job to Ranger.
ogce@login3.ranger.tacc.utexas.edu:/airavata-test/{13}> qsub scheduler_sge_job_mpi_helloworld
Once the job is submitted following output can be seen.
-------------------------------------------------------------------
------- Welcome to TACC's Ranger System, an NSF XD Resource -------
-------------------------------------------------------------------
--> Checking that you specified -V...
--> Checking that you specified a time limit...
--> Checking that you specified a queue...
--> Setting project...
--> Checking that you specified a parallel environment...
--> Checking that you specified a valid parallel environment name...
--> Checking that the minimum and maximum PE counts are the same...
--> Checking that the number of PEs requested is valid...
--> Ensuring absence of dubious h_vmem,h_data,s_vmem,s_data limits...
--> Requesting valid memory configuration (31.3G)...
--> Verifying WORK file-system availability...
--> Verifying HOME file-system availability...
--> Verifying SCRATCH file-system availability...
--> Checking ssh setup...
--> Checking that you didn't request more cores than the maximum...
--> Checking that you don't already have the maximum number of jobs...
--> Checking that you don't already have the maximum number of jobs in queue development...
--> Checking that your time limit isn't over the maximum...
--> Checking available allocation...
--> Submitting job...


Your job 2518464 ("MPI-Airavata-Testing-Script2") has been submitted

5) Using the qstat command check the status of the job.
ogce@login3.ranger.tacc.utexas.edu:/airavata-test/{17}> qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID 
-----------------------------------------------------------------------------------------------------------------
2518464 0.00000 MPI-Airava ogce         qw    04/19/2012 11:42:01                                   16        

6) The result of the batch job is written to the specified output file.
TACC: Setting memory limits for job 2518464 to unlimited KB
TACC: Dumping job script:
--------------------------------------------------------------------------------
#!/bin/bash
# Grid Engine batch job script built by Globus job manager

#$ -S /bin/bash
#$ -V
#$ -pe 16way 16
#$ -N MPI-Airavata-Testing-Script2
#$ -M ***@ogce.org
#$ -m n
#$ -q development
#$ -A TG-STA110014S
#$ -l h_rt=0:09:00
#$ -o /share/home/01437/ogce/airavata-test/mpi-hello.stdout
#$ -e /share/home/01437/ogce/airavata-test/mpi-hello.stderr
ibrun /share/home/01437/ogce/airavata-test/mpi-hellow-world
--------------------------------------------------------------------------------
TACC: Done.
TACC: Starting up job 2518464
TACC: Setting up parallel environment for OpenMPI mpirun.
TACC: Setup complete. Running job script.
TACC: starting parallel tasks...
echo_mpi_output=Hello world from process 7 of 16
echo_mpi_output=Hello world from process 6 of 16
echo_mpi_output=Hello world from process 2 of 16
echo_mpi_output=Hello world from process 4 of 16
echo_mpi_output=Hello world from process 3 of 16
echo_mpi_output=Hello world from process 10 of 16
echo_mpi_output=Hello world from process 13 of 16
echo_mpi_output=Hello world from process 9 of 16
echo_mpi_output=Hello world from process 8 of 16
echo_mpi_output=Hello world from process 0 of 16
echo_mpi_output=Hello world from process 1 of 16
echo_mpi_output=Hello world from process 12 of 16

[1] - http://www.tacc.utexas.edu/user-services/user-guides/ranger-user-guide

No comments: