srun with OpenMPI, pmi2 plugin or openmpi plugin?
Trey Dockendorf
2014-09-19 21:44:35 UTC
I've been documenting for my users how to move from Torque to SLURM and what that means for running MPI jobs. Based on the SLURM documentation I've come up with the following:

$ slurm.conf


Then users run...


srun --mpi=openmpi --resv-ports /path/to/executable


srun --mpi=none /path/to/executable

To test this and ensure I'm not giving bad instructions I've been running small 2 node HPL tests (to also test IB functionality), and this is when things go bad:

$ salloc -N2 --ntasks-per-node=32 --cpus-per-task=1 --mem-per-cpu=1900 -p mpi-core32
$ module load gcc openmpi openblas
$ srun --mpi=openmpi --resv-ports $HOME/hpl/bin/openblas_openmpi/xhpl
<LOTS of errors>
Need at least 64 processes for these tests <<<

$ srun --mpi=pmi2 --resv-ports $HOME/hpl/bin/openblas_openmpi/xhpl

< no errors >

Our install of OpenMPI was compiled like so:

../openmpi-1.8.2/configure --prefix=/apps/gcc-4.8.2/openmpi/1.8.2 \
--libdir=/apps/gcc-4.8.2/openmpi/1.8.2/lib64 \
--with-slurm --with-pmi --with-verbs \
--enable-shared --enable-static \

The SLURM documentation [1] seems to indicate that the --mpi type should be OpenMPI. I'm finding though that if I set MpiDefault=pmi2 then I'm able to run both OpenMPI and MVAPICH2 without the "--mpi" argument or the "--resv-ports" argument.

MVAPICH2 was compiled using " --with-pm=no --with-pmi=slurm".

Is it the case that if OpenMPI is compiled with "--with-pmi" and "--with-slurm" then the pmi2 MPI plugin should be used?

Is "--resv-ports" necessary given how OpenMPI was compiled?

- Trey

[1] http://slurm.schedmd.com/mpi_guide.html#open_mpi


Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Email: treydock-mRW4Vj+***@public.gmane.org
Jabber: treydock-mRW4Vj+***@public.gmane.org
Carlos Bederián
2014-09-19 22:51:37 UTC
AFAIK you don't need resv-ports with OpenMPI PMI2 (it works for us anyway),
and you can also set the SLURM_MPI_TYPE environment variable in your MPI
environment modules so users can run "srun /path/to/executable" whether
it's OpenMPI or MVAPICH2.
Post by Trey Dockendorf
I've been documenting for my users how to move from Torque to SLURM and
what that means for running MPI jobs. Based on the SLURM documentation
$ slurm.conf
Then users run...
srun --mpi=openmpi --resv-ports /path/to/executable
srun --mpi=none /path/to/executable
To test this and ensure I'm not giving bad instructions I've been running
small 2 node HPL tests (to also test IB functionality), and this is when
$ salloc -N2 --ntasks-per-node=32 --cpus-per-task=1 --mem-per-cpu=1900 -p mpi-core32
$ module load gcc openmpi openblas
$ srun --mpi=openmpi --resv-ports $HOME/hpl/bin/openblas_openmpi/xhpl
<LOTS of errors>
Need at least 64 processes for these tests <<<
$ srun --mpi=pmi2 --resv-ports $HOME/hpl/bin/openblas_openmpi/xhpl
< no errors >
../openmpi-1.8.2/configure --prefix=/apps/gcc-4.8.2/openmpi/1.8.2 \
--libdir=/apps/gcc-4.8.2/openmpi/1.8.2/lib64 \
--with-slurm --with-pmi --with-verbs \
--enable-shared --enable-static \
The SLURM documentation [1] seems to indicate that the --mpi type should
be OpenMPI. I'm finding though that if I set MpiDefault=pmi2 then I'm able
to run both OpenMPI and MVAPICH2 without the "--mpi" argument or the
"--resv-ports" argument.
MVAPICH2 was compiled using " --with-pm=no --with-pmi=slurm".
Is it the case that if OpenMPI is compiled with "--with-pmi" and
"--with-slurm" then the pmi2 MPI plugin should be used?
Is "--resv-ports" necessary given how OpenMPI was compiled?
- Trey
[1] http://slurm.schedmd.com/mpi_guide.html#open_mpi
Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Carlos S. Bederián
Instituto de Física Enrique Gaviola - CONICET
Medina Allende S/N, Ciudad Universitaria
X5000HUA Córdoba, Argentina
Trey Dockendorf
2014-09-20 01:08:35 UTC
Thanks for confirming. The idea of setting the environment variable is a good one, thanks!

- Trey


Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Email: treydock-mRW4Vj+***@public.gmane.org
Jabber: treydock-mRW4Vj+***@public.gmane.org

----- Original Message -----
Sent: Friday, September 19, 2014 5:52:32 PM
Subject: [slurm-dev] Re: srun with OpenMPI, pmi2 plugin or openmpi plugin?
AFAIK you don't need resv-ports with OpenMPI PMI2 (it works for us
anyway), and you can also set the SLURM_MPI_TYPE environment
variable in your MPI environment modules so users can run "srun
/path/to/executable" whether it's OpenMPI or MVAPICH2.
I've been documenting for my users how to move from Torque to SLURM
and what that means for running MPI jobs. Based on the SLURM
$ slurm.conf
Then users run...
srun --mpi=openmpi --resv-ports /path/to/executable
srun --mpi=none /path/to/executable
To test this and ensure I'm not giving bad instructions I've been
running small 2 node HPL tests (to also test IB functionality), and
$ salloc -N2 --ntasks-per-node=32 --cpus-per-task=1
--mem-per-cpu=1900 -p mpi-core32
$ module load gcc openmpi openblas
$ srun --mpi=openmpi --resv-ports $HOME/hpl/bin/openblas_openmpi/xhpl
<LOTS of errors>
Need at least 64 processes for these tests <<<
$ srun --mpi=pmi2 --resv-ports $HOME/hpl/bin/openblas_openmpi/xhpl
< no errors >
../openmpi-1.8.2/configure --prefix=/apps/gcc-4.8.2/openmpi/1.8.2 \
--libdir=/apps/gcc-4.8.2/openmpi/1.8.2/lib64 \
--with-slurm --with-pmi --with-verbs \
--enable-shared --enable-static \
The SLURM documentation [1] seems to indicate that the --mpi type
should be OpenMPI. I'm finding though that if I set MpiDefault=pmi2
then I'm able to run both OpenMPI and MVAPICH2 without the "--mpi"
argument or the "--resv-ports" argument.
MVAPICH2 was compiled using " --with-pm=no --with-pmi=slurm".
Is it the case that if OpenMPI is compiled with "--with-pmi" and
"--with-slurm" then the pmi2 MPI plugin should be used?
Is "--resv-ports" necessary given how OpenMPI was compiled?
- Trey
[1] http://slurm.schedmd.com/mpi_guide.html#open_mpi
Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Carlos S. Bederián
Instituto de Física Enrique Gaviola - CONICET
Medina Allende S/N, Ciudad Universitaria
X5000HUA Córdoba, Argentina