Discussion:
srun with OpenMPI, pmi2 plugin or openmpi plugin?
Trey Dockendorf
2014-09-19 21:44:35 UTC
Permalink
I've been documenting for my users how to move from Torque to SLURM and what that means for running MPI jobs. Based on the SLURM documentation I've come up with the following:

$ slurm.conf

MpiDefault=none
MpiParams=ports=30000-39999

Then users run...

OpenMPI:

srun --mpi=openmpi --resv-ports /path/to/executable

MVAPICH2:

srun --mpi=none /path/to/executable

To test this and ensure I'm not giving bad instructions I've been running small 2 node HPL tests (to also test IB functionality), and this is when things go bad:

$ salloc -N2 --ntasks-per-node=32 --cpus-per-task=1 --mem-per-cpu=1900 -p mpi-core32
$ module load gcc openmpi openblas
$ srun --mpi=openmpi --resv-ports $HOME/hpl/bin/openblas_openmpi/xhpl
<LOTS of errors>
Need at least 64 processes for these tests <<<
Then...

$ srun --mpi=pmi2 --resv-ports $HOME/hpl/bin/openblas_openmpi/xhpl

< no errors >

Our install of OpenMPI was compiled like so:

../openmpi-1.8.2/configure --prefix=/apps/gcc-4.8.2/openmpi/1.8.2 \
--libdir=/apps/gcc-4.8.2/openmpi/1.8.2/lib64 \
--with-slurm --with-pmi --with-verbs \
--enable-shared --enable-static \
CFLAGS=-m64 CXXFLAGS=-m64 FFLAGS=-m64 FCFLAGS=-m64

The SLURM documentation [1] seems to indicate that the --mpi type should be OpenMPI. I'm finding though that if I set MpiDefault=pmi2 then I'm able to run both OpenMPI and MVAPICH2 without the "--mpi" argument or the "--resv-ports" argument.

MVAPICH2 was compiled using " --with-pm=no --with-pmi=slurm".

Is it the case that if OpenMPI is compiled with "--with-pmi" and "--with-slurm" then the pmi2 MPI plugin should be used?

Is "--resv-ports" necessary given how OpenMPI was compiled?

Thanks,
- Trey

[1] http://slurm.schedmd.com/mpi_guide.html#open_mpi

=============================

Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Email: treydock-mRW4Vj+***@public.gmane.org
Jabber: treydock-mRW4Vj+***@public.gmane.org
Carlos Bederián
2014-09-19 22:51:37 UTC
Permalink
AFAIK you don't need resv-ports with OpenMPI PMI2 (it works for us anyway),
and you can also set the SLURM_MPI_TYPE environment variable in your MPI
environment modules so users can run "srun /path/to/executable" whether
it's OpenMPI or MVAPICH2.
Post by Trey Dockendorf
I've been documenting for my users how to move from Torque to SLURM and
what that means for running MPI jobs. Based on the SLURM documentation
$ slurm.conf
MpiDefault=none
MpiParams=ports=30000-39999
Then users run...
srun --mpi=openmpi --resv-ports /path/to/executable
srun --mpi=none /path/to/executable
To test this and ensure I'm not giving bad instructions I've been running
small 2 node HPL tests (to also test IB functionality), and this is when
$ salloc -N2 --ntasks-per-node=32 --cpus-per-task=1 --mem-per-cpu=1900 -p mpi-core32
$ module load gcc openmpi openblas
$ srun --mpi=openmpi --resv-ports $HOME/hpl/bin/openblas_openmpi/xhpl
<LOTS of errors>
Need at least 64 processes for these tests <<<
Then...
$ srun --mpi=pmi2 --resv-ports $HOME/hpl/bin/openblas_openmpi/xhpl
< no errors >
../openmpi-1.8.2/configure --prefix=/apps/gcc-4.8.2/openmpi/1.8.2 \
--libdir=/apps/gcc-4.8.2/openmpi/1.8.2/lib64 \
--with-slurm --with-pmi --with-verbs \
--enable-shared --enable-static \
CFLAGS=-m64 CXXFLAGS=-m64 FFLAGS=-m64 FCFLAGS=-m64
The SLURM documentation [1] seems to indicate that the --mpi type should
be OpenMPI. I'm finding though that if I set MpiDefault=pmi2 then I'm able
to run both OpenMPI and MVAPICH2 without the "--mpi" argument or the
"--resv-ports" argument.
MVAPICH2 was compiled using " --with-pm=no --with-pmi=slurm".
Is it the case that if OpenMPI is compiled with "--with-pmi" and
"--with-slurm" then the pmi2 MPI plugin should be used?
Is "--resv-ports" necessary given how OpenMPI was compiled?
Thanks,
- Trey
[1] http://slurm.schedmd.com/mpi_guide.html#open_mpi
=============================
Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
--
Carlos S. Bederián
Instituto de Física Enrique Gaviola - CONICET
Medina Allende S/N, Ciudad Universitaria
X5000HUA Córdoba, Argentina
Trey Dockendorf
2014-09-20 01:08:35 UTC
Permalink
Thanks for confirming. The idea of setting the environment variable is a good one, thanks!

- Trey

=============================

Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Email: treydock-mRW4Vj+***@public.gmane.org
Jabber: treydock-mRW4Vj+***@public.gmane.org

----- Original Message -----
Sent: Friday, September 19, 2014 5:52:32 PM
Subject: [slurm-dev] Re: srun with OpenMPI, pmi2 plugin or openmpi plugin?
AFAIK you don't need resv-ports with OpenMPI PMI2 (it works for us
anyway), and you can also set the SLURM_MPI_TYPE environment
variable in your MPI environment modules so users can run "srun
/path/to/executable" whether it's OpenMPI or MVAPICH2.
I've been documenting for my users how to move from Torque to SLURM
and what that means for running MPI jobs. Based on the SLURM
$ slurm.conf
MpiDefault=none
MpiParams=ports=30000-39999
Then users run...
srun --mpi=openmpi --resv-ports /path/to/executable
srun --mpi=none /path/to/executable
To test this and ensure I'm not giving bad instructions I've been
running small 2 node HPL tests (to also test IB functionality), and
$ salloc -N2 --ntasks-per-node=32 --cpus-per-task=1
--mem-per-cpu=1900 -p mpi-core32
$ module load gcc openmpi openblas
$ srun --mpi=openmpi --resv-ports $HOME/hpl/bin/openblas_openmpi/xhpl
<LOTS of errors>
Need at least 64 processes for these tests <<<
Then...
$ srun --mpi=pmi2 --resv-ports $HOME/hpl/bin/openblas_openmpi/xhpl
< no errors >
../openmpi-1.8.2/configure --prefix=/apps/gcc-4.8.2/openmpi/1.8.2 \
--libdir=/apps/gcc-4.8.2/openmpi/1.8.2/lib64 \
--with-slurm --with-pmi --with-verbs \
--enable-shared --enable-static \
CFLAGS=-m64 CXXFLAGS=-m64 FFLAGS=-m64 FCFLAGS=-m64
The SLURM documentation [1] seems to indicate that the --mpi type
should be OpenMPI. I'm finding though that if I set MpiDefault=pmi2
then I'm able to run both OpenMPI and MVAPICH2 without the "--mpi"
argument or the "--resv-ports" argument.
MVAPICH2 was compiled using " --with-pm=no --with-pmi=slurm".
Is it the case that if OpenMPI is compiled with "--with-pmi" and
"--with-slurm" then the pmi2 MPI plugin should be used?
Is "--resv-ports" necessary given how OpenMPI was compiled?
Thanks,
- Trey
[1] http://slurm.schedmd.com/mpi_guide.html#open_mpi
=============================
Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
--
Carlos S. Bederián
Instituto de Física Enrique Gaviola - CONICET
Medina Allende S/N, Ciudad Universitaria
X5000HUA Córdoba, Argentina
Loading...