Bjørn-Helge Mevik
2014-08-26 15:43:31 UTC
(Since sgather is in contrib, and I found no contact address in it, I
post the report here.)
sgather in slurm 14.04.1 has a bug that is triggered when nodes are set
up with different Nodename and Nodehostname (and hostname(1) returns the
Nodehostname). Changing
nodelist=$($SRUN --ntasks=$SLURM_NNODES --ntasks-per-node=1 -l hostname) || exit $?
nodelist=$(echo "$nodelist" | cut -d ' ' -f 2 | sort)
into
nodelist=$($SCONTROL show hostnames $SLURM_NODELIST | sort)
should fix it (I am not sure if sort is even needed). It should also be
slightly more efficient.
It would also be nice if the node-global destinations could be
configurable, instead of being hard-coded in the script (or at least be
set at the top of the script). For instance, on our system, the
node-global file systems are /work and /cluster, not /scratch and /home.
post the report here.)
sgather in slurm 14.04.1 has a bug that is triggered when nodes are set
up with different Nodename and Nodehostname (and hostname(1) returns the
Nodehostname). Changing
nodelist=$($SRUN --ntasks=$SLURM_NNODES --ntasks-per-node=1 -l hostname) || exit $?
nodelist=$(echo "$nodelist" | cut -d ' ' -f 2 | sort)
into
nodelist=$($SCONTROL show hostnames $SLURM_NODELIST | sort)
should fix it (I am not sure if sort is even needed). It should also be
slightly more efficient.
It would also be nice if the node-global destinations could be
configurable, instead of being hard-coded in the script (or at least be
set at the top of the script). For instance, on our system, the
node-global file systems are /work and /cluster, not /scratch and /home.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo