I think modifying the init scripts is likely to be the only way:
When I built my own version of slurm 14.03 of ubuntu 10.04 I installed
both slurm and munge on an nfs filesystem to be sure that slurm.conf was
identical across the cluster. This meant that the default init.d scripts
would fail as it would always try to start before the
/store/cluster/apps filesystem had been mounted. The way I fixed this
was to create an upstart script for munge (which I then use to trigger
slurm) which was started by the "remote-filesystems" event AND poll to
see if the directory existed yet and only start munge when it was a
valid path. You can do exactly the same to test for /dev/nvidia0.
since all of the polling is done in my munge upstart script and not in
slurm here is my /init/munge.conf.
Please not that this is the first ever upstart script I ever wrote so
I'm not claiming this is the best way, only that it works, I've not even
gone back and cleaned it up
--------------------------------------------------------------------------------
# Munge (My custom build)
#
description "Munge (My custom build for slurm)"
start on remote-filesystems
stop on runlevel [06S]
respawn
pre-start script
prefix="/store/cluster/apps/munge/gcc"
exec_prefix="${prefix}"
sbindir="${exec_prefix}/sbin"
sysconfdir="${prefix}/etc"
localstatedir="${prefix}/var"
DAEMON="$sbindir/munged"
RETRYCOUNT=10
RETRYDELAY=10
mycount=0
logger -is -t "$UPSTART_JOB" "checking prefix ${prefix}"
mkdir -p /var/run/munge
for dir in /home/share /store/cluster/apps/munge ; do
logger -is -t "$UPSTART_JOB" "checking dir \"$dir\" exists "
logger -is -t "$UPSTART_JOB" "RETRYCOUNT=$RETRYCOUNT and
mycount=$mycount"
while [ $mycount -lt ${RETRYCOUNT} ] ; do
mycount=`expr $mycount + 1`
if [ -d "$dir" ]
then
logger -is -t "$UPSTART_JOB" "$dir exists! lets go!"
break;
else
logger -is -t "$UPSTART_JOB" "WARNING: Required remote DIR
\"$dir\" not yet mounted waiting ${RETRYDELAY} seconds to retry (attempt
${mycount} of ${RETRYCOUNT} )"
sleep $RETRYDELAY
fi
done
if [ $mycount -eq 5 ]
then
logger -is -t "$UPSTART_JOB" "$dir does not exist giving up!"
stop
fi
done
# exit 0
end script
expect daemon
exec /store/cluster/apps/munge/gcc/sbin/munged 2>&1
--------------------------------------------------------------------------------------------------------------
I start slurm one munged has started using the "start on started munge"
upstart directive
Hopefully this is a useful example
Antony
Post by Lev GivonPost by Andy RiebsPost by Lev GivonI recently set up slurm 2.6.5 on a cluster of Ubuntu 14.04.1 systems hosting several
NVIDIA GPUs set up as generic resources. When the compute nodes are rebooted, I
noticed that they attempt to start slurmd before the device files initialized by
the nvidia kernel module appear, i.e., the following message appears in syslog
some number of lines before the GPU kernel driver load messages.
slurmd[1453]: fatal: can't stat gres.conf file /dev/nvidia0: No such file or directory
Is there a recommended way (on Ubuntu, at least) to ensure that slurmd isn't
started before any GPU device files appear?
One way to work around this is to set the node definition(s) in
slurm.conf with "State=DOWN". That way, manual intervention will be
required when a node is rebooted, allowing the rest of the system to
finish coming up.
Not sure how the above suggestion remedies the problem; as things stand,
I already need to manually start slurmd on the compute nodes after a
reboot because the absence of the device files prevents the daemon from starting.
Perhaps I should have phrased my question differently: is there a
recommended method on Ubuntu for ensuring that slurmd starts only after the GPU
device files appear if a GPU generic resource has been defined in a node's SLURM
configuration? One possibility that I'll try if no other solutions present
themselves involves modifying the init.d startup script to poll for the device
files if a GPU resource exists, but I'm curious whether there are any existing
fixes given that SLURM packages for Ubuntu have already existed for several years.