Discussion:
Interactive job array
Julien Collas
2014-07-17 08:01:05 UTC
Permalink
Hi,

How would you do to run an interactive job array ? By interactive, I mean
that the command only exit at the end of the array ?

Regards,
Julien
David Bigagli
2014-07-17 17:29:33 UTC
Permalink
Like this?

#!/bin/sh

jid=`sbatch -o /dev/null --array=1-2 sleepme 30|awk '{print $4}'`

while :
do
t=`squeue --noheader -o%t -j`
if [ ! -z "$t" ]; then
echo "array $jid still around..."
sleep 2
else
break
fi
done
Post by Julien Collas
Hi,
How would you do to run an interactive job array ? By interactive, I
mean that the command only exit at the end of the array ?
Regards,
Julien
--
Thanks,
/David/Bigagli

www.schedmd.com
Nicolas GRANDEMANGE
2014-07-18 16:32:03 UTC
Permalink
Hi Julien,

I believe you can retrieve the job id (like David did) and then use the
'afterany' dependency with a fake 'true' command:

bash$ JID=`sbatch --array=1-1000 -o /dev/null test.sh | awk '{print $4}'`
bash$ srun -d "afterany:$JID" true
srun: job 788910 queued and waiting for resources
srun: job 788910 has been allocated resources

But, I don't think there is any easy way to get an aggregated return code
like with the Sun Grid Engine --sync option:

man qsub
If -sync y is used in conjunction with -t n[-m[:i]],
qsub will wait for all the job's tasks to complete
before exiting. If all the job's tasks complete suc-
cessfully, qsub's exit code will be that of the first
completed job tasks with a non-zero exit code, or 0 if
all job tasks exited with an exit code of 0

Regards
--
Nicolas Grandemange
Danny Auble
2014-07-18 17:03:36 UTC
Permalink
Would running steps (multiple sruns) inside of an allocation give you
what you are looking for?
Re: [slurm-dev] Re: Interactive job array
Hi Julien,
I believe you can retrieve the job id (like David did) and then use
bash$ JID=`sbatch --array=1-1000 -o /dev/null test.sh | awk '{print $4}'`
bash$ srun -d "afterany:$JID" true
srun: job 788910 queued and waiting for resources
srun: job 788910 has been allocated resources
But, I don't think there is any easy way to get an aggregated return code
man qsub
If -sync y is used in conjunction with -t n[-m[:i]],
qsub will wait for all the job's tasks to complete
before exiting. If all the job's tasks complete suc-
cessfully, qsub's exit code will be that of the first
completed job tasks with a non-zero exit code, or 0 if
all job tasks exited with an exit code of 0
Regards
--
Nicolas Grandemange
Julien Collas
2014-07-22 09:40:32 UTC
Permalink
Hello,

Nicolas' answer is exactly what I was looking for. Thanks for your
feedbacks.

Regards,


Julien
Post by Danny Auble
Would running steps (multiple sruns) inside of an allocation give you
what you are looking for?
Hi Julien,
I believe you can retrieve the job id (like David did) and then use the
bash$ JID=`sbatch --array=1-1000 -o /dev/null test.sh | awk '{print $4}'`
bash$ srun -d "afterany:$JID" true
srun: job 788910 queued and waiting for resources
srun: job 788910 has been allocated resources
But, I don't think there is any easy way to get an aggregated return code
man qsub
If -sync y is used in conjunction with -t n[-m[:i]],
qsub will wait for all the job's tasks to complete
before exiting. If all the job's tasks complete suc-
cessfully, qsub's exit code will be that of the first
completed job tasks with a non-zero exit code, or 0 if
all job tasks exited with an exit code of 0
Regards
--
Nicolas Grandemange
Continue reading on narkive:
Loading...