Discussion:
Jobs stuck in RUNNING state after batch script completes
Poole, Ruth J.
2014-07-16 17:55:35 UTC
Permalink
I just installed version 14.03.3-2 on our Cray XMT, built using the --enable-front-end option.
Since the new install, jobs submitted using sbatch will remain in the queue in the running state indefinitely, long after the submitted script has exited.

I did notice that in our slurm.conf generated by the new build has
SelectType=select/cons_res
SelectTypeParameters=CR_Core

Where our old version had
SelectType=select/linear

The following option remains unchanged from our previous version
ProctrackType=proctrack/pgid

I'm thinking that it's using a different trigger for job completion, but not sure how to go back to setting the job completed when the script exits.
Interactive allocation with salloc does end properly when the shell is exited.
Loading...