Discussion:
change in node sharing with new(er) version?
Michael Colonno
2014-09-26 18:32:30 UTC
Permalink
Hi All ~

I just upgraded a cluster several versions (from 2.5.2 to the 14.03.8); no changes were made to the config file (slurm.conf). Prior to the upgrade the cluster was configured to allow more than one job to run on a given node (specifying cores, memory, etc.). After the upgrade all jobs seem to be allocated as if they require exclusive nodes (or the as if the --exclusive flag was used) and don't seem to be sharing nodes. I'm guessing there was a change in the config file syntax for resource allocation but I can't find anything in the docs. Any thoughts?

Thanks,
~Mike C.
Morris Jette
2014-09-26 18:43:35 UTC
Permalink
I can't think of any relevant changes. Your config files would help a lot.
Post by Michael Colonno
Hi All ~
I just upgraded a cluster several versions (from 2.5.2 to the
14.03.8); no changes were made to the config file (slurm.conf). Prior
to the upgrade the cluster was configured to allow more than one job to
run on a given node (specifying cores, memory, etc.). After the upgrade
all jobs seem to be allocated as if they require exclusive nodes (or
the as if the --exclusive flag was used) and don't seem to be sharing
nodes. I'm guessing there was a change in the config file syntax for
resource allocation but I can't find anything in the docs. Any
thoughts?
Thanks,
~Mike C.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
Michael Colonno
2014-09-26 18:48:41 UTC
Permalink
Relevant portion of the config file is below – pretty vanilla and I don’t think that’s the cause after some more time spent debugging. The nodes in question are in state “drng” which I have not seen before. sinfo reports “Low RealMemory” for them and this must be the reason additional jobs aren’t being scheduled on the offending nodes. So it seems there have been some changings in resource monitoring. Prior to the upgrade more than one job would coexist on these systems without this warning (and may have been fighting for memory sometimes – unknown).



Thanks,

~Mike C.



# SCHEDULING

SchedulerType=sched/backfill

SelectType=select/cons_res

SelectTypeParameters=CR_Core_Memory

FastSchedule=1





From: Morris Jette [mailto:jette-***@public.gmane.org]
Sent: Friday, September 26, 2014 11:44 AM
To: slurm-dev
Subject: [slurm-dev] Re: change in node sharing with new(er) version?



I can't think of any relevant changes. Your config files would help a lot.

On September 26, 2014 11:32:38 AM PDT, Michael Colonno <***@stanford.edu> wrote:


Hi All ~

I just upgraded a cluster several versions (from 2.5.2 to the 14.03.8); no changes were made to the config file (slurm.conf). Prior to the upgrade the cluster was configured to allow more than one job to run on a given node (specifying cores, memory, etc.). After the upgrade all jobs seem to be allocated as if they require exclusive nodes (or the as if the --exclusive flag was used) and don't seem to be sharing nodes. I'm guessing there was a change in the config file syntax for resource allocation but I can't find anything in the docs. Any thoughts?

Thanks,
~Mike C.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.Image removed by sender.
Michael Colonno
2014-09-26 18:53:47 UTC
Permalink
A bit more data: it seems the users are requesting both an allocation of cores and memory when submitting jobs but there is no guarantee (that I’m aware of) that the application is actually limited to the memory requested. Could this be the root cause if this state?



Thanks,

~Mike C.



From: Michael Colonno [mailto:mcolonno-FGKo4X94FMn2fBVCVOL8/***@public.gmane.org]
Sent: Friday, September 26, 2014 11:49 AM
To: slurm-dev
Subject: [slurm-dev] Re: change in node sharing with new(er) version?



Relevant portion of the config file is below – pretty vanilla and I don’t think that’s the cause after some more time spent debugging. The nodes in question are in state “drng” which I have not seen before. sinfo reports “Low RealMemory” for them and this must be the reason additional jobs aren’t being scheduled on the offending nodes. So it seems there have been some changings in resource monitoring. Prior to the upgrade more than one job would coexist on these systems without this warning (and may have been fighting for memory sometimes – unknown).



Thanks,

~Mike C.



# SCHEDULING

SchedulerType=sched/backfill

SelectType=select/cons_res

SelectTypeParameters=CR_Core_Memory

FastSchedule=1





From: Morris Jette [mailto:jette-***@public.gmane.org]
Sent: Friday, September 26, 2014 11:44 AM
To: slurm-dev
Subject: [slurm-dev] Re: change in node sharing with new(er) version?



I can't think of any relevant changes. Your config files would help a lot.

On September 26, 2014 11:32:38 AM PDT, Michael Colonno <***@stanford.edu> wrote:


Hi All ~

I just upgraded a cluster several versions (from 2.5.2 to the 14.03.8); no changes were made to the config file (slurm.conf). Prior to the upgrade the cluster was configured to allow more than one job to run on a given node (specifying cores, memory, etc.). After the upgrade all jobs seem to be allocated as if they require exclusive nodes (or the as if the --exclusive flag was used) and don't seem to be sharing nodes. I'm guessing there was a change in the config file syntax for resource allocation but I can't find anything in the docs. Any thoughts?

Thanks,
~Mike C.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.Image removed by sender.

<Loading Image...>
Michael Colonno
2014-09-26 20:57:40 UTC
Permalink
After running a bunch of tests it seems the problem is that, as of the upgrade, the memory requested with sbatch / srun is being treated as a hard limit. If a job process exceeds this amount, even for an instant, the job is (essentially) killed and the node is put into either a “drain” or “drng” state. In order to restore the previous behavior I need a set of srun / sbatch / slurm.conf options that equate to: use the requested memory for scheduling purposes but allow the processes to overrun / share memory (but not cores). The Shared option in defining a partition doesn’t seem to be able to do this based on the online docs I could find.



Thanks,

~Mike C.



From: Morris Jette [mailto:jette-***@public.gmane.org]
Sent: Friday, September 26, 2014 11:44 AM
To: slurm-dev
Subject: [slurm-dev] Re: change in node sharing with new(er) version?



I can't think of any relevant changes. Your config files would help a lot.

On September 26, 2014 11:32:38 AM PDT, Michael Colonno <***@stanford.edu> wrote:


Hi All ~

I just upgraded a cluster several versions (from 2.5.2 to the 14.03.8); no changes were made to the config file (slurm.conf). Prior to the upgrade the cluster was configured to allow more than one job to run on a given node (specifying cores, memory, etc.). After the upgrade all jobs seem to be allocated as if they require exclusive nodes (or the as if the --exclusive flag was used) and don't seem to be sharing nodes. I'm guessing there was a change in the config file syntax for resource allocation but I can't find anything in the docs. Any thoughts?

Thanks,
~Mike C.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.Image removed by sender.
Michael Colonno
2014-09-26 22:00:38 UTC
Permalink
It seems that for whatever reason SLURM isn’t tracking memory properly. Certain nodes keep going into “drain” state after any job is submitted but no memory is actually being used. Example:



# scontrol show node f1

NodeName=cv-hpcf1 Arch=x86_64 CoresPerSocket=8

CPUAlloc=0 CPUErr=0 CPUTot=16 CPULoad=0.01 Features=(null)

Gres=(null)

NodeAddr=f1 NodeHostName=f1 Version=14.03

OS=Linux RealMemory=129023 AllocMem=0 Sockets=2 Boards=1

State=IDLE+DRAIN ThreadsPerCore=1 TmpDisk=0 Weight=1

BootTime=2014-09-26T08:56:17 SlurmdStartTime=2014-09-26T09:08:51

CurrentWatts=0 LowestJoules=0 ConsumedJoules=0

ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

Reason=Low RealMemory



It seems to think the memory is low even though none is allocated. Not sure how to proceed here…



Thanks,

~Mike C.

Loading...