Discussion:
Override memory limits with --exclusive?
Bjørn-Helge Mevik
2014-09-18 10:46:11 UTC
Permalink
We are running slurm 2.6.9, with the following configs:

SelectType=select/cons_res
SelectTypeParameters=CR_CPU_Memory
TaskPlugin=task/cgroup
ProctrackType=proctrack/cgroup

i.e., we hand out CPUs and memory to jobs, and use cgroups to enforce
the memory limit. Jobs are required to use --mem-per-cpu.

In some cases, it would be very nice to be able to «override» the memory
limit. Specifically, when a job specifies --exclusive, it would be very
useful if the job would be allowed to use more memory than there is RAM
on the node, because some programs use a lot of memory for a short
while. (It would of course be the user's responsibility if the job/node
crashed.)

Does anyone have any idea about how this could be achieved? Some
job_start plugin that disabled cgroups if a job specifies --exclusive
(or some other, plugin-implemented switch), perhaps? Has anyone tried
something like this?
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Christopher Samuel
2014-09-18 14:28:32 UTC
Permalink
Post by Bjørn-Helge Mevik
Does anyone have any idea about how this could be achieved? Some
job_start plugin that disabled cgroups if a job specifies --exclusive
(or some other, plugin-implemented switch), perhaps?
I was going to suggest that you could have a check in your prolog script
that looked for those conditions and set the memory cgroup
*.limit_in_bytes values to -1 to disable enforcement.

But that only gets executed for the first job step inside an allocation,
so you'd need to do something like:

srun /bin/true

in the batch script to ensure it was triggered.

Hope this helps!
Chris
--
Christopher Samuel Senior Systems Administrator
VLSCI - Victorian Life Sciences Computation Initiative
Email: samuel-***@public.gmane.org Phone: +61 (0)3 903 55545
http://www.vlsci.org.au/ http://twitter.com/vlsci
Bjørn-Helge Mevik
2014-09-19 07:06:31 UTC
Permalink
Thanks for the tip!

We actually already have a setup where "srun
--ntasks=$SLURM_JOB_NUM_NODES /bin/true" is run at the start of every
job, so we're definitely going to look into this.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Loading...