Satrajit Ghosh
2014-07-16 16:38:35 UTC
hi folks,
we are trying to setup a cluster in a mixed usage scenario. thus far we
have had two slurm partitions (all_nodes, interactive). interactive at
present contains a single node that is also part of all_nodes.
---
PartitionName=all_nodes Default=YES MinNodes=1 AllowGroups=ALL Priority=1
DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=FORCE:4 GraceTime=0
ReqResv=NO PreemptMode=GANG State=UP Nodes=node[001-030]
PartitionName=interactive Default=NO MinNodes=1 MaxNodes=1
DefaultTime=01:00:00 MaxTime=01:00:00 AllowGroups=ALL Priority=10
DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0
MaxCPUsPerNode=32 ReqResv=NO PreemptMode=GANG State=UP Nodes=node017
---
what we are trying to achieve is a balance between cluster utilization and
interactive jobs.
are there ways in which we can balance these two options effectively?
this would be our list of constraints:
1. compute resources are time sliced across jobs. (this is already the
case, but doesn't appear to be compatible with constraint #2)
2. an interactive job request should get priority and exclusive access
within at most the time slicing window (we are using the default 30s)
independent on the number of jobs running on the node.
3. we would like to control the max number of slots an interactive job
could ask for.
4. we would like these partitions to overlap. i.e. we don't want to carve
out compute resources for the interactive partition.
any guidance would be much appreciated. also, these nodes have 1:12 core to
memory ratio, so many jobs can be launched and suspended on any node.
cheers,
satra
we are trying to setup a cluster in a mixed usage scenario. thus far we
have had two slurm partitions (all_nodes, interactive). interactive at
present contains a single node that is also part of all_nodes.
---
PartitionName=all_nodes Default=YES MinNodes=1 AllowGroups=ALL Priority=1
DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=FORCE:4 GraceTime=0
ReqResv=NO PreemptMode=GANG State=UP Nodes=node[001-030]
PartitionName=interactive Default=NO MinNodes=1 MaxNodes=1
DefaultTime=01:00:00 MaxTime=01:00:00 AllowGroups=ALL Priority=10
DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0
MaxCPUsPerNode=32 ReqResv=NO PreemptMode=GANG State=UP Nodes=node017
---
what we are trying to achieve is a balance between cluster utilization and
interactive jobs.
are there ways in which we can balance these two options effectively?
this would be our list of constraints:
1. compute resources are time sliced across jobs. (this is already the
case, but doesn't appear to be compatible with constraint #2)
2. an interactive job request should get priority and exclusive access
within at most the time slicing window (we are using the default 30s)
independent on the number of jobs running on the node.
3. we would like to control the max number of slots an interactive job
could ask for.
4. we would like these partitions to overlap. i.e. we don't want to carve
out compute resources for the interactive partition.
any guidance would be much appreciated. also, these nodes have 1:12 core to
memory ratio, so many jobs can be launched and suspended on any node.
cheers,
satra