Mike Johnson
2014-06-26 21:49:34 UTC
Hi
I have a question regarding splitting resources on a single node
across two partitions.
We have a number of nodes that contain 24 cores and 4 GPUs. What I'd
like is either of the two options:
1. One partition. Jobs that need CPU cores only will only be
scheduled up to the point that 16 cores are used on a GPU node. This
leaves space for jobs that need a GPU.
2. We can have a CPU and a GPU partition. 16 cores per GPU node are
allocated to the CPU queue and the remaining 8 cores and 4 GPUs are
allocated to the GPU queue.
It gets more complex though. Some nodes are 32, 48 or 160 cores and
don't have GPUs. Obviously these would be in the CPU partition only.
Limiting the number of CPUs used on a partition would cause issues
with these nodes because of the wastage of CPU cores if everything is
limited to 16 cores. Is it possible to have this level of control?
We'd be defining limits on a per-node basis.
Anyone got any tips? I think it should be possible I just don't know
how I'd go about defining this.
Thanks!
Mike
I have a question regarding splitting resources on a single node
across two partitions.
We have a number of nodes that contain 24 cores and 4 GPUs. What I'd
like is either of the two options:
1. One partition. Jobs that need CPU cores only will only be
scheduled up to the point that 16 cores are used on a GPU node. This
leaves space for jobs that need a GPU.
2. We can have a CPU and a GPU partition. 16 cores per GPU node are
allocated to the CPU queue and the remaining 8 cores and 4 GPUs are
allocated to the GPU queue.
It gets more complex though. Some nodes are 32, 48 or 160 cores and
don't have GPUs. Obviously these would be in the CPU partition only.
Limiting the number of CPUs used on a partition would cause issues
with these nodes because of the wastage of CPU cores if everything is
limited to 16 cores. Is it possible to have this level of control?
We'd be defining limits on a per-node basis.
Anyone got any tips? I think it should be possible I just don't know
how I'd go about defining this.
Thanks!
Mike