Discussion:
Bug in displaying nodes for pending jobs with multiple CPUs per task
Jesse Stroik
2014-08-25 18:00:20 UTC
Permalink
We noticed an inconsistency when slurm has pending jobs in the number of
expected nodes it will use. My understanding is that the number should
be calculated based on the maximum node size (by cpu) and the number of
CPUs a job needs.

A quick and dirty review of the code makes it look like it min_nodes is
calculated from num_tasks, and does not take into account the # of CPUs
per task. I could be missing something, however.

Many of our jobs are hybrid MPI + OpenMP. For example, the jobs might
use 4, 5 or even 10 CPUs per task. This causes the minimum node
calculation to seemingly be off by that factor.

Best,
Jesse Stroik

Loading...