All cores being allocated / -n ignored

Discussion:

Gordon Wells

2014-07-30 06:53:35 UTC

Hi

I get all CPUs from a node allocated to a job, even when I request less.
This is on a relatively new slurm setup, but using basically the same
configuration as an older setup which worked correctly

My slurm.conf looks as follows:
ClusterName=eslab
ControlMachine=riddley
SlurmUser=slurm
SlurmctldPort=6817
SlurmdPort=6818
AuthType=auth/munge
StateSaveLocation=/tmp
SlurmdSpoolDir=/tmp/slurmd
SwitchType=switch/none
MpiDefault=none
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
ProctrackType=proctrack/pgid
CacheGroups=0
ReturnToService=0

GresTypes=gpu

SlurmctldTimeout=300
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=30
Waittime=0
SchedulerType=sched/backfill
SelectType=select/linear
FastSchedule=1
SlurmctldDebug=3
SlurmdDebug=3
JobCompType=jobcomp/none
NodeName=riddley CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1
State=UNKNOWN Gres=gpu:1
PartitionName=debug Nodes=riddley Default=YES MaxTime=INFINITE State=UP

and in the batch file:
#SBATCH -J NAG_int_tip3p_rep2
#SBATCH -o NAG_int_tip3p_rep2.out
#SBATCH -e NAG_int_tip3p_rep2.err
#SBATCH -n 2
#SBATCH -p debug
#SBATCH -D /home/gordon/cpgh89/autodock/NAG_DNAP
#SBATCH -w riddley

Can anyone explain what I'm doing in this setup?

-- max(∫(εὐδαιμονία)dt)

Ryan Cox

2014-07-30 14:18:34 UTC

Permalink

Shared=No is the default for a partition (slurm.conf manpage). That
might have something to do with it.

Ryan

All cores being allocated / -n ignored
Hi
I get all CPUs from a node allocated to a job, even when I request
less. This is on a relatively new slurm setup, but using basically the
same configuration as an older setup which worked correctly
ClusterName=eslab
ControlMachine=riddley
SlurmUser=slurm
SlurmctldPort=6817
SlurmdPort=6818
AuthType=auth/munge
StateSaveLocation=/tmp
SlurmdSpoolDir=/tmp/slurmd
SwitchType=switch/none
MpiDefault=none
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
ProctrackType=proctrack/pgid
CacheGroups=0
ReturnToService=0
GresTypes=gpu
SlurmctldTimeout=300
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=30
Waittime=0
SchedulerType=sched/backfill
SelectType=select/linear
FastSchedule=1
SlurmctldDebug=3
SlurmdDebug=3
JobCompType=jobcomp/none
NodeName=riddley CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1
State=UNKNOWN Gres=gpu:1
PartitionName=debug Nodes=riddley Default=YES MaxTime=INFINITE State=UP
#SBATCH -J NAG_int_tip3p_rep2
#SBATCH -o NAG_int_tip3p_rep2.out
#SBATCH -e NAG_int_tip3p_rep2.err
#SBATCH -n 2
#SBATCH -p debug
#SBATCH -D /home/gordon/cpgh89/autodock/NAG_DNAP
#SBATCH -w riddley
Can anyone explain what I'm doing in this setup?
-- max(∫(εὐδαιμονία)dt)

--
Ryan Cox
Operations Director
Fulton Supercomputing Lab
Brigham Young University

j***@public.gmane.org

2014-07-30 14:56:32 UTC

Permalink

This configuration will always allocate all CPUs on a node to jobs:
SelectType=select/linear

Post by Gordon Wells
Hi
I get all CPUs from a node allocated to a job, even when I request less.
This is on a relatively new slurm setup, but using basically the same
configuration as an older setup which worked correctly
ClusterName=eslab
ControlMachine=riddley
SlurmUser=slurm
SlurmctldPort=6817
SlurmdPort=6818
AuthType=auth/munge
StateSaveLocation=/tmp
SlurmdSpoolDir=/tmp/slurmd
SwitchType=switch/none
MpiDefault=none
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
ProctrackType=proctrack/pgid
CacheGroups=0
ReturnToService=0
GresTypes=gpu
SlurmctldTimeout=300
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=30
Waittime=0
SchedulerType=sched/backfill
SelectType=select/linear
FastSchedule=1
SlurmctldDebug=3
SlurmdDebug=3
JobCompType=jobcomp/none
NodeName=riddley CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1
State=UNKNOWN Gres=gpu:1
PartitionName=debug Nodes=riddley Default=YES MaxTime=INFINITE State=UP
#SBATCH -J NAG_int_tip3p_rep2
#SBATCH -o NAG_int_tip3p_rep2.out
#SBATCH -e NAG_int_tip3p_rep2.err
#SBATCH -n 2
#SBATCH -p debug
#SBATCH -D /home/gordon/cpgh89/autodock/NAG_DNAP
#SBATCH -w riddley
Can anyone explain what I'm doing in this setup?
-- max(∫(εὐδαιμονία)dt)

--
Morris "Moe" Jette
CTO, SchedMD LLC

Slurm User Group Meeting
September 23-24, Lugano, Switzerland
Find out more http://slurm.schedmd.com/slurm_ug_agenda.html

Gordon Wells

2014-07-31 12:04:37 UTC

Permalink

Thanks, I missed that setting. Although I'm using
SelectType=select/cons_res now, but it's fixed the problem

-- max(∫(εὐδαιμονία)dt)

Post by Gordon Wells
SelectType=select/linear
Hi

Post by Gordon Wells
I get all CPUs from a node allocated to a job, even when I request less.
This is on a relatively new slurm setup, but using basically the same
configuration as an older setup which worked correctly
ClusterName=eslab
ControlMachine=riddley
SlurmUser=slurm
SlurmctldPort=6817
SlurmdPort=6818
AuthType=auth/munge
StateSaveLocation=/tmp
SlurmdSpoolDir=/tmp/slurmd
SwitchType=switch/none
MpiDefault=none
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
ProctrackType=proctrack/pgid
CacheGroups=0
ReturnToService=0
GresTypes=gpu
SlurmctldTimeout=300
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=30
Waittime=0
SchedulerType=sched/backfill
SelectType=select/linear
FastSchedule=1
SlurmctldDebug=3
SlurmdDebug=3
JobCompType=jobcomp/none
NodeName=riddley CPUs=4 Sockets=1 CoresPerSocket=4 ThreadsPerCore=1
State=UNKNOWN Gres=gpu:1
PartitionName=debug Nodes=riddley Default=YES MaxTime=INFINITE State=UP
#SBATCH -J NAG_int_tip3p_rep2
#SBATCH -o NAG_int_tip3p_rep2.out
#SBATCH -e NAG_int_tip3p_rep2.err
#SBATCH -n 2
#SBATCH -p debug
#SBATCH -D /home/gordon/cpgh89/autodock/NAG_DNAP
#SBATCH -w riddley
Can anyone explain what I'm doing in this setup?
-- max(∫(εὐδαιμονία)dt)

--
Morris "Moe" Jette
CTO, SchedMD LLC
Slurm User Group Meeting
September 23-24, Lugano, Switzerland
Find out more http://slurm.schedmd.com/slurm_ug_agenda.html

Continue reading on narkive:

Search results for 'All cores being allocated / -n ignored' (Questions and Answers)

replies

why do the Dems ignore the middle class paying who are on the brink of being poor?

started 2008-05-22 14:42:15 UTC

elections

replies

Is there a reason why climate change skeptics and deniers avoid the facts?

started 2010-04-25 11:56:18 UTC

global warming

replies

Will America lose the Cold War this time?