Sharing GPU memory (gpu

Discussion:

Sharing GPU memory (gpu_mem)

Sergio Iserte Agut

2012-04-23 13:53:06 UTC

Hello,
I'm trying to configure my Slurm-2.3.2 in order to allow me to run multiple
jobs in the same GPU.

These are my configurations:

*slurm.conf*

SchedulerType=sched/backfill

SelectType=select/linear
GresTypes=gpu,gpu_mem
NodeName=enersis CPUs=1 Sockets=1 CoresPerSocket=1 ThreadsPerCore=1
RealMemory=1006 State=UNKNOWN
NodeName=compute0 NodeAddr=10.0.0.2 CPUs=4 RealMemory=7982 Sockets=1
CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN gres=gpu:1,gpu_mem:512
NodeName=compute1 NodeAddr=10.0.0.3 CPUs=4 RealMemory=7982 Sockets=1
CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN gres=gpu:1,gpu_mem:512
PartitionName=debug Nodes=compute[0-1] Default=YES MaxTime=INFINITE
State=UP

* *
*gres.conf*
*
*

Name=gpu File=/dev/nvidia0
Name=gpu_mem Count=512

*$ srun -w"compute0" --gres=gpu:1,gpu_mem:250 sleep 100 &*
*$ srun -w"compute0" --gres=gpu:1,gpu_mem:250 sleep 100 &*

*$ squeue*
*
*

JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
68 debug sleep root PD 0:00 1 (Resources)
67 debug sleep root R 0:04 1 compute0

I wonder if to run both jobs sharing the GPU memory is possible.

Thank you.

Regards!
Sergio Iserte.

Moe Jette

2012-04-23 15:15:05 UTC

Permalink

The current logic allows a GRES to be allocated to one job at a time,
however you could develop a new plugin to do what you want. You would
not use gres/gpu, but write a gres/gpu_mem plugin that duplicates a
lot of the code from gres/gpu to set GPU environment variables for
CUDA (the code is src/common/gres.c will already avoid
over-subscribing the gpu_mem count/size).

Post by Sergio Iserte Agut
Hello,
I'm trying to configure my Slurm-2.3.2 in order to allow me to run multiple
jobs in the same GPU.
*slurm.conf*
SchedulerType=sched/backfill

* *
*gres.conf*
*
*

Name=gpu File=/dev/nvidia0
Name=gpu_mem Count=512

*$ srun -w"compute0" --gres=gpu:1,gpu_mem:250 sleep 100 &*
*$ srun -w"compute0" --gres=gpu:1,gpu_mem:250 sleep 100 &*
*$ squeue*
*
*

JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
68 debug sleep root PD 0:00 1 (Resources)
67 debug sleep root R 0:04 1 compute0

I wonder if to run both jobs sharing the GPU memory is possible.
Thank you.
Regards!
Sergio Iserte.

Sergio Iserte Agut

2012-04-23 19:26:05 UTC

Permalink

Thank you for you quick answer, I will get on with it!

Regards!

Post by Moe Jette
The current logic allows a GRES to be allocated to one job at a time,
however you could develop a new plugin to do what you want. You would
not use gres/gpu, but write a gres/gpu_mem plugin that duplicates a
lot of the code from gres/gpu to set GPU environment variables for
CUDA (the code is src/common/gres.c will already avoid
over-subscribing the gpu_mem count/size).

Post by Sergio Iserte Agut
Hello,
I'm trying to configure my Slurm-2.3.2 in order to allow me to run

multiple

Post by Sergio Iserte Agut
jobs in the same GPU.
*slurm.conf*
SchedulerType=sched/backfill

* *
*gres.conf*
*
*

Name=gpu File=/dev/nvidia0
Name=gpu_mem Count=512

*$ srun -w"compute0" --gres=gpu:1,gpu_mem:250 sleep 100 &*
*$ srun -w"compute0" --gres=gpu:1,gpu_mem:250 sleep 100 &*
*$ squeue*
*
*

JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
68 debug sleep root PD 0:00 1 (Resources)
67 debug sleep root R 0:04 1 compute0

I wonder if to run both jobs sharing the GPU memory is possible.
Thank you.
Regards!
Sergio Iserte.

Sergio Iserte Agut

2012-04-25 08:24:03 UTC

Permalink

Hello,

I have already created the plugin gres/gpu_mem whose code is almost the
same of gres/gpu. But I have been doing tests and I have seen that the
program never enter into the functions job_set_env and step_set_env either.

And if I look at /var/log/slurm/slurmctld.log, I don't understand the clues

slurmctld: debug: Configuration for job 188 complete
slurmctld: debug: gres/gpu_mem: step_test 188.4294967294 gres_bit_alloc
is NULL
slurmctld: debug: gres/gpu_mem: step_test 188.4294967294 gres_bit_alloc
is NULL
slurmctld: debug: gres/gpu_mem: step_test 188.0 gres_bit_alloc is NULL
slurmctld: debug3: step_layout cpus = 4 pos = 0
slurmctld: debug: laying out the 1 tasks on 1 hosts compute0 dist 1
slurmctld: debug: gres/gpu_mem: step_alloc gres_bit_alloc for 188.0 is
NULL
slurmctld: sched: _slurm_rpc_job_step_create: StepId=188.0 compute0
usec=1174

I hope somebody can help me.

Kind regards!

Sergio Iserte.

Thank you for you quick answer, I will get on with it!
Regards!

Post by Sergio Iserte Agut
Hello,
I'm trying to configure my Slurm-2.3.2 in order to allow me to run

multiple

Post by Sergio Iserte Agut
jobs in the same GPU.
*slurm.conf*
SchedulerType=sched/backfill

* *
*gres.conf*
*
*

Name=gpu File=/dev/nvidia0
Name=gpu_mem Count=512

*$ srun -w"compute0" --gres=gpu:1,gpu_mem:250 sleep 100 &*
*$ srun -w"compute0" --gres=gpu:1,gpu_mem:250 sleep 100 &*
*$ squeue*
*
*

JOBID PARTITION NAME USER ST TIME NODES

NODELIST(REASON)

Post by Sergio Iserte Agut

68 debug sleep root PD 0:00 1 (Resources)
67 debug sleep root R 0:04 1 compute0

I wonder if to run both jobs sharing the GPU memory is possible.
Thank you.
Regards!
Sergio Iserte.

Moe Jette

2012-04-25 22:05:04 UTC

Permalink

Did you install the plugin file on the head and compute nodes are
restart the slurmctld and slurmd daemons?

What does your code look like?

Post by Sergio Iserte Agut
Hello,
I have already created the plugin gres/gpu_mem whose code is almost the
same of gres/gpu. But I have been doing tests and I have seen that the
program never enter into the functions job_set_env and step_set_env either.
And if I look at /var/log/slurm/slurmctld.log, I don't understand the clues

I hope somebody can help me.
Kind regards!
Sergio Iserte.

Thank you for you quick answer, I will get on with it!
Regards!

Post by Sergio Iserte Agut
Hello,
I'm trying to configure my Slurm-2.3.2 in order to allow me to run

multiple

Post by Sergio Iserte Agut
jobs in the same GPU.
*slurm.conf*
SchedulerType=sched/backfill

* *
*gres.conf*
*
*

Name=gpu File=/dev/nvidia0
Name=gpu_mem Count=512

*$ srun -w"compute0" --gres=gpu:1,gpu_mem:250 sleep 100 &*
*$ srun -w"compute0" --gres=gpu:1,gpu_mem:250 sleep 100 &*
*$ squeue*
*
*

JOBID PARTITION NAME USER ST TIME NODES

NODELIST(REASON)

Post by Sergio Iserte Agut

68 debug sleep root PD 0:00 1 (Resources)
67 debug sleep root R 0:04 1 compute0

I wonder if to run both jobs sharing the GPU memory is possible.
Thank you.
Regards!
Sergio Iserte.

Sergio Iserte Agut

2012-04-26 10:27:04 UTC

Permalink

Yes, I've installed the plugin on every node, ant then I've restarted them
daemons.

The code of gres/gpu_mem is the same of the gres/gpu, though I've put a
debug line in each function to see the call flow.
That's why I know that when I submit a job requesting gpu_mem, it is called
the functions node_config_load and step_set_env.

And if I run:

$ srun --gres=gpu:1,gpu_mem:100 -w"compute0" sleep 10
SLURM_NODELIST=compute0
CUDA_VISIBLE_DEVICES=0

I get the result in the files attached.

Thank you for your interest.

Kind regards!

Post by Moe Jette
Did you install the plugin file on the head and compute nodes are
restart the slurmctld and slurmd daemons?
What does your code look like?

either.

Post by Sergio Iserte Agut
And if I look at /var/log/slurm/slurmctld.log, I don't understand the

clues

Post by Sergio Iserte Agut

I hope somebody can help me.
Kind regards!
Sergio Iserte.

Thank you for you quick answer, I will get on with it!
Regards!

Post by Sergio Iserte Agut
Hello,
I'm trying to configure my Slurm-2.3.2 in order to allow me to run

multiple

Post by Sergio Iserte Agut
jobs in the same GPU.
*slurm.conf*
SchedulerType=sched/backfill

gres=gpu:1,gpu_mem:512

Post by Sergio Iserte Agut

Post by Moe Jette

Post by Sergio Iserte Agut

NodeName=compute1 NodeAddr=10.0.0.3 CPUs=4 RealMemory=7982 Sockets=1
CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN

gres=gpu:1,gpu_mem:512

Post by Sergio Iserte Agut

Post by Moe Jette

Post by Sergio Iserte Agut

PartitionName=debug Nodes=compute[0-1] Default=YES MaxTime=INFINITE
State=UP

* *
*gres.conf*
*
*

Name=gpu File=/dev/nvidia0
Name=gpu_mem Count=512

*$ srun -w"compute0" --gres=gpu:1,gpu_mem:250 sleep 100 &*
*$ srun -w"compute0" --gres=gpu:1,gpu_mem:250 sleep 100 &*
*$ squeue*
*
*

JOBID PARTITION NAME USER ST TIME NODES

NODELIST(REASON)

Post by Sergio Iserte Agut

68 debug sleep root PD 0:00 1

(Resources)

Post by Sergio Iserte Agut

Post by Moe Jette

Post by Sergio Iserte Agut

67 debug sleep root R 0:04 1 compute0

I wonder if to run both jobs sharing the GPU memory is possible.
Thank you.
Regards!
Sergio Iserte.