Discussion:
BLCR Plugin not being loaded by slurmctld daemon
Arjun J Rao
2014-06-27 13:48:30 UTC
Permalink
When starting up slurmctld daemon on a node on which BLCR is installed and
the modules are loaded (confirmed with lsmod | grep blcr) I get the
following error and slurmctld refuses to start up.

slurmctld : error : Couldn't find the specified plugin name for
checkpoint/blcr looking at all files
slurmctld : error : Cannot find checkpoint plugin for checkpoint/blcr
slurmctld : error : Cannot create checkpoint context for checkpoint/blcr
slurmctld : fatal : failed to initialize checkpoint plugin

Any ideas on why this might be occuring ? I've created checkpoints using
BLCR and SLURM on this machine earlier.
j***@public.gmane.org
2014-06-27 14:53:29 UTC
Permalink
I suspect that you do not have the blcr-dev (developer package with
headers and library) installed for Slurm to build the plugin.
Post by Arjun J Rao
When starting up slurmctld daemon on a node on which BLCR is installed and
the modules are loaded (confirmed with lsmod | grep blcr) I get the
following error and slurmctld refuses to start up.
slurmctld : error : Couldn't find the specified plugin name for
checkpoint/blcr looking at all files
slurmctld : error : Cannot find checkpoint plugin for checkpoint/blcr
slurmctld : error : Cannot create checkpoint context for checkpoint/blcr
slurmctld : fatal : failed to initialize checkpoint plugin
Any ideas on why this might be occuring ? I've created checkpoints using
BLCR and SLURM on this machine earlier.
Arjun J Rao
2014-06-28 13:14:34 UTC
Permalink
I had previously ran slurm-2.6.6-2 with blcr just fine. I installed
mvapich2 with the options --with-pm=no --with-pmi=slurm --enable ckpt and
the checkpoints happened. That unfortunately developed some problems where
the instructions to perform checkpoints were not succeeding.

So I decided to install slurm 14.03.4-2. I just did a normal ./configure,
make, make install
I installed BLCR separately.
There is a package such as blcr-dev that I need to install separately ? I
couldn't find any such package on the SLURM website though.
Post by j***@public.gmane.org
I suspect that you do not have the blcr-dev (developer package with
headers and library) installed for Slurm to build the plugin.
When starting up slurmctld daemon on a node on which BLCR is installed and
Post by Arjun J Rao
the modules are loaded (confirmed with lsmod | grep blcr) I get the
following error and slurmctld refuses to start up.
slurmctld : error : Couldn't find the specified plugin name for
checkpoint/blcr looking at all files
slurmctld : error : Cannot find checkpoint plugin for checkpoint/blcr
slurmctld : error : Cannot create checkpoint context for checkpoint/blcr
slurmctld : fatal : failed to initialize checkpoint plugin
Any ideas on why this might be occuring ? I've created checkpoints using
BLCR and SLURM on this machine earlier.
Trey Dockendorf
2014-06-28 17:03:31 UTC
Permalink
You have to download BLCR from the BLCR website and build and install it before compiling slurm. It's not something provided by slurm, it's an external dependency.

- Trey

Arjun J Rao <rectangle.king-***@public.gmane.org> wrote:

I had previously ran slurm-2.6.6-2 with blcr just fine. I installed
mvapich2 with the options --with-pm=no --with-pmi=slurm --enable ckpt and
the checkpoints happened. That unfortunately developed some problems where
the instructions to perform checkpoints were not succeeding.

So I decided to install slurm 14.03.4-2. I just did a normal ./configure,
make, make install
I installed BLCR separately.
There is a package such as blcr-dev that I need to install separately ? I
couldn't find any such package on the SLURM website though.
Post by j***@public.gmane.org
I suspect that you do not have the blcr-dev (developer package with
headers and library) installed for Slurm to build the plugin.
When starting up slurmctld daemon on a node on which BLCR is installed and
Post by Arjun J Rao
the modules are loaded (confirmed with lsmod | grep blcr) I get the
following error and slurmctld refuses to start up.
slurmctld : error : Couldn't find the specified plugin name for
checkpoint/blcr looking at all files
slurmctld : error : Cannot find checkpoint plugin for checkpoint/blcr
slurmctld : error : Cannot create checkpoint context for checkpoint/blcr
slurmctld : fatal : failed to initialize checkpoint plugin
Any ideas on why this might be occuring ? I've created checkpoints using
BLCR and SLURM on this machine earlier.
Arjun J Rao
2014-06-29 08:12:34 UTC
Permalink
I have already installed it.
Like I said in my original email, lsmod | grep blcr shows that the BLCR
kernel modules are indeed installed and loaded.
Post by Trey Dockendorf
You have to download BLCR from the BLCR website and build and install it
before compiling slurm. It's not something provided by slurm, it's an
external dependency.
- Trey
I had previously ran slurm-2.6.6-2 with blcr just fine. I installed
mvapich2 with the options --with-pm=no --with-pmi=slurm --enable ckpt and
the checkpoints happened. That unfortunately developed some problems where
the instructions to perform checkpoints were not succeeding.
So I decided to install slurm 14.03.4-2. I just did a normal ./configure,
make, make install
I installed BLCR separately.
There is a package such as blcr-dev that I need to install separately ? I
couldn't find any such package on the SLURM website though.
Post by j***@public.gmane.org
I suspect that you do not have the blcr-dev (developer package with
headers and library) installed for Slurm to build the plugin.
When starting up slurmctld daemon on a node on which BLCR is installed
Post by Arjun J Rao
and
the modules are loaded (confirmed with lsmod | grep blcr) I get the
following error and slurmctld refuses to start up.
slurmctld : error : Couldn't find the specified plugin name for
checkpoint/blcr looking at all files
slurmctld : error : Cannot find checkpoint plugin for checkpoint/blcr
slurmctld : error : Cannot create checkpoint context for checkpoint/blcr
slurmctld : fatal : failed to initialize checkpoint plugin
Any ideas on why this might be occuring ? I've created checkpoints using
BLCR and SLURM on this machine earlier.
Loading...