Discussion:
(unknown)
Erica Riello
2014-08-13 13:56:31 UTC
Permalink
Hi all,

I've installed slurm, and I when I try to start slurmctld, I get these
slurmctld -D -vvvv
slurmctld: pidfile not locked, assuming no running daemon
slurmctld: error: Configured MailProg is invalid
slurmctld: error: Job accounting information gathered, but not stored
slurmctld: fatal: Incorrect permissions on state save loc: /var/spool

Has anyone seen it before and know what might be the cause for such errors?

Thanks in advance.

Regards,
--
===============
Erica Riello
Computer Engineering Student PUC-Rio
Andy Riebs
2014-08-13 14:14:37 UTC
Permalink
Oops; the other essential guideline for getting help is to include a
meaningful subject line!

On 08/13/2014 10:12 AM, Andy Riebs
wrote:
Hi Erica,

You'll find much of this discussion takes place frequently, most
recently about a week ago.

To get started,

[*]It looks like Slurm can't find a mail program. Use
$ scontrol show config | grep MailProg
to see what program Slurm is looking for.
[*]You probably want a subdirectory in /var/spool, such as
/var/spool/slurm, for your state save location so that Slurm
doesn't need full root privs to write to it
For further help from the people on this list, please include

[*]What version of Slurm you are using
[*]Whether you built it yourself, or if it came from a
pre-built distribution
[*]A copy of your slurm.conf file (you might want to obscure
specific node names and other data that might be used to
compromise your system)
Also, as noted above, much of this is covered frequently; check
the mail archives for more detail. (BTW, this is generally true
of open source projects; most of them have frequent "Hey, I've
just started using your program, and I've run into a hurdle..."
discussions. You gain immediate credibility if you start your
queries with "I've got a problem, and I can't find it in the
mail archive.")

Regards,
Andy

On 08/13/2014 09:56 AM, Erica Riello
wrote:
slurm-dev
Hi all,
I've installed slurm, and I when I try to start
slurmctld -D -vvvv
slurmctld: pidfile not locked, assuming no running
daemon
slurmctld: error: Configured MailProg is invalid
slurmctld: error: Job accounting information
gathered, but not stored
slurmctld: fatal: Incorrect permissions on state save
loc: /var/spool

Has anyone seen it before and know what might be the
cause for such errors?
Thanks in advance.
Regards,
--
===============
Erica Riello
Computer Engineering Student PUC-Rio
Williams, Kevin E. (Federal SIP)
2014-08-13 14:28:57 UTC
Permalink
Thanks for that. I was seeing a lot of newer messages without the [slurm-dev] header. Very annoying, but as a neophyte, I was mute on the subject… ;-)

From: Riebs, Andy
Sent: Wednesday, August 13, 2014 10:15 AM
To: slurm-dev
Subject: [slurm-dev] slurm-dev Slurm configuration questions, was Re:

Oops; the other essential guideline for getting help is to include a meaningful subject line!
On 08/13/2014 10:12 AM, Andy Riebs wrote:
Hi Erica,

You'll find much of this discussion takes place frequently, most recently about a week ago.

To get started,
· It looks like Slurm can't find a mail program. Use
$ scontrol show config | grep MailProg
to see what program Slurm is looking for.
· You probably want a subdirectory in /var/spool, such as /var/spool/slurm, for your state save location so that Slurm doesn't need full root privs to write to it

For further help from the people on this list, please include
· What version of Slurm you are using
· Whether you built it yourself, or if it came from a pre-built distribution
· A copy of your slurm.conf file (you might want to obscure specific node names and other data that might be used to compromise your system)

Also, as noted above, much of this is covered frequently; check the mail archives for more detail. (BTW, this is generally true of open source projects; most of them have frequent "Hey, I've just started using your program, and I've run into a hurdle..." discussions. You gain immediate credibility if you start your queries with "I've got a problem, and I can't find it in the mail archive.")

Regards,
Andy
On 08/13/2014 09:56 AM, Erica Riello wrote:
Hi all,
slurmctld -D -vvvv
slurmctld: pidfile not locked, assuming no running daemon
slurmctld: error: Configured MailProg is invalid
slurmctld: error: Job accounting information gathered, but not stored
slurmctld: fatal: Incorrect permissions on state save loc: /var/spool

Has anyone seen it before and know what might be the cause for such errors?

Thanks in advance.

Regards,
--
===============
Erica Riello
Computer Engineering Student PUC-Rio
Andy Riebs
2014-08-13 14:23:36 UTC
Permalink
Hi Erica,

You'll find much of this discussion takes place frequently, most
recently about a week ago.

To get started,

[*]It looks like Slurm can't find a mail program. Use
$ scontrol show config | grep MailProg
to see what program Slurm is looking for.
[*]You probably want a subdirectory in /var/spool, such as
/var/spool/slurm, for your state save location so that Slurm
doesn't need full root privs to write to it
For further help from the people on this list, please include

[*]What version of Slurm you are using
[*]Whether you built it yourself, or if it came from a pre-built
distribution
[*]A copy of your slurm.conf file (you might want to obscure
specific node names and other data that might be used to
compromise your system)
Also, as noted above, much of this is covered frequently; check
the mail archives for more detail. (BTW, this is generally true of
open source projects; most of them have frequent "Hey, I've just
started using your program, and I've run into a hurdle..."
discussions. You gain immediate credibility if you start your
queries with "I've got a problem, and I can't find it in the mail
archive.")

Regards,
Andy

On 08/13/2014 09:56 AM, Erica Riello
wrote:
slurm-dev
Hi all,
I've installed slurm, and I when I try to start slurmctld,
slurmctld -D -vvvv
slurmctld: pidfile not locked, assuming no running
daemon
slurmctld: error: Configured MailProg is invalid
slurmctld: error: Job accounting information gathered,
but not stored
slurmctld: fatal: Incorrect permissions on state save
loc: /var/spool
Has anyone seen it before and know what might be the cause
for such errors?
Thanks in advance.
Regards,
--
===============

Erica Riello
Computer Engineering Student PUC-Rio
Erica Riello
2014-08-15 15:40:45 UTC
Permalink
Hi Andy,

thanks for the advice.
Post by Andy Riebs
scontrol show config | grep MailProg
slurm_load_ctl_conf error: Unable to contact slurm controller (connect
failure)

There's a directory /var/spool/slurmd, where there's an empty file
calledcred_state.

I'm using 14.03.6 version of Slurm, I've build it myself.

The slurm.conf is copied below:

# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=erica-VirtualBox
#ControlAddr=
#
#MailProg=/bin/mail
MpiDefault=none
#MpiParams=ports=#-#
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/spool
SwitchType=switch/none
TaskPlugin=task/none
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
#SchedulerPort=7321
SelectType=select/linear
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=cluster
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
#SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm/slurmctld
#SlurmdDebug=3
SlurmdLogFile=/var/log/slurm/slurmd
#
#
# COMPUTE NODES
NodeName=erica-VirtualBox CPUs=1 RealMemory=2002 Sockets=1 CoresPerSocket=1
ThreadsPerCore=1 State=UNKNOWN
PartitionName=particao1 Nodes=erica-VirtualBox Default=YES MaxTime=INFINITE
State=UP

Is there any clue of what may be wrong?

Regards,
Post by Andy Riebs
Hi Erica,
You'll find much of this discussion takes place frequently, most recently
about a week ago.
To get started,
- It looks like Slurm can't find a mail program. Use
$ scontrol show config | grep MailProg
to see what program Slurm is looking for.
- You probably want a subdirectory in /var/spool, such as
/var/spool/slurm, for your state save location so that Slurm doesn't need
full root privs to write to it
For further help from the people on this list, please include
- What version of Slurm you are using
- Whether you built it yourself, or if it came from a pre-built
distribution
- A copy of your slurm.conf file (you might want to obscure specific
node names and other data that might be used to compromise your system)
Also, as noted above, much of this is covered frequently; check the mail
archives for more detail. (BTW, this is generally true of open source
projects; most of them have frequent "Hey, I've just started using your
program, and I've run into a hurdle..." discussions. You gain immediate
credibility if you start your queries with "I've got a problem, and I can't
find it in the mail archive.")
Regards,
Andy
Hi all,
I've installed slurm, and I when I try to start slurmctld, I get these
slurmctld -D -vvvv
slurmctld: pidfile not locked, assuming no running daemon
slurmctld: error: Configured MailProg is invalid
slurmctld: error: Job accounting information gathered, but not stored
slurmctld: fatal: Incorrect permissions on state save loc: /var/spool
Has anyone seen it before and know what might be the cause for such errors?
Thanks in advance.
Regards,
--
===============
Erica Riello
Computer Engineering Student PUC-Rio
--
===============
Erica Riello
Aluna Engenharia de Computação PUC-Rio
Andy Riebs
2014-08-15 15:52:48 UTC
Permalink
Hi Erica,

Two suggestions:

1. By convention, the fact that "MailProg" is commented out usually
implies that it's the default. If you don't have /bin/mail, but you
do have /usr/bin/mail, you could specify that. More likely, you'll
need to install one of the classic mail packages, like "mailx".
2. I doubt it's a best practice, but we use /var/tmp for
"StateSaveLocation," and thus bypass questions about who has write
access to what.

Andy

On 08/15/2014 11:40 AM, Erica Riello
wrote:
Re: [slurm-dev] Re:

Hi Andy,
thanks for the advice.
Post by Andy Riebs
scontrol show config | grep MailProg
slurm_load_ctl_conf error: Unable to contact slurm
controller (connect failure)
There's a directory /var/spool/slurmd, where there's an
empty file calledcred_state.
I'm using 14.03.6 version of Slurm, I've build it myself.
The slurm.conf is copied below:
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=erica-VirtualBox
#ControlAddr=
#
#MailProg=/bin/mail
MpiDefault=none
#MpiParams=ports=#-#
ProctrackType=proctrack/pgid

ReturnToService=1
SlurmctldPidFile=/var/run/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd

SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/spool
SwitchType=switch/none
TaskPlugin=task/none
#
#
# TIMERS
#KillWait=30

#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
#SchedulerPort=7321
SelectType=select/linear
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=cluster
#JobAcctGatherFrequency=30

JobAcctGatherType=jobacct_gather/linux
#SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm/slurmctld
#SlurmdDebug=3
SlurmdLogFile=/var/log/slurm/slurmd
#
#

# COMPUTE NODES
NodeName=erica-VirtualBox CPUs=1 RealMemory=2002
Sockets=1 CoresPerSocket=1 ThreadsPerCore=1 State=UNKNOWN
PartitionName=particao1 Nodes=erica-VirtualBox
Default=YES MaxTime=INFINITE State=UP
Is there any clue of what may be wrong?
Regards,
2014-08-13 11:23 GMT-03:00 Andy Riebs <andy.riebs-***@public.gmane.org>:

Hi Erica,

You'll find much of this discussion takes place
frequently, most recently about a week ago.

To get started,

[*]It looks like Slurm can't find a mail program. Use
$ scontrol show config | grep MailProg
to see what program Slurm is looking for.
[*]You probably want a subdirectory in /var/spool, such
as /var/spool/slurm, for your state save location so
that Slurm doesn't need full root privs to write to it
For further help from the people on this list, please
include

[*]What version of Slurm you are using
[*]Whether you built it yourself, or if it came from a
pre-built distribution
[*]A copy of your slurm.conf file (you might want to
obscure specific node names and other data that might
be used to compromise your system)
Also, as noted above, much of this is covered
frequently; check the mail archives for more detail.
(BTW, this is generally true of open source projects;
most of them have frequent "Hey, I've just started using
your program, and I've run into a hurdle..."
discussions. You gain immediate credibility if you start
your queries with "I've got a problem, and I can't find
it in the mail archive.")

Regards,
Andy
On 08/13/2014 09:56 AM, Erica Riello wrote:
Hi all,
I've installed slurm, and I when I try to
Post by Andy Riebs
slurmctld -D -vvvv
slurmctld: pidfile not locked, assuming
no running daemon
slurmctld: error: Configured MailProg is
invalid
slurmctld: error: Job accounting
information gathered, but not stored
slurmctld: fatal: Incorrect permissions
on state save loc: /var/spool

Has anyone seen it before and know what
might be the cause for such errors?
Thanks in advance.
Regards,
--
===============
Erica
Riello
Computer Engineering
Student PUC-Rio
--
===============
Erica Riello
Aluna Engenharia de Computação PUC-Rio
Erica Riello
2014-08-15 16:19:58 UTC
Permalink
Hi Andy,

I've made the suggestions you've suggested and I'm experiencing connection
issues now which I'm trying to solve.

Thanks for the help.

Regards,
Post by Andy Riebs
Hi Erica,
1. By convention, the fact that "MailProg" is commented out usually
implies that it's the default. If you don't have /bin/mail, but you do have
/usr/bin/mail, you could specify that. More likely, you'll need to install
one of the classic mail packages, like "mailx".
2. I doubt it's a best practice, but we use /var/tmp for
"StateSaveLocation," and thus bypass questions about who has write access
to what.
Andy
Hi Andy,
thanks for the advice.
Post by Andy Riebs
scontrol show config | grep MailProg
slurm_load_ctl_conf error: Unable to contact slurm controller (connect
failure)
There's a directory /var/spool/slurmd, where there's an empty file
calledcred_state.
I'm using 14.03.6 version of Slurm, I've build it myself.
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=erica-VirtualBox
#ControlAddr=
#
#MailProg=/bin/mail
MpiDefault=none
#MpiParams=ports=#-#
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/spool
SwitchType=switch/none
TaskPlugin=task/none
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
#SchedulerPort=7321
SelectType=select/linear
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=cluster
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
#SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm/slurmctld
#SlurmdDebug=3
SlurmdLogFile=/var/log/slurm/slurmd
#
#
# COMPUTE NODES
NodeName=erica-VirtualBox CPUs=1 RealMemory=2002 Sockets=1
CoresPerSocket=1 ThreadsPerCore=1 State=UNKNOWN
PartitionName=particao1 Nodes=erica-VirtualBox Default=YES
MaxTime=INFINITE State=UP
Is there any clue of what may be wrong?
Regards,
Post by Andy Riebs
Hi Erica,
You'll find much of this discussion takes place frequently, most recently
about a week ago.
To get started,
- It looks like Slurm can't find a mail program. Use
$ scontrol show config | grep MailProg
to see what program Slurm is looking for.
- You probably want a subdirectory in /var/spool, such as
/var/spool/slurm, for your state save location so that Slurm doesn't need
full root privs to write to it
For further help from the people on this list, please include
- What version of Slurm you are using
- Whether you built it yourself, or if it came from a pre-built
distribution
- A copy of your slurm.conf file (you might want to obscure specific
node names and other data that might be used to compromise your system)
Also, as noted above, much of this is covered frequently; check the mail
archives for more detail. (BTW, this is generally true of open source
projects; most of them have frequent "Hey, I've just started using your
program, and I've run into a hurdle..." discussions. You gain immediate
credibility if you start your queries with "I've got a problem, and I can't
find it in the mail archive.")
Regards,
Andy
Hi all,
I've installed slurm, and I when I try to start slurmctld, I get these
slurmctld -D -vvvv
slurmctld: pidfile not locked, assuming no running daemon
slurmctld: error: Configured MailProg is invalid
slurmctld: error: Job accounting information gathered, but not stored
slurmctld: fatal: Incorrect permissions on state save loc: /var/spool
Has anyone seen it before and know what might be the cause for such errors?
Thanks in advance.
Regards,
--
===============
Erica Riello
Computer Engineering Student PUC-Rio
--
===============
Erica Riello
Aluna Engenharia de Computação PUC-Rio
--
===============
Erica Riello
Aluna Engenharia de Computação PUC-Rio
Continue reading on narkive:
Loading...