Hi Erica,
Two suggestions:
1. By convention, the fact that "MailProg" is commented out usually
implies that it's the default. If you don't have /bin/mail, but you
do have /usr/bin/mail, you could specify that. More likely, you'll
need to install one of the classic mail packages, like "mailx".
2. I doubt it's a best practice, but we use /var/tmp for
"StateSaveLocation," and thus bypass questions about who has write
access to what.
Andy
On 08/15/2014 11:40 AM, Erica Riello
wrote:
Re: [slurm-dev] Re:
Hi Andy,
thanks for the advice.
Post by Andy Riebsscontrol show config | grep MailProg
slurm_load_ctl_conf error: Unable to contact slurm
controller (connect failure)
There's a directory /var/spool/slurmd, where there's an
empty file calledcred_state.
I'm using 14.03.6 version of Slurm, I've build it myself.
The slurm.conf is copied below:
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=erica-VirtualBox
#ControlAddr=
#
#MailProg=/bin/mail
MpiDefault=none
#MpiParams=ports=#-#
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/spool
SwitchType=switch/none
TaskPlugin=task/none
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
#SchedulerPort=7321
SelectType=select/linear
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=cluster
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
#SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm/slurmctld
#SlurmdDebug=3
SlurmdLogFile=/var/log/slurm/slurmd
#
#
# COMPUTE NODES
NodeName=erica-VirtualBox CPUs=1 RealMemory=2002
Sockets=1 CoresPerSocket=1 ThreadsPerCore=1 State=UNKNOWN
PartitionName=particao1 Nodes=erica-VirtualBox
Default=YES MaxTime=INFINITE State=UP
Is there any clue of what may be wrong?
Regards,
2014-08-13 11:23 GMT-03:00 Andy Riebs <andy.riebs-***@public.gmane.org>:
Hi Erica,
You'll find much of this discussion takes place
frequently, most recently about a week ago.
To get started,
[*]It looks like Slurm can't find a mail program. Use
$ scontrol show config | grep MailProg
to see what program Slurm is looking for.
[*]You probably want a subdirectory in /var/spool, such
as /var/spool/slurm, for your state save location so
that Slurm doesn't need full root privs to write to it
For further help from the people on this list, please
include
[*]What version of Slurm you are using
[*]Whether you built it yourself, or if it came from a
pre-built distribution
[*]A copy of your slurm.conf file (you might want to
obscure specific node names and other data that might
be used to compromise your system)
Also, as noted above, much of this is covered
frequently; check the mail archives for more detail.
(BTW, this is generally true of open source projects;
most of them have frequent "Hey, I've just started using
your program, and I've run into a hurdle..."
discussions. You gain immediate credibility if you start
your queries with "I've got a problem, and I can't find
it in the mail archive.")
Regards,
Andy
On 08/13/2014 09:56 AM, Erica Riello wrote:
Hi all,
I've installed slurm, and I when I try to
Post by Andy Riebsslurmctld -D -vvvv
slurmctld: pidfile not locked, assuming
no running daemon
slurmctld: error: Configured MailProg is
invalid
slurmctld: error: Job accounting
information gathered, but not stored
slurmctld: fatal: Incorrect permissions
on state save loc: /var/spool
Has anyone seen it before and know what
might be the cause for such errors?
Thanks in advance.
Regards,
--
===============
Erica
Riello
Computer Engineering
Student PUC-Rio
--
===============
Erica Riello
Aluna Engenharia de Computação PUC-Rio