Discussion:
Debugging slurmctld's abnormal behavoir
Vsevolod Nikonorov
2014-07-30 11:25:35 UTC
Permalink
Hello,

I have Slurm 14.03.6 installed on Centos 5.10, and slurmctld is unable
to talk to slurmdbd:

slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: debug2: Testing job time limits and checkpoints
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 70 of 2613 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: debug2: Testing job time limits and checkpoints
slurmctld: debug2: Performing purge of old job records
slurmctld: debug: sched: Running job scheduler
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error

What can I do to debug my problem?

Thanks in advance!
--
Никоноров Всеволод Дмитриевич, ОИТТиС, НИКИЭТ

Vsevolod D. Nikonorov, JSC NIKET
Vsevolod Nikonorov
2014-07-30 11:31:50 UTC
Permalink
Hello,

I have Slurm 14.03.6 installed on Centos 5.10, and slurmctld is unable
to talk to slurmdbd:

slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: debug2: Testing job time limits and checkpoints
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 70 of 2613 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: debug2: Testing job time limits and checkpoints
slurmctld: debug2: Performing purge of old job records
slurmctld: debug: sched: Running job scheduler
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error

What can I do to debug my problem?
--
Никоноров Всеволод Дмитриевич, ОИТТиС, НИКИЭТ

Vsevolod D. Nikonorov, JSC NIKET
Vsevolod Nikonorov
2014-07-31 05:29:30 UTC
Permalink
Hello,

I have Slurm 14.03.6 installed on Centos 5.10, and slurmctld is unable
to talk to slurmdbd:

slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: debug2: Testing job time limits and checkpoints
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 70 of 2613 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: debug2: Testing job time limits and checkpoints
slurmctld: debug2: Performing purge of old job records
slurmctld: debug: sched: Running job scheduler
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error
slurmctld: error: slurmdbd: read: No error
slurmctld: error: slurmdbd: only read 72 of 872415232 bytes
slurmctld: error: slurmdbd: Sending DbdInit msg: Unspecified error

What can I do to debug my problem?

Thanks in advance!

p.s. Sorry if some of my posts have doubled, we had some issues with our
mail server lately.
--
Никоноров Всеволод Дмитриевич, ОИТТиС, НИКИЭТ

Vsevolod D. Nikonorov, JSC NIKET
Loading...