Paul Edmon
2013-01-30 15:02:02 UTC
Perhaps I missed the documentation on this but what is the proper order
of operations to add new nodes to slurm.conf? Currently if we start up
slurmd on the new nodes but then don't have them in the conf it just
fails on the nodes. However, if we then later add them to the conf and
to a reconfigure on the master the master process falls over and we have
to restart it. At that point they show up as unknown and waiting for
the slurmd's on the respective new nodes to connect. Ideally this
wouldn't happen, the master shouldn't tip over just because new hosts
are added to the conf. Once those hosts are in though then simply
restarting slurmd on the hosts works fine.
So what is the proper order? Do you put the new hosts in the conf and
start up their slurmd's before you reconfig the master?
-Paul Edmon-
of operations to add new nodes to slurm.conf? Currently if we start up
slurmd on the new nodes but then don't have them in the conf it just
fails on the nodes. However, if we then later add them to the conf and
to a reconfigure on the master the master process falls over and we have
to restart it. At that point they show up as unknown and waiting for
the slurmd's on the respective new nodes to connect. Ideally this
wouldn't happen, the master shouldn't tip over just because new hosts
are added to the conf. Once those hosts are in though then simply
restarting slurmd on the hosts works fine.
So what is the proper order? Do you put the new hosts in the conf and
start up their slurmd's before you reconfig the master?
-Paul Edmon-