Sanjay Tiwari (stiwari)
2014-06-20 19:19:58 UTC
Hello,
I have a dedicated user slurm created for the installation purpose.
I am using munge for the authtype. Bothe slurm user and munge users work well together.
My cluster is operational with these users. I can monitor control and run jobs on the entire cluster.
At the same time any other user is not able to schedule any job, I see the following error:
srun: error: Task launch for 26.0 failed on node bolsvc01: Invalid job credential
srun: error: Application launch failed: Invalid job credential
srun: Job step aborted: Waiting up to 2 seconds for job step to finish.
srun: error: Timed out waiting for job step to complete
Any help and direction would help me move past this issue.
Cheers
Sanjay
I have a dedicated user slurm created for the installation purpose.
I am using munge for the authtype. Bothe slurm user and munge users work well together.
My cluster is operational with these users. I can monitor control and run jobs on the entire cluster.
At the same time any other user is not able to schedule any job, I see the following error:
srun: error: Task launch for 26.0 failed on node bolsvc01: Invalid job credential
srun: error: Application launch failed: Invalid job credential
srun: Job step aborted: Waiting up to 2 seconds for job step to finish.
srun: error: Timed out waiting for job step to complete
Any help and direction would help me move past this issue.
Cheers
Sanjay