Didier GAZEN
2014-06-17 17:09:34 UTC
Hi,
The fix that is supposed to print each down/drained node once rather
than once per partition (commit object b5ace9a) when running "sinfo -R"
is not sufficient.
When enabling the -R option (list_reasons), all spawned threads
(_build_part_info) that work on a separate partition receive as argument
an empty sinfo_list. In this "list_reasons" case, threads may update the
same partition data during the _insert_node_ptr function call. The
problem is that the _insert_node_ptr function and all the functions it
calls are not thread safe WHEN different threads are updating the SAME
partition member of sinfo_list.
The simplest workaround I found was to disable thread spawning when
list_reasons is on.
static int _build_sinfo_data(List sinfo_list,
partition_info_msg_t *partition_msg,
node_info_msg_t *node_msg)
The fix that is supposed to print each down/drained node once rather
than once per partition (commit object b5ace9a) when running "sinfo -R"
is not sufficient.
When enabling the -R option (list_reasons), all spawned threads
(_build_part_info) that work on a separate partition receive as argument
an empty sinfo_list. In this "list_reasons" case, threads may update the
same partition data during the _insert_node_ptr function call. The
problem is that the _insert_node_ptr function and all the functions it
calls are not thread safe WHEN different threads are updating the SAME
partition member of sinfo_list.
The simplest workaround I found was to disable thread spawning when
list_reasons is on.
static int _build_sinfo_data(List sinfo_list,
partition_info_msg_t *partition_msg,
node_info_msg_t *node_msg)