Discussion:
acct_gather_energy/ipmi configuration
Janne Blomqvist
2014-09-03 09:43:33 UTC
Permalink
Hi,

has anyone got the acct_gather_energy/ipmi plugin to work correctly? In
acct_gather.conf I have the lines

EnergyIPMIFrequency=30
EnergyIPMICalcAdjustment=yes

and in slurm.conf

DebugFlags=Profile
AcctGatherNodeFreq=30
AcctGatherEnergyType=acct_gather_energy/ipmi

However, the end result is that in the slurmd logs when starting slurmd
a line like

[2014-08-28T10:44:52.179] Power sensor not found.

appears.

I suspect that the reason is related to the fact that I cannot retrieve
the power readings with the "ipmi-sensors" command. With "ipmi-sensors
-W discretereading" I can get a reading for the power supplies, but it
seems to be the nameplate capacity rather than the current consumption.
Same for using ipmitool and ipmiutil rather than ipmi-sensors.

However, using "ipmi-dcmi --get-system-power-statistics" (part of
freeipmi) does appear to work.

So my question, I guess, is that is there some way to configure the
acct_gather_energy/ipmi plugin to retrieve these DCMI power values
instead of whatever it tries to do now? I looked briefly into the source
code and there is a big bunch of undocumented "EnergyIPMIxxxx"
configuration parameters, but I didn't figure out if any of those could
be used to use DCMI.

(The hardware in question is various HP Proliant servers somewhere
between 1 and 4 years old)
--
Janne Blomqvist, D.Sc. (Tech.), Scientific Computing Specialist
Aalto University School of Science, PHYS & BECS
+358503841576 || janne.blomqvist-***@public.gmane.org
Janne Blomqvist
2014-09-03 12:45:32 UTC
Permalink
Hi,

thanks for confirming that my configuration is correct. Indeed "ipmi-sensors --non-abbreviated-units | grep Watts " doesn't show anything, so I guess I'm out of luck until DCMI support is added to the plugin.
Hi Janne,
$ ipmi-sensors --non-abbreviated-units | grep Watts
If you have nothing, then the IPMI plugin cannot be used on this hardware.
We plan to add an option to used the DCMI power to support more hardwares.
The undocumented options "EnergyIPMIxxxx" are wrappers for
"ipmi-sensors" options.
I never have reason to use them.
Except "timeout" and "reflush", when I used BMC in unstable "dev" state,
but in this case I had troubles with ipmi-sensors too.
Thomas
Post by Janne Blomqvist
Hi,
has anyone got the acct_gather_energy/ipmi plugin to work correctly?
In acct_gather.conf I have the lines
EnergyIPMIFrequency=30
EnergyIPMICalcAdjustment=yes
and in slurm.conf
DebugFlags=Profile
AcctGatherNodeFreq=30
AcctGatherEnergyType=acct_gather_energy/ipmi
However, the end result is that in the slurmd logs when starting
slurmd a line like
[2014-08-28T10:44:52.179] Power sensor not found.
appears.
I suspect that the reason is related to the fact that I cannot
retrieve the power readings with the "ipmi-sensors" command. With
"ipmi-sensors -W discretereading" I can get a reading for the power
supplies, but it seems to be the nameplate capacity rather than the
current consumption. Same for using ipmitool and ipmiutil rather than
ipmi-sensors.
However, using "ipmi-dcmi --get-system-power-statistics" (part of
freeipmi) does appear to work.
So my question, I guess, is that is there some way to configure the
acct_gather_energy/ipmi plugin to retrieve these DCMI power values
instead of whatever it tries to do now? I looked briefly into the
source code and there is a big bunch of undocumented "EnergyIPMIxxxx"
configuration parameters, but I didn't figure out if any of those
could be used to use DCMI.
(The hardware in question is various HP Proliant servers somewhere
between 1 and 4 years old)
--
Janne Blomqvist, D.Sc. (Tech.), Scientific Computing Specialist
Aalto University School of Science, PHYS & BECS
+358503841576 || janne.blomqvist-***@public.gmane.org
Thomas Cadeau
2014-09-03 12:54:29 UTC
Permalink
Hi Janne,

The configuration is correct. Please try the command :
$ ipmi-sensors --non-abbreviated-units | grep Watts
If you have nothing, then the IPMI plugin cannot be used on this hardware.

We plan to add an option to used the DCMI power to support more hardwares.

The undocumented options "EnergyIPMIxxxx" are wrappers for
"ipmi-sensors" options.
I never have reason to use them.
Except "timeout" and "reflush", when I used BMC in unstable "dev" state,
but in this case I had troubles with ipmi-sensors too.

Thomas
Post by Janne Blomqvist
Hi,
has anyone got the acct_gather_energy/ipmi plugin to work correctly?
In acct_gather.conf I have the lines
EnergyIPMIFrequency=30
EnergyIPMICalcAdjustment=yes
and in slurm.conf
DebugFlags=Profile
AcctGatherNodeFreq=30
AcctGatherEnergyType=acct_gather_energy/ipmi
However, the end result is that in the slurmd logs when starting
slurmd a line like
[2014-08-28T10:44:52.179] Power sensor not found.
appears.
I suspect that the reason is related to the fact that I cannot
retrieve the power readings with the "ipmi-sensors" command. With
"ipmi-sensors -W discretereading" I can get a reading for the power
supplies, but it seems to be the nameplate capacity rather than the
current consumption. Same for using ipmitool and ipmiutil rather than
ipmi-sensors.
However, using "ipmi-dcmi --get-system-power-statistics" (part of
freeipmi) does appear to work.
So my question, I guess, is that is there some way to configure the
acct_gather_energy/ipmi plugin to retrieve these DCMI power values
instead of whatever it tries to do now? I looked briefly into the
source code and there is a big bunch of undocumented "EnergyIPMIxxxx"
configuration parameters, but I didn't figure out if any of those
could be used to use DCMI.
(The hardware in question is various HP Proliant servers somewhere
between 1 and 4 years old)
Loading...