我正在尝试启用 sysstat 读取温度读数,以便我有过去的温度信息来诊断将来的主机故障。
我试过这个命令来获取温度信息:
$ sar -m TEMP
Requested activities not available in file /var/log/sysstat/sa22
以下是 sar 手册页对此的说明:
-m { keyword [,...] | ALL }
Report power management statistics. Note that these statistics depend on sadc's option "-S POWER" to
be collected.
Possible keywords are CPU, FAN, FREQ, IN, TEMP and USB.
[...]
With the TEMP keyword, statistics about devices temperature are reported. The following values are
displayed:
据此,默认情况下不记录电源管理信息(温度是其子集)。所以,我更改了文件/etc/sysstat/sysstat
以启用它。我改变了这个:
# Parameters for the system activity data collector (see sadc(8) manual page)
# which are used for the generation of log files.
# By default contains the `-S DISK' option responsible for generating disk
# statisitcs. Use `-S XALL' to collect all available statistics.
SADC_OPTIONS="-S DISK"
进入这个:
SADC_OPTIONS="-S DISK,POWER"
sysstat 问题跟踪器上的另一个问题说 sysstat 需要 lm-sensors 才能运行,所以我也安装了该软件包。这是输出sensors
:
$ sensors
acpitz-acpi-0
Adapter: ACPI interface
temp1: +27.8°C (crit = +119.0°C)
temp2: +29.8°C (crit = +119.0°C)
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +89.0°C (high = +82.0°C, crit = +100.0°C)
Core 0: +86.0°C (high = +82.0°C, crit = +100.0°C)
Core 1: +88.0°C (high = +82.0°C, crit = +100.0°C)
Core 2: +89.0°C (high = +82.0°C, crit = +100.0°C)
Core 3: +89.0°C (high = +82.0°C, crit = +100.0°C)
Core 4: +88.0°C (high = +82.0°C, crit = +100.0°C)
Core 5: +87.0°C (high = +82.0°C, crit = +100.0°C)
nvme-pci-0800
Adapter: PCI adapter
Composite: +38.9°C (low = -273.1°C, high = +84.8°C)
(crit = +84.8°C)
Sensor 1: +38.9°C (low = -273.1°C, high = +65261.8°C)
Sensor 2: +37.9°C (low = -273.1°C, high = +65261.8°C)
所以这似乎可以正确检测到我的温度传感器。
我还尝试等待十分钟以等待另一个收集发生。(我的系统配置为每十分钟记录一次,时间为 :05、:15、:25 等)
不幸的是,毕竟,我仍然得到同样的错误:
$ sar -m TEMP
Requested activities not available in file /var/log/sysstat/sa22