我有以下服务配置来捕获 SNMP 陷阱:
define service {
name SNMP_TRAP
service_description SNMP_TRAP
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized
process_perf_data 0
obsess_over_service 0 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
check_command check-host-alive ; This will be used to reset the service to "OK"
is_volatile 1
check_period 24x7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
notification_interval 120
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
register 0
}
define service {
use SNMP_TRAP
service_description gigabitethernet16
hostgroup_name cisco
check_interval 120
}
我在cisco
组中有几个设备,例如:
define host {
use base-host
host_name cisco-sg300-28-4
alias CISCO-SG300-28 (VT-Registratur)
display_name Switch VT-Registratur
address 10.0.1.109
hostgroups switches,cisco,cisco28
}
该服务在 Web 界面中显示得很好:
但是,根本不处理收到的服务检查。我的/var/lib/nagios3/rw/nagios.cmd
文件收集了结果,但该文件永远不会被清除,结果也不会出现在 Nagios 中。nagios.cmd
包含,例如:
[1437659629] PROCESS_SERVICE_CHECK_RESULT;cisco-sg300-28-4;gigabitethernet16;2;gigabitethernet16 linkDown
accept_passive_service_checks
在 中启用nagios.cfg
。
经过进一步检查,我意识到这应该nagios.cmd
是一个命名管道。就我而言,它只是一个普通的旧文件:
从我们的日志存档中,我可以看到被动检查在过去的某个时间被处理过,但现在它们不再起作用了。
我再次查看配置以查找更多详细信息
nagios.cmd
并发现:所以,这给了我看的想法
README.Debian
,它位于/usr/share/doc/nagios3-common/README.Debian
并包含以下指令:虽然我确定该指令已启用,但我仔细检查了它实际上并未启用。
在启用它(并执行 中提到的其他任务后
README
,创建了命名管道。