Comecei com o padrão localhost.cfg
encontrado em /usr/local/nagios/etc/objects
. Eu simplesmente adicionei um campo de serviço no topo da seção apropriada. Eu usei a seção de serviço "ping" existente como modelo ...
###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################
# Define a custom service
define service {
use local-service ; Name of service template to use
host_name localhost
service_description docker_testconn
check_command check_testconn_xxx22
}
define service {
use local-service ; Name of service template to use
host_name localhost
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}
A check_testconn_xxx22.sh
vida em /usr/local/nagios/libexec
e (para teste) simplesmente retorna uma mensagem positiva ....
#!/bin/bash
countWarnings=2
if (($countWarnings<=5)); then
echo "OK - $countWarnings services in Warning state"
exit 0
elif ((6<=$countWarnings && $countWarnings<=30)); then
# This case makes no sense because it only adds one warning.
# It is just to make an example on all possible exits.
echo "WARNING - $countWarnings services in Warning state"
exit 1
elif ((30<=$countWarnings)); then
echo "CRITICAL - $countWarnings services in Warning state"
exit 2
else
echo "UNKNOWN - $countWarnings"
exit 3
fi
...
# ls -la check_testconn_xxx22.sh
-rwxr-xr-x 1 root root 663 Jul 20 12:07 check_testconn_xxx22.sh
# ./check_testconn_xxx22.sh
OK - 2 services in Warning state
# echo $?
0
# service nagios restart
Job for nagios.service failed. See 'systemctl status nagios.service' and 'journalctl -xn' for details.
# journalctl -xn
-- Logs begin at Thu 2018-07-19 16:28:44 CEST, end at Fri 2018-07-20 12:08:21 CEST. --
Jul 20 12:08:21 docker-server-1 nagios[2872]: ***> One or more problems was encountered while running the pre-flight check...
Jul 20 12:08:21 docker-server-1 nagios[2872]: Check your configuration file(s) to ensure that they contain valid
Jul 20 12:08:21 docker-server-1 nagios[2872]: directives and data definitions. If you are upgrading from a previous
Jul 20 12:08:21 docker-server-1 nagios[2872]: version of Nagios, you should be aware that some variables/definitions
Jul 20 12:08:21 docker-server-1 nagios[2872]: may have been removed or modified in this version. Make sure to read
Jul 20 12:08:21 docker-server-1 nagios[2872]: the HTML documentation regarding the config files, as well as the
Jul 20 12:08:21 docker-server-1 nagios[2872]: 'Whats New' section to find out what has changed.
Jul 20 12:08:21 docker-server-1 systemd[1]: nagios.service: control process exited, code=exited status=1
Jul 20 12:08:21 docker-server-1 systemd[1]: Failed to start Nagios Core 4.4.1.
-- Subject: Unit nagios.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit nagios.service has failed.
--
-- The result is failed.
Não consigo descobrir por que o nagios está insatisfeito com essa seção.
Verifique o arquivo de configuração command.cfg que configura todos os comandos referenciados no arquivo localhost.cfg