我的设置的简要说明:我在 Docker Compose 单服务器配置中有 11 个 Docker 容器。其中一些容器产生日志,我将其写入(主要是单独的)主机卷。这会产生八个日志文件,这些文件与 Rsyslog 容器共享,同样通过 Docker 卷。最后,这个容器将日志转发到基于云的日志聚合器应用程序 Papertrail。
这一切正常:我可以近乎实时地查看 Mongo 日志、Apache 日志和各种应用程序级日志。但是,我意识到当我重新启动记录器容器时,Rsyslog 将丢失它已推送的日志的任何记录,因此将再次推送整个日志。Papertrail 可能会执行其自己的重复数据删除,但最好删除此重复。
我打算添加另一个主机上的卷,这样即使记录器容器被销毁并重新创建,该状态也是永久的。但是,在刚刚检查一个暂存实例时,状态目录/var/spool/rsyslog
似乎是空的。因此,在我做这项工作之前,我想确保 Rsyslog 确实在写入状态文件。
我认为这个目录是由$WorkDirectory
指令设置的吗?
这是我的配置文件的一个轻微编辑的片段:
# Load Modules
module(load="imfile")
# If we don't use this, we'll get the hostname of the Logger container
$LocalHostName missive-test
# See https://help.papertrailapp.com/kb/configuration/advanced-unix-logging-tips#rsyslog-2
$ActionResumeInterval 10
$ActionQueueSize 100000
$ActionQueueDiscardMark 97500
$ActionQueueHighWaterMark 80000
$ActionQueueType LinkedList
$ActionQueueFileName papertrailqueue
$ActionQueueCheckpointInterval 100
$ActionQueueMaxDiskSpace 2g
$ActionResumeRetryCount -1
$ActionQueueSaveOnShutdown on
$ActionQueueTimeoutEnqueue 2
$ActionQueueDiscardSeverity 0
# My own addition
$WorkDirectory /var/spool/rsyslog
# See https://help.papertrailapp.com/kb/configuration/encrypting-remote-syslog-with-tls-ssl/#download-root-certificates
$DefaultNetstreamDriverCAFile /etc/papertrail-bundle.pem # trust these CAs
$ActionSendStreamDriver gtls # use gtls netstream driver
$ActionSendStreamDriverMode 1 # require TLS
$ActionSendStreamDriverAuthMode x509/name # authenticate by hostname
$ActionSendStreamDriverPermittedPeer *.papertrailapp.com
# Slightly edited to obfuscate exact URL
*.* @@logs0.papertrailapp.com:00000
# Use http://www.rsyslog.com/rsyslog-configuration-builder/ to generate new
# log file watchers.
input(type="imfile"
File="/var/log/missive/controller/socket-listener.log"
Tag="socket"
Facility="local0")
最后一个input
指令是一个示例,其中有八个,都非常相似。
我可能对状态数据的位置有误吗?我最近重新启动了几次,每次 Papertrail 确实获得了 25K 的新记录。所以,要么是我看不到的写状态,要么根本不写。我错过了一个指令吗?
我已经在 Rsyslog 8.26.0 和 8.31.0 上尝试过这个,两个版本都一样。
更新 1
从 Rsyslog 本身查看一些日志可能会有所帮助。但是,我已经签入/var/log
,并且没有香肠(除了我正在推送的日志的主机卷)。这需要显式打开吗?
发布后,我找到了一个验证配置文件的命令,它看起来很好 - 看起来没有什么会阻止状态文件的工作。
/var # rsyslogd -N1
rsyslogd: version 8.26.0, config validation run (level 1), master config /etc/rsyslog.conf
rsyslogd: End of config validation run. Bye.
更新 2
我现在已将-dn
(调试和前台)开关添加到 Rsyslog,并且现在正在生成(非常详细的)日志文件。在 300K 其他行中,我发现了这个:
2227.959643591:main thread : action 0 queue: starting queue
2227.959648334:main thread : action 0 queue: is disk-assisted, disk will be used on demand
2227.959656291:main thread : action 0 queue: params: type 1, enq-only 0, disk assisted 1, spoolDir '/var/spool/rsyslog', maxFileSz 1048576, maxQSize 100000, lqsize 0, pqsize 0, child 0, full delay 40000, light delay 70000, deq batch si
ze 16, high wtrmrk 80000, low wtrmrk 70000, discardmrk 97500, max wrkr 1, min msgs f. wrkr 100000
2227.959661911:main thread : action 0 queue:Reg: finalizing construction of worker thread pool (numworkerThreads 1)
2227.959666678:main thread : action 0 queue:Reg/w0: finalizing construction of worker instance data (for 1 actions)
2227.959671472:main thread : action 0 queue:DAwpool: finalizing construction of worker thread pool (numworkerThreads 1)
2227.959675830:main thread : action 0 queue:DAwpool/w0: finalizing construction of worker instance data (for 1 actions)
2227.959682905:main thread : action 0 queue[DA]: starting queue
2227.959687826:main thread : action 0 queue[DA]: .qi file name is '/var/spool/rsyslog/papertrailqueue.qi', len 37
2227.959691356:main thread : action 0 queue[DA]: I am a child
2227.959702371:main thread : action 0 queue[DA]: clean startup, no .qi file found
2227.959706512:main thread : action 0 queue[DA]: state -2040 reading .qi file - can not read persisted info (if any)
2227.959711817:main thread : file stream N/A params: flush interval 0, async write 0
2227.959719817:main thread : file stream N/A params: flush interval 0, async write 0
2227.959724086:main thread : file stream N/A params: flush interval 0, async write 0
2227.959734668:main thread : action 0 queue[DA]: params: type 2, enq-only 0, disk assisted 0, spoolDir '/var/spool/rsyslog', maxFileSz 1048576, maxQSize 0, lqsize 0, pqsize 0, child 1, full delay -1, light delay -1, deq batch size 8, h
igh wtrmrk 0, low wtrmrk 1, discardmrk 0, max wrkr 1, min msgs f. wrkr 0
2227.959739181:main thread : action 0 queue[DA]:Reg: finalizing construction of worker thread pool (numworkerThreads 1)
2227.959743488:main thread : action 0 queue[DA]:Reg/w0: finalizing construction of worker instance data (for 1 actions)
2227.959747045:main thread : action 0 queue[DA]: queue finished initialization
2227.959753324:main thread : action 0 queue: DA queue initialized, disk queue 0x5630e6079ea0
2227.959756804:main thread : action 0 queue: queue finished initialization
2227.959762404:main thread : Action builtin:omfwd[0x5630e605cd60]: queue 0x5630e605d1e0 started
2227.959766212:main thread : Activating Ruleset Queue[0] for Ruleset RSYSLOG_DefaultRuleset
2227.959769728:main thread : activateMainQueue: mainq cnf obj ptr is 0
2227.959773846:main thread : main Q: starting queue
2227.959781585:main thread : main Q: is NOT disk-assisted
2227.959789199:main thread : main Q: params: type 0, enq-only 0, disk assisted 0, spoolDir '/var/spool/rsyslog', maxFileSz 1048576, maxQSize 100000, lqsize 0, pqsize 0, child 0, full delay 97000, light delay 70000, deq batch size 256,
high wtrmrk 80000, low wtrmrk 20000, discardmrk 98000, max wrkr 2, min msgs f. wrkr 40000
2227.959793605:main thread : main Q:Reg: finalizing construction of worker thread pool (numworkerThreads 2)
2227.959797797:main thread : main Q:Reg/w0: finalizing construction of worker instance data (for 1 actions)
2227.959807534:main thread : main Q:Reg/w1: finalizing construction of worker instance data (for 1 actions)
2227.959813976:main thread : main Q: queue finished initialization
2227.959819101:main thread : Main processing queue is initialized and running
2227.959823416:main thread : running module imfile with config 0x5630e605bda0, term mode: cooperative/SIGTTIN
2227.959847971:main thread : configuration 0x5630e6047ae0 activated
2227.959891721:main thread : rsyslog/glbl: using '127.0.0.1' as localhost IP
2227.959897527:main thread : signaling new internal message via SIGTTOU
2227.959907993:main thread : rsyslogd: writing pidfile '/var/run/rsyslogd.pid.tmp'.
2227.960012480:main thread : rsyslogd: initialization completed, transitioning to regular run mode
这提到/var/spool/rsyslog
了几次,所以看起来该配置确实设置正确。但是,我仍然没有真正找到任何可以解释为什么在推送到 Papertrail 后没有留下状态文件的东西。
我将继续查看日志,看看是否还有其他内容跳出。同时,欢迎对此发表评论。
更新 3
啊哈,我找到了这些行:
/ # grep -C 1 "NO state file" /var/log/rsyslog.log
2227.960253654:imfile.c : imfile: trying to open state for '/var/log/missive/controller/socket-listener.log', state file 'imfile-state:-var-log-missive-controller-socket-listener.log'
2227.960264463:imfile.c : imfile: NO state file (/var/spool/rsyslog/imfile-state:-var-log-missive-controller-socket-listener.log) exists for '/var/log/missive/controller/socket-listener.log'
2227.960268423:imfile.c : imfile: clean startup withOUT state file for '/var/log/missive/controller/socket-listener.log'
--
2227.971282226:imfile.c : imfile: trying to open state for '/var/log/missive/transmitter/queue.log', state file 'imfile-state:-var-log-missive-transmitter-queue.log'
2227.971298328:imfile.c : imfile: NO state file (/var/spool/rsyslog/imfile-state:-var-log-missive-transmitter-queue.log) exists for '/var/log/missive/transmitter/queue.log'
2227.971302589:imfile.c : imfile: clean startup withOUT state file for '/var/log/missive/transmitter/queue.log'
--
2228.510460963:imfile.c : imfile: trying to open state for '/var/log/missive/transmitter/worker-manager.log', state file 'imfile-state:-var-log-missive-transmitter-worker-manager.log'
2228.510480060:imfile.c : imfile: NO state file (/var/spool/rsyslog/imfile-state:-var-log-missive-transmitter-worker-manager.log) exists for '/var/log/missive/transmitter/worker-manager.log'
2228.510484619:imfile.c : imfile: clean startup withOUT state file for '/var/log/missive/transmitter/worker-manager.log'
--
2228.774990028:imfile.c : imfile: trying to open state for '/var/log/missive/storage/storage-server.log', state file 'imfile-state:-var-log-missive-storage-storage-server.log'
2228.775007074:imfile.c : imfile: NO state file (/var/spool/rsyslog/imfile-state:-var-log-missive-storage-storage-server.log) exists for '/var/log/missive/storage/storage-server.log'
2228.775011149:imfile.c : imfile: clean startup withOUT state file for '/var/log/missive/storage/storage-server.log'
--
2228.797432870:imfile.c : imfile: trying to open state for '/var/log/missive/outtray/outtray-server.log', state file 'imfile-state:-var-log-missive-outtray-outtray-server.log'
2228.797447810:imfile.c : imfile: NO state file (/var/spool/rsyslog/imfile-state:-var-log-missive-outtray-outtray-server.log) exists for '/var/log/missive/outtray/outtray-server.log'
2228.797452333:imfile.c : imfile: clean startup withOUT state file for '/var/log/missive/outtray/outtray-server.log'
--
2228.798755146:imfile.c : imfile: trying to open state for '/var/log/missive/interface/access.log', state file 'imfile-state:-var-log-missive-interface-access.log'
2228.798767248:imfile.c : imfile: NO state file (/var/spool/rsyslog/imfile-state:-var-log-missive-interface-access.log) exists for '/var/log/missive/interface/access.log'
2228.798771170:imfile.c : imfile: clean startup withOUT state file for '/var/log/missive/interface/access.log'
--
2228.802581105:imfile.c : imfile: trying to open state for '/var/log/missive/interface/error.log', state file 'imfile-state:-var-log-missive-interface-error.log'
2228.802592195:imfile.c : imfile: NO state file (/var/spool/rsyslog/imfile-state:-var-log-missive-interface-error.log) exists for '/var/log/missive/interface/error.log'
2228.802595916:imfile.c : imfile: clean startup withOUT state file for '/var/log/missive/interface/error.log'
--
2228.818228008:imfile.c : imfile: trying to open state for '/var/log/missive/mongo/mongodb.log', state file 'imfile-state:-var-log-missive-mongo-mongodb.log'
2228.818241043:imfile.c : imfile: NO state file (/var/spool/rsyslog/imfile-state:-var-log-missive-mongo-mongodb.log) exists for '/var/log/missive/mongo/mongodb.log'
2228.818244939:imfile.c : imfile: clean startup withOUT state file for '/var/log/missive/mongo/mongodb.log'
--
2228.826921335:imfile.c : imfile: trying to open state for '/var/log/missive/traffic/traefik.log', state file 'imfile-state:-var-log-missive-traffic-traefik.log'
2228.826933834:imfile.c : imfile: NO state file (/var/spool/rsyslog/imfile-state:-var-log-missive-traffic-traefik.log) exists for '/var/log/missive/traffic/traefik.log'
2228.826937674:imfile.c : imfile: clean startup withOUT state file for '/var/log/missive/traffic/traefik.log'
第一次运行时会出现这些,但当然会期望状态文件会被留下,以便不会再次推送日志。我将搜索这些错误字符串以防万一,我将跟踪日志以查看最后是否有任何内容阻止记录最终状态。
更新 4
我试图在我的input()
定义之后添加这个,也无济于事:
$InputFilePersistStateInterval 100
$InputRunFileMonitor
这是因为手册说,第二个参数:
这将激活当前监视器。它没有参数。如果您忘记了此指令,则不会进行文件监控。
但是,我认为这是指由命令声明的旧式日志$
,而不是新式input()
语法。
更新 5
我突然想到,我上面的配置覆盖了默认的 Rsyslog 配置,可能正在删除一些关键的东西。因此,我修改了相关COPY
命令,以便保留默认值rsyslogd.conf
,并且我的配置变得额外。
因此,新指令(未注释掉)如下:
# rsyslog v5: load input modules
# If you do not load inputs, nothing happens!
# You may need to set the module load path if modules are not found.
$ModLoad immark.so # provides --MARK-- message capability
$ModLoad imuxsock.so # provides support for local system logging (e.g. via logger command)
$ModLoad imklog.so # kernel logging (formerly provided by rklogd)
# default permissions for all log files.
$FileOwner root
$FileGroup adm
$FileCreateMode 0640
$DirCreateMode 0755
$Umask 0022
# Include configuration files from directory
$IncludeConfig /etc/rsyslog.d/*
# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none -/var/log/messages
# The authpriv file has restricted access.
authpriv.* /var/log/secure
# Log all the mail messages in one place.
mail.* -/var/log/maillog
# Log cron stuff
cron.* -/var/log/cron
# Everybody gets emergency messages
*.emerg :omusrmsg:*
# Save news errors of level crit and higher in a special file.
uucp,news.crit -/var/log/spooler
# Save boot messages also to boot.log
local7.* /var/log/boot.log
这仍在将我的日志推送到 Papertrail,并且仍然没有创建假脱机文件。
我很想这可能是一个权限问题,但是 Rsyslogroot
在 Alpine 中运行,我相信队列作为主进程中的线程实现。因此,它们也将运行root
,并且写入到/var/spool/rsyslog/*
.
更新 6
我曾询问 Rsyslog 错误跟踪器的读者这是否可能是一个错误,可能是因为我使用的是 Docker。
一些在 Rsyslog 问题列表中巡查的非常有帮助的工程师提出了一个足够好的解决方案。似乎默认情况下,状态文件仅在 Rsyslog 退出时按日志写入,这就是它们不会立即出现的原因。
此外,有人建议 Docker 停止/终止进程在 Rsyslog 有机会写入状态文件之前发生。一位贡献者说,Rsyslog 在写入状态文件后可能需要长达 90 秒
SIGTERM
,我的理解是 Docker 通常不会等待那么长时间(默认情况下,它会等待 10 秒再发送SIGKILL
)。奇怪的是,在我的情况下,Rsyslog 只等待约 2 秒(明显低于 10 秒的默认限制),这表明它正在干净地退出,尽管没有写入任何状态文件。我目前不知道为什么会发生这种情况。
修复如下。在每个日志声明中,
PersistStateInterval
可以添加指令以指定写入状态文件的频率(在处理的日志行中)。这默认为0
,它给出了“退出时写入”行为。例如:
我将使用在慢速日志文件上使用较低值(例如 20)的策略,在快速移动文件上使用较高值(例如 100)。我认为我对以这种方式频繁写入更满意,因为它可以防止电源故障或容器崩溃。
我认为这个修复并不理想,因为容器关闭仍然无法更新确切的日志发送位置。但是这样做的结果是,Rsyslog会发送一些重复的日志,这些日志应该被聚合器目标去重(间隔X会导致最大重复X-1条日志行,这并不可怕)。