在 Centos7 上运行 logstash,我认为 NIC 可能已饱和。
从 logstash 服务器,我也可以看到服务器发送日志的请求 q。但是我不确定根据我的设置请求 q 是否很高,或者 tcp 调整是否对我有帮助。
一些信息:
sysctl -a | grep mem
net.core.optmem_max = 20480
net.core.rmem_default = 212992
net.core.rmem_max = 212992
net.core.wmem_default = 212992
net.core.wmem_max = 212992
net.ipv4.igmp_max_memberships = 20
net.ipv4.tcp_mem = 227763 303685 455526
net.ipv4.tcp_rmem = 4096 87380 6291456
net.ipv4.tcp_wmem = 4096 16384 4194304
net.ipv4.udp_mem = 229686 306249 459372
net.ipv4.udp_rmem_min = 4096
net.ipv4.udp_wmem_min = 4096
vm.lowmem_reserve_ratio = 256 256 32
vm.memory_failure_early_kill = 0
vm.memory_failure_recovery = 1
vm.nr_hugepages_mempolicy = 0
vm.overcommit_memory = 0
netstat -na --tcp | grep :9123
tcp6 0 0 :::9123 :::* LISTEN
tcp6 247834 0 192.168.123.123:9123 10.16.1.82:52289 ESTABLISHED
tcp6 241242 0 192.168.123.123:9123 10.31.31.232:65293 ESTABLISHED
tcp6 0 0 192.168.123.123:9123 10.16.1.198:53693 ESTABLISHED
tcp6 0 0 192.168.123.123:9123 10.16.1.198:56751 ESTABLISHED
tcp6 331114 0 192.168.123.123:9123 10.31.35.157:53998 ESTABLISHED
tcp6 256047 0 192.168.123.123:9123 10.16.2.155:52221 ESTABLISHED
tcp6 240498 0 192.168.123.123:9123 10.19.5.166:51805 ESTABLISHED
tcp6 312648 0 192.168.123.123:9123 10.18.16.155:57975 ESTABLISHED
tcp6 234321 0 192.168.123.123:9123 10.18.19.242:51664 ESTABLISHED
tcp6 255079 0 192.168.123.123:9123 10.19.4.51:51458 ESTABLISHED
tcp6 256328 0 192.168.123.123:9123 10.18.45.89:56821 ESTABLISHED
tcp6 237167 0 192.168.123.123:9123 10.18.33.26:49278 ESTABLISHED
tcp6 248204 0 192.168.123.123:9123 10.18.30.250:54267 ESTABLISHED
tcp6 248573 0 192.168.123.123:9123 10.16.1.198:57522 ESTABLISHED
tcp6 238348 0 192.168.123.123:9123 10.18.11.169:55147 ESTABLISHED
tcp6 243762 0 192.168.123.123:9123 10.31.22.48:60425 ESTABLISHED
tcp6 258035 0 192.168.123.123:9123 10.31.46.31:60432 ESTABLISHED
tcp6 241863 0 192.168.123.123:9123 10.18.45.113:63376 ESTABLISHED
tcp6 327889 0 192.168.123.123:9123 10.18.3.219:58640 ESTABLISHED
tcp6 317363 0 192.168.123.123:9123 10.31.37.249:65162 ESTABLISHED
tcp6 252394 0 192.168.123.123:9123 10.16.1.92:56360 ESTABLISHED
tcp6 326401 0 192.168.123.123:9123 10.31.17.74:53948 ESTABLISHED
tcp6 0 0 192.168.123.123:9123 10.16.2.12:53781 ESTABLISHED
tcp6 244669 0 192.168.123.123:9123 10.18.18.100:49281 ESTABLISHED
tcp6 250264 0 192.168.123.123:9123 10.18.32.116:56795 ESTABLISHED
tcp6 0 0 192.168.123.123:9123 10.16.1.82:49304 ESTABLISHED
tcp6 310864 0 192.168.123.123:9123 10.18.11.25:64230 ESTABLISHED
tcp6 247973 0 192.168.123.123:9123 10.18.22.230:55209 ESTABLISHED
tcp6 0 0 192.168.123.123:9123 10.17.1.8:51741 ESTABLISHED
tcp6 0 0 192.168.123.123:9123 10.16.2.12:54507 ESTABLISHED
tcp6 251552 0 192.168.123.123:9123 10.18.24.83:63499 ESTABLISHED
tcp6 251481 0 192.168.123.123:9123 10.16.2.72:57268 ESTABLISHED
tcp6 0 0 192.168.123.123:9123 10.16.1.198:53406 ESTABLISHED
tcp6 312263 0 192.168.123.123:9123 10.19.12.239:52173 ESTABLISHED
tcp6 238878 0 192.168.123.123:9123 10.18.5.198:57978 ESTABLISHED
tcp6 322460 0 192.168.123.123:9123 10.18.5.124:53117 ESTABLISHED
tcp6 0 0 192.168.123.123:9123 10.16.2.72:54883 ESTABLISHED
tcp6 237717 0 192.168.123.123:9123 10.16.2.12:56387 ESTABLISHED
tcp6 315963 0 192.168.123.123:9123 10.18.26.44:49197 ESTABLISHED
tcp6 248914 0 192.168.123.123:9123 10.18.41.101:51859 ESTABLISHED
tcp6 0 0 192.168.123.123:9123 10.16.2.155:49303 ESTABLISHED
tcp6 316994 0 192.168.123.123:9123 10.18.44.120:49375 ESTABLISHED
tcp6 236421 0 192.168.123.123:9123 10.31.43.130:51590 ESTABLISHED
tcp6 240929 0 192.168.123.123:9123 10.17.1.114:63546 ESTABLISHED
tcp6 306346 0 192.168.123.123:9123 10.17.2.159:61633 ESTABLISHED
tcp6 239360 0 192.168.123.123:9123 10.18.39.43:54080 ESTABLISHED
tcp6 245361 0 192.168.123.123:9123 10.19.13.107:52629 ESTABLISHED
tcp6 244398 0 192.168.123.123:9123 10.18.11.195:53257 ESTABLISHED
您当然可以调整您的 NIC,RedHat 网络性能调整指南提供了一套相当完整的(并且是最近的,c. 2015)需要考虑的项目集。
例如,该文档建议
16384 349520 16777216
使用 TCP rmem(您是否需要调整 wmem 取决于您的设置 - 从您共享的内容来看,这似乎没有必要)。但是,在我看来,这表明 logstash 是一个瓶颈——您的接收队列是侦听给定端口(logstash)的进程的缓冲区,因此增加这些缓冲区并不能真正解决根本问题。在我看来,logstash 已经(相对而言)很慢,而且更多的网络缓冲区听起来不会让它更快。
检查您的 logstash 配置/解析器是否可以针对您的情况/需求进行优化(如果您使用正则表达式,这可能是一个很好的研究领域)。
您可能需要增加可用于 logstash 的资源,和/或考虑分层您的 logstash 基础架构以减轻您的客户端/面向源的实例的负载。
换句话说,让前端接收数据并且几乎不做任何实际工作(只需添加一些标签/信息,也许还有一些路由到特定处理器),然后让前端将数据传递给另一个实例进行进一步处理(或者到像 Kafka 这样的消息队列,例如)。
值得注意的是,如果您使用 DNS 和 geoIP 插件,则在完全解析事件/日志/数据后,在中心位置执行这些操作通常是有意义的)。