AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / user-374465

Alex Harvey's questions

Martin Hope
Alex Harvey
Asked: 2017-08-04 23:47:15 +0800 CST

未从 Docker 容器转发的 UDP 流量 -> Docker 主机

  • 12

我有一个 docker 容器,我无法从容器内部运行 DNS 查找,尽管它在 docker 主机上运行良好。

众所周知,构建 Docker 主机的配置管理代码可以在市场上的标准 RHEL 7 映像上运行,因此已知问题出在 SOE RHEL 7 映像内部。

RHEL 7.2 / Docker 版本 1.12.6,构建 88a4867/1.12.6。容器是 RHEL 7.3。SELinux 处于启用/许可模式。Docker 主机是一个 Amazon EC2 实例。

一些配置:

# /etc/sysconfig/docker
OPTIONS='--dns=10.0.0.10 --dns=10.0.0.11 --dns-search=example.com'
DOCKER_CERT_PATH=/etc/docker
ADD_REGISTRY='--add-registry registry.example.com'
no_proxy=169.254.169.254,localhost,127.0.0.1,registory.example.com
http_proxy=http://proxy.example.com:8080
https_proxy=http://proxy.example.com:8080
ftp_proxy=http://proxy.example.com:8080

容器和主机中的解析器配置相同:

# /etc/resolv.conf
search example.com
nameserver 10.0.0.10
nameserver 10.0.0.11

如果我重新启动 docker 守护程序,--debug我会看到以下内容journalctl -u docker.service:

Aug 08 11:44:23 myhost.example.com dockerd-current[17341]: time="2017-08-08T11:44:23.430769581+10:00" level=debug msg="Name To resolve: http://proxy.example.com."
Aug 08 11:44:23 myhost.example.com dockerd-current[17341]: time="2017-08-08T11:44:23.431488213+10:00" level=debug msg="Query http://proxy.example.com.[1] from 172.18.0.6:38189, forwarding to udp:10.162.182.101"
Aug 08 11:44:27 myhost.example.com dockerd-current[17341]: time="2017-08-08T11:44:27.431772666+10:00" level=debug msg="Read from DNS server failed, read udp 172.18.0.6:38189->10.162.182.101:53: i/o timeout"

进一步观察之后,如果我指定一个 IP 地址而不是代理的 DNS 名称,我可以让一些网络工作;尽管这实际上只是避免使用 DNS 的一种方式,而不是真正的修复。

事实上,(更新#3)事实证明,我可以通过简单地将DNS配置为使用TCP而不是UDP来完全避免这个问题,即

# head -1 /etc/sysconfig/docker
OPTIONS="--dns=10.0.0.10 --dns=10.0.0.11 --dns-search=example.com --dns-opt=use-vc"

(添加一行use-vc告诉解析器使用 TCP 而不是 UDP。)

我确实注意到了 iptables 中的一些可疑规则,但结果证明这些规则是正常的:

# iptables -n -L DOCKER-ISOLATION -v --line-numbers
Chain DOCKER-ISOLATION (1 references)
num   pkts bytes target     prot opt in     out     source               destination         
1        0     0 DROP       all  --  br-1d6a05c10468 docker0  0.0.0.0/0            0.0.0.0/0           
2        0     0 DROP       all  --  docker0 br-1d6a05c10468  0.0.0.0/0            0.0.0.0/0           
3    34903   11M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

删除这两个 DROP 规则后,我继续看到问题。

完整的iptables:

# iptables -nL -v
Chain INPUT (policy ACCEPT 2518 packets, 1158K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
23348 9674K DOCKER-ISOLATION  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
    0     0 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           
23244 9667K DOCKER     all  --  *      br-1d6a05c10468  0.0.0.0/0            0.0.0.0/0           
23232 9667K ACCEPT     all  --  *      br-1d6a05c10468  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
  104  6230 ACCEPT     all  --  br-1d6a05c10468 !br-1d6a05c10468  0.0.0.0/0            0.0.0.0/0           
   12   700 ACCEPT     all  --  br-1d6a05c10468 br-1d6a05c10468  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 2531 packets, 414K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 ACCEPT     tcp  --  !br-1d6a05c10468 br-1d6a05c10468  0.0.0.0/0            172.18.0.2           tcp dpt:443
    0     0 ACCEPT     tcp  --  !br-1d6a05c10468 br-1d6a05c10468  0.0.0.0/0            172.18.0.2           tcp dpt:80
    0     0 ACCEPT     tcp  --  !br-1d6a05c10468 br-1d6a05c10468  0.0.0.0/0            172.18.0.3           tcp dpt:389

Chain DOCKER-ISOLATION (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       all  --  br-1d6a05c10468 docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 DROP       all  --  docker0 br-1d6a05c10468  0.0.0.0/0            0.0.0.0/0           
23348 9674K RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

网桥配置

# ip addr show docker0
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN 
    link/ether 02:42:a8:73:db:bb brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
# ip addr show br-1d6a05c10468
3: br-1d6a05c10468: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 02:42:d5:b6:2d:f5 brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.1/16 scope global br-1d6a05c10468
       valid_lft forever preferred_lft forever

和

# docker network inspect bridge 
[
    {
        "Name": "bridge",
        "Id": "e159ddd37386cac91e0d011ade99a51f9fe887b8d32d212884beace67483af44",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.17.0.0/16",
                    "Gateway": "172.17.0.1"
                }
            ]
        },
        "Internal": false,
        "Containers": {},
        "Options": {
            "com.docker.network.bridge.default_bridge": "true",
            "com.docker.network.bridge.enable_icc": "true",
            "com.docker.network.bridge.enable_ip_masquerade": "true",
            "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
            "com.docker.network.bridge.name": "docker0",
            "com.docker.network.driver.mtu": "1500"
        },
        "Labels": {}
    }
]

在日志中:

Aug 04 17:33:32 myhost.example.com systemd[1]: Starting Docker Application Container Engine...
Aug 04 17:33:33 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:33.056770003+10:00" level=info msg="libcontainerd: new containerd process, pid: 2140"
Aug 04 17:33:34 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:34.740346421+10:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Aug 04 17:33:34 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:34.741164354+10:00" level=info msg="Loading containers: start."
Aug 04 17:33:34 myhost.example.com dockerd-current[2131]: .........................time="2017-08-04T17:33:34.903371015+10:00" level=info msg="Firewalld running: true"
Aug 04 17:33:35 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:35.325581993+10:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address" 
Aug 04 17:33:36 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:36+10:00" level=info msg="Firewalld running: true"
Aug 04 17:33:37 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:37+10:00" level=info msg="Firewalld running: true"
Aug 04 17:33:37 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:37+10:00" level=info msg="Firewalld running: true"
Aug 04 17:33:38 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:38+10:00" level=info msg="Firewalld running: true"
Aug 04 17:33:39 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:39+10:00" level=info msg="Firewalld running: true"
Aug 04 17:33:40 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:40+10:00" level=info msg="Firewalld running: true"
Aug 04 17:33:40 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:40+10:00" level=info msg="Firewalld running: true"
Aug 04 17:33:42 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:42+10:00" level=info msg="Firewalld running: true"
Aug 04 17:33:42 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:42+10:00" level=info msg="Firewalld running: true"
Aug 04 17:33:43 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:43.541905145+10:00" level=info msg="Loading containers: done."
Aug 04 17:33:43 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:43.541975618+10:00" level=info msg="Daemon has completed initialization"
Aug 04 17:33:43 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:43.541998095+10:00" level=info msg="Docker daemon" commit="88a4867/1.12.6" graphdriver=devicemapper version=1.12.6
Aug 04 17:33:43 myhost.example.com dockerd-current[2131]: time="2017-08-04T17:33:43.548508756+10:00" level=info msg="API listen on /var/run/docker.sock"
Aug 04 17:33:43 myhost.example.com systemd[1]: Started Docker Application Container Engine.

从容器中,我可以 ping 默认网关,但所有名称解析都失败。

我注意到日志中有一件奇怪的事情(更新 #2我现在知道这是一条红鲱鱼 - 请参阅下面的讨论):

# journalctl -u docker.service |grep insmod > /tmp/log # \n's replaced below
Jul 26 23:59:02 myhost.example.com dockerd-current[3185]: time="2017-07-26T23:59:02.056295890+10:00" level=warning msg="Running modprobe bridge br_netfilter failed with message: insmod /lib/modules/3.10.0-514.26.2.el7.x86_64/kernel/net/bridge/bridge.ko 
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-arptables: No such file or directory
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: No such file or directory
modprobe: ERROR: Error running install command for bridge
modprobe: ERROR: could not insert 'bridge': Unknown error 253
insmod /lib/modules/3.10.0-514.26.2.el7.x86_64/kernel/net/llc/llc.ko 
insmod /lib/modules/3.10.0-514.26.2.el7.x86_64/kernel/net/802/stp.ko 
install /sbin/modprobe --ignore-install bridge && /sbin/sysctl -q -w net.bridge.bridge-nf-call-arptables=0 net.bridge.bridge-nf-call-iptables=0 net.bridge.bridge-nf-call-ip6tables=0 
insmod /lib/modules/3.10.0-514.26.2.el7.x86_64/kernel/net/bridge/br_netfilter.ko 
, error: exit status 1"

更新#1:这来自:

# tail -2 /etc/modprobe.d/dist.conf
# Disable netfilter on bridges when the bridge module is loaded
install bridge /sbin/modprobe --ignore-install bridge && /sbin/sysctl -q -w net.bridge.bridge-nf-call-arptables=0 net.bridge.bridge-nf-call-iptables=0 net.bridge.bridge-nf-call-ip6tables=0

还:

# cat /proc/sys/net/bridge/bridge-nf-call-{arp,ip,ip6}tables
1
1
1

但是,即使在我这样做之后:

# for i in /proc/sys/net/bridge/bridge-nf-call-{arp,ip,ip6}tables ; do echo 0 > $i ; done 

仍然没有运气。

我花了一整天的时间在这上面,所以现在把我的头发拉出来了。任何关于我可以尝试的其他方法或其他问题的想法可能会非常感激。

更新#4

我使用 Netcat 进行了一些实验,我已经证明如果从任何容器 -> 主机发送所有 UDP 数据包都不会转发。我尝试使用几个端口,包括 53、2115 和 50000。但是 TCP 数据包很好。如果我刷新 iptables 规则,这仍然是正确的iptables -F。

此外,我可以将 UDP 数据包从一个容器发送到另一个容器 - 只有来自容器的 UDP 流量 -> 主机不会被转发。

要设置测试:

在 IP 为 10.1.1.10 的主机上:

# nc -u -l 50000

在容器上:

# echo "foo" | nc -w1 -u 10.1.1.10 50000

在 TCP 转储捕获期间,我看到:

17:20:36.761214 IP (tos 0x0, ttl 64, id 48146, offset 0, flags [DF], proto UDP (17), length 32)
    172.17.0.2.41727 > 10.1.1.10.50000: [bad udp cksum 0x2afa -> 0x992f!] UDP, length 4
        0x0000:  4500 0020 bc12 4000 4011 53de ac11 0002  E.....@[email protected].....
        0x0010:  0aa5 7424 a2ff c350 000c 2afa 666f 6f0a  ..t$...P..*.foo.
        0x0020:  0000 0000 0000 0000 0000 0000 0000 0000  ................
17:20:36.761214 IP (tos 0x0, ttl 64, id 48146, offset 0, flags [DF], proto UDP (17), length 32)
    172.17.0.2.41727 > 10.1.1.10.50000: [bad udp cksum 0x2afa -> 0x992f!] UDP, length 4
        0x0000:  4500 0020 bc12 4000 4011 53de ac11 0002  E.....@[email protected].....
        0x0010:  0aa5 7424 a2ff c350 000c 2afa 666f 6f0a  ..t$...P..*.foo.
        0x0020:  0000 0000 0000 0000 0000 0000 0000 0000  ................

我尝试通过此修复错误的 UDP 校验和,但未成功。

然而,我注意到,即使在 UDP 数据包(主机 -> 主机)和容器 -> 容器的成功传输期间,也会看到错误的 UDP 校验和。

总之,我现在知道:

  • 路由没问题

  • iptables 被刷新

  • SELinux 是允许的

  • 所有 TCP 都在各个方向工作

  • 来自容器的所有 UDP -> 容器都很好

  • 来自主机的所有 UDP -> 主机都很好

  • 来自主机的所有 UDP -> 容器都很好

  • 但是没有来自容器-> 主机的 UDP 数据包被转发

linux-networking
  • 3 个回答
  • 17826 Views
Martin Hope
Alex Harvey
Asked: 2017-05-04 01:26:43 +0800 CST

Stunnel 代理发出文件未找到错误

  • 1

我在 Red Hat Linux 6.8 上有一个 Stunnel 4.29,它不会启动并发出“没有这样的文件或目录”错误:

# /usr/bin/stunnel /etc/stunnel/agent/dynatrace-agent.conf 
2017.05.03 19:04:26 LOG7[3880:140667243153344]: Snagged 64 random bytes from /root/.rnd
2017.05.03 19:04:26 LOG7[3880:140667243153344]: Wrote 1024 new random bytes to /root/.rnd
2017.05.03 19:04:26 LOG7[3880:140667243153344]: RAND_status claims sufficient entropy for the PRNG
2017.05.03 19:04:26 LOG7[3880:140667243153344]: PRNG seeded successfully
2017.05.03 19:04:26 LOG3[3880:140667243153344]: nil: No such file or directory (2)

使用 strace 我看到一个可疑的尝试统计文件“nil”:

# strace -e trace=stat -f /usr/bin/stunnel /etc/stunnel/agent/dynatrace-agent.conf 
stat("/root/.rnd", {st_mode=S_IFREG|0600, st_size=1024, ...}) = 0
stat("/root/.rnd", {st_mode=S_IFREG|0600, st_size=1024, ...}) = 0
stat("/root/.rnd", {st_mode=S_IFREG|0600, st_size=1024, ...}) = 0
stat("nil", 0x7ffe119643d0)             = -1 ENOENT (No such file or directory)
2017.05.03 19:11:30 LOG7[3916:140189915436992]: Snagged 64 random bytes from /root/.rnd
2017.05.03 19:11:30 LOG7[3916:140189915436992]: Wrote 1024 new random bytes to /root/.rnd
2017.05.03 19:11:30 LOG7[3916:140189915436992]: RAND_status claims sufficient entropy for the PRNG
2017.05.03 19:11:30 LOG7[3916:140189915436992]: PRNG seeded successfully
2017.05.03 19:11:30 LOG3[3916:140189915436992]: nil: No such file or directory (2)
+++ exited with 1 +++

我还看到尝试连接到套接字失败:

# strace -e trace=connect -f /usr/bin/stunnel /etc/stunnel/agent/dynatrace-agent.conf                                                                                               
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
2017.05.03 19:12:54 LOG7[3928:139643326924736]: Snagged 64 random bytes from /root/.rnd
2017.05.03 19:12:54 LOG7[3928:139643326924736]: Wrote 1024 new random bytes to /root/.rnd
2017.05.03 19:12:54 LOG7[3928:139643326924736]: RAND_status claims sufficient entropy for the PRNG
2017.05.03 19:12:54 LOG7[3928:139643326924736]: PRNG seeded successfully
2017.05.03 19:12:54 LOG3[3928:139643326924736]: nil: No such file or directory (2)
+++ exited with 1 +++

这是我的配置文件:

# cat /etc/stunnel/agent/dynatrace-agent.conf 
; This stunnel config is managed by Puppet.

cert = nil
key = nil
CAfile = nil
CRLfile = nil
sslVersion = TLSv1
verify = 2

chroot = /var/lib/stunnel/dynatrace-agent
setuid = dtagent
setgid = dtagent
pid = dynatrace-agent.pid

socket = l:TCP_NODELAY=1
socket = r:TCP_NODELAY=1

debug = 7
output = /var/log/dynatrace-agent.log

client = yes

[dynatrace-agent]
accept = localhost:9998
connect = x.x.x.x:7443

版本信息:

# stunnel -version
stunnel 4.29 on x86_64-redhat-linux-gnu with OpenSSL 1.0.1e-fips 11 Feb 2013
Threading:PTHREAD SSL:ENGINE,FIPS Sockets:POLL,IPv6 Auth:LIBWRAP

Global options
debug           = 5
pid             = /var/run/stunnel.pid
RNDbytes        = 64
RNDfile         = /dev/urandom
RNDoverwrite    = yes

Service-level options
cert            = /etc/stunnel/stunnel.pem
ciphers         = ALL:!aNULL:!eNULL:!SSLv2:!EXPORT:!RC2:!DES
curve                  = prime256v1
key             = /etc/stunnel/stunnel.pem
session         = 300 seconds
stack           = 65536 bytes
sslVersion      = all
TIMEOUTbusy     = 300 seconds
TIMEOUTclose    = 60 seconds
TIMEOUTconnect  = 10 seconds
TIMEOUTidle     = 43200 seconds
verify          = none
stunnel
  • 1 个回答
  • 598 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    新安装后 postgres 的默认超级用户用户名/密码是什么?

    • 5 个回答
  • Marko Smith

    SFTP 使用什么端口?

    • 6 个回答
  • Marko Smith

    命令行列出 Windows Active Directory 组中的用户?

    • 9 个回答
  • Marko Smith

    什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同?

    • 3 个回答
  • Marko Smith

    如何确定bash变量是否为空?

    • 15 个回答
  • Martin Hope
    Tom Feiner 如何按大小对 du -h 输出进行排序 2009-02-26 05:42:42 +0800 CST
  • Martin Hope
    Noah Goodrich 什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同? 2009-05-19 18:24:42 +0800 CST
  • Martin Hope
    Brent 如何确定bash变量是否为空? 2009-05-13 09:54:48 +0800 CST
  • Martin Hope
    cletus 您如何找到在 Windows 中打开文件的进程? 2009-05-01 16:47:16 +0800 CST

热门标签

linux nginx windows networking ubuntu domain-name-system amazon-web-services active-directory apache-2.4 ssh

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve