Eu tenho um contêiner LXC. Quero rotear todo o tráfego por meio de uma interface diferente ( tap0
) da do host.
Interfaces de host:
tap0
172.13.0.3, gateway 172.13.0.1lxcbr0
192.168.12.104 com o contêinerveth
como membro
No container existe um eth0
192.168.12.105 com rota padrão via 192.168.12.104. É claro que posso executar ping no host a partir do contêiner e vice-versa.
A tabela de roteamento de contêineres é trivial:
# ip route show
default via 192.168.12.104 dev eth0
192.168.12.0/24 dev eth0 proto kernel scope link src 192.168.12.105
Criei uma tabela de roteamento separada no host:
# ip rule add from all fwmark 1234 table 1234
# ip route show table 1234
default via 172.30.0.1 dev tap0
Tabela de roteamento principal do host (novamente, nada de especial):
# ip route show
default via 192.168.xxx.xxx dev eth0 proto dhcp src 192.168.xxx.xxx metric 2004 mtu 1500
172.30.0.0/16 dev tap0 proto kernel scope link src 172.30.0.3
192.168.xxx.0/24 dev eth0 proto dhcp scope link src 192.168.xxx.xxx metric 2004 mtu 1500
192.168.12.0/24 dev lxcbr0 proto kernel scope link src 192.168.12.104
Eu configurei o iptables desta forma:
iptables -t nat -A POSTROUTING -o tap0 -j MASQUERADE
iptables -t nat -A PREROUTING -i lxcbr0 -j MARK --set-mark 1234
Agora, tento executar ping em 8.8.8.8 do contêiner e exatamente cada segundo ping é perdido . De forma confiável.
# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=109 time=48.9 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=109 time=47.1 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=109 time=47.0 ms
64 bytes from 8.8.8.8: icmp_seq=7 ttl=109 time=46.9 ms
64 bytes from 8.8.8.8: icmp_seq=9 ttl=109 time=47.1 ms
64 bytes from 8.8.8.8: icmp_seq=11 ttl=109 time=47.3 ms
64 bytes from 8.8.8.8: icmp_seq=13 ttl=109 time=47.1 ms
64 bytes from 8.8.8.8: icmp_seq=15 ttl=109 time=47.0 ms
Trânsito na ponte:
# tcpdump -i lxcbr0 -n
listening on lxcbr0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
10:04:46.208273 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 1, length 64
10:04:46.257177 IP 8.8.8.8 > 192.168.12.105: ICMP echo reply, id 138, seq 1, length 64
10:04:47.209372 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 2, length 64
10:04:48.236402 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 3, length 64
10:04:48.283429 IP 8.8.8.8 > 192.168.12.105: ICMP echo reply, id 138, seq 3, length 64
10:04:49.237599 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 4, length 64
10:04:50.252397 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 5, length 64
10:04:50.299356 IP 8.8.8.8 > 192.168.12.105: ICMP echo reply, id 138, seq 5, length 64
10:04:51.253520 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 6, length 64
10:04:52.268435 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 7, length 64
10:04:52.315270 IP 8.8.8.8 > 192.168.12.105: ICMP echo reply, id 138, seq 7, length 64
10:04:53.270429 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 8, length 64
10:04:54.284396 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 9, length 64
10:04:54.331473 IP 8.8.8.8 > 192.168.12.105: ICMP echo reply, id 138, seq 9, length 64
Tráfego de tap0
:
# tcpdump -i tap0 -n
listening on tap0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
10:04:46.208342 IP 172.30.0.3 > 8.8.8.8: ICMP echo request, id 138, seq 1, length 64
10:04:46.257147 IP 8.8.8.8 > 172.30.0.3: ICMP echo reply, id 138, seq 1, length 64
10:04:48.236458 IP 172.30.0.3 > 8.8.8.8: ICMP echo request, id 138, seq 3, length 64
10:04:48.283402 IP 8.8.8.8 > 172.30.0.3: ICMP echo reply, id 138, seq 3, length 64
10:04:50.252446 IP 172.30.0.3 > 8.8.8.8: ICMP echo request, id 138, seq 5, length 64
10:04:50.299328 IP 8.8.8.8 > 172.30.0.3: ICMP echo reply, id 138, seq 5, length 64
10:04:52.268485 IP 172.30.0.3 > 8.8.8.8: ICMP echo request, id 138, seq 7, length 64
10:04:52.315242 IP 8.8.8.8 > 172.30.0.3: ICMP echo reply, id 138, seq 7, length 64
10:04:54.284445 IP 172.30.0.3 > 8.8.8.8: ICMP echo request, id 138, seq 9, length 64
10:04:54.331445 IP 8.8.8.8 > 172.30.0.3: ICMP echo reply, id 138, seq 9, length 64
10:04:56.300446 IP 172.30.0.3 > 8.8.8.8: ICMP echo request, id 138, seq 11, length 64
10:04:56.347598 IP 8.8.8.8 > 172.30.0.3: ICMP echo reply, id 138, seq 11, length 64
Sempre que há um ping de saída, tap0
sempre há uma resposta (então tudo por trás tap0
funciona bem).
Parece que o host está eliminando o tráfego de saída do contêiner. Como posso depurar esta situação?
Resolvi o problema usando a
mangle
tabela em vez danat
tabela. A configuração mágica é: