描述
配置
我有 3 个节点,使用 Tinc VPN 连接在一起,我想在其中安装 HAproxy 并拥有一个 VIP,以便 HAproxy 本身处于高可用性模式。
以下是节点详细信息:
- 节点 1在接口vpn上的 IP 地址为10.0.0.222/32
- 节点 2在接口vpn上的 IP 地址为10.0.0.13/32
- 节点 3在接口vpn上的 IP 地址为10.0.0.103/32
为此,我keepalived
在每台机器上都安装了。
我还启用了以下 sysctl:
net.ipv4.ip_forward = 1
net.ipv4.ip_nonlocal_bind = 1
节点 1 具有以下/etc/keepalived/keepalived.conf文件:
global_defs {
enable_script_security
router_id node-1
}
vrrp_script haproxy-check {
script "/usr/bin/killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance haproxy-vip {
state MASTER
priority 150
interface vpn
virtual_router_id 1
advert_int 1
virtual_ipaddress {
10.0.0.1/32
}
track_script {
haproxy-check
}
}
节点 2 和 3 具有以下/etc/keepalived/keepalived.conf文件:
global_defs {
enable_script_security
router_id node-2 # Node 3 has "node-3" here.
}
vrrp_script haproxy-check {
script "/usr/bin/killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance haproxy-vip {
state BACKUP
priority 100
interface vpn
virtual_router_id 1
advert_int 1
virtual_ipaddress {
10.0.0.1/32
}
track_script {
haproxy-check
}
}
当所有节点都在运行keepalived
时,节点 1 是主节点,并且 VIP10.0.0.1
配置良好,其他 2 个节点 ping 它。
节点 1 日志
启动时的日志keepalived
:
Dec 5 14:07:53 node-1 systemd[1]: Starting Keepalive Daemon (LVS and VRRP)...
Dec 5 14:07:53 node-1 Keepalived[5870]: Starting Keepalived v1.3.2 (12/03,2016)
Dec 5 14:07:53 node-1 systemd[1]: Started Keepalive Daemon (LVS and VRRP).
Dec 5 14:07:53 node-1 Keepalived[5870]: WARNING - default user 'keepalived_script' for script execution does not exist - please create.
Dec 5 14:07:53 node-1 Keepalived[5870]: Opening file '/etc/keepalived/keepalived.conf'.
Dec 5 14:07:53 node-1 Keepalived[5871]: Starting Healthcheck child process, pid=5872
Dec 5 14:07:53 node-1 Keepalived_healthcheckers[5872]: Initializing ipvs
Dec 5 14:07:53 node-1 Keepalived_healthcheckers[5872]: Registering Kernel netlink reflector
Dec 5 14:07:53 node-1 Keepalived_healthcheckers[5872]: Registering Kernel netlink command channel
Dec 5 14:07:53 node-1 Keepalived_healthcheckers[5872]: Opening file '/etc/keepalived/keepalived.conf'.
Dec 5 14:07:53 node-1 Keepalived[5871]: Starting VRRP child process, pid=5873
Dec 5 14:07:53 node-1 Keepalived_vrrp[5873]: Registering Kernel netlink reflector
Dec 5 14:07:53 node-1 Keepalived_vrrp[5873]: Registering Kernel netlink command channel
Dec 5 14:07:53 node-1 Keepalived_vrrp[5873]: Registering gratuitous ARP shared channel
Dec 5 14:07:53 node-1 Keepalived_vrrp[5873]: Opening file '/etc/keepalived/keepalived.conf'.
Dec 5 14:07:53 node-1 Keepalived_healthcheckers[5872]: Using LinkWatch kernel netlink reflector...
Dec 5 14:07:53 node-1 Keepalived_vrrp[5873]: Using LinkWatch kernel netlink reflector...
Dec 5 14:07:53 node-1 Keepalived_vrrp[5873]: VRRP_Script(haproxy-check) succeeded
Dec 5 14:07:54 node-1 Keepalived_vrrp[5873]: VRRP_Instance(haproxy-vip) Transition to MASTER STATE
Dec 5 14:07:54 node-1 Keepalived_vrrp[5873]: VRRP_Instance(haproxy-vip) Changing effective priority from 150 to 152
Dec 5 14:07:55 node-1 Keepalived_vrrp[5873]: VRRP_Instance(haproxy-vip) Entering MASTER STATE
Dec 5 14:07:57 node-1 ntpd[946]: Listen normally on 45 vpn 10.0.0.1:123
节点 1 ip addr
:
vpn: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500
link/none
inet 10.0.0.222/24 scope global vpn
valid_lft forever preferred_lft forever
inet 10.0.0.1/24 scope global secondary vpn
valid_lft forever preferred_lft forever
节点 2 和 3 日志
Dec 5 14:14:32 node-2 systemd[1]: Starting Keepalive Daemon (LVS and VRRP)...
Dec 5 14:14:32 node-2 Keepalived[13745]: Starting Keepalived v1.3.2 (12/03,2016)
Dec 5 14:14:32 node-2 Keepalived[13745]: WARNING - default user 'keepalived_script' for script execution does not exist - please create.
Dec 5 14:14:32 node-2 Keepalived[13745]: Opening file '/etc/keepalived/keepalived.conf'.
Dec 5 14:14:32 node-2 Keepalived[13746]: Starting Healthcheck child process, pid=13747
Dec 5 14:14:32 node-2 Keepalived_healthcheckers[13747]: Initializing ipvs
Dec 5 14:14:32 node-2 systemd[1]: Started Keepalive Daemon (LVS and VRRP).
Dec 5 14:14:32 node-2 Keepalived_healthcheckers[13747]: Registering Kernel netlink reflector
Dec 5 14:14:32 node-2 Keepalived_healthcheckers[13747]: Registering Kernel netlink command channel
Dec 5 14:14:32 node-2 Keepalived[13746]: Starting VRRP child process, pid=13748
Dec 5 14:14:32 node-2 Keepalived_healthcheckers[13747]: Opening file '/etc/keepalived/keepalived.conf'.
Dec 5 14:14:32 node-2 Keepalived_vrrp[13748]: Registering Kernel netlink reflector
Dec 5 14:14:32 node-2 Keepalived_vrrp[13748]: Registering Kernel netlink command channel
Dec 5 14:14:32 node-2 Keepalived_vrrp[13748]: Registering gratuitous ARP shared channel
Dec 5 14:14:32 node-2 Keepalived_vrrp[13748]: Opening file '/etc/keepalived/keepalived.conf'.
Dec 5 14:14:32 node-2 Keepalived_healthcheckers[13747]: Using LinkWatch kernel netlink reflector...
Dec 5 14:14:32 node-2 Keepalived_vrrp[13748]: Using LinkWatch kernel netlink reflector...
Dec 5 14:14:32 node-2 Keepalived_vrrp[13748]: VRRP_Instance(haproxy-vip) Entering BACKUP STATE
Dec 5 14:14:32 node-2 Keepalived_vrrp[13748]: VRRP_Script(haproxy-check) succeeded
Dec 5 14:14:33 node-2 Keepalived_vrrp[13748]: VRRP_Instance(haproxy-vip) Changing effective priority from 100 to 102
节点 2 和 3 ip addr
:
节点 2
vpn: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500
link/none
inet 10.0.0.13/24 scope global vpn
valid_lft forever preferred_lft forever
节点 3
vpn: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500
link/none
inet 10.0.0.103/24 scope global vpn
valid_lft forever preferred_lft forever
问题
但是,当我停keepalived
在节点 1 上时,节点 3 被选为主节点,并注册了 VIP,只有节点 3 ping 10.0.0.1。
节点 1 日志
停止时:
Dec 5 14:15:26 node-1 systemd[1]: Stopping Keepalive Daemon (LVS and VRRP)...
Dec 5 14:15:26 node-1 Keepalived[5871]: Stopping
Dec 5 14:15:26 node-1 Keepalived_healthcheckers[5872]: Stopped
Dec 5 14:15:26 node-1 Keepalived_vrrp[5873]: VRRP_Instance(haproxy-vip) sent 0 priority
Dec 5 14:15:27 node-1 Keepalived_vrrp[5873]: Stopped
Dec 5 14:15:27 node-1 Keepalived[5871]: Stopped Keepalived v1.3.2 (12/03,2016)
Dec 5 14:15:27 node-1 systemd[1]: Stopped Keepalive Daemon (LVS and VRRP).
Dec 5 14:15:28 node-1 ntpd[946]: Deleting interface #45 vpn, 10.0.0.1#123, interface stats: received=0, sent=0, dropped=0, active_time=451 secs
节点 1 ip addr
:
vpn: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500
link/none
inet 10.0.0.222/24 scope global vpn
valid_lft forever preferred_lft forever
节点 2 日志
Dec 5 14:15:27 node-2 Keepalived_vrrp[13748]: VRRP_Instance(haproxy-vip) Transition to MASTER STATE
Dec 5 14:15:27 node-2 Keepalived_vrrp[13748]: VRRP_Instance(haproxy-vip) Received advert with higher priority 102, ours 102
Dec 5 14:15:27 node-2 Keepalived_vrrp[13748]: VRRP_Instance(haproxy-vip) Entering BACKUP STATE
节点 2 ip addr
:
vpn: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500
link/none
inet 10.0.0.13/24 scope global vpn
valid_lft forever preferred_lft forever
节点 3 日志
Dec 5 14:15:27 node-3 Keepalived_vrrp[31252]: VRRP_Instance(haproxy-vip) Transition to MASTER STATE
Dec 5 14:15:27 node-3 Keepalived_vrrp[31252]: VRRP_Instance(haproxy-vip) Received advert with lower priority 102, ours 102, forcing new election
Dec 5 14:15:28 node-3 Keepalived_vrrp[31252]: VRRP_Instance(haproxy-vip) Entering MASTER STATE
Dec 5 14:15:29 node-3 ntpd[27734]: Listen normally on 36 vpn 10.0.0.1:123
节点 3 ip addr
:
vpn: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500
link/none
inet 10.0.0.103/24 scope global vpn
valid_lft forever preferred_lft forever
inet 10.0.0.1/24 scope global secondary vpn
valid_lft forever preferred_lft forever
更多细节
跟踪路由
我用来traceroute
尝试获取有关该问题的更多信息。
当所有节点都在运行keepalived
并且 ping VIP 无处不在时,traceroute
显示所有节点:
$ traceroute 10.0.0.1
traceroute to 10.0.0.1 (10.0.0.1), 30 hops max, 60 byte packets
1 10.0.0.1 (10.0.0.1) 0.094 ms 0.030 ms 0.019 ms
当keepalived
节点1上停止时,节点3选举了,节点1无法弄清楚VIP在哪里:
$ traceroute 10.0.0.1
traceroute to 10.0.0.1 (10.0.0.1), 30 hops max, 60 byte packets
1 * * *
2 * * *
...
29 * * *
30 * * *
节点 2 期望节点 1 拥有 VIP:
$ traceroute 10.0.0.1
traceroute to 10.0.0.1 (10.0.0.1), 30 hops max, 60 byte packets
1 10.0.0.222 (10.0.0.222) 0.791 ms 0.962 ms 1.080 ms
2 * * *
3 * * *
...
并且节点 3 有 VIP,所以它可以工作。
Tinc 设备类型
我阅读了一些邮件存档,建议DeviceType = tap
在 Tinc 配置中使用 以便传输 ARP 包(据我了解),但它没有帮助。
实际上,随着选举的发生,我不确定 Tinc 是根本原因。
尝试不使用 Tinc
我更改了keepalived
配置,使其使用公共互联网接口,使用单播。
我在每个节点上的每个 keepalived 配置中添加了以下块(这里是 for node-1
):
unicast_src_ip XXX.XXX.XXX.XXX # node's public IP address
unicast_peer {
XXX.XXX.XXX.XXX # other node's public IP address
XXX.XXX.XXX.XXX # other node's public IP address
}
但是行为与上面描述的完全一样,所以 Tinc 不应该是相关的。
要求
谁能帮我找出问题所在并解决这个问题,以便在进行新的选举时,节点可以在新位置找到 VIP?