Estou tentando implantar o OpenStack Queens com kolla-ansible (7.0.0) em hosts Ubuntu, seguindo o guia oficial .
Depois de bem-sucedido bootstrap-servers
e precheck
o deploy
comando falhar:
RUNNING HANDLER [haproxy : Waiting for virtual IP to appear] **********************************************************
fatal: [testcloudcontrol01]: FAILED! => {"changed": false, "elapsed": 300, "msg": "Timeout when waiting for 10.52.41.98:3306"}
fatal: [testcloudcontrol02]: FAILED! => {"changed": false, "elapsed": 300, "msg": "Timeout when waiting for 10.52.41.98:3306"}
A razão para a verificação falhar é que o kolla_internal_vip_address
não aparece.
globals.yml
config_strategy: "COPY_ALWAYS"
kolla_base_distro: "ubuntu"
kolla_install_type: "binary"
openstack_release: "queens"
kolla_internal_vip_address: "10.52.41.98"
kolla_internal_fqdn: "testcloudapi.example.com"
kolla_external_vip_address: "{{ kolla_internal_vip_address }}"
kolla_external_fqdn: "{{ kolla_internal_fqdn }}"
network_interface: "ens160"
api_interface: "ens160"
storage_interface: "ens161"
keepalived_virtual_router_id: "148"
Atualmente estou fixo em rainhas porque quero replicar nosso ambiente de produção para testes.
A saída de ip addr
um dos nós em que o haproxy deve implantar:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:a1:6a:2c brd ff:ff:ff:ff:ff:ff
inet 10.52.41.100/24 brd 10.52.41.255 scope global ens160
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fea1:6a2c/64 scope link
valid_lft forever preferred_lft forever
3: ens161: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:a1:7d:07 brd ff:ff:ff:ff:ff:ff
inet 10.52.42.100/24 brd 10.52.42.255 scope global ens161
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fea1:7d07/64 scope link
valid_lft forever preferred_lft forever
4: ens224: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:a1:23:6e brd ff:ff:ff:ff:ff:ff
inet 10.52.40.100/24 brd 10.52.40.255 scope global ens224
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fea1:236e/64 scope link
valid_lft forever preferred_lft forever
5: ens256: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:a1:20:12 brd ff:ff:ff:ff:ff:ff
inet 10.52.44.100/24 brd 10.52.44.255 scope global ens256
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fea1:2012/64 scope link
valid_lft forever preferred_lft forever
6: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:b0:8a:93:e7 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 scope global docker0
valid_lft forever preferred_lft forever
Os nós são máquinas virtuais VMware com placas VMXNet3.
Saída de docker logs keepalived
:
+ sudo -E kolla_set_configs
INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Deleting /etc/keepalived/keepalived.conf
INFO:__main__:Copying /var/lib/kolla/config_files/keepalived.conf to /etc/keepalived/keepalived.conf
INFO:__main__:Setting permission for /etc/keepalived/keepalived.conf
INFO:__main__:Writing out command to execute
++ cat /run_command
+ CMD='/usr/sbin/keepalived -nld -p /run/keepalived.pid'
+ ARGS=
+ [[ ! -n '' ]]
+ . kolla_extend_start
++ modprobe ip_vs
++ '[' -f /run/keepalived.pid ']'
+ echo 'Running command: '\''/usr/sbin/keepalived -nld -p /run/keepalived.pid'\'''
Running command: '/usr/sbin/keepalived -nld -p /run/keepalived.pid'
+ exec /usr/sbin/keepalived -nld -p /run/keepalived.pid
Thu Dec 13 12:10:26 2018: Starting Keepalived v1.3.9 (10/21,2017)
Thu Dec 13 12:10:26 2018: Opening file '/etc/keepalived/keepalived.conf'.
Thu Dec 13 12:10:26 2018: Starting Healthcheck child process, pid=11
Thu Dec 13 12:10:26 2018: Opening file '/etc/keepalived/keepalived.conf'.
Thu Dec 13 12:10:26 2018: Starting VRRP child process, pid=12
Thu Dec 13 12:10:26 2018: ------< Global definitions >------
Thu Dec 13 12:10:26 2018: Router ID = testcloudcontrol01.example.com
Thu Dec 13 12:10:26 2018: Default interface = eth0
Thu Dec 13 12:10:26 2018: LVS flush = false
Thu Dec 13 12:10:26 2018: VRRP IPv4 mcast group = 224.0.0.18
Thu Dec 13 12:10:26 2018: VRRP IPv6 mcast group = ff02::12
Thu Dec 13 12:10:26 2018: Gratuitous ARP delay = 5
Thu Dec 13 12:10:26 2018: Gratuitous ARP repeat = 5
Thu Dec 13 12:10:26 2018: Gratuitous ARP refresh timer = 0
Thu Dec 13 12:10:26 2018: Gratuitous ARP refresh repeat = 1
Thu Dec 13 12:10:26 2018: Gratuitous ARP lower priority delay = 4294
Thu Dec 13 12:10:26 2018: Gratuitous ARP lower priority repeat = -1
Thu Dec 13 12:10:26 2018: Send advert after receive lower priority advert = true
Thu Dec 13 12:10:26 2018: Send advert after receive higher priority advert = false
Thu Dec 13 12:10:26 2018: Gratuitous ARP interval = 0
Thu Dec 13 12:10:26 2018: Gratuitous NA interval = 0
Thu Dec 13 12:10:26 2018: VRRP default protocol version = 2
Thu Dec 13 12:10:26 2018: Iptables input chain = INPUT
Thu Dec 13 12:10:26 2018: Using ipsets = true
Thu Dec 13 12:10:26 2018: ipset IPv4 address set = keepalived
Thu Dec 13 12:10:26 2018: ipset IPv6 address set = keepalived6
Thu Dec 13 12:10:26 2018: ipset IPv6 address,iface set = keepalived_if6
Thu Dec 13 12:10:26 2018: VRRP check unicast_src = false
Thu Dec 13 12:10:26 2018: VRRP skip check advert addresses = false
Thu Dec 13 12:10:26 2018: VRRP strict mode = false
Thu Dec 13 12:10:26 2018: VRRP process priority = 0
Thu Dec 13 12:10:26 2018: VRRP don't swap = false
Thu Dec 13 12:10:26 2018: Checker process priority = 0
Thu Dec 13 12:10:26 2018: Checker don't swap = false
Thu Dec 13 12:10:26 2018: SNMP keepalived disabled
Thu Dec 13 12:10:26 2018: SNMP checker disabled
Thu Dec 13 12:10:26 2018: SNMP RFCv2 disabled
Thu Dec 13 12:10:26 2018: SNMP RFCv3 disabled
Thu Dec 13 12:10:26 2018: SNMP traps disabled
Thu Dec 13 12:10:26 2018: SNMP socket = default (unix:/var/agentx/master)
Thu Dec 13 12:10:26 2018: Network namespace = (default)
Thu Dec 13 12:10:26 2018: DBus disabled
Thu Dec 13 12:10:26 2018: DBus service name = (null)
Thu Dec 13 12:10:26 2018: Script security disabled
Thu Dec 13 12:10:26 2018: Default script uid:gid 0:0
Thu Dec 13 12:10:26 2018: Registering Kernel netlink reflector
Thu Dec 13 12:10:26 2018: Registering Kernel netlink command channel
Thu Dec 13 12:10:26 2018: Registering gratuitous ARP shared channel
Thu Dec 13 12:10:26 2018: Opening file '/etc/keepalived/keepalived.conf'.
Thu Dec 13 12:10:26 2018: WARNING - default user 'keepalived_script' for script execution does not exist - please create.
Thu Dec 13 12:10:26 2018: Truncating auth_pass to 8 characters
Thu Dec 13 12:10:26 2018: SECURITY VIOLATION - scripts are being executed but script_security not enabled.
Thu Dec 13 12:10:26 2018: ------< Global definitions >------
Thu Dec 13 12:10:26 2018: Router ID = testcloudcontrol01.example.com
Thu Dec 13 12:10:26 2018: Default interface = eth0
Thu Dec 13 12:10:26 2018: LVS flush = false
Thu Dec 13 12:10:26 2018: VRRP IPv4 mcast group = 224.0.0.18
Thu Dec 13 12:10:26 2018: VRRP IPv6 mcast group = ff02::12
Thu Dec 13 12:10:26 2018: Gratuitous ARP delay = 5
Thu Dec 13 12:10:26 2018: Gratuitous ARP repeat = 5
Thu Dec 13 12:10:26 2018: Gratuitous ARP refresh timer = 0
Thu Dec 13 12:10:26 2018: Gratuitous ARP refresh repeat = 1
Thu Dec 13 12:10:26 2018: Gratuitous ARP lower priority delay = 5
Thu Dec 13 12:10:26 2018: Gratuitous ARP lower priority repeat = 5
Thu Dec 13 12:10:26 2018: Send advert after receive lower priority advert = true
Thu Dec 13 12:10:26 2018: Send advert after receive higher priority advert = false
Thu Dec 13 12:10:26 2018: Gratuitous ARP interval = 0
Thu Dec 13 12:10:26 2018: Gratuitous NA interval = 0
Thu Dec 13 12:10:26 2018: VRRP default protocol version = 2
Thu Dec 13 12:10:26 2018: Iptables input chain = INPUT
Thu Dec 13 12:10:26 2018: Using ipsets = false
Thu Dec 13 12:10:26 2018: ipset IPv4 address set = keepalived
Thu Dec 13 12:10:26 2018: ipset IPv6 address set = keepalived6
Thu Dec 13 12:10:26 2018: ipset IPv6 address,iface set = keepalived_if6
Thu Dec 13 12:10:26 2018: VRRP check unicast_src = false
Thu Dec 13 12:10:26 2018: VRRP skip check advert addresses = false
Thu Dec 13 12:10:26 2018: VRRP strict mode = false
Thu Dec 13 12:10:26 2018: VRRP process priority = 0
Thu Dec 13 12:10:26 2018: VRRP don't swap = false
Thu Dec 13 12:10:26 2018: Checker process priority = 0
Thu Dec 13 12:10:26 2018: Checker don't swap = false
Thu Dec 13 12:10:26 2018: SNMP keepalived disabled
Thu Dec 13 12:10:26 2018: SNMP checker disabled
Thu Dec 13 12:10:26 2018: SNMP RFCv2 disabled
Thu Dec 13 12:10:26 2018: SNMP RFCv3 disabled
Thu Dec 13 12:10:26 2018: SNMP traps disabled
Thu Dec 13 12:10:26 2018: SNMP socket = default (unix:/var/agentx/master)
Thu Dec 13 12:10:26 2018: Network namespace = (default)
Thu Dec 13 12:10:26 2018: DBus disabled
Thu Dec 13 12:10:26 2018: DBus service name = (null)
Thu Dec 13 12:10:26 2018: Script security disabled
Thu Dec 13 12:10:26 2018: Default script uid:gid 0:0
Thu Dec 13 12:10:26 2018: ------< VRRP Topology >------
Thu Dec 13 12:10:26 2018: VRRP Instance = kolla_internal_vip_148
Thu Dec 13 12:10:26 2018: Using VRRPv2
Thu Dec 13 12:10:26 2018: Want State = BACKUP
Thu Dec 13 12:10:26 2018: Running on device = ens160
Thu Dec 13 12:10:26 2018: Skip checking advert IP addresses = no
Thu Dec 13 12:10:26 2018: Enforcing strict VRRP compliance = no
Thu Dec 13 12:10:26 2018: Using src_ip = 10.52.41.100
Thu Dec 13 12:10:26 2018: Gratuitous ARP delay = 5
Thu Dec 13 12:10:26 2018: Gratuitous ARP repeat = 5
Thu Dec 13 12:10:26 2018: Gratuitous ARP refresh timer = 0
Thu Dec 13 12:10:26 2018: Gratuitous ARP refresh repeat = 1
Thu Dec 13 12:10:26 2018: Gratuitous ARP lower priority delay = 5
Thu Dec 13 12:10:26 2018: Gratuitous ARP lower priority repeat = 5
Thu Dec 13 12:10:26 2018: Send advert after receive lower priority advert = true
Thu Dec 13 12:10:26 2018: Send advert after receive higher priority advert = false
Thu Dec 13 12:10:26 2018: Virtual Router ID = 148
Thu Dec 13 12:10:26 2018: Priority = 1
Thu Dec 13 12:10:26 2018: Advert interval = 1 sec
Thu Dec 13 12:10:26 2018: Accept enabled
Thu Dec 13 12:10:26 2018: Preempt disabled
Thu Dec 13 12:10:26 2018: Promote_secondaries disabled
Thu Dec 13 12:10:26 2018: Authentication type = SIMPLE_PASSWORD
Thu Dec 13 12:10:26 2018: Password = 0RXbQYFF
Thu Dec 13 12:10:26 2018: Tracked scripts = 1
Thu Dec 13 12:10:26 2018: check_alive weight 0
Thu Dec 13 12:10:26 2018: Virtual IP = 1
Thu Dec 13 12:10:26 2018: 10.52.41.98/32 dev ens160 scope global
Thu Dec 13 12:10:26 2018: ------< VRRP Scripts >------
Thu Dec 13 12:10:26 2018: VRRP Script = check_alive
Thu Dec 13 12:10:26 2018: Command = /check_alive.sh
Thu Dec 13 12:10:26 2018: Interval = 2 sec
Thu Dec 13 12:10:26 2018: Timeout = 0 sec
Thu Dec 13 12:10:26 2018: Weight = 0
Thu Dec 13 12:10:26 2018: Rise = 10
Thu Dec 13 12:10:26 2018: Fall = 2
Thu Dec 13 12:10:26 2018: Insecure = no
Thu Dec 13 12:10:26 2018: Status = INIT
Thu Dec 13 12:10:26 2018: Script uid:gid = 0:0
Thu Dec 13 12:10:26 2018: ------< NIC >------
Thu Dec 13 12:10:26 2018: Name = lo
Thu Dec 13 12:10:26 2018: index = 1
Thu Dec 13 12:10:26 2018: IPv4 address = 127.0.0.1
Thu Dec 13 12:10:26 2018: IPv6 address = ::
Thu Dec 13 12:10:26 2018: is UP
Thu Dec 13 12:10:26 2018: is RUNNING
Thu Dec 13 12:10:26 2018: MTU = 65536
Thu Dec 13 12:10:26 2018: HW Type = LOOPBACK
Thu Dec 13 12:10:26 2018: ------< NIC >------
Thu Dec 13 12:10:26 2018: Name = ens160
Thu Dec 13 12:10:26 2018: index = 2
Thu Dec 13 12:10:26 2018: IPv4 address = 10.52.41.100
Thu Dec 13 12:10:26 2018: IPv6 address = fe80::250:56ff:fea1:6a2c
Thu Dec 13 12:10:26 2018: MAC = 00:50:56:a1:6a:2c
Thu Dec 13 12:10:26 2018: is UP
Thu Dec 13 12:10:26 2018: is RUNNING
Thu Dec 13 12:10:26 2018: MTU = 1500
Thu Dec 13 12:10:26 2018: HW Type = ETHERNET
Thu Dec 13 12:10:26 2018: ------< NIC >------
Thu Dec 13 12:10:26 2018: Name = ens161
Thu Dec 13 12:10:26 2018: index = 3
Thu Dec 13 12:10:26 2018: IPv4 address = 10.52.42.100
Thu Dec 13 12:10:26 2018: IPv6 address = fe80::250:56ff:fea1:7d07
Thu Dec 13 12:10:26 2018: MAC = 00:50:56:a1:7d:07
Thu Dec 13 12:10:26 2018: is UP
Thu Dec 13 12:10:26 2018: is RUNNING
Thu Dec 13 12:10:26 2018: MTU = 1500
Thu Dec 13 12:10:26 2018: HW Type = ETHERNET
Thu Dec 13 12:10:26 2018: ------< NIC >------
Thu Dec 13 12:10:26 2018: Name = ens224
Thu Dec 13 12:10:26 2018: index = 4
Thu Dec 13 12:10:26 2018: IPv4 address = 10.52.40.100
Thu Dec 13 12:10:26 2018: IPv6 address = fe80::250:56ff:fea1:236e
Thu Dec 13 12:10:26 2018: MAC = 00:50:56:a1:23:6e
Thu Dec 13 12:10:26 2018: is UP
Thu Dec 13 12:10:26 2018: is RUNNING
Thu Dec 13 12:10:26 2018: MTU = 1500
Thu Dec 13 12:10:26 2018: HW Type = ETHERNET
Thu Dec 13 12:10:26 2018: ------< NIC >------
Thu Dec 13 12:10:26 2018: Name = ens256
Thu Dec 13 12:10:26 2018: index = 5
Thu Dec 13 12:10:26 2018: IPv4 address = 10.52.44.100
Thu Dec 13 12:10:26 2018: IPv6 address = fe80::250:56ff:fea1:2012
Thu Dec 13 12:10:26 2018: MAC = 00:50:56:a1:20:12
Thu Dec 13 12:10:26 2018: is UP
Thu Dec 13 12:10:26 2018: is RUNNING
Thu Dec 13 12:10:26 2018: MTU = 1500
Thu Dec 13 12:10:26 2018: HW Type = ETHERNET
Thu Dec 13 12:10:26 2018: ------< NIC >------
Thu Dec 13 12:10:26 2018: Name = docker0
Thu Dec 13 12:10:26 2018: index = 6
Thu Dec 13 12:10:26 2018: IPv4 address = 172.17.0.1
Thu Dec 13 12:10:26 2018: IPv6 address = ::
Thu Dec 13 12:10:26 2018: MAC = 02:42:b0:8a:93:e7
Thu Dec 13 12:10:26 2018: is UP
Thu Dec 13 12:10:26 2018: MTU = 1500
Thu Dec 13 12:10:26 2018: HW Type = ETHERNET
Thu Dec 13 12:10:26 2018: Using LinkWatch kernel netlink reflector...
Thu Dec 13 12:10:26 2018: VRRP_Instance(kolla_internal_vip_148) Entering BACKUP STATE
Thu Dec 13 12:10:26 2018: /check_alive.sh exited with status 1
Thu Dec 13 12:10:28 2018: /check_alive.sh exited with status 1
Thu Dec 13 12:10:30 2018: VRRP_Instance(kolla_internal_vip_148) Now in FAULT state
Thu Dec 13 12:10:30 2018: /check_alive.sh exited with status 1
Thu Dec 13 12:10:32 2018: /check_alive.sh exited with status 1
[message repeats until I stop the container]
É isso, ambas as instâncias keepalived ficam no estado FAULT, o endereço IP não é ativado em nenhuma das VMs.
Passei por essa pergunta e a resposta , mesmo não tendo as mensagens de erro nos arquivos de log:
- keepalived_virtual_router_id foi alterado e é único
- Eu corri
kolla-genpwd
novamente. Eu confirmei quekeepalived_password
está definido em/etc/kolla/passwords.yml
kolla_internal_vip_address
é acessível a partir denetwork_interface
. O IP principal dessa interface está na mesma rede. Posso definir manualmente o endereço IP adicional e funciona.kolla-ansible prechecks
passa- selinux não está ativo no Ubuntu
No lado do hipervisor, tentei habilitar Promiscuous mode
o grupo de portas dessa interface. Isso não fez diferença.
Então, depois de encontrar o mesmo problema no bare metal, me aprofundei no problema. Acontece que não foi keepalived, mas o contêiner haproxy que teve o problema.
O contêiner haproxy continua reiniciando porque o haproxy é iniciado com o parâmetro de linha de comando
-W
, que não existe na versão do haproxy fornecida no contêiner.Portanto, o contêiner haproxy continua reiniciando. O contêiner keepalived, por outro lado, é configurado com um script de verificação para keepalived que continua saindo com um erro:
Este script de verificação é muito simples, ele verifica o status do haproxy através de um arquivo de soquete:
Então ... enquanto haproxy é chamado com o parâmetro inválido e não inicia, keepalived permanece no
FAULT
estado, sem IP flutuante para cima.Usando
grep -R "haproxy -W" *
descobri que a linha de comando para haproxy está definida no arquivo/usr/local/share/kolla-ansible/ansible/roles/haproxy/templates/haproxy.json.j2
. Eu removi o-W
parâmetro da linha de comando, o que resultou no haproxy iniciando corretamente e no keepalived mudando para oMASTER
estado com a configuração do IP flutuante.Já existe um relatório de bug aberto no Launchpad sobre este problema . Há também uma solução ligeiramente diferente nos comentários (alterando o mesmo arquivo).
Essa modificação será, obviamente, revertida quando o arquivo for atualizado. Se você tiver o mesmo problema, faça login no Launchpad e marque que o bug (que foi relatado em 2018-06-08) afeta você, para que tenha prioridade e seja corrigido.