我正在尝试使用 2 个节点安装 MariaDB Galera Cluster:
节点 1/ 172.23.0.2 :
wsrep_on=ON
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
binlog_format=ROW
wsrep_cluster_address='gcomm://'
wsrep_sst_receive_address = '172.23.0.2:4444'
wsrep_cluster_name='cluster'
wsrep_node_name='n_01'
wsrep_sst_method=rsync
wsrep_sst_auth=cluster_user:cluster_pass
节点 2/ 172.23.0.3 :
wsrep_on=ON
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
binlog_format=ROW
wsrep_cluster_address='gcomm://172.23.0.2'
wsrep_sst_receive_address = '172.23.0.3:4444'
wsrep_cluster_name='cluster'
wsrep_node_name='n_02'
wsrep_sst_method=rsync
wsrep_sst_auth=cluster_user:cluster_pass
第一个节点启动时没有错误:
Variable_name Value
-------------------- ---------
wsrep_cluster_size 1
wsrep_cluster_status Primary
wsrep_connected ON
wsrep_ready ON
但是当我启动 2n 节点时,我得到了这个:
mariadb.service - MariaDB database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/mariadb.service.d
└─migrated-from-my.cnf-settings.conf
Active: failed (Result: exit-code) since jeu. 2017-08-24 19:11:32 CEST; 14s ago
Process: 14656 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
Process: 20222 ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION (code=exited, status=1/FAILURE)
Process: 18861 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
Process: 18858 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
Main PID: 20222 (code=exited, status=1/FAILURE)
Status: "MariaDB server is down"
CGroup: /system.slice/mariadb.service
├─20357 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 20309 --binlog /var/log/mariadb/binlog/mysql_binlog
├─20391 rsync --daemon --no-detach --port 4444 --config /home/mysql//rsync_sst.conf
├─22006 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 21997 --binlog /var/log/mariadb/binlog/mysql_binlog
├─22638 sleep 0.2
└─22648 sleep 0.2
août 24 19:11:23 ovh38 mysqld[20222]: 2017-08-24 19:11:23 127079663351552 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_rsync --role 'joiner' --address '172.23.0.3:4444' --datadir '/home/mysql/' --pa...log/mysql_binlog'
août 24 19:11:23 ovh38 mysqld[20222]: Read: '(null)'
août 24 19:11:23 ovh38 mysqld[20222]: 2017-08-24 19:11:23 127079663351552 [ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'joiner' --address '172.23.0.3:4444' --datadir '/home/mysql/' --parent '...eady in progress)
août 24 19:11:23 ovh38 mysqld[20222]: 2017-08-24 19:11:23 127080155712256 [ERROR] WSREP: Failed to prepare for 'rsync' SST. Unrecoverable.
août 24 19:11:23 ovh38 mysqld[20222]: 2017-08-24 19:11:23 127080155712256 [ERROR] Aborting
août 24 19:11:32 ovh38 mysqld[20222]: Error in my_thread_global_end(): 1 threads didn't exit
août 24 19:11:32 ovh38 systemd[1]: mariadb.service: main process exited, code=exited, status=1/FAILURE
août 24 19:11:32 ovh38 systemd[1]: Failed to start MariaDB database server.
août 24 19:11:32 ovh38 systemd[1]: Unit mariadb.service entered failed state.
août 24 19:11:32 ovh38 systemd[1]: mariadb.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
更新 :
此错误的来源是由于 rsync 程序已在使用中,因此解决方案是终止它:
Proto Recv-Q Send-Q Adresse locale Adresse distante Etat PID/Program name
tcp 0 0 0.0.0.0:21 0.0.0.0:* LISTEN 1087/proftpd: (acce
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 708/sshd
tcp 0 0 0.0.0.0:4444 0.0.0.0:* LISTEN 15510/rsync
tcp6 0 0 :::80 :::* LISTEN 19059/httpd
tcp6 0 0 :::22 :::* LISTEN 708/sshd
tcp6 0 0 :::443 :::* LISTEN 19059/httpd
tcp6 0 0 :::4444 :::* LISTEN 15510/rsync
tcp6 0 0 :::545 :::* LISTEN 19059/httpd
#kill -9 15510
我尝试重新启动第二个节点: systemctl start mariadb
在第一个节点中: SHOW STATUS LIKE 'wsrep_cluster%'
Variable_name Value
------------------------ --------------------------------------
wsrep_cluster_conf_id 2
wsrep_cluster_size 2
wsrep_cluster_state_uuid 00edfa0e-88d5-11e7-8f43-5ea901e83b3a
wsrep_cluster_status Primary
然而,又出现了一个错误:
# systemctl status mariadb.service
● mariadb.service - MariaDB database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/mariadb.service.d
└─migrated-from-my.cnf-settings.conf
Active: failed (Result: timeout) since ven. 2017-08-25 10:42:09 CEST; 11min ago
Process: 14656 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
Process: 12697 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
Process: 12685 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
Main PID: 15310
CGroup: /system.slice/mariadb.service
├─15310 /usr/sbin/mysqld --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
├─15468 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 15310 --binlog /var/log/mariadb/binlog/mysql_binlog
├─15510 rsync --daemon --no-detach --port 4444 --config /home/mysql//rsync_sst.conf
├─15980 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 15901 --binlog /var/log/mariadb/binlog/mysql_binlog
├─18646 sleep 0.2
├─18670 sleep 0.2
├─18675 sleep 0.2
├─18676 sleep 0.2
├─18686 sleep 0.2
├─20357 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 20309 --binlog /var/log/mariadb/binlog/mysql_binlog
├─22006 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 21997 --binlog /var/log/mariadb/binlog/mysql_binlog
└─23982 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 23794 --binlog /var/log/mariadb/binlog/mysql_binlog
août 25 10:39:10 ovh38 mysqld[15310]: 2017-08-25 10:39:10 115191544425216 [Note] WSREP: New cluster view: global state: 00edfa0e-88d5-11e7-8f43-5ea901e83b3a:0, view# 2: Primary, number of nodes: 2, my index: 0, protocol version 3
août 25 10:39:10 ovh38 mysqld[15310]: 2017-08-25 10:39:10 115191544425216 [Warning] WSREP: Gap in state sequence. Need state transfer.
août 25 10:39:10 ovh38 mysqld[15310]: 2017-08-25 10:39:10 115190975993600 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '172.23.0.3' --datadir '/home/mysql/' --parent '15310' --binlog '/var/log/...g/mysql_binlog' '
août 25 10:39:10 ovh38 rsyncd[15510]: rsyncd version 3.0.9 starting, listening on port 4444
août 25 10:39:13 ovh38 mysqld[15310]: 2017-08-25 10:39:13 115191014389504 [Note] WSREP: (42fd49d1, 'tcp://0.0.0.0:4567') turning message relay requesting off
août 25 10:40:39 ovh38 systemd[1]: mariadb.service start operation timed out. Terminating.
août 25 10:42:09 ovh38 systemd[1]: mariadb.service stop-final-sigterm timed out. Skipping SIGKILL. Entering failed mode.
août 25 10:42:09 ovh38 systemd[1]: Failed to start MariaDB database server.
août 25 10:42:09 ovh38 systemd[1]: Unit mariadb.service entered failed state.
août 25 10:42:09 ovh38 systemd[1]: mariadb.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
任何想法来解决这个问题?
您必须在每个节点的配置文件中指定所有节点的 IP 地址:
请参考 Galera 文档
此外,为了避免脑裂,您应该添加第三个节点或仲裁者。