操作系统:CentOS 7.3
PostgreSQL:9.6
/etc/hosts
:
192.168.0.1 node1
192.168.0.2 node2
192.168.0.3 # VIP
192.168.0.4 # VIP
按照本指南进行 PostgreSQL 复制:
我用这种方式做第一个初始数据库副本:
psql -c "select pg_start_backup('initial_backup');"
rsync -cva --inplace --exclude=*pg_xlog* ~postgres/9.6/data/ 192.168.0.2:~postgres/9.6/data/
psql -c "select pg_stop_backup();"
然后编辑到正确的 IP pg_hba.conf
。postgresql.conf
然后创建一个recovery.conf
文件插入这个:
standby_mode = 'on'
primary_conninfo = 'host=192.168.0.1 port=5432 user=rep password=password'
restore_command = 'cp /var/lib/pgsql/9.6/pg_archive/%f %p'
recovery_target_timeline = 'latest'
遵循官方指南:
https://wiki.clusterlabs.org/wiki/PgSQL_Replicated_Cluster
设置config.pcs
:
pcs cluster cib pgsql_cfg
pcs -f pgsql_cfg property set no-quorum-policy="ignore"
pcs -f pgsql_cfg property set stonith-enabled="false"
pcs -f pgsql_cfg resource defaults resource-stickiness="INFINITY"
pcs -f pgsql_cfg resource defaults migration-threshold="1"
pcs -f pgsql_cfg resource create vip-master IPaddr2 \
ip="192.168.0.3" \
nic="ens192" \
cidr_netmask="24" \
op start timeout="60s" interval="0s" on-fail="restart" \
op monitor timeout="60s" interval="10s" on-fail="restart" \
op stop timeout="60s" interval="0s" on-fail="block"
pcs -f pgsql_cfg resource create vip-rep IPaddr2 \
ip="192.168.0.4" \
nic="ens192" \
cidr_netmask="24" \
meta migration-threshold="0" \
op start timeout="60s" interval="0s" on-fail="stop" \
op monitor timeout="60s" interval="10s" on-fail="restart" \
op stop timeout="60s" interval="0s" on-fail="ignore"
pcs -f pgsql_cfg resource create pgsql pgsql \
pgctl="/usr/pgsql-9.6/bin/pg_ctl" \
psql="/usr/pgsql-9.6/bin/psql" \
pgdata="/var/lib/pgsql/9.6/data/" \
rep_mode="sync" \
node_list="node1 node2" \
restore_command="cp /var/lib/pgsql/9.6/pg_archive/%f %p" \
primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5 keepalives_count=5" \
master_ip="192.168.0.4" \
restart_on_promote='true' \
op start timeout="60s" interval="0s" on-fail="restart" \
op monitor timeout="60s" interval="4s" on-fail="restart" \
op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" \
op promote timeout="60s" interval="0s" on-fail="restart" \
op demote timeout="60s" interval="0s" on-fail="stop" \
op stop timeout="60s" interval="0s" on-fail="block" \
op notify timeout="60s" interval="0s"
pcs -f pgsql_cfg resource master msPostgresql pgsql \
master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
pcs -f pgsql_cfg resource group add master-group vip-master vip-rep
pcs -f pgsql_cfg constraint colocation add master-group with Master msPostgresql INFINITY
pcs -f pgsql_cfg constraint order promote msPostgresql then start master-group symmetrical=false score=INFINITY
pcs -f pgsql_cfg constraint order demote msPostgresql then stop master-group symmetrical=false score=0
pcs cluster cib-push pgsql_cfg
加载配置
sh config.pcs
再次查看状态:
crm_mon -Afr -1
结果:
Stack: corosync
Current DC: node1 (version 1.1.15-11.el7_3.5-e174ec8) - partition with quorum
Last updated: Thu Jul 27 07:29:35 2017 Last change: Thu Jul 27 07:13:11 2017 by hacluster via crmd on node1
2 nodes and 4 resources configured
Online: [ node1 node2 ]
Full list of resources:
Master/Slave Set: msPostgresql [pgsql]
Masters: [ node2 ]
Slaves: [ node1 ]
Resource Group: master-group
vip-master (ocf::heartbeat:IPaddr2): Started node2
vip-rep (ocf::heartbeat:IPaddr2): Started node2
Node Attributes:
* Node node1:
+ master-pgsql : -INFINITY
+ pgsql-data-status : DISCONNECT
+ pgsql-status : HS:alone
* Node node2:
+ master-pgsql : 1000
+ pgsql-data-status : LATEST
+ pgsql-master-baseline : 0000000011000140
+ pgsql-status : PRI
Migration Summary:
* Node node1:
* Node node2:
为什么node1无法连接。
此时检查recovery.conf
两个节点上的文件:
节点 1:
standby_mode = 'on'
primary_conninfo = 'host=192.168.0.4 port=5432 user=postgres application_name=node1 keepalives_idle=60 keepalives_interval=5 keepalives_count=5'
restore_command = 'cp /var/lib/pgsql/9.6/pg_archive/%f %p'
recovery_target_timeline = 'latest'
节点 2:
(无效的)