如何将数据库从 Postgres 8.4 复制到 Postgres 9.1？

Question

Jamess

Asked: 2015-08-07 21:33:40 +0800 CST2015-08-07 21:33:40 +0800 CST 2015-08-07 21:33:40 +0800 CST

Postgresql - 使用 recovery.conf 和触发器进行时间点恢复。如何最小化等待 .history 文件的时间

772

在http://www.postgresql.org/docs/9.1/static/continuous-archiving.html之后，我将 WAL 档案复制到另一台机器上并定期应用它们。最近由于我的主服务器上的 pg 服务器崩溃，我尝试通过推送触发器（一个空文件，期待智能故障转移）来恢复数据库，日志如下所示。

prodrestore_error.log

WAL file not present yet. Checking for trigger file...
trigger file found: smart failover
Trigger file:         /app/recovery-prod/trigger/pgsql.trigger.5432
Waiting for WAL file: 0000000100000001000000CC
WAL file path:        /backup/prod/db_backup/archive/0000000100000001000000CC
Restoring to:         pg_xlog/RECOVERYXLOG
Sleep interval:       60 seconds
Max wait interval:    0 forever
Command for restore:  cp "/backup/prod/db_backup/archive/0000000100000001000000CC" "pg_xlog/RECOVERYXLOG"
Keep archive history: 000000000000000000000000 and later
trigger file found: smart failover
running restore:      OK

Trigger file:         /app/recovery-prod/trigger/pgsql.trigger.5432
Waiting for WAL file: 00000002.history
WAL file path:        /backup/prod/db_backup/archive/00000002.history
Restoring to:         pg_xlog/RECOVERYHISTORY
Sleep interval:       60 seconds
Max wait interval:    0 forever
Command for restore:  cp "/backup/prod/db_backup/archive/00000002.history" "pg_xlog/RECOVERYHISTORY"
Keep archive history: 000000000000000000000000 and later
running restore:      cp: cannot stat `/backup/prod/db_backup/archive/00000002.history': No such file or directory
cp: cannot stat `/backup/prod/db_backup/archive/00000002.history': No such file or directory
cp: cannot stat `/backup/prod/db_backup/archive/00000002.history': No such file or directory
cp: cannot stat `/backup/prod/db_backup/archive/00000002.history': No such file or directory
not restored
history file not found
Trigger file:         /app/recovery-prod/trigger/pgsql.trigger.5432
Waiting for WAL file: 00000001.history
WAL file path:        /backup/prod/db_backup/archive/00000001.history
Restoring to:         pg_xlog/RECOVERYHISTORY
Sleep interval:       60 seconds
Max wait interval:    0 forever
Command for restore:  cp "/backup/prod/db_backup/archive/00000001.history" "pg_xlog/RECOVERYHISTORY"
Keep archive history: 000000000000000000000000 and later
running restore:      cp: cannot stat `/backup/prod/db_backup/archive/00000001.history': No such file or directory
cp: cannot stat `/backup/prod/db_backup/archive/00000001.history': No such file or directory
cp: cannot stat `/backup/prod/db_backup/archive/00000001.history': No such file or directory
cp: cannot stat `/backup/prod/db_backup/archive/00000001.history': No such file or directory
not restored
history file not found

postgresql-9.1-main.log

2015-08-03 08:08:44 IST::@:[5542]:LOG:  restored log file "0000000100000001000000CC" from archive
2015-08-03 08:11:44 IST::@:[5542]:LOG:  could not open file "pg_xlog/0000000100000001000000CD" (log file 1, segment 205): No such file or directory
2015-08-03 08:11:44 IST::@:[5542]:LOG:  redo done at 1/CC001E10
2015-08-03 08:11:44 IST::@:[5542]:LOG:  last completed transaction was at log time 2015-08-03 07:59:58.121908+05:30
2015-08-03 08:11:44 IST::@:[5542]:LOG:  restored log file "0000000100000001000000CC" from archive
2015-08-03 08:17:44 IST::@:[5542]:LOG:  selected new timeline ID: 2
2015-08-03 08:23:45 IST::@:[5542]:LOG:  archive recovery complete
2015-08-03 08:23:51 IST::@:[6088]:LOG:  autovacuum launcher started
2015-08-03 08:23:51 IST::@:[5541]:LOG:  database system is ready to accept connections

整个查找两个历史文件，4次，最后启动数据库的过程00000002.history大约00000001.history用了12分钟。然后，通过恢复新的基本备份进行的每次重复迭代都需要更多时间，因为它开始寻找更多 .history 文件。

如何使查找历史文件的速度尽可能快，如果不存在，则快速继续以读/写模式启动数据库？否则，根本不查找历史文件。

1 个回答

Voted

Kassandry · Answer 1 · 2015-08-07T23:52:32+08:00

这不是历史文件的错。

引用您引用的文档链接：

通常，恢复将通过所有可用的 WAL 段进行，从而将数据库恢复到当前时间点（或尽可能接近可用的 WAL 段）。因此，正常恢复将以“找不到文件”消息结束，错误消息的确切文本取决于您选择的 restore_command。对于名为 00000001.history 的文件，您可能还会在恢复开始时看到一条错误消息。这也是正常的，在简单的恢复情况下并不表示有问题；讨论见第24.3.4节

由于看起来您正在使用 pg_standby，因此您有几个选择。

调整-r maxretries和-s sleeptime。

默认情况下-r maxretries是 3，因为看起来你的-s sleeptime设置设置为 60，所以它会在第一次重试时等待 60 秒，然后在第二次重试时等待 120 秒，然后在第三次也是最后一次重试复制命令时等待 180 秒它放弃并将副本服务器带入独立模式。

因此，除了重播所有日志之外，在命令尝试复制 .history 文件结束时，您总共要等待 360 秒或 6 分钟，加上恢复所有 WAL 所需的时间智能模式下的细分市场。
执行快速故障转移而不是智能故障转移。

如果您迫切希望快速进行故障转移，并且可以忍受丢失任何未应用的 WAL 段，您可以执行快速故障转移，而不是触发的智能故障转移，如 pg_standby 文档中所述。

智能故障转移与快速故障转移的优点取决于您希望副本服务器运行的速度以及您对丢失事务的容忍度。

快速故障转移会忽略未应用的 WAL 段，因此事务会丢失。在以独立模式启动副本之前，智能故障转移会重播所有可能的 WAL 段。这可能需要很长时间，具体取决于您需要重播多少 WAL 段。

最后，由于您使用的是 9.1，因此您可以使用流复制和restore_command您的 arecovery.conf来保持服务器比标准日志传送更最新，方法是在生成单个 WAL 记录时对其进行流式传输。

此外，如果副本无法跟上流负载（由于网络问题或任何数量的事情），它可以交替使用 arestore_command从您已经设置的存档中获取 WAL 段。

但是，您不能pg_standby用于 restore_command。

从理论上讲，如果一切顺利，恢复 WAL 记录几乎没有延迟，而不是恢复完整的 WAL 段，您的副本应该更快地以独立模式出现。

可以在此处找到流复制和存档的示例：

http://evol-monkey.blogspot.com/2014/09/offsite-replication-problems-and-how-to.html

监控您的复制延迟也很有用，以确保一切都按照您的要求保持最新状态。示例如下：

http://www.keithf4.com/monitoring_streaming_slave_lag/

希望有帮助。=)

Postgresql - 使用 recovery.conf 和触发器进行时间点恢复。如何最小化等待 .history 文件的时间

prodrestore_error.log

postgresql-9.1-main.log

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

Postgresql - 使用 recovery.conf 和触发器进行时间点恢复。如何最小化等待 .history 文件的时间

prodrestore_error.log

postgresql-9.1-main.log

1 个回答

相关问题