我可以在使用数据库后激活 PITR 吗？

Question

rdbmsNoob

Asked: 2021-07-03 05:39:55 +0800 CST2021-07-03 05:39:55 +0800 CST 2021-07-03 05:39:55 +0800 CST

恐慌：无法写入文件“pg_xlog/xlogtemp”：设备上没有剩余空间

772

我们收到与以下错误有关的 postgres 环境中断：

Jul  1 00:36:04 test1 postgres[219259]: [770-2] user=,db=,app=client= CONTEXT:  writing block 199237 of relation pg_tblspc/16402/PG_9.6_201608131/7358881/41721132
Jul  1 00:36:05 test1 postgres[219252]: [3-1] user=,db=,app=client= LOG:  checkpointer process (PID 219259) was terminated by signal 6: Aborted
Jul  1 00:36:05 test1 postgres[219252]: [4-1] user=,db=,app=client= LOG:  terminating any other active server processes
Jul  1 00:36:05 test1 postgres[110539]: [5-1] user=postgres,db=product,app=psqlclient=[local] WARNING:  terminating connection because of crash of another server process
Jul  1 00:36:05 test1 postgres[110539]: [5-2] user=postgres,db=product,app=psqlclient=[local] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
Jul  1 00:36:05 test1 postgres[110539]: [5-3] user=postgres,db=product,app=psqlclient=[local] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
Jul  1 00:36:04 test1 postgres[219259]: [770-1] user=,db=,app=client= PANIC:  could not write to file "pg_xlog/xlogtemp.219259": No space left on device
Jul  1 00:36:04 test1 postgres[219259]: [770-2] user=,db=,app=client= CONTEXT:  writing block 199237 of relation pg_tblspc/16402/PG_9.6_201608131/7358881/41721132
Jul  1 00:36:05 test1 postgres[219252]: [3-1] user=,db=,app=client= LOG:  checkpointer process (PID 219259) was terminated by signal 6: Aborted
Jul  1 00:36:05 test1 postgres[219252]: [4-1] user=,db=,app=client= LOG:  terminating any other active server processes

Postgres 崩溃然后恢复，观察存储 pg_xlog 的空间可以看到驱动器已满，通常是 80GB 驱动器，使用率约为 10%，每天晚上大约在同一时间发生。我试图找出原因，但 postgres 日志文件中没有任何内容指向罪魁祸首。

我们有监控数据库服务器的数据狗，可以在发出错误时看到错误，但没有指出它可能是什么。

任何帮助表示赞赏。

1毫秒后可以看到：

Jul  1 00:36:04 test1 postgres[219259]: [770-1] user=,db=,app=client= PANIC:  could not write to file "pg_xlog/xlogtemp.219259": No space left on device
Jul  1 00:36:04 test1 postgres[219259]: [770-2] user=,db=,app=client= CONTEXT:  writing block 199237 of relation pg_tblspc/16402/PG_9.6_201608131/7358881/41721132
Jul  1 00:36:05 test1 postgres[219252]: [3-1] user=,db=,app=client= LOG:  checkpointer process (PID 219259) was terminated by signal 6: Aborted
Jul  1 00:36:05 test1 postgres[219252]: [4-1] user=,db=,app=client= LOG:  terminating any other active server processes
Jul  1 00:36:05 test1 postgres[110539]: [5-1] user=postgres,db=product,app=psqlclient=[local] WARNING:  terminating connection because of crash of another server process
Jul  1 00:36:05 test1 postgres[110539]: [5-2] user=postgres,db=product,app=psqlclient=[local] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
Jul  1 00:36:05 test1 postgres[110539]: [5-3] user=postgres,db=product,app=psqlclient=[local] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
Jul  1 00:36:05 test1 postgres[110539]: [5-4] user=postgres,db=product,app=psqlclient=[local] CONTEXT:  SQL statement "INSERT INTO AGGREGATES.agg_item_part_count
Jul  1 00:36:05 test1 postgres[110539]: [5-5] #011#011#011SELECT b.item_id,
Jul  1 00:36:05 test1 postgres[110539]: [5-6] #011#011              count(bp.item_id) as item_parts
Jul  1 00:36:05 test1 postgres[110539]: [5-7] #011        #011FROM item.item b
Jul  1 00:36:05 test1 postgres[110539]: [5-8] #011#011LEFT JOIN item.item_part bp USING (item_id)
Jul  1 00:36:05 test1 postgres[110539]: [5-9] #011        #011WHERE b.item_id >= starting_item
Jul  1 00:36:05 test1 postgres[110539]: [5-10] #011#011        GROUP by b.item_id
Jul  1 00:36:05 test1 postgres[110539]: [5-11] #011#011ON CONFLICT (item_id) DO
Jul  1 00:36:05 test1 postgres[110539]: [5-12] #011#011#011UPDATE SET
Jul  1 00:36:05 test1 postgres[110539]: [5-13] #011#011#011item_parts = EXCLUDED.item_parts"
Jul  1 00:36:05 test1 postgres[110539]: [5-14] #011PL/pgSQL function etl.update_bpart_agg() line 39 at SQL statement

1 个回答

Voted

jjanes · Answer 1 · 2021-07-04T06:54:33+08:00

Best Answer

jjanes

2021-07-04T06:54:33+08:002021-07-04T06:54:33+08:00

您的日志中只有 1 秒的分辨率（您应该将其更改为 %t 为 %m），所以这些是从下一秒开始的。但是我们不知道您离第二个边界有多近，并且这些日志消息是您所期望的。

看起来 INSERT...SELECT 语句可能是填充 wal 目录的罪魁祸首。如果 SELECT 返回很多行，那么 INSERT 会快速生成大量 WAL 是有道理的。我们不知道为什么它完全填满了，也许您正在归档并且归档命令无法跟上，或者有复制槽并且副本无法跟上，或者您可能没有任何这些东西并且它只是无法跟上的检查点。

如果您的临时文件被写入与 WAL 文件相同的分区，那么它们可能有助于填满分区。

1

恐慌：无法写入文件“pg_xlog/xlogtemp”：设备上没有剩余空间

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

恐慌：无法写入文件“pg_xlog/xlogtemp”：设备上没有剩余空间

1 个回答

相关问题