我可以在使用数据库后激活 PITR 吗？

Question

yaugenka

Asked: 2019-04-21 08:56:18 +0800 CST2019-04-21 08:56:18 +0800 CST 2019-04-21 08:56:18 +0800 CST

按数组重叠分组

772

我有一个带有 id 和集群的 PostgreSQL 表，如下所示：

CREATE TABLE w (id bigint, clst int);
INSERT INTO w (id,clst)
VALUES 
  (1,0),
  (1,4),
  (2,1),
  (2,2),
  (2,3),
  (3,2),
  (4,2),
  (5,4),
  (6,5);

如果聚合按 id 分组的集群，可以看到集群数组中有重叠的值：

select id, array_agg(clst) clst from w group by id order by id;
 id |  clst
----+---------
  1 | {0,4}
  2 | {1,2,3}
  3 | {2}
  4 | {2}
  5 | {4}
  6 | {5}

即集群 4 涵盖 id 1 和 5，集群 2 涵盖 id 2、3 和 4，而集群 5 仅对应一个 id。

我现在如何聚合由集群数组重叠分组的 id？即预期的结果是：

 id      | clst
---------+-------
 {1,5}   | {0,4,4}
 {2,3,4} | {1,2,3,2,2}
 {6}     | {5}

我不太关心集群列只需要正确聚合的 id。

可能的重叠数量没有限制。每个 id 的集群数量也不受限制（可以是数百甚至更多）。集群不按顺序关联到 id。

表中有数百万行！！！

使用 PostgreSQL 11。

1 个回答

Voted

Jack Douglas · Answer 1 · 2019-04-23T02:53:20+08:00

Best Answer

Jack Douglas

2019-04-23T02:53:20+08:002019-04-23T02:53:20+08:00

我不太关心集群列只需要正确聚合的 id。

在这种情况下，我们可以使用intarray 扩展uniq中的andsort函数：

with recursive a as (
  select id, array_agg(distinct clst) clst from w group by id)
, t(id,pid,clst) as (
  select id,id,clst from a
  union all
  select t.id,a.id,t.clst|a.clst
  from t join a on a.id<>t.pid and t.clst&&a.clst and not t.clst@>a.clst)
, d as (
  select distinct on(id) id, clst from t order by id, cardinality(clst) desc)
select array_agg(id), clst from d group by clst;

array_agg | clst   
:-------- | :------
{6} | {5}    
{2,3,4} | {1,2,3}
{1,5} | {0,4}

db<>在这里摆弄

请记住，这不太可能在数百万行上表现良好。

4

按数组重叠分组

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

按数组重叠分组

1 个回答

相关问题