我可以在使用数据库后激活 PITR 吗？

Question

Rovanion

Asked: 2022-01-11 04:45:39 +0800 CST2022-01-11 04:45:39 +0800 CST 2022-01-11 04:45:39 +0800 CST

聚合函数中的自引用条件

772

我有一个来自养鱼场的真实用例，其中养鱼场的增长取决于养鱼时养鱼场中鱼的平均大小。我已经将这个问题简化为我认为无法在 PostgreSQL 中表达的核心问题：一个聚合函数，其中的条件取决于该聚合的先前计算的值。

操作的数据是一系列事务。

create table transactions (
    id           bigserial primary key,
    feed_g       bigint  
);

insert into transactions
    (feed_g)
values
    (50),
    (50),
    (50),
    (50);

计算这些行的总和很简单。

select
    id,
    feed_g,
    sum(feed_g) over (order by id) as simple_sum
from transactions;

--  id | feed_g | simple_sum 
-- ----+--------+------------
--   1 |     50 |         50
--   2 |     50 |        100
--   3 |     50 |        150
--   4 |     50 |        200

使用取决于输入行值的条件计算总和也很简单。在下面的查询中，将始终使用第二种情况。

select
    id,
    feed_g,
    sum(
        case when feed_g > 75 then feed_g
             else                  feed_g * 0.5
        end
    ) over (order by id) as row_weighted_sum
from transactions;

--  id | feed_g | row_weighted_sum 
-- ----+--------+------------------
--   1 |     50 |             25.0
--   2 |     50 |             50.0
--   3 |     50 |             75.0
--   4 |     50 |            100.0

我不知道该怎么做是编写一个查询，其中聚合函数中的条件取决于前一行的相同聚合函数计算的输出。

下面是一些不工作的伪 SQL。

select
    id,
    feed_g,
    sum(
        case when lag(recursive_sum) + feed_g  > 75 then feed_g
             else                                        feed_g * 0.5
        end
    ) over (order by id) as recursive_sum
from transactions;

-- The imagined output would be the following:
--  id | feed_g | row_weighted_sum 
-- ----+--------+------------------
--   1 |     50 |             25.0
--   2 |     50 |             50.0
--   3 |     50 |            100.0
--   4 |     50 |            150.0

将simple_sum用作的输入recursive_sum似乎不是一个可行的解决方案，因为它们会随着时间的推移而分道扬镳。在给定的小型示例数据集中，这种漂移会影响第二行，其中在simple_sum第 3 行之前它不应该发生在第 2 行的阈值交叉处。

with estimate as (
    select
        id,
        feed_g,
        sum(feed_g) over (order by id) as simple_sum
    from transactions
)
select
    id,
    feed_g,
    simple_sum,
    sum(
        case when simple_sum > 75 then feed_g
             else                      feed_g * 0.5
        end
    ) over (order by id) as simple_sum_weighted_sum
from estimate;

--  id | feed_g | simple_sum | simple_sum_weighted_sum 
-- ----+--------+------------+-------------------------
--   1 |     50 |         50 |                    25.0
--   2 |     50 |        100 |                    75.0
--   3 |     50 |        150 |                   125.0
--   4 |     50 |        200 |                   175.0

simple_sum_weighted_sum在调用中使用作为输入的第三步也lag不起作用，因为它“忘记”了除最后一行之外的所有内容的权重。

with estimate as (
    select
        id,
        feed_g,
        sum(feed_g) over (order by id) as simple_sum
    from transactions
),
est2 as (
select
    id,
    feed_g,
    simple_sum,
    sum(
        case when simple_sum > 75 then feed_g
             else                      feed_g * 0.5
        end
    ) over (order by id) as simple_sum_weighted_sum
from estimate)
select
    id,
    feed_g,
    simple_sum,
    simple_sum_weighted_sum,
    coalesce(lag(simple_sum_weighted_sum) over (order by id), 0)
        + case when simple_sum_weighted_sum > 75 then feed_g
               else                                   feed_g * 0.5
          end as row_weighted_sum
from est2;

--  id | feed_g | simple_sum | simple_sum_weighted_sum | row_weighted_sum 
-- ----+--------+------------+-------------------------+------------------
--   1 |     50 |         50 |                    25.0 |             25.0
--   2 |     50 |        100 |                    75.0 |             50.0
--   3 |     50 |        150 |                   125.0 |            125.0
--   4 |     50 |        200 |                   175.0 |            175.0

我在 Python 中编写了该算法的两个工作实现以供参考。这是第一个命令式风格。

data = (50, 50, 50, 50)
sum = 0
for value in data:
  if sum + value > 75:
    sum = sum + value
  else:
    sum = sum + value * 0.5
  print(value, sum)

# 50 25.0
# 50 50.0
# 50 100.0
# 50 150.0

这第二个功能风格有些发育不良。

data = (50, 50, 50, 50)

def data_dependant_recursive_sum(iterator, last_sum):
  try:
    value = next(iterator)
  except StopIteration:
    return
  recursively_weighted_value = value if last_sum + value > 75 else value * 0.5
  recursive_sum = recursively_weighted_value + last_sum
  print(value, recursive_sum)
  data_dependant_recursive_sum(iterator, recursive_sum)
  
data_dependant_recursive_sum(iter(data), 0)

# 50 25.0
# 50 50.0
# 50 100.0
# 50 150.0

如果这个练习感觉做作和荒谬，可以在这里找到这个问题的更复杂但完整的版本：https ://stackoverflow.com/questions/70158295

我目前正在使用 Postgres 12，但如果需要，升级到 14 会很容易。

3 个回答

Voted

Isak · Answer 1 · 2022-01-11T09:58:44+08:00

Best Answer

Isak

2022-01-11T09:58:44+08:002022-01-11T09:58:44+08:00

这需要递归 CTE。这是 TSQL 中的一个示例（postgres 应该类似）：

declare @transactions as table (
    id integer primary key identity(1,1),
    feed_g integer
);

insert into @transactions 
values (50), (50), (50), (50);

with indexed_transactions as (
    select *, row_number() over (order by id) as rn
    from @transactions
),
cte as (
    select cast(0 as bigint) as rn, 0 as id, 0 as feed_g, cast(0.0 as float) as row_weighted_sum

    union all
    select
        a.rn,
        a.id,
        a.feed_g,
        case when cte.row_weighted_sum + a.feed_g > 75 then cte.row_weighted_sum + a.feed_g
        else cte.row_weighted_sum + a.feed_g * 0.5 end as row_weighted_sum
    from indexed_transactions a
    join cte on cte.rn = a.rn - 1
)
select * from cte where id > 0

结果：

rn  id  feed_g  row_weighted_sum
1   1   50      25
2   2   50      50
3   3   50      100
4   4   50      150

1

Rovanion · Answer 2 · 2022-01-12T03:23:16+08:00

Rovanion

2022-01-12T03:23:16+08:002022-01-12T03:23:16+08:00

为了后代，这是 Isak 翻译成 PostgreSQL 的答案。

with recursive indexed_transactions as (
    select *, row_number() over (order by id)
    from transactions
),
cte as (
    select 0::bigint as row_number, 0::bigint as id, 0::bigint as feed_g, 0::float as row_weighted_sum
    union all
    select
        a.row_number,
        a.id,
        a.feed_g,
        case when cte.row_weighted_sum + a.feed_g > 75
            then cte.row_weighted_sum + a.feed_g
            else cte.row_weighted_sum + a.feed_g * 0.5
        end as row_weighted_sum
    from indexed_transactions a
    join cte on cte.row_number = a.row_number - 1
)
select * from cte where id > 0;

1

Laurenz Albe · Answer 3 · 2022-01-11T22:44:32+08:00

Laurenz Albe

2022-01-11T22:44:32+08:002022-01-11T22:44:32+08:00

如果您在 Python 中有一个有效的实现，您可以简单地将其转换为 PostgreSQL 中的 PL/Python 函数。虽然应该可以提出一个纯 SQL 解决方案，但该任务实际上是一个程序性任务，因此程序性解决方案可能是最合适的。

0

聚合函数中的自引用条件

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

聚合函数中的自引用条件

3 个回答

相关问题