在聚合一个数组时,我需要删除空字符串,然后组合所有相邻的相同值。例如:
["","product","product","","product","","product","product","","product","product","","product","","","collection","product","","","product","product","","collection","order","checkout",""]
应该变成:
["product","collection","product","collection","order","checkout"]
我有一个带有 4 个嵌套选择的工作查询:
SELECT array_agg( page_type_unique_pre) FILTER (WHERE page_type_unique_pre != '')
OVER (ORDER BY event_time) AS page_type_journey_unique
FROM (
SELECT CASE WHEN lag(last_page_type) OVER (ORDER BY event_time) LIKE '%' || page_type || '%' THEN ''
ELSE page_type END AS page_type_unique_pre
, page_type
, event_time
FROM (
SELECT string_agg(page_type, ',') OVER (ORDER BY event_time) AS page_type_journey
, first_value(page_type) OVER (PARTITION BY last_page_type_partition ORDER BY event_time) AS last_page_type
, page_type
, event_time
FROM (
SELECT
sum(CASE WHEN page_type IS NULL OR page_type = '' THEN 0 ELSE 1 END) OVER (ORDER BY event_time) AS last_page_type_partition,
page_type,
event_time
FROM (
SELECT * FROM tes
) a
) b
) c
) d;
请参阅此小提琴中的测试用例。
我确定有更好的方法来实现这一目标吗?
单个子查询应该这样做:
小提琴
立即消除 null 和空字符串
WHERE page_type <> ''
。看:然后
page_type
使用窗口函数获取上一个,默认lag()
放置。''
这种方式last_page_type
永远不可能null
(并且空字符串''
在刚刚被消除后不会与现有值发生冲突)。看:因此,我们可以在外部使用普通的
<>
(不是更昂贵的)来识别具有新页面类型的行。IS DISTINCT FROM
SELECT
将结果集提供给 ARRAY 构造函数。最简单最便宜的。看:
如前所述,您似乎可以只使用
LAG
窗口函数,并结合数组聚合。如果您想在合并重复项之前和之后查看,可以使用
ARRAY_AGG
带有和不带有 a 的聚合FILTER
db<>小提琴