我正在对包含 3000 万行并且只会继续增长的数据集运行查询,该表是 customer_actions(表大小为 2416 MB)
create table customer_actions
(
id bigint not null
constraint customer_actions_pkey
primary key,
action text,
customer_id bigint,
product_id bigint,
item_type text,
create_date timestamp
);
我尝试了各种各样的索引,但是查看查询的 exaplin,没有任何内容被击中
SELECT customer_id, product_id, count(*) AS count
from customer_actions
WHERE action = 'a2b'
AND item_type = 'wine'
AND create_date BETWEEN current_timestamp - INTERVAL '2 Years' AND current_timestamp
GROUP BY customer_id, product_id
SELECT customer_id, product_id, count(*) AS count
from customer_actions
WHERE action = 'view'
AND item_type = 'wine'
AND create_date BETWEEN current_timestamp - INTERVAL '2 Years' AND current_timestamp
GROUP BY customer_id, product_id
SELECT customer_id, product_id, count(*) AS count
from customer_actions
WHERE action = 'buy'
AND item_type = 'wine'
AND create_date BETWEEN current_timestamp - INTERVAL '2 Years' AND current_timestamp
GROUP BY customer_id, product_id
我尝试过的索引,其中一些我知道不会起作用但我抓住了稻草,所有最后有条件的索引也都尝试过没有条件。不要以为任何人都能为我指明正确的方向,我对 PSQL 还很陌生,对索引还没有深入的了解。
CREATE INDEX IF NOT EXISTS idx_14 on customer_actions (customer_id, product_id,
create_date, action, item_type) where action = 'a2b'
CREATE INDEX IF NOT EXISTS idx_15 on customer_actions (customer_id, product_id, action, item_type) where action = 'a2b'
CREATE INDEX IF NOT EXISTS idx_16 on customer_actions (customer_id, product_id) where action = 'a2b'
CREATE INDEX IF NOT EXISTS idx_11 on customer_actions (item_type, action ) where item_type = 'wine' and action = 'a2b';
CREATE INDEX IF NOT EXISTS idx_12 on customer_actions (item_type, action ) where item_type = 'wine' and action = 'view' ;
CREATE INDEX IF NOT EXISTS idx_13 on customer_actions (item_type, action ) where item_type = 'wine' and action = 'buy' ;
CREATE INDEX idx_time on customer_actions using brin (create_date);
create index idx_actions_a2b on customer_actions (action) where action = 'a2b'
CREATE INDEX IF NOT EXISTS idx_customer_actions_action_product_cardinality_order on customer_actions (customer_id, product_id, action);
CREATE INDEX id_time_and_other on customer_actions (action, item_type, create_date DESC)
CREATE INDEX IF NOT EXISTS idx_customer_actions_product_and_customer on customer_actions (customer_id, product_id)
CREATE INDEX IF NOT EXISTS idx_14 on customer_actions (customer_id, product_id, create_date, action, item_type)
CREATE INDEX IF NOT EXISTS idx_14 on customer_actions (customer_id, product_id, create_date, action)
CREATE INDEX IF NOT EXISTS idx_17 on customer_actions (customer_id, product_id)
查询的解释是
Finalize GroupAggregate (cost=745877.49..1182094.60 rows=1527687 width=24)
" Group Key: customer_id, product_id"
-> Gather Merge (cost=745877.49..1143902.43 rows=3055374 width=24)
Workers Planned: 2
-> Partial GroupAggregate (cost=744877.46..790236.43 rows=1527687 width=24)
" Group Key: customer_id, product_id"
-> Sort (cost=744877.46..752397.99 rows=3008210 width=16)
" Sort Key: customer_id, product_id"
-> Parallel Seq Scan on customer_actions (cost=0.00..318363.94 rows=3008210 width=16)
Filter: ((action = 'a2b'::text) AND (item_type = 'wine'::text) AND (create_date <= CURRENT_TIMESTAMP) AND (create_date >= (CURRENT_TIMESTAMP - '2 years'::interval)))
查询的最佳索引将所有想要的记录组合在一起
想象一下,如果你说
ORDER BY action,item_type, create_date