Estou executando uma consulta em um conjunto de dados que tem 30 milhões de linhas e só continuará a crescer, a tabela é customer_actions (tamanho da tabela é 2416 MB)
create table customer_actions
(
id bigint not null
constraint customer_actions_pkey
primary key,
action text,
customer_id bigint,
product_id bigint,
item_type text,
create_date timestamp
);
Eu tentei uma grande variedade de índices, mas olhando para o exaplin da consulta, nada está sendo atingido
SELECT customer_id, product_id, count(*) AS count
from customer_actions
WHERE action = 'a2b'
AND item_type = 'wine'
AND create_date BETWEEN current_timestamp - INTERVAL '2 Years' AND current_timestamp
GROUP BY customer_id, product_id
SELECT customer_id, product_id, count(*) AS count
from customer_actions
WHERE action = 'view'
AND item_type = 'wine'
AND create_date BETWEEN current_timestamp - INTERVAL '2 Years' AND current_timestamp
GROUP BY customer_id, product_id
SELECT customer_id, product_id, count(*) AS count
from customer_actions
WHERE action = 'buy'
AND item_type = 'wine'
AND create_date BETWEEN current_timestamp - INTERVAL '2 Years' AND current_timestamp
GROUP BY customer_id, product_id
Índices que eu tentei, alguns deles eu sei que não funcionariam, mas eu estava me agarrando a canudos, todos aqueles com uma condição no final também foram tentados sem a condição. Não suponha que alguém possa me apontar na direção certa, sou muito novo em PSQL e ainda não tenho uma compreensão profunda de índices.
CREATE INDEX IF NOT EXISTS idx_14 on customer_actions (customer_id, product_id,
create_date, action, item_type) where action = 'a2b'
CREATE INDEX IF NOT EXISTS idx_15 on customer_actions (customer_id, product_id, action, item_type) where action = 'a2b'
CREATE INDEX IF NOT EXISTS idx_16 on customer_actions (customer_id, product_id) where action = 'a2b'
CREATE INDEX IF NOT EXISTS idx_11 on customer_actions (item_type, action ) where item_type = 'wine' and action = 'a2b';
CREATE INDEX IF NOT EXISTS idx_12 on customer_actions (item_type, action ) where item_type = 'wine' and action = 'view' ;
CREATE INDEX IF NOT EXISTS idx_13 on customer_actions (item_type, action ) where item_type = 'wine' and action = 'buy' ;
CREATE INDEX idx_time on customer_actions using brin (create_date);
create index idx_actions_a2b on customer_actions (action) where action = 'a2b'
CREATE INDEX IF NOT EXISTS idx_customer_actions_action_product_cardinality_order on customer_actions (customer_id, product_id, action);
CREATE INDEX id_time_and_other on customer_actions (action, item_type, create_date DESC)
CREATE INDEX IF NOT EXISTS idx_customer_actions_product_and_customer on customer_actions (customer_id, product_id)
CREATE INDEX IF NOT EXISTS idx_14 on customer_actions (customer_id, product_id, create_date, action, item_type)
CREATE INDEX IF NOT EXISTS idx_14 on customer_actions (customer_id, product_id, create_date, action)
CREATE INDEX IF NOT EXISTS idx_17 on customer_actions (customer_id, product_id)
A explicação da consulta é
Finalize GroupAggregate (cost=745877.49..1182094.60 rows=1527687 width=24)
" Group Key: customer_id, product_id"
-> Gather Merge (cost=745877.49..1143902.43 rows=3055374 width=24)
Workers Planned: 2
-> Partial GroupAggregate (cost=744877.46..790236.43 rows=1527687 width=24)
" Group Key: customer_id, product_id"
-> Sort (cost=744877.46..752397.99 rows=3008210 width=16)
" Sort Key: customer_id, product_id"
-> Parallel Seq Scan on customer_actions (cost=0.00..318363.94 rows=3008210 width=16)
Filter: ((action = 'a2b'::text) AND (item_type = 'wine'::text) AND (create_date <= CURRENT_TIMESTAMP) AND (create_date >= (CURRENT_TIMESTAMP - '2 years'::interval)))