Eu tenho uma consulta semelhante à seguinte:
FROM example_table
WHERE
`date` BETWEEN '2023-11-26' AND '2023-11-28'
AND location_id IN (3, 4, 6, 7, 8, 10, 11, 12, 14, 18, 19, 22, 23, 24, 28, 29, 30, 31, 32, 36, 39, 40, 41, 43, 45, 46, 48, 49, 50, 51, 52, 54, 55, 56, 57, 59, 60, 61, 62, 68, 69, 75, 121)
AND ( `type` IS NULL OR ( `type` IN ('type1', 'type2', 'type3') ) )
GROUP BY location_id;
Meu entendimento é que, ao criar um índice multicoluna, a coluna com maior cardinalidade/seletividade vai primeiro. Tentei testar o desempenho com duas chaves de índice:
- (data, location_id, tipo, valor)
- (location_id, data, tipo, valor)
Na minha tabela real, tenho 11.833 valores exclusivos na coluna de data e apenas 99 em location_id. Atualmente, existem mais de 63 milhões de linhas.
No entanto, o MySQL 8 prefere usar aquele que começa com location_id. Mesmo quando tento FORCE INDEX
e EXPLAIN ANALYZE
, ele mostra um custo/tempo maior daquele que começa com date
.
O que poderia estar acontecendo?
EDITAR:
EXPLICAR ANÁLISE:
- data primeiro índice
-> Group aggregate: sum(ledger_entries.amount_cents) (cost=1897 rows=6236) (actual time=0.167..4.67 rows=43 loops=1)
-> Filter: ((ledger_entries.`date` = DATE'2023-11-28') and (ledger_entries.location_id in (3,4,6,7,8,10,11,12,14,18,19,22,23,24,28,29,30,31,32,36,39,40,41,43,45,46,48,49,50,51,52,54,55,56,57,59,60,61,62,68,69,75,121)) and ((ledger_entries.`type` is null) or (ledger_entries.`type` in ('Procedure','Adjustment','AncillarySale')))) (cost=1273 rows=6236) (actual time=0.0221..4.09 rows=6192 loops=1)
-> Covering index range scan on ledger_entries using index_le_date_location_type_amount_cents over (date = '2023-11-28' AND location_id = 3 AND type = NULL) OR (date = '2023-11-28' AND location_id = 3 AND type = 'Adjustment') OR (170 more) (cost=1273 rows=6236) (actual time=0.02..2.83 rows=6192 loops=1)
- primeiro índice de localização
-> Group aggregate: sum(ledger_entries.amount_cents) (cost=1888 rows=6236) (actual time=0.171..4.74 rows=43 loops=1)
-> Filter: ((ledger_entries.`date` = DATE'2023-11-28') and (ledger_entries.location_id in (3,4,6,7,8,10,11,12,14,18,19,22,23,24,28,29,30,31,32,36,39,40,41,43,45,46,48,49,50,51,52,54,55,56,57,59,60,61,62,68,69,75,121)) and ((ledger_entries.`type` is null) or (ledger_entries.`type` in ('Procedure','Adjustment','AncillarySale')))) (cost=1265 rows=6236) (actual time=0.0244..4.15 rows=6192 loops=1)
-> Covering index range scan on ledger_entries using ledger_entries_location_date_type_amount_cents over (location_id = 3 AND date = '2023-11-28' AND type = NULL) OR (location_id = 3 AND date = '2023-11-28' AND type = 'Adjustment') OR (170 more) (cost=1265 rows=6236) (actual time=0.022..2.91 rows=6192 loops=1)