O PostgreSQL parece falhar ao reconhecer que a classificação é realizada por constante e cada constante define inequivocamente a fonte de linha correspondente e, como resultado, observo que ela calcula todas as fontes de linha antes de "classificar":
create table t1 as
select generate_series(1, 10) as c1;
explain analyse
select c1
from (select 1, c1 from t1 where c1 = 1
union all
select 2, c1 from t1 where c1 = 2
union all
select 3, c1 from t1 where c1 = 3
union all
select 4, c1 from t1 where c1 = 4
order by 1
limit 1) q
Subquery Scan on q (cost=169.84..169.85 rows=1 width=4) (actual time=0.501..0.685 rows=1 loops=1)
-> Limit (cost=169.84..169.84 rows=1 width=8) (actual time=0.479..0.622 rows=1 loops=1)
-> Sort (cost=169.84..169.97 rows=52 width=8) (actual time=0.458..0.570 rows=1 loops=1)
Sort Key: (1)
Sort Method: top-N heapsort Memory: 25kB
-> Append (cost=0.00..168.28 rows=52 width=8) (actual time=0.044..0.476 rows=4 loops=1)
-> Seq Scan on t1 (cost=0.00..41.88 rows=13 width=8) (actual time=0.023..0.046 rows=1 loops=1)
Filter: (c1 = 1)
Rows Removed by Filter: 9
-> Seq Scan on t1 t1_1 (cost=0.00..41.88 rows=13 width=8) (actual time=0.016..0.038 rows=1 loops=1)
Filter: (c1 = 2)
Rows Removed by Filter: 9
-> Seq Scan on t1 t1_2 (cost=0.00..41.88 rows=13 width=8) (actual time=0.016..0.038 rows=1 loops=1)
Filter: (c1 = 3)
Rows Removed by Filter: 9
-> Seq Scan on t1 t1_3 (cost=0.00..41.88 rows=13 width=8) (actual time=0.017..0.039 rows=1 loops=1)
Filter: (c1 = 4)
Rows Removed by Filter: 9
No entanto, após a remoção, order by 1
estou obtendo exatamente o que desejo - never executed
avisos no plano de consulta:
explain analyse
select c1
from (select 1, c1 from t1 where c1 = 1
union all
select 2, c1 from t1 where c1 = 2
union all
select 3, c1 from t1 where c1 = 3
union all
select 4, c1 from t1 where c1 = 4
--order by 1
limit 1) q
Subquery Scan on q (cost=0.00..3.25 rows=1 width=4) (actual time=0.092..0.195 rows=1 loops=1)
-> Limit (cost=0.00..3.24 rows=1 width=8) (actual time=0.069..0.131 rows=1 loops=1)
-> Append (cost=0.00..168.28 rows=52 width=8) (actual time=0.047..0.078 rows=1 loops=1)
-> Seq Scan on t1 (cost=0.00..41.88 rows=13 width=8) (actual time=0.026..0.037 rows=1 loops=1)
Filter: (c1 = 1)
-> Seq Scan on t1 t1_1 (cost=0.00..41.88 rows=13 width=8) (never executed)
Filter: (c1 = 2)
-> Seq Scan on t1 t1_2 (cost=0.00..41.88 rows=13 width=8) (never executed)
Filter: (c1 = 3)
-> Seq Scan on t1 t1_3 (cost=0.00..41.88 rows=13 width=8) (never executed)
Filter: (c1 = 4)
A questão é: é seguro omitir order by
nesses casos?
O otimizador não é inteligente o suficiente para descobrir que as linhas já estão classificadas. então ele classifica novamente. Para isso, ele precisa de todas as linhas. Simplesmente omita o
ORDER BY
.