我们有一个性能不佳的查询。可以使用仅访问一个索引以从八行中检索一列(索引列)的简单查询来重现问题的根源。
该表没有统计信息,但索引有。在索引上收集新的统计数据并没有改变计划,但在表上收集统计数据却改变了。我的理解是,仅使用索引就可以满足的查询不必访问表,因此我的心智模型是表统计信息在这种情况下无关紧要,但经验似乎表明并非如此。
解释计划和自动跟踪计划都只显示索引访问,但是当表统计信息不存在时,成本和基数会显着增加。自动跟踪显示更高的 CPU、数据库时间和一致获取。我还没有尝试跟踪它,但我可以通过在表中创建/删除统计信息来重现它,如下所示。谁能解释这种行为?
set serveroutput on
DECLARE
numr NUMBER;
numb NUMBER;
avgr NUMBER;
nrow NUMBER;
nblk NUMBER;
numd NUMBER;
avgl NUMBER;
avgd NUMBER;
cfac NUMBER;
ilvl NUMBER;
gues NUMBER;
BEGIN
--Gather Stats.
dbms_stats.Gather_table_Stats(USER,'RESULTS');
--Gather Index Stats.
dbms_stats.Gather_index_Stats(USER,'I1');
--Show Index Stats.
dbms_stats.get_index_stats(USER, 'I1', NULL, NULL, NULL, nrow, nblk
, numd, avgl, avgd, cfac, ilvl, NULL, gues);
dbms_output.put_line('Number of rows: ' || TO_CHAR(nrow));
dbms_output.put_line('Number of blocks: ' || TO_CHAR(nblk));
dbms_output.put_line('Distinct keys: ' || TO_CHAR(numd));
dbms_output.put_line('Avg leaf blocks/key: ' || TO_CHAR(avgl));
dbms_output.put_line('Avg data blocks/key: ' || TO_CHAR(avgd));
dbms_output.put_line('Clustering factor: ' || TO_CHAR(cfac));
dbms_output.put_line('Index level: ' || TO_CHAR(ilvl));
dbms_output.put_line('IOT guess quality: ' || TO_CHAR(gues));
delete from plan_table;
END;
/
EXPLAIN PLAN FOR SELECT rsample_id FROM results
WHERE rsample_id = '0555103360';
SELECT cost, substr(lpad(' ', level-1) || operation || ' (' || options
|| ')',1,50 ) "Operation", object_name "Object"
FROM plan_table START WITH ID = 0 CONNECT BY PRIOR id=parent_id;
DECLARE
nrow NUMBER;
nblk NUMBER;
numd NUMBER;
avgl NUMBER;
avgd NUMBER;
cfac NUMBER;
ilvl NUMBER;
gues NUMBER;
BEGIN
--Delete Stats.
dbms_stats.delete_table_stats(USER,'RESULTS');
--Gather Index Stats.
dbms_stats.Gather_index_Stats('LRIFFEL','I1');
--Show Index Stats.
dbms_stats.get_index_stats(USER, 'I1', NULL, NULL, NULL, nrow, nblk
, numd, avgl, avgd, cfac, ilvl, NULL, gues);
dbms_output.put_line('Number of rows: ' || TO_CHAR(nrow));
dbms_output.put_line('Number of blocks: ' || TO_CHAR(nblk));
dbms_output.put_line('Distinct keys: ' || TO_CHAR(numd));
dbms_output.put_line('Avg leaf blocks/key: ' || TO_CHAR(avgl));
dbms_output.put_line('Avg data blocks/key: ' || TO_CHAR(avgd));
dbms_output.put_line('Clustering factor: ' || TO_CHAR(cfac));
dbms_output.put_line('Index level: ' || TO_CHAR(ilvl));
dbms_output.put_line('IOT guess quality: ' || TO_CHAR(gues));
delete from plan_table;
END;
/
EXPLAIN PLAN FOR SELECT rsample_id FROM results
WHERE rsample_id = '0555103360';
SELECT cost, substr(lpad(' ', level-1) || operation || ' (' || options
|| ')',1,50 ) "Operation", object_name "Object"
FROM plan_table START WITH ID = 0 CONNECT BY PRIOR id=parent_id;
这有以下输出(修改以适合):
anonymous block completed
Number of rows: 125226611
Number of blocks: 381090
Distinct keys: 5778886
Avg leaf blocks/key: 1
Avg data blocks/key: 3
Clustering factor: 19792294
Index level: 3
IOT guess quality:
plan FOR succeeded.
COST Operation Object
----- --------------------- ------
4 SELECT STATEMENT()
4 INDEX (RANGE SCAN) I1
anonymous block completed
Number of rows: 119034073
Number of blocks: 362402
Distinct keys: 5353024
Avg leaf blocks/key: 1
Avg data blocks/key: 3
Clustering factor: 18852918
Index level: 3
IOT guess quality:
plan FOR succeeded.
COST Operation Object
----- --------------------- ------
9 SELECT STATEMENT()
9 INDEX (RANGE SCAN) I1
创建这个之后,我注意到每次运行的索引统计信息都是不同的,即使表中没有任何更改并且索引统计信息在每次运行时重新收集。我现在的理论是,在使用级联选项收集表统计信息时,即使重新收集索引统计信息,也会保留索引统计信息中的某些内容。
Granularity 设置为 AUTO,Cascade 设置为 AUTO_CASCADE。
我猜想 CBO 以某种方式从索引统计信息计算表统计信息近似值或使用一些经验法则。
不同的统计数据可能是由