我认为这个社区的一般建议是避免使用临时表来支持 CTE。但是,我有时会遇到 CTE 构造非常慢,而它们的临时表等价物非常快的情况。
例如,这旋转了几个小时,似乎永远不会产生结果。查询计划充满了嵌套循环。
CREATE TABLE #TRIANGLES
(
NODE_A VARCHAR(22),
NODE_B VARCHAR(22),
NODE_C VARCHAR(22)
)
;
INSERT INTO #TRIANGLES VALUES
/* 150,000 ROWS */
;
CREATE NONCLUSTERED INDEX IDX_A ON #TRIANGLES (NODE_A);
CREATE NONCLUSTERED INDEX IDX_B ON #TRIANGLES (NODE_B);
CREATE NONCLUSTERED INDEX IDX_C ON #TRIANGLES (NODE_C);
WITH
TRIANGLES_FILTERED AS
(
-- **** FILTERING OF THE TRIANGLE TABLE OCCURS IN A CTE ****
SELECT *
FROM #TRIANGLES AS T
WHERE LEN(T.NODE_A) = 2 AND
LEN(T.NODE_B) = 2 AND
LEN(T.NODE_C) = 2
),
CONNECTABLE_NODES AS
(
SELECT DISTINCT T1.NODE_C AS [NODE]
FROM TRIANGLES_FILTERED AS T1
INNER JOIN
TRIANGLES_FILTERED AS T2
ON T1.NODE_B = T2.NODE_A AND
T1.NODE_C = T2.NODE_B
INNER JOIN
TRIANGLES_FILTERED AS T3
ON T2.NODE_B = T3.NODE_A AND
T2.NODE_C = T3.NODE_B
WHERE T1.NODE_A <> T2.NODE_C AND
T1.NODE_A <> T3.NODE_C AND
T2.NODE_A <> T3.NODE_C
)
SELECT *
FROM #TRIANGLES AS T1
WHERE T1.NODE_A IN (SELECT * FROM CONNECTABLE_NODES) AND
T1.NODE_B IN (SELECT * FROM CONNECTABLE_NODES) AND
T1.NODE_C IN (SELECT * FROM CONNECTABLE_NODES)
;
查询计划: https ://www.brentozar.com/pastetheplan/?id=rk_5TaiiP
鉴于此的查询计划使用哈希匹配,并且它在瞬间运行:
CREATE TABLE #TRIANGLES
(
NODE_A VARCHAR(22),
NODE_B VARCHAR(22),
NODE_C VARCHAR(22)
)
;
INSERT INTO #TRIANGLES VALUES
/* 150,000 ROWS */
;
CREATE NONCLUSTERED INDEX IDX_A ON #TRIANGLES (NODE_A);
CREATE NONCLUSTERED INDEX IDX_B ON #TRIANGLES (NODE_B);
CREATE NONCLUSTERED INDEX IDX_C ON #TRIANGLES (NODE_C);
-- **** FILTERING OF THE TRIANGLE TABLE SAVED INTO A TEMP TABLE ****
SELECT *
INTO #TRIANGLES_FILTERED
FROM #TRIANGLES AS T
WHERE LEN(T.NODE_A) = 2 AND
LEN(T.NODE_B) = 2 AND
LEN(T.NODE_C) = 2
;
CREATE NONCLUSTERED INDEX IDX_A ON #TRIANGLES_FILTERED (NODE_A);
CREATE NONCLUSTERED INDEX IDX_B ON #TRIANGLES_FILTERED (NODE_B);
CREATE NONCLUSTERED INDEX IDX_C ON #TRIANGLES_FILTERED (NODE_C);
WITH
CONNECTABLE_NODES AS
(
SELECT DISTINCT T1.NODE_C AS [NODE]
FROM #TRIANGLES_FILTERED AS T1
INNER JOIN
#TRIANGLES_FILTERED AS T2
ON T1.NODE_B = T2.NODE_A AND
T1.NODE_C = T2.NODE_B
INNER JOIN
#TRIANGLES_FILTERED AS T3
ON T2.NODE_B = T3.NODE_A AND
T2.NODE_C = T3.NODE_B
WHERE T1.NODE_A <> T2.NODE_C AND
T1.NODE_A <> T3.NODE_C AND
T2.NODE_A <> T3.NODE_C
)
SELECT *
FROM #TRIANGLES AS T1
WHERE T1.NODE_A IN (SELECT * FROM CONNECTABLE_NODES) AND
T1.NODE_B IN (SELECT * FROM CONNECTABLE_NODES) AND
T1.NODE_C IN (SELECT * FROM CONNECTABLE_NODES)
;
查询计划: https ://www.brentozar.com/pastetheplan/?id=B1cZC6isD
我将如何将第一个重写为与第二个一样快?
顺便说一句,如果您想知道所有几何/拓扑是什么,我需要知道在创建这个难题时所有三角形是如何相互连接的:
https ://puzzling.stackexchange.com/questions/105275/dragon -召唤咒语
有时 CTE 的估计错误。临时表很擅长。
因此,CTE 使用这些索引是因为他们认为那里的行数较少。第一个慢的原因是RID Lookup。如果您删除索引或将输出列添加为索引中的包含。它会更快。
这里有一篇很棒的博客文章。
我认为他们之间没有胜利。您应该根据具体情况使用它们。并在相同的情况下尝试它们。通过这种方式,您可以看到成本。