AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / dba / 问题 / 15596
Accepted
Quandary
Quandary
Asked: 2012-03-19 20:38:03 +0800 CST2012-03-19 20:38:03 +0800 CST 2012-03-19 20:38:03 +0800 CST

SQL 查询从 1 秒减慢到 11 分钟 - 为什么?

  • 772

问题:我将以下查询(按外键依赖项列出表)移植到 PostGreSql。

WITH Fkeys AS (

    SELECT DISTINCT 
         OnTable       = OnTable.name
        ,AgainstTable  = AgainstTable.name 
    FROM sysforeignkeys fk 

        INNER JOIN sysobjects onTable 
            ON fk.fkeyid = onTable.id 

        INNER JOIN sysobjects againstTable  
            ON fk.rkeyid = againstTable.id 

    WHERE 1=1
        AND AgainstTable.TYPE = 'U'
        AND OnTable.TYPE = 'U'
        -- ignore self joins; they cause an infinite recursion
        AND OnTable.Name <> AgainstTable.Name
    )

,MyData AS (

    SELECT 
         OnTable = o.name 
        ,AgainstTable = FKeys.againstTable 
    FROM sys.objects o 

    LEFT JOIN FKeys
        ON o.name = FKeys.onTable 

    WHERE (1=1) 
        AND o.type = 'U' 
        AND o.name NOT LIKE 'sys%' 
    )

,MyRecursion AS (

    -- base case
    SELECT  
         TableName    = OnTable
        ,Lvl        = 1
    FROM MyData
    WHERE 1=1
        AND AgainstTable IS NULL 

    -- recursive case
    UNION ALL 

    SELECT 
         TableName = OnTable 
        ,Lvl       = r.Lvl + 1 
    FROM MyData d 
        INNER JOIN MyRecursion r 
            ON d.AgainstTable = r.TableName 
)
SELECT 
     Lvl = MAX(Lvl)
    ,TableName
    --,strSql = 'delete from [' + tablename + ']'
FROM 
    MyRecursion
GROUP BY
    TableName

ORDER BY lvl

/*
ORDER BY 

     2 ASC
    ,1 ASC

*/

使用 information_schema,查询如下所示:

WITH Fkeys AS 
(
    SELECT DISTINCT 
         KCU1.TABLE_NAME AS OnTable 
        ,KCU2.TABLE_NAME AS AgainstTable 
    FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS RC 

    LEFT JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU1 
        ON KCU1.CONSTRAINT_CATALOG = RC.CONSTRAINT_CATALOG  
        AND KCU1.CONSTRAINT_SCHEMA = RC.CONSTRAINT_SCHEMA 
        AND KCU1.CONSTRAINT_NAME = RC.CONSTRAINT_NAME 

    LEFT JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU2 
        ON KCU2.CONSTRAINT_CATALOG =  RC.UNIQUE_CONSTRAINT_CATALOG  
        AND KCU2.CONSTRAINT_SCHEMA = RC.UNIQUE_CONSTRAINT_SCHEMA 
        AND KCU2.CONSTRAINT_NAME = RC.UNIQUE_CONSTRAINT_NAME 
        AND KCU2.ORDINAL_POSITION = KCU1.ORDINAL_POSITION 

    WHERE (1=1)
    AND KCU1.TABLE_NAME <> KCU2.TABLE_NAME 
)

,MyData AS 
( 
    SELECT 
         TABLE_NAME AS OnTable  
        ,FKeys.againstTable AS AgainstTable
    FROM INFORMATION_SCHEMA.TABLES 

    LEFT JOIN FKeys
        ON TABLE_NAME = FKeys.onTable  

    WHERE (1=1) 
        AND TABLE_TYPE = 'BASE TABLE'
        AND TABLE_NAME NOT IN ('sysdiagrams', 'dtproperties') 
)

,MyRecursion AS 
(
    -- base case
    SELECT  
         OnTable AS TableName 
        ,1 AS Lvl 
    FROM MyData
    WHERE 1=1
    AND AgainstTable IS NULL 

    -- recursive case
    UNION ALL 

    SELECT 
         OnTable AS TableName
        ,r.Lvl + 1 AS Lvl 
    FROM MyData d 

    INNER JOIN MyRecursion r 
        ON d.AgainstTable = r.TableName 
)

SELECT 
     MAX(Lvl) AS Lvl 
    ,TableName
    --,strSql = 'delete from [' + tablename + ']'
FROM 
    MyRecursion
GROUP BY
    TableName

ORDER BY lvl

/*
ORDER BY 

     2 ASC
    ,1 ASC

*/

我现在的问题是:

在 SQL Server 中(在 2008 R2 上测试):为什么我替换时查询从 1 秒跳到 11 分钟

SELECT DISTINCT 
     OnTable       = OnTable.name
    ,AgainstTable  = AgainstTable.name 
FROM sysforeignkeys fk 

    INNER JOIN sysobjects onTable 
        ON fk.fkeyid = onTable.id 

    INNER JOIN sysobjects againstTable  
        ON fk.rkeyid = againstTable.id 

WHERE 1=1
    AND AgainstTable.TYPE = 'U'
    AND OnTable.TYPE = 'U'
    -- ignore self joins; they cause an infinite recursion
    AND OnTable.Name <> AgainstTable.Name

和

SELECT DISTINCT 
     KCU1.TABLE_NAME AS OnTable 
    ,KCU2.TABLE_NAME AS AgainstTable 
FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS RC 

LEFT JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU1 
    ON KCU1.CONSTRAINT_CATALOG = RC.CONSTRAINT_CATALOG  
    AND KCU1.CONSTRAINT_SCHEMA = RC.CONSTRAINT_SCHEMA 
    AND KCU1.CONSTRAINT_NAME = RC.CONSTRAINT_NAME 

LEFT JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU2 
    ON KCU2.CONSTRAINT_CATALOG =  RC.UNIQUE_CONSTRAINT_CATALOG  
    AND KCU2.CONSTRAINT_SCHEMA = RC.UNIQUE_CONSTRAINT_SCHEMA 
    AND KCU2.CONSTRAINT_NAME = RC.UNIQUE_CONSTRAINT_NAME 
    AND KCU2.ORDINAL_POSITION = KCU1.ORDINAL_POSITION 

WHERE (1=1)
AND KCU1.TABLE_NAME <> KCU2.TABLE_NAME 

???

据我所知,单独运行部分查询时确实没有显着的速度差异。结果集也完全相同(我检查了 Excel 中的每一行),尽管顺序不同。

在工作的 PostGreSQL 版本之下(在完全相同的数据库内容 [75 个表] 上在 35 毫秒内完成...)
——没有任何保证——

WITH RECURSIVE Fkeys AS 
(
    SELECT DISTINCT 
         KCU1.TABLE_NAME AS OnTable 
        ,KCU2.TABLE_NAME AS AgainstTable 
    FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS RC 

    LEFT JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU1 
        ON KCU1.CONSTRAINT_CATALOG = RC.CONSTRAINT_CATALOG  
        AND KCU1.CONSTRAINT_SCHEMA = RC.CONSTRAINT_SCHEMA 
        AND KCU1.CONSTRAINT_NAME = RC.CONSTRAINT_NAME 

    LEFT JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU2 
        ON KCU2.CONSTRAINT_CATALOG =  RC.UNIQUE_CONSTRAINT_CATALOG  
        AND KCU2.CONSTRAINT_SCHEMA = RC.UNIQUE_CONSTRAINT_SCHEMA 
        AND KCU2.CONSTRAINT_NAME = RC.UNIQUE_CONSTRAINT_NAME 
        AND KCU2.ORDINAL_POSITION = KCU1.ORDINAL_POSITION 
)

,MyData AS 
( 
    SELECT 
         TABLE_NAME AS OnTable  
        ,FKeys.againstTable AS AgainstTable
    FROM INFORMATION_SCHEMA.TABLES 

    LEFT JOIN FKeys
        ON TABLE_NAME = FKeys.onTable  

    WHERE (1=1) 
        AND TABLE_TYPE = 'BASE TABLE'
        AND TABLE_SCHEMA = 'public'
        --AND TABLE_NAME NOT IN ('sysdiagrams', 'dtproperties') 
)


,MyRecursion AS 
(
    -- base case
    SELECT  
         OnTable AS TableName 
        ,1 AS Lvl 
    FROM MyData
    WHERE 1=1
    AND AgainstTable IS NULL 

    -- recursive case
    UNION ALL 

    SELECT 
         OnTable AS TableName
        ,r.Lvl + 1 AS Lvl 
    FROM MyData d 

    INNER JOIN MyRecursion r 
        ON d.AgainstTable = r.TableName 
)

SELECT 
     MAX(Lvl) AS Lvl 
    ,TableName
    --,strSql = 'delete from [' + tablename + ']'
FROM 
    MyRecursion
GROUP BY
    TableName

ORDER BY lvl


/*
ORDER BY 

     2 ASC
    ,1 ASC

*/

似乎也

AND KCU1.TABLE_NAME <> KCU2.TABLE_NAME

在使用 information_schema 时是多余的,所以它实际上应该更快。

sql-server performance
  • 2 2 个回答
  • 12139 Views

2 个回答

  • Voted
  1. Best Answer
    Martin Smith
    2012-03-21T02:46:22+08:002012-03-21T02:46:22+08:00

    我可能会放弃INFORMATION_SCHEMA这里的视图并使用新sys.视图(而不是向后兼容的视图),或者至少首先将结果实现JOIN到索引表中。

    递归 CTE 在 SQL Server 中始终获得相同的基本计划,其中每一行都被添加到堆栈假脱机并一一处理。这意味着之间的连接REFERENTIAL_CONSTRAINTS RC, KEY_COLUMN_USAGE KCU1, KEY_COLUMN_USAGE KCU2将与​​以下查询的结果一样多次SELECT COUNT(*) FROM MyRecursion。

    我假设在您的情况下(从 11 分钟的执行时间开始)可能是数千次,因此您需要递归部分尽可能高效。您的查询将执行数千次以下类型的事情。

       SELECT  
               KCU1.TABLE_CATALOG,
               KCU1.TABLE_SCHEMA,
               KCU1.TABLE_NAME
        FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS RC 
        INNER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU1 
            ON KCU1.CONSTRAINT_CATALOG = RC.CONSTRAINT_CATALOG  
            AND KCU1.CONSTRAINT_SCHEMA = RC.CONSTRAINT_SCHEMA 
            AND KCU1.CONSTRAINT_NAME = RC.CONSTRAINT_NAME 
        INNER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU2 
            ON KCU2.CONSTRAINT_CATALOG =  RC.UNIQUE_CONSTRAINT_CATALOG  
            AND KCU2.CONSTRAINT_SCHEMA = RC.UNIQUE_CONSTRAINT_SCHEMA 
            AND KCU2.CONSTRAINT_NAME = RC.UNIQUE_CONSTRAINT_NAME 
            AND KCU2.ORDINAL_POSITION = KCU1.ORDINAL_POSITION 
        WHERE KCU2.TABLE_NAME = 'FOO' 
    

    (旁注:如果不同模式中的表名相同,则查询的两个版本都将返回不正确的结果)

    正如你所看到的,这个计划非常可怕。

    计划

    将此与您的sys查询计划进行比较,这有点简单。

    SELECT OnTable = OnTable.name, 
           AgainstTable = AgainstTable.name 
    FROM   sysforeignkeys fk 
           INNER JOIN sysobjects OnTable 
             ON fk.fkeyid = OnTable.id 
           INNER JOIN sysobjects AgainstTable 
             ON fk.rkeyid = AgainstTable.id 
    WHERE  AgainstTable.name = 'FOO' 
    

    计划 2

    您可以通过更改 to 的定义来鼓励中间物化,而无需显式创建#temp表MyData

    MyData AS 
    ( 
        SELECT TOP 99.999999 PERCENT
             TABLE_NAME AS OnTable  
            ,Fkeys.AgainstTable AS AgainstTable
        FROM INFORMATION_SCHEMA.TABLES 
    
        LEFT JOIN Fkeys
            ON TABLE_NAME = Fkeys.OnTable  
    
        WHERE (1=1) 
            AND TABLE_TYPE = 'BASE TABLE'
            AND TABLE_NAME NOT IN ('sysdiagrams', 'dtproperties') 
            ORDER BY TABLE_NAME
    )
    

    在我的机器上进行测试,Adventureworks2008这使运行时间从大约 10 秒下降到 250 毫秒(在第一次运行之后因为计划需要 2 秒来编译)。它在计划中添加了一个急切的假脱机,在第一个递归调用上实现 Join 的结果,然后在后续调用中重放它。但是,不能保证此行为,您可能希望支持 Connect 项目请求提供提示以强制 CTE 或派生表的中间实现

    我会感到更安全#temp,如下所示明确创建表格,而不是依赖这种行为。

    CREATE TABLE #MyData
    (
    OnTable SYSNAME,
    AgainstTable NVARCHAR(128) NULL,
    UNIQUE CLUSTERED (AgainstTable, OnTable)
    );
    
    WITH Fkeys AS 
    (
        SELECT DISTINCT 
             KCU1.TABLE_NAME AS OnTable 
            ,KCU2.TABLE_NAME AS AgainstTable 
        FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS RC 
    
        LEFT JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU1 
            ON KCU1.CONSTRAINT_CATALOG = RC.CONSTRAINT_CATALOG  
            AND KCU1.CONSTRAINT_SCHEMA = RC.CONSTRAINT_SCHEMA 
            AND KCU1.CONSTRAINT_NAME = RC.CONSTRAINT_NAME 
    
        LEFT JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU2 
            ON KCU2.CONSTRAINT_CATALOG =  RC.UNIQUE_CONSTRAINT_CATALOG  
            AND KCU2.CONSTRAINT_SCHEMA = RC.UNIQUE_CONSTRAINT_SCHEMA 
            AND KCU2.CONSTRAINT_NAME = RC.UNIQUE_CONSTRAINT_NAME 
            AND KCU2.ORDINAL_POSITION = KCU1.ORDINAL_POSITION 
    
        WHERE (1=1)
        AND KCU1.TABLE_NAME <> KCU2.TABLE_NAME 
    )
    
    ,MyData AS 
    ( 
        SELECT 
             TABLE_NAME AS OnTable  
            ,Fkeys.AgainstTable AS AgainstTable
        FROM INFORMATION_SCHEMA.TABLES 
    
        LEFT JOIN Fkeys
            ON TABLE_NAME = Fkeys.OnTable  
    
        WHERE (1=1) 
            AND TABLE_TYPE = 'BASE TABLE'
            AND TABLE_NAME NOT IN ('sysdiagrams', 'dtproperties') 
    )
    INSERT INTO #MyData
    SELECT *
    FROM MyData;
    
    
    WITH MyRecursion AS 
    (
        -- base case
        SELECT  
             OnTable AS TableName 
            ,1 AS Lvl 
        FROM #MyData
        WHERE 1=1
        AND AgainstTable IS NULL 
    
        -- recursive case
        UNION ALL 
    
        SELECT 
             OnTable AS TableName
            ,r.Lvl + 1 AS Lvl 
        FROM #MyData d 
    
        INNER JOIN MyRecursion r 
            ON d.AgainstTable = r.TableName 
    )
    
    SELECT 
         MAX(Lvl) AS Lvl 
        ,TableName
        --,strSql = 'delete from [' + tablename + ']'
    FROM 
        MyRecursion
    GROUP BY
        TableName
    
    ORDER BY Lvl
    
    DROP TABLE #MyData
    

    或者

    • 12
  2. Jānis
    2012-03-20T03:09:56+08:002012-03-20T03:09:56+08:00

    在这两种情况下,您都在查询视图,留给兼容性:Compatibilty views和Information Schema Views。

    改为使用目录视图以获得最佳性能(msdn:“我们建议您使用目录视图,因为它们是目录元数据的最通用接口,并提供获取、转换和呈现此信息的自定义形式的最有效方式”) ..

    • 2

相关问题

  • 死锁的主要原因是什么,可以预防吗?

  • 如何确定是否需要或需要索引

  • 我在哪里可以找到mysql慢日志?

  • 如何优化大型数据库的 mysqldump?

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    如何查看 Oracle 中的数据库列表?

    • 8 个回答
  • Marko Smith

    mysql innodb_buffer_pool_size 应该有多大?

    • 4 个回答
  • Marko Smith

    列出指定表的所有列

    • 5 个回答
  • Marko Smith

    从 .frm 和 .ibd 文件恢复表?

    • 10 个回答
  • Marko Smith

    如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

    • 4 个回答
  • Marko Smith

    你如何mysqldump特定的表?

    • 4 个回答
  • Marko Smith

    如何选择每组的第一行?

    • 6 个回答
  • Marko Smith

    使用 psql 列出数据库权限

    • 10 个回答
  • Marko Smith

    如何从 PostgreSQL 中的选择查询中将值插入表中?

    • 4 个回答
  • Marko Smith

    如何使用 psql 列出所有数据库和表?

    • 7 个回答
  • Martin Hope
    Mike Walsh 为什么事务日志不断增长或空间不足? 2012-12-05 18:11:22 +0800 CST
  • Martin Hope
    Stephane Rolland 列出指定表的所有列 2012-08-14 04:44:44 +0800 CST
  • Martin Hope
    haxney MySQL 能否合理地对数十亿行执行查询? 2012-07-03 11:36:13 +0800 CST
  • Martin Hope
    qazwsx 如何监控大型 .sql 文件的导入进度? 2012-05-03 08:54:41 +0800 CST
  • Martin Hope
    markdorison 你如何mysqldump特定的表? 2011-12-17 12:39:37 +0800 CST
  • Martin Hope
    pedrosanta 使用 psql 列出数据库权限 2011-08-04 11:01:21 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 对 SQL 查询进行计时? 2011-06-04 02:22:54 +0800 CST
  • Martin Hope
    Jonas 如何从 PostgreSQL 中的选择查询中将值插入表中? 2011-05-28 00:33:05 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 列出所有数据库和表? 2011-02-18 00:45:49 +0800 CST
  • Martin Hope
    bernd_k 什么时候应该使用唯一约束而不是唯一索引? 2011-01-05 02:32:27 +0800 CST

热门标签

sql-server mysql postgresql sql-server-2014 sql-server-2016 oracle sql-server-2008 database-design query-performance sql-server-2017

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve