SQL Server - 使用聚集索引时如何存储数据页

Question

Riley Major

Asked: 2017-09-20 09:04:41 +0800 CST2017-09-20 09:04:41 +0800 CST 2017-09-20 09:04:41 +0800 CST

为什么在 SQL Server 2016 上使用 XML 的函数编译时间长？

772

我们有将数据收集到 XML 中的内联函数，将派生的 XML 传递给其他函数，然后将其分解并重组为字符串。

（你的“你不应该在 T-SQL 中做那种事情”是另一天的讨论。）

这在 2005 和 2008 R2 中运行良好多年。我们现在升级到 2016 SP1。使用这些函数的查询在我们的生产服务器上运行不到一秒，现在在 2016 SP1 中运行得更快。那太好了，但是在 2016 SP1 上编译需要一个小时。严重地：

生产环境：

（它们都来自 SQL Sentry Plan Explorer。）

我们已经尝试使用 2008 (100) 和 2016 (130) 兼容级别的数据库（但我们还没有使用“Legacy Cardinality Estimation”和“Query Optimizer Fixes”设置，它们目前都是“OFF”）。我们试过使用QUERYTRACEON 9481，似乎没有效果。

同样，这与最终计划无关，因为它们都在很短的时间内运行。大约是制定计划所需的时间。

我们已经设法通过一组简化的代码在一定程度上重现了这个问题。从以下示例调用顶级函数的语句在 SQL Server 2016 (SP1-CU5) 上编译需要 30-60 秒，但在 SQL Server 2008 R2 (SP3) 上编译和运行只需不到一秒。

例子

/*

Create and populate table...

*/

CREATE TABLE TestXMLStuff (OrderID int, ProdLength int, ProdWidth int, ProdHeight int);
INSERT INTO TestXMLStuff (OrderID, ProdLength, ProdWidth, ProdHeight) VALUES
    (1, 10, 15, 20),
    (1, 15, 20, 25),
    (2, 20, 25, 30),
    (2, 25, 30, 35),
    (2, 30, 35, 40);
GO

/*

Function which accepts XML, shreds it and reforms it as a string...

*/


CREATE FUNCTION TestCalc
(   
    @T varchar(8000),
    @X xml
)
RETURNS TABLE 
AS
RETURN 
    WITH p AS
    (
        SELECT  
            LF = CHAR(13) + CHAR(10),
            Tab = CHAR(9),
            T = isNull(@T,'')
    ), pid AS
    (
        SELECT
            isNull(ProdInfoXMLTable.ProdInfoXML.query('(/ProdInfo)').value('(.)[1]','varchar(max)'),'') AS ProdInfoText
        FROM        (
                        SELECT
                            ProdInfoXML =
                                (
                                    SELECT
                                        ProdInfo = 
                                            CASE WHEN Products.ProdNum > 1 THEN '--' + p.LF ELSE '' END +
                                            'Product Number: ' + CONVERT(varchar(50),Products.ProdNum) + p.LF +
                                                CASE WHEN Products.ProdLength       = '' THEN '' ELSE p.Tab + 'Length: '                + Products.ProdLength       + p.LF END +
                                                CASE WHEN Products.ProdWidth        = '' THEN '' ELSE p.Tab + 'Width: '                 + Products.ProdHeight       + p.LF END +
                                                CASE WHEN Products.ProdHeight       = '' THEN '' ELSE p.Tab + 'Height: '                + Products.ProdHeight       + p.LF END
                                    FROM        (
                                                    SELECT
                                                        ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS ProdNum,
                                                        isNull(P.X.value('(./Length)[1]','varchar(500)'),'') AS ProdLength,
                                                        isNull(P.X.value('(./Width)[1]','varchar(500)'),'') AS ProdWidth,
                                                        isNull(P.X.value('(./Height)[1]','varchar(500)'),'') AS ProdHeight
                                                    FROM        @x.nodes('/Products/Product') AS P(X)
                                                ) AS Products
                                    CROSS JOIN  p
                                    FOR XML PATH(''), TYPE
                                )
                    ) AS ProdInfoXMLTable
    )
    SELECT
        Final = p.T + p.LF + p.LF + pid.ProdInfoText
    FROM        p
    CROSS JOIN  pid;

GO

/*

Function to create XML in the format required for TestCalc...

*/

CREATE FUNCTION TestGetXML
(   
    @N int
)
RETURNS TABLE 
AS
RETURN 
    WITH p AS
    (
        SELECT  
            N = isNull(@N,0)
    )
    SELECT
        ProdInfoXML =
            (
                SELECT
                    [Length] = ProdData.ProdLength,
                    [Width] = ProdData.ProdWidth,
                    [Height] = ProdData.ProdHeight
                FROM        TestXMLStuff ProdData
                WHERE       ProdData.OrderID = @N
                FOR XML PATH('Product'), ROOT('Products'), TYPE
            );
GO

/*

Function to join the other two functions, gathering the XML and feeding it to the string creator which shreds and reforms it...

*/

CREATE FUNCTION TestGetFromTableUsingFunc
(   
    @N int
)
RETURNS TABLE 
AS
RETURN 
    WITH p AS
    (
        SELECT  
            N = isNull(@N,0)
    )
    SELECT
        FinalResult = 'This is a ' + TestCalcResults.Final
    FROM        p
    CROSS APPLY TestGetXML
                (
                    p.N
                ) AS x
    CROSS APPLY TestCalc
                (
                    'test',
                    x.ProdInfoXML
                ) AS TestCalcResults;
GO

/*

Code to call the function. This is what takes around 60 seconds to compile on our 2016 system but basically no time on the 2008 R2 system.

*/

SELECT      *
FROM        TestXMLStuff
CROSS APPLY TestGetFromTableUsingFunc
            (
                OrderID
            )
OPTION      (RECOMPILE);
GO

@@version编译不是问题的生产：

Microsoft SQL Server 2008 R2 (SP3) - 10.50.6000.34 (X64)

测试@@version编译需要“永远”的地方：

Microsoft SQL Server 2016 (SP1-CU5) (KB4040714) - 13.0.4451.0 (X64)

问题

为什么从 2008 R2 到 2016 年的编译时间会如此下降？
这个答案是否揭示了一个简单的解决方案来解决这个难题，而不是改变所有这些工作的方式？（我希望有神奇的跟踪标志或即将推出的 Microsoft 更新。）

（如果这是 SQL Server 2017，我会使用 JSON 来收集和传递数据，这似乎更快且开销更低，然后我会使用 JSON 函数来切碎并重组STRING_AGG为文本。但是，唉，那还不是可用的。）

根据Joe Obbish的提示，我使用以下代码收集了使用跟踪标志 8675时的结果：

DBCC TRACEON(3604)
SELECT      *
FROM        TestXMLStuff
CROSS APPLY TestGetFromTableUsingFunc
            (
                OrderID
            )
OPTION      (RECOMPILE, QUERYTRACEON 8675);
DBCC TRACEOFF(3604)

在 2008 R2 实例上，它花费了不到一秒的时间，并产生了这个：

DBCC execution completed. If DBCC printed error messages, contact your system administrator. End of simplification, time: 0.008 net: 0.008 total: 0.008 net: 0.008

end exploration, tasks: 597 no total cost time: 0.005 net: 0.005 total: 0.014 net: 0.014

end search(0),  cost: 2071.66 tasks: 2267 time: 0.005 net: 0.005 total: 0.02 net: 0.02

end exploration, tasks: 2703 Cost = 2071.66 time: 0.002 net: 0.002 total: 0.022 net: 0.022

end search(1),  cost: 1731.11 tasks: 3362 time: 0.004 net: 0.004 total: 0.026 net: 0.026

end exploration, tasks: 3363 Cost = 1731.11 time: 0 net: 0 total:
0.026 net: 0.026

end search(1),  cost: 1731.11 tasks: 3382 time: 0 net: 0 total: 0.026 net: 0.026

end exploration, tasks: 3413 Cost = 1731.11 time: 0 net: 0 total:
0.027 net: 0.027

end search(2),  cost: 1731.11 tasks: 3515 time: 0 net: 0 total: 0.027 net: 0.027

End of post optimization rewrite, time: 0.001 net: 0.001 total: 0.029 net: 0.029

End of query plan compilation, time: 0.001 net: 0.001 total: 0.03 net:
0.03


(5 row(s) affected) DBCC execution completed. If DBCC printed error messages, contact your system administrator.

在 2016 SP1-CU5 实例上，它花费了 1 分 11 秒并产生了这个：

DBCC execution completed. If DBCC printed error messages, contact your system administrator. End of simplification, time: 0.004 net: 0.004 total: 0.004 net: 0.004

end exploration, tasks: 612 no total cost time: 0.003 net: 0.003 total: 0.008 net: 0.008

end exploration, tasks: 613 no total cost time: 0 net: 0 total: 0.008 net: 0.008

end exploration, tasks: 2305 no total cost time: 0.002 net: 0.002 total: 0.011 net: 0.011

end exploration, tasks: 2306 no total cost time: 0 net: 0 total: 0.011 net: 0.011

end search(0),  cost: 4402.32 tasks: 2306 time: 0 net: 0 total: 0.011 net: 0.011

end exploration, tasks: 2738 Cost = 4402.32 time: 0.001 net: 0.001 total: 0.013 net: 0.013

end exploration, tasks: 2739 Cost = 4402.32 time: 0 net: 0 total:
0.013 net: 0.013

end exploration, tasks: 3466 Cost = 4402.32 time: 0.002 net: 0.002 total: 0.015 net: 0.015

end exploration, tasks: 3467 Cost = 4402.32 time: 0 net: 0 total:
0.015 net: 0.015

end search(1),  cost: 3938.19 tasks: 3467 time: 0 net: 0 total: 0.015 net: 0.015

end exploration, tasks: 3468 Cost = 3938.19 time: 0 net: 0 total:
0.015 net: 0.015

end exploration, tasks: 3469 Cost = 3938.19 time: 0 net: 0 total:
0.015 net: 0.015

end exploration, tasks: 3489 Cost = 3938.19 time: 0 net: 0 total:
0.015 net: 0.015

end exploration, tasks: 3490 Cost = 3938.19 time: 0 net: 0 total:
0.015 net: 0.015

end search(1),  cost: 3938.19 tasks: 3490 time: 0 net: 0 total: 0.015 net: 0.015

end exploration, tasks: 3521 Cost = 3938.19 time: 0 net: 0 total:
0.015 net: 0.015

end exploration, tasks: 3522 Cost = 3938.19 time: 0 net: 0 total:
0.015 net: 0.015

end exploration, tasks: 3625 Cost = 3938.19 time: 0 net: 0 total:
0.016 net: 0.016

end exploration, tasks: 3626 Cost = 3938.19 time: 0 net: 0 total:
0.016 net: 0.016

end search(2),  cost: 3938.19 tasks: 3626 time: 0 net: 0 total: 0.016 net: 0.016

End of post optimization rewrite, time: 0 net: 0 total: 0.016 net:
0.016

End of query plan compilation, time: 0.001 net: 0.001 total: 0.018 net: 0.018


(5 row(s) affected) DBCC execution completed. If DBCC printed error messages, contact your system administrator.

尽管发生了更多变化，但看起来经过的时间仅为 0.018（秒？），小于 2008 R2 上的 0.03。所以编译/优化/执行的这个阶段一定不会花这么长时间。但肯定是。

我们最终的 2016 生产实例将具有与当前生产的 2008 R2 相同的“硬件”。Test/Dev 2016 实例具有不同的规格，因此这些比较不是同类比较。产品是 78 演出。开发是 16 演出。但我在另一个 20 gig 的 2008 R2 盒子上测试过，速度很快。此外，我们还在谈论索引良好的少量数据。在编译过程中，IO 很少，但 CPU 很多。

我可以看到统计数据是影响真实（大）表的真实函数的问题，但在我人为的简化示例中，需要 1 分钟以上的时间才能对5个充满整数的行执行一些简单的 XML/文本操作。我可以更快地输入结果。:) 我相信我们在生产中有自动统计，而且我没有看到其他看似与统计相关的性能问题。另一个非生产 2008 R2 开发/测试环境（具有与 2016 开发/测试类似的陈旧生产副本）是 lickety-split。

2 个回答

Voted

Joe Obbish · Answer 1 · 2018-01-05T08:29:45+08:00

Best Answer

Joe Obbish

2018-01-05T08:29:45+08:002018-01-05T08:29:45+08:00

SQL Server 2016 SP1 的 CU7可能会解决你的问题。相关的知识库文章似乎是KB 4056955 - FIX: Queries that casts string or binary data to XML take a long time to compile in SQL Server 2016。

我在 SQL Server 2016 SP1 CU6 中运行了您的重现代码，并获得了以下每个编译时间SET STATISTICS TIME ON：

SQL Server 解析和编译时间：CPU 时间 = 24437 毫秒，运行时间 = 24600 毫秒。

SQL Server 解析和编译时间：CPU 时间 = 48968 毫秒，运行时间 = 49451 毫秒。

以下是升级到 SP1 CU7 后编译时间的变化：

SQL Server 解析和编译时间：CPU 时间 = 16 毫秒，运行时间 = 16 毫秒。

SQL Server 解析和编译时间：CPU 时间 = 16 毫秒，运行时间 = 17 毫秒。

SQL Server 解析和编译时间：CPU 时间 = 16 毫秒，运行时间 = 17 毫秒。

SQL Server 解析和编译时间：CPU 时间 = 44 毫秒，运行时间 = 44 毫秒。

它快了大约 750 倍。

7

Riley Major · Answer 2 · 2017-09-20T09:04:41+08:00

我无法回答为什么 2016 年需要这么多时间，但解决方法似乎是隐藏优化器的一些操作。如果通过将其置于标量函数后面来屏蔽流程的 XML 构造部分，则编译时间会下降到大约一秒。它仍然比 2008 R2 慢很多，但可能已经足够快了。

/*

Add this scalar function as an opaque wrapper around the (transparent) inline XML gathering function...

*/

CREATE FUNCTION TestGetXMLScalar
(
    @N int
)
RETURNS xml
AS
BEGIN
    RETURN
    (
        SELECT
            ProdInfoXML
        FROM        TestGetXML
                    (   
                            @N
                    )
    );

END
GO

/*

Then, use the scalar function when orchestrating everything...

*/

CREATE FUNCTION TestGetFromTableUsingFuncScalar
(   
    @N int
)
RETURNS TABLE 
AS
RETURN 
    WITH p AS
    (
        SELECT  
            N = isNull(@N,0)
    )
    SELECT
        FinalResult = 'This is a ' + TestCalcResults.Final
    FROM        p
    CROSS APPLY TestCalc
                (
                    'test',
                    dbo.TestGetXMLScalar(@N)
                ) AS TestCalcResults;
GO

/*

Finally, try out the new function...

*/

SELECT      *
FROM        TestXMLStuff
CROSS APPLY TestGetFromTableUsingFuncScalar
            (
                OrderID
            )
OPTION      (RECOMPILE);
GO

为什么在 SQL Server 2016 上使用 XML 的函数编译时间长？

例子

问题

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

为什么在 SQL Server 2016 上使用 XML 的函数编译时间长？

例子

问题

2 个回答

相关问题