我有一个查询用于填充聚合表以进行报告。该查询来自我工作的公司的另一位开发人员,但我的工作是让它快速运行。到目前为止,我所有的早期尝试都失败了。我已经用这个查询尝试了几件事,这就是我目前所处的位置。我已经缩短了大约半小时的加载时间,但我被卡住了,我想我可能只需要重新做整个事情。我希望这里有人可以看到如果我遗漏了什么,并给我一些关于如何修复此查询的指示。
SELECT P.CompanyID,
P.CompanyName,
P.StoreID,
P.StoreName,
P.ReportDate,
Isnull((SELECT Sum(FT.GrossSales - Isnull(FP.PaymentAmount, 0)) AS PullNet
FROM FactSalesTransaction AS FT
LEFT JOIN (SELECT TransactionID,
DimStoreID,
DimBusinessDateID,
Sum(PaymentAmount) AS PaymentAmount
FROM FactSalesPayment
WHERE DimPaymentTypeID <> 2
AND ModStatusFlg <> 'D'
GROUP BY TransactionID,
DimStoreID,
DimBusinessDateID) AS FP
ON FP.TransactionID = FT.TransactionID
AND FP.DimStoreID = FT.DimStoreID
INNER JOIN DimCalendar AS C
ON FT.DimBusinessDateID = C.DimCalendarID
AND FP.DimBusinessDateID = C.DimCalendarID AND C.CalendarDate >= '12/4/2012'
WHERE FT.DimStoreID = P.DimStoreID
AND FT.DimBusinessDateID = P.DimBusinessDateID
AND FT.ModStatusFlg <> 'D'), 0) AS StoreCash,
SR.CashDeposit AS StoreResp,
SN.StoreNet,
P.DimEmployeeID AS EmpID,
P.EmpName,
P.RegisterID,
P.PullNumber,
Isnull((SELECT Sum(FT.GrossSales - Isnull(FP.PaymentAmount, 0)) AS PullNet
FROM FactSalesTransaction AS FT
LEFT JOIN (SELECT TransactionID,
DimStoreID,
DimBusinessDateID,
Sum(PaymentAmount) AS PaymentAmount
FROM FactSalesPayment
WHERE DimPaymentTypeID <> 2
AND ModStatusFlg <> 'D'
GROUP BY TransactionID,
DimStoreID,
DimBusinessDateID) AS FP
ON FP.TransactionID = FT.TransactionID
AND FP.DimStoreID = FT.DimStoreID
INNER JOIN DimCalendar AS C
ON FT.DimBusinessDateID = C.DimCalendarID
AND FP.DimBusinessDateID = C.DimCalendarID AND C.CalendarDate >= '12/4/2012'
WHERE FT.DimStoreID = P.DimStoreID
AND FT.DimRegisterID = P.DimRegisterID
AND FT.TransactionDateTime BETWEEN P.PullDrawerStartTime AND P.PullDrawerEndTime
AND FT.ModStatusFlg <> 'D'), 0) AS PullCash,
P.PullResp + Isnull((SELECT Sum(SkimAmount)
FROM FactSkims
WHERE DimStoreID = P.DimStoreID
AND DimRegisterID = P.DimRegisterID
AND SkimDateTime BETWEEN P.PullDrawerStartTime AND P.PullDrawerEndTime), 0) AS PullResp,
Isnull((SELECT Sum(NetSales) AS PullNet
FROM FactSalesTransaction AS FT
INNER JOIN DimCalendar AS C
ON FT.DimBusinessDateID = C.DimCalendarID AND C.CalendarDate >= '12/4/2012'
WHERE DimStoreID = P.DimStoreID
AND DimRegisterID = P.DimRegisterID
AND TransactionDateTime BETWEEN P.PullDrawerStartTime AND P.PullDrawerEndTime
AND ModStatusFlg <> 'D'), 0) AS PullNet
FROM (SELECT C.CompanyID,
C.CompanyName,
S.StoreID,
S.StoreName,
F.DimEmployeeID,
E.FirstName + ' ' + E.LastName AS EmpName,
CASE
WHEN F.PullDrawerStartTime <> '1900-01-01' THEN F.PullDrawerStartTime
ELSE Isnull(Cast((SELECT TOP 1 Dateadd(SECOND, 1, PullDrawerEndTime)
FROM FactPullDrawer
WHERE PullDrawerEndTime < F.PullDrawerEndTime
AND DimStoreID = F.DimStoreID
AND DimRegisterID = F.DimRegisterID
AND DimBusinessDateID = F.DimBusinessDateID
ORDER BY PullDrawerEndTime DESC) AS DATETIME), BD.CalendarDate + Isnull(Cast(Cast(ST.SiteSettingValue AS TIME) AS DATETIME), Cast('4:00:00 AM' AS DATETIME)))
END AS PullDrawerStartTime,
F.PullDrawerEndTime,
BD.CalendarDate AS ReportDate,
R.RegisterID,
R.DimRegisterID,
(SELECT Count(PullDrawerEndTime)
FROM FactPullDrawer
WHERE PullDrawerEndTime < F.PullDrawerEndTime
AND DimStoreID = F.DimStoreID
AND DimRegisterID = F.DimRegisterID
AND DimBusinessDateID = F.DimBusinessDateID) + 1 AS PullNumber,
Isnull(F.Amount, 0) AS PullResp,
F.DimStoreID,
F.DimBusinessDateID
FROM FactPullDrawer AS F
INNER JOIN DimCompany AS C
ON C.DimCompanyID = F.DimCompanyID
INNER JOIN DimStore AS S
ON S.DimStoreID = F.DimStoreID
INNER JOIN DimCalendar AS BD
ON BD.DimCalendarID = F.DimBusinessDateID
AND BD.CalendarDate >= '12/4/2012'
INNER JOIN DimEmployee AS E
ON F.DimEmployeeID = E.DimEmployeeID
INNER JOIN DimRegister AS R
ON R.DimRegisterID = F.DimRegisterID
LEFT JOIN DimSiteSettings AS ST
ON S.StoreID = ST.StoreID
AND C.CompanyID = ST.CompanyID
AND ST.SiteSettingFieldID = 1412) AS P
INNER JOIN (SELECT DimStoreID,
DimBusinessDateID,
Sum(NetSales) AS StoreNet
FROM FactSalesTransaction
WHERE ModStatusFlg <> 'D'
GROUP BY DimStoreID,
DimBusinessDateID) AS SN
ON SN.DimStoreID = P.DimStoreID
AND SN.DimBusinessDateID = P.DimBusinessDateID
INNER JOIN (SELECT CompanyID,
StoreID,
ReportDate,
Sum(ValTotal) AS CashDeposit
FROM AgtAccountingReport
WHERE ReportCatOrder = 7
AND ReportElementOrder < 100
AND ReportElementOrder NOT IN ( 7, 9, 10, 16,17, 18, 19, 20, 21 )
AND ReportDate >= '10/28/2012'
GROUP BY CompanyID,
StoreID,
ReportDate) AS SR
ON SR.CompanyID = P.CompanyID
AND SR.StoreID = P.StoreID
AND SR.ReportDate = P.ReportDate
我在想所有的嵌套SELECT
的,这就是为什么我想我会从头开始。任何帮助,将不胜感激。
好的,这就是我实际所做的,以使这个曾经在大约一个半小时内运行的查询在一分钟内运行。首先,我进行了更多的挖掘,以确切了解它在做什么。在查询的最外层部分有几个主要的子选择。他们看起来像这样。
这些选择语句加入了我最大的两个表。FactSalesTransaction(1.05 亿条记录)到 FactSalesPayment(1.02 亿条记录),它是在一个选择中的一个选择中这样做的。这实质上意味着对于返回的每一行,它都在执行这个查询。那么这个查询通常会运行大约 7 天的数据,因此返回大约 19,000 条记录。这意味着对这些海量表的 3 个子选择需要执行 19,000 次。宾果游戏 我想我已经找到了我的性能损失在哪里。所以我将这些查询切换到左连接。没有什么复杂的,所以他们只需要加入一次。要替换的左连接看起来像这样。
正如您所看到的,我对选择本身并没有太大的改变,只是将其更改为不运行 19,000 次。我所做的下一件事是将查询更改为存储过程,其中获取用户的日期范围,或者在这种情况下,ETL 过程给出(通常是 7 天前) 从 DimCalendar 中选择当天的 DimCalendarID,因此查询使用整数代替的日期时间和总体要加入的记录较少。使最终查询看起来像这样。
在对速度增加进行测试后,看看是否值得索引空间。我在 ModStatusFlag 和 DimBusinessDateID 上添加了两个非聚集索引,其中包括此查询请求的其他列。在 FactSalesTransaction 和 FactSalesPayment 上。我可能会做更多的事情来清理它并让它运行得更快,但是性能提升将是最小的,而且目前还有更大的鱼要炸。长话短说要小心你的子选择语句。
我强烈推荐临时表和索引。我基本上会推荐以下。
使用临时表
临时表极大地优化了您的查询性能。在大表或子查询中使用 WHERE 子句的地方使用 int。
创建索引
Esp 在您正在使用的列上。会让事情变得快速快速。
运行单个子查询
找出哪个相对较慢
加入子查询并在加入时检查性能
找出哪个 JOIN 造成了问题。通常归结为一个需要所有时间的 JOIN。此外,当您将 AND 与 JOIN 结合使用时,在较大的表中可能需要很长时间。例如,您可能需要重新编写查询的这一部分,如下所示。
一旦你发现 JOIN 会降低你的表现
在那里使用临时表进行优化。如果您在 100k 表中使用 3k 条记录,则临时表效果最好。
这些只是一些可以帮助您的提示。