实际问题涉及更多的数据和连接,但我创建了一个小示例来演示该问题:
-- create example table
DROP TABLE dbo.EventRecords
GO
CREATE TABLE dbo.EventRecords
(
EventDate datetime NOT NULL,
EventCount int NOT NULL
) ON [PRIMARY]
GO
ALTER TABLE dbo.EventRecords ADD CONSTRAINT
PK_EventRecords PRIMARY KEY CLUSTERED
(
EventDate,
EventCount
) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
-- put in some random data for example
DECLARE @Counter INT=0
WHILE (@Counter<1000)
BEGIN
DECLARE @SemiRandomCount1 INT=@Counter*589043%23
DECLARE @SemiRandomCount2 INT=@Counter*85907%7
IF @SemiRandomCount1>0 AND @Counter%7<>0 -- leave some dates empty
BEGIN
INSERT INTO dbo.EventRecords(EventDate,EventCount)
VALUES (DATEADD(day,@Counter,'2013-01-01'),@SemiRandomCount1)
PRINT CAST(@SemiRandomCount2 AS VARCHAR(MAX))
IF @SemiRandomCount2>0 AND @Counter%2=0 -- some dates have multiple entries
INSERT INTO dbo.EventRecords(EventDate,EventCount)
VALUES (DATEADD(day,@Counter,'2013-01-01'),@SemiRandomCount2)
END
SET @Counter=@Counter+1
END
--SELECT * FROM dbo.EventRecords
因此,有些日期有多个条目,有些则没有。我需要获取包含指定范围内每个日期的报告结果,以及该日期的总计数(如果该日期没有计数,则为零)。经过大量的谷歌搜索和实验,我找到了一种非常聪明的方法来动态生成序列并从中构建了这个函数。这些序列可用于即时构建日期序列表,然后可用于加入 EventRecords 表并按日期分组,没有漏洞:
IF NOT EXISTS (SELECT * FROM dbo.sysobjects WHERE name='GetSequence')
EXECUTE sp_executesql N'CREATE FUNCTION GetSequence() RETURNS @Table TABLE (Value SMALLINT NOT NULL) AS BEGIN RETURN END'
GO
ALTER FUNCTION [dbo].[GetSequence](@StartInclusive INT, @EndExclusive INT)
RETURNS @Sequence TABLE
(
Value BIGINT NOT NULL
)
AS
BEGIN
INSERT @Sequence
SELECT Value=@StartInclusive+n-1
FROM (SELECT ROW_NUMBER() OVER (ORDER BY o1.n)
FROM (SELECT n=ROW_NUMBER() OVER (ORDER BY object_id) FROM sys.objects WITH (NOLOCK)) o1
CROSS JOIN (SELECT n=ROW_NUMBER() OVER (ORDER BY object_id) FROM sys.objects WITH (NOLOCK)) o2
CROSS JOIN (SELECT n=ROW_NUMBER() OVER (ORDER BY object_id) FROM sys.objects WITH (NOLOCK)) o3
CROSS JOIN (SELECT n=ROW_NUMBER() OVER (ORDER BY object_id) FROM sys.objects WITH (NOLOCK)) o4
CROSS JOIN (SELECT n=ROW_NUMBER() OVER (ORDER BY object_id) FROM sys.objects WITH (NOLOCK)) o5
CROSS JOIN (SELECT n=ROW_NUMBER() OVER (ORDER BY object_id) FROM sys.objects WITH (NOLOCK)) o6
) D (n)
WHERE n<=@EndExclusive-@StartInclusive
RETURN
END
GO
以下是示例查询:
DECLARE @StartDate DATE='2013-01-01'
DECLARE @EndDate DATE='2015-01-01'
-- query with holes: not what I need
SELECT [EventDate], [TotalEventCount]=ISNULL(SUM(EventCount),0)
FROM dbo.EventRecords
GROUP BY [EventDate]
-- query with date holes filled in: this is what I need
SELECT
[EventDate]=DATEADD(day,s.Value,@StartDate),
[TotalEventCount]=ISNULL(SUM(EventCount),0)
FROM [dbo].[GetSequence](0,DATEDIFF(day,@StartDate,@EndDate)) s
LEFT JOIN dbo.EventRecords c
ON DATEDIFF(day,@StartDate, EventDate)=s.Value
GROUP BY s.Value
所以,我的问题是:有没有更好(更简单或更快)的方法来获取序列,或者在 SQL 中解决这个问题的更好方法?
我建议使用日历表或日期维度(无论您喜欢哪个名称)。这是使用快速 CTE 的答案。
关于日历表的一些链接:
那么,你应该有一个数字表或日历表。我将从一个数字表开始:
然后:
如另一个答案中所述,日历表甚至更好,但两者的性能都应该比您的函数好得多。
如果您没有 Numbers 或 Calendar 表,并且不想创建任何一个(请先阅读此答案底部的链接和本文),那么您可以使用内置的东西,例如
spt_values
如果您的最大值日期范围小于约 2,000 天:如果您需要超过 2,000 天(无论如何,谁会使用跨越两年的单独日期的报告?),您可以使用单个
ROW_NUMBER() OVER (ORDER BY [object_id]) FROM sys.all_columns
或不同目录视图的交叉连接来满足您的最大范围,而不是许多单个查询的草率交叉连接。但实际上,日历表是您最好的选择。