对于每个账户,我需要第一笔交易日期,最大交易金额,以及从第一到最大所花费的时间。数据经常被截断和重新加载,我有只读访问权限(没有架构或索引更改)。该表有超过一百万行,因此创建和更新临时表太慢了;交叉应用似乎也很低效。使用 CTE 和窗口函数,我只需要敲桌子两次。但是有更好的方法吗?表的简化示例和我的查询如下:
CREATE TABLE [dbo].[trans](
[Tran_ID] [int] IDENTITY(1,1) NOT NULL,
[Account_ID] [varchar](50) NULL,
[Tran_Date] [datetime] NOT NULL,
[Tran_Amount] [money] NOT NULL
) ON [PRIMARY];
编辑:@Peter 该表具有以下聚集索引。这是唯一的索引,没有主键或外键或其他约束。
CREATE CLUSTERED INDEX [idx_trans_ID_DATE] ON [dbo].[trans]
(
[Account_ID] ASC,
[Tran_Date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
WITH HighestTran AS (
SELECT
Account_ID,
Tran_Date,
Tran_Amount,
ROW_NUMBER() OVER (PARTITION BY Account_ID ORDER BY Tran_Date ASC) AS FirstTranRow,
ROW_NUMBER() OVER (PARTITION BY Account_ID ORDER BY Tran_Amount DESC) AS MaxTranRow
FROM
dbo.trans
WHERE
Tran_Amount > 0
)
SELECT
t1.Account_ID,
t1.Tran_Date,
t2.Tran_Date,
t1.Tran_Amount,
t2.Tran_Amount
FROM
HighestTran t1
INNER JOIN HighestTran t2 ON t1.Account_ID = t2.Account_ID AND t1.FirstTranRow = 1 and t2.MaxTranRow = 1
ORDER BY t1.Account_ID;
这是在单个表扫描中完成的:
此查询假定每个 仅找到几行
Account_ID
。Tran_Amount
您可以通过添加到索引来提高查询(进行排序)的速度。虽然您的问题确实表明您无法编辑架构,但如果您发现实验数据有所改进,至少可以与 DBA 进行讨论。您和 Thomas 的略有不同,只需一次索引扫描即可完成。