每次在外部系统中创建或更新特定类型的记录时,我正在使用一个由另一个应用程序(我不控制)填充的表(我控制的)。下面是该表的高度简化的说明,我将其称为“AllRecordVersions”。
记录时间戳 | 记录ID | 记录状态 | 数据字段1 | 数据字段2 | 数据字段3 |
---|---|---|---|---|---|
2023-11-29 09:08:00 | A0000001 | 处理 | ABC123 | 真的 | 102.11 |
2023-11-29 09:13:00 | A0000002 | 处理 | DEF789 | 错误的 | 96.48 |
2023-11-29 09:20:00 | A0000001 | 未加工的 | ABC123 | 错误的 | 105.59 |
2023-11-29 09:22:00 | A0000001 | 未加工的 | ABC124 | 错误的 | 106.02 |
2023-11-29 09:37:00 | A0000002 | 未加工的 | DEF789 | 错误的 | 99.73 |
“RecordState”的值是指我自己的应用程序是否已获取记录。我可以更新此列来跟踪我的应用程序的活动。
我正在使用以下格式的查询(我已经简化/匿名)来挑选表中每条记录的最新版本,这样我就可以将其每个数据字段的值与它们保存在我处理的同一记录的先前版本中。
SELECT
-- Here I can compare the latest version of each record
-- with the last version that I had processed, e.g. with
-- "CASE WHEN [LatestUnprocessedRecordVersion].[DataField1] <> [LatestProcessedRecordVersion].[DataField1] ...",
-- so that I can check if and how certain fields have been changed,
-- and use that to decide what updates to push into some other external system
FROM (
[AllRecordVersions] AS [LatestUnprocessedRecordVersion]
JOIN (
SELECT
MAX([RecordTimestamp]) AS [RecordTimestamp_Max],
[RecordID]
FROM [AllRecordVersions]
WHERE [RecordState] = 'Unprocessed'
GROUP BY [RecordID]
) AS [LatestUnprocessedRecordVersion_Timestamp]
ON [LatestUnprocessedRecordVersion].[RecordTimestamp] = [LatestUnprocessedRecordVersion_Timestamp].[RecordTimestamp_Max]
AND [LatestUnprocessedRecordVersion].[RecordID] = [LatestUnprocessedRecordVersion_Timestamp].[RecordID]
)
JOIN (
[AllRecordVersions] AS [LatestProcessedRecordVersion]
JOIN (
SELECT
MAX([RecordTimestamp]) AS [RecordTimestamp_Max],
[RecordID]
FROM [AllRecordVersions]
WHERE [RecordState] = 'Processed'
GROUP BY [RecordID]
) AS [LatestProcessedRecordVersion_Timestamp]
ON [LatestProcessedRecordVersion].[RecordTimestamp] = [LatestProcessedRecordVersion_Timestamp].[RecordTimestamp_Max]
AND [LatestProcessedRecordVersion].[RecordID] = [LatestProcessedRecordVersion_Timestamp].[RecordID]
)
ON [LatestUnprocessedRecordVersion].[RecordID] = [LatestProcessedRecordVersion].[RecordID]
我的问题是:我应该在表“AllRecordVersions”上放置哪些索引才能使上述查询高效运行?数据库是Microsoft SQL Server。
为了提高效率,表“AllRecordVersions”将定期删除:
- 表中每条记录的除最新处理版本之外的所有版本(因为永远不需要这些记录进行比较)
- 表中每条记录的除最新未处理版本之外的所有记录(因为这些记录永远不会被处理)