我被要求为 Web 应用程序创建一个简单的审计系统。由于一些限制,我选择了一种“肮脏而快速”的方式。这是可能的,因为该特定应用程序中的活动很小(每天数百次插入)。
但是,现在我必须在另一个生成更多活动的 Web 应用程序中实施,因此我考虑重构它。
笔记:
所有查询都是由使用的 ORM(实体框架)生成的,看起来非常糟糕。
所有测试都是在一个表中执行的,表中填充了一些 2K 虚拟记录,在一些用户、时间等之间均匀分布。
数据定义
CREATE TABLE dbo.AppEvent
(
AppEventId INT NOT NULL IDENTITY(1, 1) CONSTRAINT PK_AppEvent PRIMARY KEY CLUSTERED,
InsertTimestamp DATETIME2 NOT NULL CONSTRAINT DF_AppEvent DEFAULT(GETDATE()),
UserId INT NOT NULL CONSTRAINT FK_AppEvent_UserId REFERENCES dbo.AppUser,
EventTypeId INT NOT NULL CONSTRAINT FK_AppEvent_EventType REFERENCES dbo.EventType,
RegionId INT NULL CONSTRAINT FK_AppEvent_Region REFERENCES dbo.Region,
CountryId INT NULL CONSTRAINT FK_AppEvent_Country REFERENCES dbo.Country,
InsertDay AS (CAST(InsertTimestamp as DATE)),
InsertMonth AS (CAST(DATEADD(MONTH, DATEDIFF(MONTH, 0, InsertTimestamp), 0) AS DATE)),
InsertYear AS (CAST(DATEADD(YEAR, DATEDIFF(YEAR, 0, InsertTimestamp), 0) AS DATE)),
Description NVARCHAR(2000) NULL,
ProjectId INT NULL CONSTRAINT FK_AppEvent_Project REFERENCES dbo.Project,
ReminderActionId INT NULL CONSTRAINT FK_AppEvent_ReminderAction REFERENCES dbo.ReminderAction
)
GO
这是唯一相关的表格。所有 FK 引用都指向聚簇主键,并且大多数表包含的记录少于 100 条。
实际记录
该应用程序试图通过避免插入时间太近(相同的用户,相同的事件类型)来聚集日志记录信息。
因此,它需要做一个SELECT
:
exec sp_executesql N'SELECT TOP (1)
[Project1].[InsertTimestamp] AS [InsertTimestamp]
FROM ( SELECT
[Extent1].[AppEventId] AS [AppEventId],
[Extent1].[InsertTimestamp] AS [InsertTimestamp]
FROM [dbo].[AppEvent] AS [Extent1]
WHERE ([Extent1].[UserId] = @p__linq__0) AND ([Extent1].[EventTypeId] = @p__linq__1) AND (([Extent1].[ReminderActionId] = @p__linq__2) OR (([Extent1].[ReminderActionId] IS NULL) AND (@p__linq__2 IS NULL)))
) AS [Project1]
ORDER BY [Project1].[AppEventId] DESC',N'@p__linq__0 int,@p__linq__1 int,@p__linq__2 int',@p__linq__0=1,@p__linq__1=4,@p__linq__2=27
这产生了关于:CPU = 16, Reads = 34, Writes = 0, Duration = 0
查看执行计划,我认为索引可能会有所改善:
CREATE INDEX IDX_AppEvent_User_EventType ON dbo.AppEvent (UserId, EventTypeId, ReminderActionId) INCLUDE (AppEventId, InsertTimestamp)
这给CPU = 0, Reads = 20, Writes = 0, Duration = 0
实际的 INSERT 语句如下所示:
exec sp_executesql N'INSERT [dbo].[AppEvent]([InsertTimestamp], [UserId], [EventTypeId], [RegionId], [CountryId], [Description], [ProjectId], [ReminderActionId])
VALUES (@0, @1, @2, @3, @4, @5, @6, NULL)
SELECT [AppEventId], [InsertDay], [InsertMonth], [InsertYear]
FROM [dbo].[AppEvent]
WHERE @@ROWCOUNT > 0 AND [AppEventId] = scope_identity()',N'@0 datetime2(7),@1 int,@2 int,@3 int,@4 int,@5 nvarchar(2000),@6 int',
@0='2017-01-30 14:54:02.6469319',@1=1,@2=7,@3=5,@4=305,@5=N'Custom message',@6=1533
SELECT
声明是由 ORM 引起的,它需要知道刚刚创建的标识符(我将不得不看看我是否可以摆脱SELECT
不需要的额外内容)。
这大约需要CPU = 16, Reads = 32, Writes = 0, Duration = 9
执行计划生成以下内容:
和这个
报告
审计报告非常简单,很少运行(每天最多几次)。典型的查询如下所示:
SELECT
1 AS [C1],
[GroupBy1].[K2] AS [InsertDay],
[GroupBy1].[K1] AS [CountryId],
[GroupBy1].[A1] AS [C2]
FROM ( SELECT
[Extent1].[CountryId] AS [K1],
[Extent1].[InsertDay] AS [K2],
COUNT(1) AS [A1]
FROM [dbo].[AppEvent] AS [Extent1]
WHERE ([Extent1].[EventTypeId] IN (1, 6, 7, 9)) AND ([Extent1].[CountryId] IS NOT NULL)
GROUP BY [Extent1].[CountryId], [Extent1].[InsertDay]
) AS [GroupBy1]
`CPU = 16, Reads = 25, Writes = 0, Duration = 11`
SELECT
1 AS [C1],
[GroupBy1].[K2] AS [InsertMonth],
[GroupBy1].[K1] AS [CountryId],
[GroupBy1].[A1] AS [C2]
FROM ( SELECT
[Extent1].[CountryId] AS [K1],
[Extent1].[InsertMonth] AS [K2],
COUNT(1) AS [A1]
FROM [dbo].[AppEvent] AS [Extent1]
WHERE ([Extent1].[EventTypeId] IN (1, 6, 7)) AND ([Extent1].[CountryId] IS NOT NULL)
GROUP BY [Extent1].[CountryId], [Extent1].[InsertMonth]
) AS [GroupBy1]
`CPU = 16, Reads = 25, Writes = 0, Duration = 12`
问题: 考虑到每天最多生成 100K 个事件,我是否应该考虑重写整个审计机制(从数据库的角度来看)?
在应用层,我可以做几项改进,例如:确保审计不在与操作更改相同的事务中执行,并使用其他线程执行查询。
您可以
SELECT
从插入代码中删除;改变这个:对此:
我根本不认为插入率是个问题;您估计每秒最多插入 3 到 4 次。确保列定义符合实际要求;您在评论中提到该
Description
列实际上可以是 avarchar(255)
- 从 a 中减少它varchar(2000)
会对 insert 和 select 语句产生可衡量的影响。