我需要编写一个存储过程来更新我的数据库中的链接引用。链接可以包含在几个包含 JSON(可能包含一些 url)的 nvarchar 字段中。
为此,我每次迭代分批更新 8129 个项目的表,这样机器就不会挂起(理论上)。
但是现在代码似乎无论如何都挂了,它不会打印任何消息并且程序继续运行(不影响任何数据)很多分钟,直到我不得不终止程序(同时似乎没有影响任何数据) .
如果我尝试在玩具示例上使用相同的逻辑,我没有遇到任何问题,所以我认为我的问题是由于表很大(几十万行)。
这里是最小的例子,更大的表上完全相同的代码显然什么也没做(用 SQL Server 2019 测试)。
程序代码:
ALTER PROCEDURE [dbo].[SiteUrlChangeURL]
@FullOldUrl nvarchar(500),
@FullNewUrl nvarchar(500)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
SET @FullOldUrl = ISNULL(@FullOldUrl,'');
SET @FullNewUrl = ISNULL(@FullNewUrl,'');
IF ( LEN(@FullOldUrl) <= 0 OR LEN(@FullNewUrl) <= 0 )
BEGIN
PRINT('Invalid parameters');
RETURN 1;
END
--ARTICLE
RAISERROR ('updating articles',0,1) WITH NOWAIT;
WHILE 1=1
BEGIN
UPDATE TOP (8196) [dbo].[tbl_ana_Articles]
SET [_ATTRIBUTES] = REPLACE([_ATTRIBUTES] , @FullOldUrl, @FullNewUrl)
,[_DOCUMENTS] = REPLACE([_DOCUMENTS] , @FullOldUrl, @FullNewUrl)
,[_SEO] = REPLACE([_SEO] , @FullOldUrl, @FullNewUrl)
,[_TRANSLATIONS] = REPLACE([_TRANSLATIONS] , @FullOldUrl, @FullNewUrl)
,[_TAGS] = REPLACE([_TAGS] , @FullOldUrl, @FullNewUrl)
,[_NOTES] = REPLACE([_NOTES] , @FullOldUrl, @FullNewUrl)
WHERE
[_ATTRIBUTES] like '%' + @FullOldUrl + '%' OR
[_DOCUMENTS] like '%' + @FullOldUrl + '%' OR
[_SEO] like '%' + @FullOldUrl + '%' OR
[_TRANSLATIONS] like '%' + @FullOldUrl + '%' OR
[_TAGS] like '%' + @FullOldUrl + '%' OR
[_NOTES] like '%' + @FullOldUrl + '%'
IF (@@ROWCOUNT <= 0)
BEGIN
BREAK;
END
END
RETURN 0;
例子 :
CREATE TABLE [dbo].[tbl_ana_Articles](
[ID] [int] IDENTITY(1,1) NOT NULL,
[ID_BRAND] [int] NOT NULL,
[CODE] [nvarchar](40) NOT NULL,
[CODFOR] [nvarchar](40) NOT NULL,
[COD_ALT01] [nvarchar](50) NOT NULL,
[COD_ALT02] [nvarchar](50) NOT NULL,
[COD_ALT03] [nvarchar](50) NOT NULL,
[ID_UOM] [int] NOT NULL,
[IS_ACTIVE] [bit] NOT NULL,
[_ATTRIBUTES] [nvarchar](max) NOT NULL,
[_DOCUMENTS] [nvarchar](max) NOT NULL,
[_SEO] [nvarchar](max) NOT NULL,
[_TRANSLATIONS] [nvarchar](max) NOT NULL,
[_TAGS] [nvarchar](max) NOT NULL,
[_NOTES] [nvarchar](max) NOT NULL,
[_METADATA] [nvarchar](max) NOT NULL,
[IS_B2B] [bit] NOT NULL,
[IS_B2C] [bit] NOT NULL,
[IS_PROMO] [bit] NOT NULL,
[IS_NEWS] [bit] NOT NULL,
[CAN_BE_RETURNED] [bit] NOT NULL,
[IS_SHIPPABLE] [bit] NOT NULL,
[HAS_SHIPPING_COSTS] [bit] NOT NULL,
[IS_PURCHEASABLE] [bit] NOT NULL,
CONSTRAINT [PK_tbl_ana_articles] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
INSERT INTO [dbo].[tbl_ana_Articles]
([ID_BRAND]
,[CODE]
,[CODFOR]
,[COD_ALT01]
,[COD_ALT02]
,[COD_ALT03]
,[ID_UOM]
,[IS_ACTIVE]
,[_ATTRIBUTES]
,[_DOCUMENTS]
,[_SEO]
,[_TRANSLATIONS]
,[_TAGS]
,[_NOTES]
,[_METADATA]
,[IS_B2B]
,[IS_B2C]
,[IS_PROMO]
,[IS_NEWS]
,[CAN_BE_RETURNED]
,[IS_SHIPPABLE]
,[HAS_SHIPPING_COSTS]
,[IS_PURCHEASABLE])
VALUES
(1
,'COD1'
,'SUPPLIER1'
,'CATEGORY1'
,'CATEGORY1-BIS'
,'CATEGORY2'
,1
,1
,'{ "url" : "https://old.com" }'
,''
,''
,''
,''
,''
,''
,1
,0
,0
,0
,1
,1
,0
,1);
DECLARE @FullOldUrl AS NVARCHAR(50) = 'https://old.com';
DECLARE @FullNewUrl AS NVARCHAR(50) = 'https://new.com';
--ARTICLE
PRINT('updating articles');
WHILE 1=1
BEGIN
UPDATE TOP (8196) [dbo].[tbl_ana_Articles]
SET [_ATTRIBUTES] = REPLACE([_ATTRIBUTES] , @FullOldUrl, @FullNewUrl)
,[_DOCUMENTS] = REPLACE([_DOCUMENTS] , @FullOldUrl, @FullNewUrl)
,[_SEO] = REPLACE([_SEO] , @FullOldUrl, @FullNewUrl)
,[_TRANSLATIONS] = REPLACE([_TRANSLATIONS] , @FullOldUrl, @FullNewUrl)
,[_TAGS] = REPLACE([_TAGS] , @FullOldUrl, @FullNewUrl)
,[_NOTES] = REPLACE([_NOTES] , @FullOldUrl, @FullNewUrl)
WHERE
[_ATTRIBUTES] like '%' + @FullOldUrl + '%' OR
[_DOCUMENTS] like '%' + @FullOldUrl + '%' OR
[_SEO] like '%' + @FullOldUrl + '%' OR
[_TRANSLATIONS] like '%' + @FullOldUrl + '%' OR
[_TAGS] like '%' + @FullOldUrl + '%' OR
[_NOTES] like '%' + @FullOldUrl + '%'
IF (@@ROWCOUNT <= 0)
BEGIN
BREAK;
END
END
SELECT * FROM [dbo].[tbl_ana_Articles]
PRINT('Finished');
这是玩具示例生成的执行计划(我无法获得真实场景的执行计划)。
https://www.brentozar.com/pastetheplan/?id=SJhVTcMTo
我真的很困惑是什么导致了这个问题
- 编辑 :
我再次运行该程序,发现如果我运行足够长的时间(~30 分钟),我就会得到正确的行为。所以显然我在这里遇到了性能问题。我对这里的性能并不感兴趣,因为在维护期间(仅当站点更改域时)很少(手动)使用该过程。
但我很好奇我是否在这里犯了一些错误,以获得如此低的性能
工人
由于查询计划中的两个原因,我可能会尝试避免在每次更新时一遍又一遍地访问主表:
过滤器运算符,因为在处理最大数据类型时无法推送谓词
万圣节保护桌线轴
通过在临时表中存储要处理的主键列表来提供一些手动阶段分离可能是个好主意:
至少,它可以帮助您更好地了解查询的哪一部分真正缓慢:查找数据或更新数据。
将主键添加到临时表也可能会加快速度,但那是你要赛跑的一匹马。
当然,按照 Martin 通过关注 Michael 的帖子所建议的那样跟踪当前的主键值也很有用:
还有其他的东西可以试验,比如
PAGLOCK
和RECOMPILE
提示UPDATE
,每个批次的大小,并行度,初始SELECT
填充临时表的批处理模式等。与所有查询调优建议一样,它们是否成功将取决于各种局部因素,正如一只聪明的鸟曾经说过的那样。