Michael B提出的问题 -dba

Michael B

Asked: 2021-02-17 19:21:43 +0800 CST

当 TVP 数值变大时，使用 TVP 的程序会变慢？

3

遗留应用程序有一个夜间作业，它使用 TVP 重复调用一些存储过程，并按顺序传入需要处理的 10,000 个 id 的批次。现在 ID 数以百万计，看来这个过程需要的时间明显更长。每晚运行的批处理调用数量大致相同，但从分析来看，该过程似乎变得越来越慢。我们检查了通常的罪魁祸首，重建了索引并更新了正在使用的表上的统计信息，并尝试在程序上重新编译。但没有什么能解决回归问题。

该过程进行一些处理并返回一些结果，每个结果的基数可能为 10000 行。我的一位同事查看了它并通过简单地将以下内容添加到查询顶部来更新存储过程来修复性能回归：

select id into #t from @ids

@ids并用替换所有用法#t。

我对这个简单的修复感到惊讶，并试图更多地理解它。我试图创建一个非常简单的复制品。

create table dbo.ids
(
   id int primary key clustered,
   timestamp
);

create type dbo.tvp as table(id int primary key clustered)

insert into dbo.ids(id)
select row_number() over (order by 1/0)
from string_split(space(1414),' ') a,string_split(space(1414),' ') b
go
create or alter procedure dbo.tvp_proc
(
    @ids dbo.tvp readonly
)
as
begin
    declare @_ int = 0, @r int = 5;
    while(@r > 0)
        select @_ = count(*), @r -= 1
        from dbo.ids i
        where exists (
            select 1
            from @ids t
            where t.id = i.id     
        );
end 
go
create or alter procedure dbo.temp_proc
(
    @ids dbo.tvp readonly
)
as
begin
    select * into #t from @ids
    declare @_ int = 0, @r int = 5;
    while(@r > 0)
        select @_ = count(*), @r -= 1
        from dbo.ids i
        where exists (
            select 1
            from #t t
            where t.id = i.id     
        );
end

这是我的简单基准。

set nocount on;
declare @s nvarchar(4000)=
'declare @ids tvp;
insert into @ids(id)
select @init + row_number() over (order by 1/0)
from string_split(space(99),char(32)) a,string_split(space(99),char(32)) b
declare @s datetime2 = sysutcdatetime()
create table #d(_ int)
insert into #d
exec dbo.tvp_proc @ids
print concat(right(concat(space(10),format(@init,''N0'')),10),char(9),datediff(ms, @s, sysutcdatetime()))',
@params nvarchar(20)=N'@init int'
print 'tvp result'
exec sp_executesql @s,@params,10000000
exec sp_executesql @s,@params,1000000
exec sp_executesql @s,@params,100000
exec sp_executesql @s,@params,10000
select @s=replace(@s,'tvp_proc','temp_proc')
print 'temp table result'
exec sp_executesql @s,@params,10000000
exec sp_executesql @s,@params,1000000
exec sp_executesql @s,@params,100000
exec sp_executesql @s,@params,10000

在我的机器上运行这个基准会产生以下结果：

tvp result
10,000,000  653
 1,000,000  341
   100,000  42
    10,000  12
temp table result
10,000,000  52
 1,000,000  60
   100,000  57
    10,000  59

结果表明，tvp 方法似乎随着内部 id 变大而变慢，而临时表保持相当一致。任何人都知道为什么引用具有较大值的 tvp 比临时表慢？

Michael B

Asked: 2019-08-07 10:57:46 +0800 CST

如何检测正在更改我的范围配置的内容？

4

我注意到我有一个不断重置作用域配置的数据库，即 maxdop。

是否有任何日志显示谁或什么进程导致了这些配置更改？我在 Microsoft SQL Server 2016 (SP1-CU7-GDR) (KB4057119) - 13.0.4466.4

Michael B

Asked: 2019-06-15 06:33:22 +0800 CST

如何强制 SQL Server 通过视图使用我的空间索引？

6

我有一些表，其中包含存储为 lat long 对的属性的事务。（在我的示例架构中，列和数据点更多）。

一个常见的请求是查找在特定点 X 英里范围内发生的交易，并且只检索附近每个物业发生的 5 次最近的交易。

为了完成这项工作，我决定添加一个封装最新逻辑的视图：

create or alter view dbo.v_example
with schemabinding as
select example_id
      ,transaction_dt
      ,latitude
      ,longitude
      ,latlong
      ,most_recent= iif(row_number() over (partition by latitude,longitude order by transaction_dt desc) < 5,1,null)
from dbo.example;

因此查询可能如下所示：

select *
from dbo.v_example
where latlong.STDistance(geography::Point(40,-74,4326)) <=1609.344e1
and most_recent = 1

不幸的是，当我通过视图查询时，SQL Server 不想使用空间索引。如果我删除schemabinding并尝试在视图上添加提示，我会发现查询处理器无法创建计划。

如何封装逻辑并仍然让它使用我的空间索引？

这是一个带有示例数据和计划形状的 db<>fiddle。

该表要大得多，扫描它然后进行聚集索引查找然后找到附近的点要慢得多。

Michael B

Asked: 2019-05-03 06:13:46 +0800 CST

为什么这个派生表可以提高性能？

18

我有一个以 json 字符串作为参数的查询。json 是一个纬度、经度对的数组。示例输入可能如下。

declare @json nvarchar(max)= N'[[40.7592024,-73.9771259],[40.7126492,-74.0120867]
,[41.8662374,-87.6908788],[37.784873,-122.4056546]]';

它调用一个 TVF，计算一个地理点周围 1、3、5、10 英里距离的 POI 数量。

create or alter function [dbo].[fn_poi_in_dist](@geo geography)
returns table
with schemabinding as
return 
select count_1  = sum(iif(LatLong.STDistance(@geo) <= 1609.344e * 1,1,0e))
      ,count_3  = sum(iif(LatLong.STDistance(@geo) <= 1609.344e * 3,1,0e))
      ,count_5  = sum(iif(LatLong.STDistance(@geo) <= 1609.344e * 5,1,0e))
      ,count_10 = count(*)
from dbo.point_of_interest
where LatLong.STDistance(@geo) <= 1609.344e * 10

json 查询的目的是批量调用这个函数。如果我这样称呼它，那么性能非常差，只需 4 分就需要将近 10 秒：

select row=[key]
      ,count_1
      ,count_3
      ,count_5
      ,count_10
from openjson(@json)
cross apply dbo.fn_poi_in_dist(
            geography::Point(
                convert(float,json_value(value,'$[0]'))
               ,convert(float,json_value(value,'$[1]'))
               ,4326))

计划 = https://www.brentozar.com/pastetheplan/?id=HJDCYd_o4

但是，将地理结构移动到派生表中会导致性能显着提高，大约 1 秒即可完成查询。

select row=[key]
      ,count_1
      ,count_3
      ,count_5
      ,count_10
from (
select [key]
      ,geo = geography::Point(
                convert(float,json_value(value,'$[0]'))
               ,convert(float,json_value(value,'$[1]'))
               ,4326)
from openjson(@json)
) a
cross apply dbo.fn_poi_in_dist(geo)

计划 = https://www.brentozar.com/pastetheplan/?id=HkSS5_OoE

这些计划看起来几乎相同。既不使用并行性，也都使用空间索引。慢速计划还有一个额外的惰性线轴，我可以通过提示消除它option(no_performance_spool)。但是查询性能没有改变。它仍然慢得多。

使用添加的提示在批处理中运行这两个查询将平等地衡量两个查询。

Sql server 版本 = Microsoft SQL Server 2016 (SP1-CU7-GDR) (KB4057119) - 13.0.4466.4 (X64)

所以我的问题是为什么这很重要？我怎么知道何时应该计算派生表中的值？

Michael B

Asked: 2014-09-13 09:40:31 +0800 CST

访问写入与视图和 insteadof 触发器冲突？

0

我在链接到 sql-server 表的 msaccess 数据库中有一个遗留链接表。

链接表会定期同步到一组不同的表，这些表由于各种原因被拆分并且列数多于用户希望看到的列数。我想摆脱遗留表和同步它的头痛，通过创建一个视图而不是触发器来处理数据访问。假设我的观点是这样的：

Create view Compatibility_View
with schemabinding,view_metadata
as
select   a.Name,a.col_1 aliasedName,a.col_2 aliasedName2
        ,b.col_1 aliasNamed3 ,convert(float,b.col_2) aliasedName4
        ,a.modified_date,a.modified_by
from dbo.some a
join dbo.other b
  on a.some_Id = b.some_Id
where a.isActive = 1;

而且我没有触发器来软删除记录并适当地路由属性。

我还会自动更新 modified_date 和 modified_by 列。（这些过去由遗留表上的触发器处理）。

不幸的是，执行第二组要求会导致访问在每次编辑数据已更改后发出警告。跟踪查询访问问题，它似乎正在对视图进行更新，确保所有其他列没有更改。例如

update "dbo"."Compatibility_View"
set "aliasedName"='v'
where "aliasedName2"=45 and "aliasedName3"='horse'
and "aliasedName4"=1.3e2 and "modified_date"='09/10/2014 12:35:00.00'
and "modified_by"='AD\SomeUser'

由于我的触发器更改了modified_date和modifed_by列访问将其视为差异并发出丑陋的错误提示？

用户喜欢在访问中看到modified_date和modified_by列，因此删除这部分并不是一个真正的选择。我怎样才能在视图中获得这些列而不用担心访问？

当 TVP 数值变大时，使用 TVP 的程序会变慢？

如何检测正在更改我的范围配置的内容？

如何强制 SQL Server 通过视图使用我的空间索引？

为什么这个派生表可以提高性能？

访问写入与视图和 insteadof 触发器冲突？

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

Michael B's questions