简短版本:有没有办法获取在分区表上创建的聚集索引所使用的 FILESTREAM 数据的分区方案?
更长的版本: 假设您想要对存储 FILESTREAM 数据的表进行分区,那么文档会说:
如果表已分区,则必须包含 FILESTREAM_ON 子句,并且必须指定 FILESTREAM 文件组的分区方案,该方案使用与表的分区方案相同的分区函数和分区列。否则,会引发错误。
因此,您为行和文件流数据创建文件组、一个分区函数和两个分区方案(再次分别用于行和文件流数据),如下所示:
USE [master]
GO
CREATE DATABASE [FSPartitionTest]
ON PRIMARY
( NAME = N'FSPartitionTest', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\DATA\FSPartitionTest.mdf'),
FILEGROUP [DataPartitionA]
( NAME = N'Data_A', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\DATA\Data_A.ndf'),
FILEGROUP [DataPartitionB]
( NAME = N'Data_B', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\DATA\Data_B.ndf'),
FILEGROUP [FSPartitionA] CONTAINS FILESTREAM DEFAULT
( NAME = N'FS_A', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\DATA\FS_A'),
FILEGROUP [FSPartitionB] CONTAINS FILESTREAM
( NAME = N'FS_B', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\DATA\FS_B')
LOG ON
( NAME = N'FSPartitionTest_log', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\DATA\FSPartitionTest_log.ldf')
GO
USE [FSPartitionTest]
GO
CREATE PARTITION FUNCTION [APartitionFunction] (INT)
AS RANGE LEFT FOR VALUES (1);
GO
CREATE PARTITION SCHEME [DataPartitionScheme]
AS PARTITION [APartitionFunction]
TO ([DataPartitionA], [DataPartitionB]);
GO
CREATE PARTITION SCHEME [FSPartitionScheme]
AS PARTITION [APartitionFunction]
TO ([FSPartitionA], [FSPartitionB]);
GO
CREATE TABLE [FilestreamTable] (
[Partition] INT NOT NULL
, [Id] [uniqueidentifier] ROWGUIDCOL NOT NULL
CONSTRAINT [UX_FilestreamTable_Id] UNIQUE NONCLUSTERED ON [PRIMARY]
, [FilestreamData] VARBINARY(MAX) FILESTREAM NULL
, INDEX [UX_FilestreamTable_Partition_Id] UNIQUE CLUSTERED (
[Partition],
[Id]
) ON [DataPartitionScheme]([Partition]) FILESTREAM_ON [FSPartitionScheme]
) ON [DataPartitionScheme]([Partition]) FILESTREAM_ON [FSPartitionScheme]
GO
然后您可以像这样查询分区方案和索引:
SELECT *
FROM sys.partition_schemes
SELECT T.[name], I.[name], I.[data_space_id]
FROM sys.tables AS T
JOIN sys.indexes AS I
ON T.[object_id] = I.[object_id]
两种分区方案均显示出来,它们都使用相同的分区函数,并且聚集索引引用了 中的DataPartitionScheme
。sys.indexes
但是,未引用用于 FILESTREAM 数据的分区方案。在这种情况下,只有一个其他分区方案使用相同的分区函数并将文件流文件组作为目标。对于任何实际场景,我们都会完成。没有人会添加另一个分区方案,对吗?
CREATE PARTITION SCHEME [FSPartitionSchemeB]
AS PARTITION [APartitionFunction]
TO ([FSPartitionB], [FSPartitionA]);
GO
CREATE TABLE [FilestreamTableB] (
[Partition] INT NOT NULL
, [Id] [uniqueidentifier] ROWGUIDCOL NOT NULL
CONSTRAINT [UX_FilestreamTableB_Id] UNIQUE NONCLUSTERED ON [PRIMARY]
, [FilestreamData] VARBINARY(MAX) FILESTREAM NULL
, INDEX [UX_FilestreamTableB_Partition_Id] UNIQUE CLUSTERED (
[Partition],
[Id]
) ON [DataPartitionScheme]([Partition]) FILESTREAM_ON [FSPartitionSchemeB]
) ON [DataPartitionScheme]([Partition]) FILESTREAM_ON [FSPartitionSchemeB]
GO
CREATE PARTITION SCHEME [FSPartitionSchemeC]
AS PARTITION [APartitionFunction]
ALL TO ([FSPartitionA]);
GO
CREATE TABLE [FilestreamTableC] (
[Partition] INT NOT NULL
, [Id] [uniqueidentifier] ROWGUIDCOL NOT NULL
CONSTRAINT [UX_FilestreamTableC_Id] UNIQUE NONCLUSTERED ON [PRIMARY]
, [FilestreamData] VARBINARY(MAX) FILESTREAM NULL
, INDEX [UX_FilestreamTableC_Partition_Id] UNIQUE CLUSTERED (
[Partition],
[Id]
) ON [DataPartitionScheme]([Partition]) FILESTREAM_ON [FSPartitionSchemeC]
) ON [DataPartitionScheme]([Partition]) FILESTREAM_ON [FSPartitionSchemeC]
GO
我们如何确定哪个表使用哪个分区方案?是否有任何视图直接引用data_space_id
FILESTREAM 数据的索引?
解决方法:
我注意到了一件事,但我不完全确定它是否 100% 有效。视图sys.partitions
有一filestream_filegroup_id
列:
SELECT T.[name]
, I.[name]
, I.[data_space_id]
, P.[partition_number]
, FG.[name]
FROM sys.tables AS T
JOIN sys.indexes AS I
ON T.[object_id] = I.[object_id]
JOIN sys.partitions AS P
ON I.[object_id] = P.[object_id]
AND I.[index_id] = P.[index_id]
JOIN sys.filegroups AS FG
ON P.[filestream_filegroup_id] = FG.[data_space_id]
该视图destination_data_spaces
还有一列destination_id
,我认为它与以下内容相匹配partition_number
:
SELECT PS.[name]
, DDS.[destination_id]
, FG.[name]
FROM sys.partition_schemes AS PS
JOIN sys.destination_data_spaces AS DDS
ON PS.[data_space_id] = DDS.[partition_scheme_id]
JOIN sys.filegroups AS FG
ON DDS.[data_space_id] = FG.[data_space_id]
因此,我想可以检查引用的文件组是否与每个分区destination_data_spaces
引用的文件组匹配:partitions
WITH A AS (
SELECT T.[name] AS [Table]
, I.[name] AS [Index]
, PS.[name] AS [PartitionScheme] -- for row data
, FSPS.[name] AS [FS_PartitionScheme] -- candidate
, FG.[name] AS [FS_DestinationFilegroup] -- referenced by partition scheme
, PFG.[name] AS [FS_PartitionFilegroup] -- referenced by partition
, CASE WHEN FG.[name] = PFG.[name] THEN 1 ELSE 0 END AS [FilegorupsMatch]
, COUNT(*) OVER (PARTITION BY T.[object_id], I.[index_id], FSPS.[data_space_id]) AS [NumPartitions]
, SUM(CASE WHEN FG.[name] = PFG.[name] THEN 1 ELSE 0 END) OVER (PARTITION BY T.[object_id], I.[index_id], FSPS.[data_space_id]) AS [SumMatches]
FROM sys.tables AS T
JOIN sys.indexes AS I
ON T.[object_id] = I.[object_id]
--partition scheme used for row data
JOIN sys.partition_schemes AS PS
ON I.[data_space_id] = PS.[data_space_id]
--look for candidate partition schemes, that are used for FILESTREAM data
--must use the same partition function
JOIN sys.partition_schemes AS FSPS
ON PS.[function_id] = FSPS.[function_id]
AND PS.[data_space_id] <> FSPS.[data_space_id]
--destination must be a FILESTREAM filegroup
JOIN sys.destination_data_spaces AS DDS
ON FSPS.[data_space_id] = DDS.[partition_scheme_id]
JOIN sys.filegroups AS FG
ON DDS.[data_space_id] = FG.[data_space_id]
AND FG.[type_desc] = 'FILESTREAM_DATA_FILEGROUP'
--get the partition where partition_number matches desination_id
JOIN sys.partitions AS P
ON I.[object_id] = P.[object_id]
AND I.[index_id] = P.[index_id]
AND P.[partition_number] = DDS.[destination_id]
--find the filegroup referenced by the partition
LEFT JOIN sys.filegroups AS PFG
ON P.[filestream_filegroup_id] = PFG.[data_space_id]
)
SELECT [Table], [Index], [PartitionScheme], [FS_PartitionScheme]
FROM A
WHERE [NumPartitions] = [SumMatches]
GROUP BY [Table], [Index], [PartitionScheme], [FS_PartitionScheme]
每个索引应该有一行,除非有些小丑这样做:
CREATE PARTITION SCHEME [FSPartitionSchemeB2]
AS PARTITION [APartitionFunction]
TO ([FSPartitionB], [FSPartitionA]);
GO
因此,如果有人可以确认确实总是partition_number
与匹配destination_id
,我们至少可以确定分区方案的唯一定义(不包括NEXT USED
),但仍然没有明确的data_space_id
。
找到了一种使用 PowerShell 的 SSMS 之外的方法。我使用了
dbatools
,但它可能也适用于SqlServer
模块。如果有管理视图提供此信息,仍然很有趣。