SQL Server - 使用聚集索引时如何存储数据页

Question

Niels Broertjes

Asked: 2021-12-24 22:29:14 +0800 CST2021-12-24 22:29:14 +0800 CST 2021-12-24 22:29:14 +0800 CST

在现有表上进行表分区并使用不同的文件组

772

我在主文件组上有一个现有表，我想对其进行分区。分区键在年份，这是一个计算列。我想以这样的方式对表进行分区，以便最终每年的数据都在它自己的文件组上。我首先想拆分 2 年，所以稍后我可以测试更多关于如何使用拆分命令拆分其他数据的内容。现在，我可以创建分区函数和方案，并且我还看到某年的数据在正确的分区中，但是我无法在正确的文件组中获取物理数据。似乎数据仍驻留在该主文件组中。我尝试重建索引，但这仍然没有将数据移动到正确的文件组中。最后，该表将具有聚集列存储索引，但我也尝试使用聚集行存储索引。我这样做的原因是因为 SQL Server 似乎不允许列存储索引拆分和合并非空分区（我尝试了一些拆分和合并但结果相同的东西），所以我认为这至少可以工作。如果您有任何建议或意见，请在此填写。顺便说一句，我正在使用 SQL Server 2019。

现在对于代码，我使用 Stackoverflow2013 数据库：

use StackOverflow2013;
go

-- Create file groups for partitions
alter database [StackOverflow2013]
add filegroup StackOverflow2013_2008;

ALTER DATABASE [StackOverflow2013]
    ADD FILE 
    (
    NAME = [StackOverflow2013_2008],
    FILENAME = 'E:\DATA\StackOverflow2013_2008.ndf',
        SIZE = 1024 KB, 
        MAXSIZE = UNLIMITED, 
        FILEGROWTH = 512 MB
    ) TO FILEGROUP [StackOverflow2013_2008]
    
alter database [StackOverflow2013]
add filegroup StackOverflow2013_2009;

ALTER DATABASE [StackOverflow2013]
    ADD FILE 
    (
    NAME = [StackOverflow2013_2009],
    FILENAME = 'E:\DATA\StackOverflow2013_2009.ndf',
        SIZE = 1024 KB, 
        MAXSIZE = UNLIMITED, 
        FILEGROWTH = 512 MB
    ) TO FILEGROUP [StackOverflow2013_2009]


-- Drop the current default index, we want to build one later on the partition key
ALTER TABLE [dbo].[Comments] DROP CONSTRAINT [PK_Comments_Id] WITH ( ONLINE = OFF )

-- Add partition key column
alter table [StackOverflow2013].[dbo].[Comments]
add [year] as (datepart(year, CreationDate));

go
-- Add partition function based on year 
-- For now we only want 2008 and 2009, other years will be migrated later to test with split function
create partition function fun_Comments(int)
as range left for values (2008, 2009);

-- Add partition scheme
create partition scheme scheme_Comments
as partition fun_Comments
to (StackOverflow2013_2008, StackOverflow2013_2009, [Primary]);

-- Check the partition numbers and who's next
SELECT DestinationId = DestinationDataSpaces.destination_id
    ,FilegroupName = Filegroups.name
    ,PartitionHighBoundaryValue = PartitionRangeValues.value
    ,IsNextUsed = CASE 
        WHEN DestinationDataSpaces.destination_id > 1
            AND LAG(PartitionRangeValues.value, 1) OVER (
                ORDER BY DestinationDataSpaces.destination_id ASC
                ) IS NULL
            THEN 1
        ELSE 0
        END
FROM sys.partition_schemes AS PartitionSchemes
INNER JOIN sys.destination_data_spaces AS DestinationDataSpaces ON PartitionSchemes.data_space_id = DestinationDataSpaces.partition_scheme_id
INNER JOIN sys.filegroups AS Filegroups ON DestinationDataSpaces.data_space_id = Filegroups.data_space_id
LEFT OUTER JOIN sys.partition_range_values AS PartitionRangeValues ON PartitionSchemes.function_id = PartitionRangeValues.function_id
    AND DestinationDataSpaces.destination_id = PartitionRangeValues.boundary_id
WHERE PartitionSchemes.name = N'scheme_Comments'
ORDER BY DestinationId ASC;

检查分区 1 的行


SELECT * FROM Comments 
WHERE $PARTITION.fun_Comments(year) = 1;

检查分区 2 的行

SELECT * FROM Comments 
WHERE $PARTITION.fun_Comments(year) = 2;

检查文件大小

（非常大的查询）

-- 创建新的聚集索引以正确分布数据

create clustered index [CCIX_Comments] ON [dbo].[Comments] (year)

再次检查文件大小

所以在我看来，所有数据实际上仍在主文件组中，因为新文件组是 emtpy。该表为 7 GB，因此我至少希望其中有一些数据。

所以基本上我的问题是，在这种情况下，如何正确地在文件组中的文件上重新分配数据？

1 个回答

Voted

MBuschi · Answer 1 · 2021-12-25T01:20:50+08:00

Best Answer

MBuschi

2021-12-25T01:20:50+08:002021-12-25T01:20:50+08:00

您的 create index 语句需要 ON 子句才能使用您创建的分区方案：

create clustered index [CCIX_Comments] ON [dbo].[Comments]
( [year] ASC )
ON [scheme_Comments]([year])

2

在现有表上进行表分区并使用不同的文件组

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

在现有表上进行表分区并使用不同的文件组

1 个回答

相关问题