我在数据仓库的暂存区中有一些表,我正在用从另一个系统中提取的一些扁平的、逗号分隔的文本中的数据填充这些表。当数据进入每个元素的父级层次结构时,将在标记为ParentCode01
...的列ParentCode11
中显示当前节点的直接父级所在的位置,ParentCode01
顶级父级可以在任何列中(ParentCode11
主要是NULL
)。
CREATE TABLE CostCentreHierarchy (
CostCentreCode varchar(10) NOT NULL CONSTRAINT pCostCentreCode_CostCentreHierarchy PRIMARY KEY,
CostCentreDesc varchar(100),
ValidFromDate varchar(10),
ValidToDate varchar(10),
ParentCode01 varchar(15),
ParentDesc01 varchar(100),
ParentCode02 varchar(15),
ParentDesc02 varchar(100),
ParentCode03 varchar(15),
ParentDesc03 varchar(100),
ParentCode04 varchar(15),
ParentDesc04 varchar(100),
ParentCode05 varchar(15),
ParentDesc05 varchar(100),
ParentCode06 varchar(15),
ParentDesc06 varchar(100),
ParentCode07 varchar(15),
ParentDesc07 varchar(100),
ParentCode08 varchar(15),
ParentDesc08 varchar(100),
ParentCode09 varchar(15),
ParentDesc09 varchar(100),
ParentCode10 varchar(15),
ParentDesc10 varchar(100),
ParentCode11 varchar(15),
ParentDesc11 varchar(100));
INSERT INTO CostCentreHierarchy
(CostCentreCode, CostCentreDesc, ValidFromDate, ValidToDate, ParentCode01, ParentDesc01, ParentCode02, ParentDesc02,
ParentCode03, ParentDesc03, ParentCode04, ParentDesc04, ParentCode05, ParentDesc05, ParentCode06, ParentDesc06)
VALUES
('0002000000', '0002000000', '01.07.1950', '31.12.9999', 'YA0201', 'YA0201', 'YA0200', 'YA0200', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000001', '0002000001', '01.07.1950', '31.12.9999', 'YA0301', 'YA0301', 'YA0300', 'YA0300', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000002', '0002000002', '01.07.1950', '31.12.9999', 'XA0101', 'XA0101', 'XA0100', 'XA0100', 'XA0000', 'Unit 3 - Admin', 'X00000', 'Branch B - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000003', '0002000003', '01.07.1950', '31.12.9999', 'YA0999', 'YA0999', 'YA0900', 'YA0900', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000004', '0002000004', '01.07.1950', '31.12.9999', 'YB0999', 'YB0999', 'YB0900', 'YB0900', 'YB0000', 'Unit 2 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000005', '0002000005', '01.07.1950', '31.12.9999', 'YA0101', 'YA0101', 'YA0100', 'YA0100', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000006', '0002000006', '01.07.1950', '31.12.9999', 'XA0999', 'XA0999', 'XA0900', 'XA0900', 'XA0000', 'Unit 3 - Admin', 'X00000', 'Branch B - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000007', '0002000007', '01.07.1950', '31.12.9999', 'YA0302', 'YA0302', 'YA0300', 'YA0300', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000008', '0002000008', '01.07.1950', '31.12.9999', 'YA0999', 'YA0999', 'YA0900', 'YA0900', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000009', '0002000009', '01.07.1950', '31.12.9999', 'YA0999', 'YA0999', 'YA0900', 'YA0900', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company');
当我将这些数据加载到我的数据仓库中时,我将它加载到一个具有如下父子关系的表中:
CREATE SEQUENCE CostCentreID_Sequence AS integer START WITH 1 NO CYCLE NO CACHE;
CREATE TABLE CostCentre (
CostCentreID int CONSTRAINT DF_Sequence_CostCentreID_CostCentre DEFAULT NEXT VALUE FOR CostCentreID_Sequence NOT NULL,
CostCentreCode varchar(15) Constraint uCostCentreCode_CostCentre Unique NOT NULL,
CostCentreDesc varchar(100) NOT NULL,
ValidFromDate date,
ValidToDate date,
ParentID int CONSTRAINT fParentID_CostCentre REFERENCES CostCentre(CostCentreID),
CONSTRAINT pCostCentreID_CostCentre PRIMARY KEY CLUSTERED (CostCentreID))
With (Data_Compression = Row);
CREATE INDEX iParentID ON CostCentre(ParentID);
因此,为了获得该格式,我有一个查询,该查询通过在每个级别中生成一个值来获取不同的值CostCentreCode
,如下所示:CostCentreDesc
UNION
WITH unflattened AS (
SELECT CCH.CostCentreCode,
CCH.CostCentreDesc,
CCH.ValidFromDate,
CCH.ValidToDate,
COALESCE(CCH.ParentCode01,'') AS ParentCostCentreCode
FROM CostCentreHierarchy AS CCH
UNION
SELECT CCH.ParentCode01,
CCH.ParentDesc01,
NULL AS ValidFromDate,
NULL AS ValidToDate,
COALESCE(CCH.ParentCode02,'') AS ParentCostCentreCode
FROM CostCentreHierarchy AS CCH
UNION
SELECT CCH.ParentCode02,
CCH.ParentDesc02,
NULL AS ValidFromDate,
NULL AS ValidToDate,
COALESCE(CCH.ParentCode03,'') AS ParentCostCentreCode
FROM CostCentreHierarchy AS CCH
UNION
SELECT CCH.ParentCode03,
CCH.ParentDesc03,
NULL AS ValidFromDate,
NULL AS ValidToDate,
COALESCE(CCH.ParentCode04,'') AS ParentCostCentreCode
FROM CostCentreHierarchy AS CCH
UNION
SELECT CCH.ParentCode04,
CCH.ParentDesc04,
NULL AS ValidFromDate,
NULL AS ValidToDate,
COALESCE(CCH.ParentCode05,'') AS ParentCostCentreCode
FROM CostCentreHierarchy AS CCH
UNION
SELECT CCH.ParentCode05,
CCH.ParentDesc05,
NULL AS ValidFromDate,
NULL AS ValidToDate,
COALESCE(CCH.ParentCode06,'') AS ParentCostCentreCode
FROM CostCentreHierarchy AS CCH
UNION
SELECT CCH.ParentCode06,
CCH.ParentDesc06,
NULL AS ValidFromDate,
NULL AS ValidToDate,
COALESCE(CCH.ParentCode07,'') AS ParentCostCentreCode
FROM CostCentreHierarchy AS CCH
UNION
SELECT CCH.ParentCode07,
CCH.ParentDesc07,
NULL AS ValidFromDate,
NULL AS ValidToDate,
COALESCE(CCH.ParentCode08,'') AS ParentCostCentreCode
FROM CostCentreHierarchy AS CCH
UNION
SELECT CCH.ParentCode08,
CCH.ParentDesc08,
NULL AS ValidFromDate,
NULL AS ValidToDate,
COALESCE(CCH.ParentCode09,'') AS ParentCostCentreCode
FROM CostCentreHierarchy AS CCH
UNION
SELECT CCH.ParentCode09,
CCH.ParentDesc09,
NULL AS ValidFromDate,
NULL AS ValidToDate,
COALESCE(CCH.ParentCode10,'') AS ParentCostCentreCode
FROM CostCentreHierarchy AS CCH
UNION
SELECT CCH.ParentCode10,
CCH.ParentDesc10,
NULL AS ValidFromDate,
NULL AS ValidToDate,
COALESCE(CCH.ParentCode11,'') AS ParentCostCentreCode
FROM CostCentreHierarchy AS CCH
UNION
SELECT CCH.ParentCode11,
CCH.ParentDesc11,
NULL AS ValidFromDate,
NULL AS ValidToDate,
'' AS ParentCostCentreCode
FROM CostCentreHierarchy AS CCH)
SELECT u.CostCentreCode,
u.CostCentreDesc,
u.ValidFromDate,
u.ValidToDate,
IIF(unflattened.ParentCostCentreCode = '', NULL, u.ParentCostCentreCode) AS ParentCostCentreCode
FROM unflattened AS u
WHERE u.CostCentreCode <> '';
也许我想太多了,但我真的不喜欢这个,因为它目前正在进行 12 次表扫描以获得结果。优化器目前也在选择对数据进行排序并进行合并连接以实现,UNION
但在这个阶段我并不担心它会选择这样做。
有没有另一种方法可以做到这一点,每次层次结构更深一层时都不会导致额外的表扫描?
您可以使用交叉应用和表值构造函数。
那是因为 union 试图删除重复的行。如果你不需要,你应该使用
union all
。如果删除重复项是您想要的,那么您应该添加
distinct
上面的查询。