AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / dba / 问题 / 171350
Accepted
mendosi
mendosi
Asked: 2017-04-19 18:47:50 +0800 CST2017-04-19 18:47:50 +0800 CST 2017-04-19 18:47:50 +0800 CST

在 SQL Server 中展开分层数据

  • 772

我在数据仓库的暂存区中有一些表,我正在用从另一个系统中提取的一些扁平的、逗号分隔的文本中的数据填充这些表。当数据进入每个元素的父级层次结构时,将在标记为ParentCode01...的列ParentCode11中显示当前节点的直接父级所在的位置,ParentCode01顶级父级可以在任何列中(ParentCode11主要是NULL)。

CREATE TABLE CostCentreHierarchy (
    CostCentreCode varchar(10) NOT NULL CONSTRAINT pCostCentreCode_CostCentreHierarchy PRIMARY KEY,
    CostCentreDesc varchar(100),
    ValidFromDate varchar(10),
    ValidToDate varchar(10),
    ParentCode01 varchar(15),
    ParentDesc01 varchar(100),
    ParentCode02 varchar(15),
    ParentDesc02 varchar(100),
    ParentCode03 varchar(15),
    ParentDesc03 varchar(100),
    ParentCode04 varchar(15),
    ParentDesc04 varchar(100),
    ParentCode05 varchar(15),
    ParentDesc05 varchar(100),
    ParentCode06 varchar(15),
    ParentDesc06 varchar(100),
    ParentCode07 varchar(15),
    ParentDesc07 varchar(100),
    ParentCode08 varchar(15),
    ParentDesc08 varchar(100),
    ParentCode09 varchar(15),
    ParentDesc09 varchar(100),
    ParentCode10 varchar(15),
    ParentDesc10 varchar(100),
    ParentCode11 varchar(15),
    ParentDesc11 varchar(100));

INSERT INTO CostCentreHierarchy 
    (CostCentreCode, CostCentreDesc, ValidFromDate, ValidToDate, ParentCode01, ParentDesc01, ParentCode02, ParentDesc02, 
     ParentCode03, ParentDesc03, ParentCode04, ParentDesc04, ParentCode05, ParentDesc05, ParentCode06, ParentDesc06)
VALUES 
('0002000000', '0002000000', '01.07.1950', '31.12.9999', 'YA0201', 'YA0201', 'YA0200', 'YA0200', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000001', '0002000001', '01.07.1950', '31.12.9999', 'YA0301', 'YA0301', 'YA0300', 'YA0300', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000002', '0002000002', '01.07.1950', '31.12.9999', 'XA0101', 'XA0101', 'XA0100', 'XA0100', 'XA0000', 'Unit 3 - Admin', 'X00000', 'Branch B - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000003', '0002000003', '01.07.1950', '31.12.9999', 'YA0999', 'YA0999', 'YA0900', 'YA0900', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000004', '0002000004', '01.07.1950', '31.12.9999', 'YB0999', 'YB0999', 'YB0900', 'YB0900', 'YB0000', 'Unit 2 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000005', '0002000005', '01.07.1950', '31.12.9999', 'YA0101', 'YA0101', 'YA0100', 'YA0100', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000006', '0002000006', '01.07.1950', '31.12.9999', 'XA0999', 'XA0999', 'XA0900', 'XA0900', 'XA0000', 'Unit 3 - Admin', 'X00000', 'Branch B - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000007', '0002000007', '01.07.1950', '31.12.9999', 'YA0302', 'YA0302', 'YA0300', 'YA0300', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000008', '0002000008', '01.07.1950', '31.12.9999', 'YA0999', 'YA0999', 'YA0900', 'YA0900', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company'),
('0002000009', '0002000009', '01.07.1950', '31.12.9999', 'YA0999', 'YA0999', 'YA0900', 'YA0900', 'YA0000', 'Unit 1 - Admin', 'Y00000', 'Branch A - Admin', '1A', 'ADMINISTERED REPORTING', '1', 'Company');

当我将这些数据加载到我的数据仓库中时,我将它加载到一个具有如下父子关系的表中:

CREATE SEQUENCE CostCentreID_Sequence AS integer START WITH 1 NO CYCLE NO CACHE;
CREATE TABLE CostCentre (
    CostCentreID int CONSTRAINT DF_Sequence_CostCentreID_CostCentre DEFAULT NEXT VALUE FOR CostCentreID_Sequence NOT NULL,
    CostCentreCode varchar(15) Constraint uCostCentreCode_CostCentre Unique NOT NULL,
    CostCentreDesc varchar(100) NOT NULL,
    ValidFromDate date,
    ValidToDate date,
    ParentID int CONSTRAINT fParentID_CostCentre REFERENCES CostCentre(CostCentreID),
    CONSTRAINT pCostCentreID_CostCentre PRIMARY KEY CLUSTERED (CostCentreID))
    With (Data_Compression = Row);
CREATE INDEX iParentID ON CostCentre(ParentID);

因此,为了获得该格式,我有一个查询,该查询通过在每个级别中生成一个值来获取不同的值CostCentreCode,如下所示:CostCentreDescUNION

WITH unflattened AS (
    SELECT CCH.CostCentreCode,
        CCH.CostCentreDesc,
        CCH.ValidFromDate,
        CCH.ValidToDate,
        COALESCE(CCH.ParentCode01,'') AS ParentCostCentreCode
        FROM CostCentreHierarchy AS CCH
    UNION
    SELECT CCH.ParentCode01,
        CCH.ParentDesc01,
        NULL AS ValidFromDate,
        NULL AS ValidToDate,
        COALESCE(CCH.ParentCode02,'') AS ParentCostCentreCode
        FROM CostCentreHierarchy AS CCH
    UNION
    SELECT CCH.ParentCode02,
        CCH.ParentDesc02,
        NULL AS ValidFromDate,
        NULL AS ValidToDate,
        COALESCE(CCH.ParentCode03,'') AS ParentCostCentreCode
        FROM CostCentreHierarchy AS CCH
    UNION
    SELECT CCH.ParentCode03,
        CCH.ParentDesc03,
        NULL AS ValidFromDate,
        NULL AS ValidToDate,
        COALESCE(CCH.ParentCode04,'') AS ParentCostCentreCode
        FROM CostCentreHierarchy AS CCH
    UNION
    SELECT CCH.ParentCode04,
        CCH.ParentDesc04,
        NULL AS ValidFromDate,
        NULL AS ValidToDate,
        COALESCE(CCH.ParentCode05,'') AS ParentCostCentreCode
        FROM CostCentreHierarchy AS CCH
    UNION
    SELECT CCH.ParentCode05,
        CCH.ParentDesc05,
        NULL AS ValidFromDate,
        NULL AS ValidToDate,
        COALESCE(CCH.ParentCode06,'') AS ParentCostCentreCode
        FROM CostCentreHierarchy AS CCH
    UNION
    SELECT CCH.ParentCode06,
        CCH.ParentDesc06,
        NULL AS ValidFromDate,
        NULL AS ValidToDate,
        COALESCE(CCH.ParentCode07,'') AS ParentCostCentreCode
        FROM CostCentreHierarchy AS CCH
    UNION
    SELECT CCH.ParentCode07,
        CCH.ParentDesc07,
        NULL AS ValidFromDate,
        NULL AS ValidToDate,
        COALESCE(CCH.ParentCode08,'') AS ParentCostCentreCode
        FROM CostCentreHierarchy AS CCH
    UNION
    SELECT CCH.ParentCode08,
        CCH.ParentDesc08,
        NULL AS ValidFromDate,
        NULL AS ValidToDate,
        COALESCE(CCH.ParentCode09,'') AS ParentCostCentreCode
        FROM CostCentreHierarchy AS CCH
    UNION
    SELECT CCH.ParentCode09,
        CCH.ParentDesc09,
        NULL AS ValidFromDate,
        NULL AS ValidToDate,
        COALESCE(CCH.ParentCode10,'') AS ParentCostCentreCode
        FROM CostCentreHierarchy AS CCH
    UNION
    SELECT CCH.ParentCode10,
        CCH.ParentDesc10,
        NULL AS ValidFromDate,
        NULL AS ValidToDate,
        COALESCE(CCH.ParentCode11,'') AS ParentCostCentreCode
        FROM CostCentreHierarchy AS CCH
    UNION
    SELECT CCH.ParentCode11,
        CCH.ParentDesc11,
        NULL AS ValidFromDate,
        NULL AS ValidToDate,
        '' AS ParentCostCentreCode
        FROM CostCentreHierarchy AS CCH)
SELECT u.CostCentreCode,
    u.CostCentreDesc,
    u.ValidFromDate,
    u.ValidToDate,
    IIF(unflattened.ParentCostCentreCode = '', NULL, u.ParentCostCentreCode) AS ParentCostCentreCode
  FROM unflattened AS u
  WHERE u.CostCentreCode <> '';

也许我想太多了,但我真的不喜欢这个,因为它目前正在进行 12 次表扫描以获得结果。优化器目前也在选择对数据进行排序并进行合并连接以实现,UNION但在这个阶段我并不担心它会选择这样做。

有没有另一种方法可以做到这一点,每次层次结构更深一层时都不会导致额外的表扫描?

sql-server performance
  • 1 1 个回答
  • 1263 Views

1 个回答

  • Voted
  1. Best Answer
    Mikael Eriksson
    2017-04-19T21:40:26+08:002017-04-19T21:40:26+08:00

    您可以使用交叉应用和表值构造函数。

    select U.CostCentreCode,
           U.CostCentreDesc,
           C.ValidFromDate,
           C.ValidToDate,
           coalesce(U.ParentCostCentreCode, '') as ParentCostCentreCode
    from dbo.CostCentreHierarchy as C
      cross apply (values(C.CostCentreCode, C.CostCentreDesc, C.ParentCode01),
                         (C.ParentCode01, C.ParentDesc01, C.ParentCode02),
                         (C.ParentCode02, C.ParentDesc02, C.ParentCode03),
                         (C.ParentCode03, C.ParentDesc03, C.ParentCode04),
                         (C.ParentCode04, C.ParentDesc04, C.ParentCode05),
                         (C.ParentCode05, C.ParentDesc05, C.ParentCode06),
                         (C.ParentCode06, C.ParentDesc06, C.ParentCode07),
                         (C.ParentCode07, C.ParentDesc07, C.ParentCode08),
                         (C.ParentCode08, C.ParentDesc08, C.ParentCode09),
                         (C.ParentCode09, C.ParentDesc09, C.ParentCode10),
                         (C.ParentCode10, C.ParentDesc10, C.ParentCode11),
                         (C.ParentCode11, C.ParentDesc11, '')
                  ) as U(CostCentreCode, CostCentreDesc, ParentCostCentreCode)
    where U.CostCentreCode <> '';
    

    在此处输入图像描述

    优化器目前也在选择对数据进行排序并进行合并连接以实现 UNION

    那是因为 union 试图删除重复的行。如果你不需要,你应该使用union all。

    如果删除重复项是您想要的,那么您应该添加distinct上面的查询。

    select distinct 
           U.CostCentreCode,
           U.CostCentreDesc,
           C.ValidFromDate,
    ....
    
    • 3

相关问题

  • 死锁的主要原因是什么,可以预防吗?

  • 如何确定是否需要或需要索引

  • 我在哪里可以找到mysql慢日志?

  • 如何优化大型数据库的 mysqldump?

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    连接到 PostgreSQL 服务器:致命:主机没有 pg_hba.conf 条目

    • 12 个回答
  • Marko Smith

    如何让sqlplus的输出出现在一行中?

    • 3 个回答
  • Marko Smith

    选择具有最大日期或最晚日期的日期

    • 3 个回答
  • Marko Smith

    如何列出 PostgreSQL 中的所有模式?

    • 4 个回答
  • Marko Smith

    列出指定表的所有列

    • 5 个回答
  • Marko Smith

    如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

    • 4 个回答
  • Marko Smith

    你如何mysqldump特定的表?

    • 4 个回答
  • Marko Smith

    使用 psql 列出数据库权限

    • 10 个回答
  • Marko Smith

    如何从 PostgreSQL 中的选择查询中将值插入表中?

    • 4 个回答
  • Marko Smith

    如何使用 psql 列出所有数据库和表?

    • 7 个回答
  • Martin Hope
    Jin 连接到 PostgreSQL 服务器:致命:主机没有 pg_hba.conf 条目 2014-12-02 02:54:58 +0800 CST
  • Martin Hope
    Stéphane 如何列出 PostgreSQL 中的所有模式? 2013-04-16 11:19:16 +0800 CST
  • Martin Hope
    Mike Walsh 为什么事务日志不断增长或空间不足? 2012-12-05 18:11:22 +0800 CST
  • Martin Hope
    Stephane Rolland 列出指定表的所有列 2012-08-14 04:44:44 +0800 CST
  • Martin Hope
    haxney MySQL 能否合理地对数十亿行执行查询? 2012-07-03 11:36:13 +0800 CST
  • Martin Hope
    qazwsx 如何监控大型 .sql 文件的导入进度? 2012-05-03 08:54:41 +0800 CST
  • Martin Hope
    markdorison 你如何mysqldump特定的表? 2011-12-17 12:39:37 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 对 SQL 查询进行计时? 2011-06-04 02:22:54 +0800 CST
  • Martin Hope
    Jonas 如何从 PostgreSQL 中的选择查询中将值插入表中? 2011-05-28 00:33:05 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 列出所有数据库和表? 2011-02-18 00:45:49 +0800 CST

热门标签

sql-server mysql postgresql sql-server-2014 sql-server-2016 oracle sql-server-2008 database-design query-performance sql-server-2017

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve