AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / dba / 问题 / 9829
Accepted
Martin Smith
Martin Smith
Asked: 2011-02-24 09:19:23 +0800 CST2011-02-24 09:19:23 +0800 CST 2011-02-24 09:19:23 +0800 CST

堆上的非聚集索引与聚集索引的性能

  • 772

这份 2007 年的白皮书比较了单个 select/insert/delete/update 和 range select 语句在组织为聚集索引的表上与在与 CI 相同的键列上具有非聚集索引的堆组织的表上的性能桌子。

通常,聚集索引选项在测试中表现更好,因为只需维护一个结构并且不需要书签查找。

本文未涵盖的一个可能有趣的案例是堆上的非聚集索引与聚集索引上的非聚集索引之间的比较。在那种情况下,我预计堆甚至可能会表现得更好,因为一旦在 NCI 叶级 SQL Server 有一个 RID 可以直接遵循,而不需要遍历聚集索引。

有没有人知道在这个领域已经进行了类似的正式测试,如果有,结果如何?

sql-server clustered-index
  • 3 3 个回答
  • 6085 Views

3 个回答

  • Voted
  1. Best Answer
    Filip De Vos
    2011-03-05T14:57:50+08:002011-03-05T14:57:50+08:00

    为了检查您的请求,我按照此方案创建了 2 个表:

    • 代表余额信息的 790 万条记录。
    • 一个从 1 到 790 万的身份字段
    • 一个数字字段,将记录分组到大约 500k 组中。

    调用的第一个表heap在该字段上有一个非聚集索引group。调用的第二个表在调用clust的顺序字段上有key一个聚集索引,在该字段上有一个非聚集索引group

    测试在具有 2 个超线程内核、4Gb 内存和 64 位 windows 7 的 I5 M540 处理器上运行。

    Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (X64) 
    Apr  2 2010 15:48:46 
    Developer Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)  
    

    2011 年 3 月 9 日更新:我通过运行以下 .net 代码并在 Sql Server Profiler 中记录 Duration、CPU、Reads、Writes 和 RowCounts 进行了第二次更广泛的基准测试。(使用的 CommandText 将在结果中提及。)

    注意: CPU 和持续时间以毫秒表示

    • 1000 个查询
    • 从结果中消除零 CPU 查询
    • 从结果中消除了 0 行受影响
    int[] idList = new int[] { 6816588, 7086702, 6498815 ... }; // 1000 values here.
    using (var conn = new SqlConnection(@"Data Source=myserver;Initial Catalog=mydb;Integrated Security=SSPI;"))
                {
                    conn.Open();
                    using (var cmd = new SqlCommand())
                    {
                        cmd.Connection = conn;
                        cmd.CommandType = CommandType.Text;
                        cmd.CommandText = "select * from heap where common_key between @id and @id+1000"; 
                        cmd.Parameters.Add("@id", SqlDbType.Int);
                        cmd.Prepare();
                        foreach (int id in idList)
                        {
                            cmd.Parameters[0].Value = id;
    
                            using (var reader = cmd.ExecuteReader())
                            {
                                int count = 0;
                                while (reader.Read())
                                {
                                    count++;
                                }
                                Console.WriteLine(String.Format("key: {0} => {1} rows", id, count));
                            }
                        }
                    }
                }
    

    2011 年 3 月 9 日更新结束。

    选择性能

    为了检查性能数字,我在堆表和集群表上执行了一次以下查询:

    select * from heap/clust where group between 5678910 and 5679410
    select * from heap/clust where group between 6234567 and 6234967
    select * from heap/clust where group between 6455429 and 6455729
    select * from heap/clust where group between 6655429 and 6655729
    select * from heap/clust where group between 6955429 and 6955729
    select * from heap/clust where group between 7195542 and 7155729
    

    该基准测试的结果适用于heap:

    rows  reads CPU   Elapsed 
    ----- ----- ----- --------
    1503  1510  31ms  309ms
    401   405   15ms  283ms
    2700  2709  0ms   472ms
    0     3     0ms   30ms
    2953  2962  32ms  257ms
    0     0     0ms   0ms
    

    2011 年 3 月 9 日更新: cmd.CommandText = "select * from heap where group between @id and @id+1000";

    • 721 行有 > 0 CPU 并影响超过 0 行
    Counter   Minimum    Maximum Average  Weighted
    --------- ------- ---------- ------- ---------
    RowCounts    1001      69788    6368         -         
    Cpu            15        374      37   0.00754
    Reads        1069      91459    7682   1.20155
    Writes          0          0       0   0.00000
    Duration   0.3716   282.4850 10.3672   0.00180
    

    2011 年 3 月 9 日更新结束。


    该表clust的结果是:

    rows  reads CPU   Elapsed 
    ----- ----- ----- --------
    1503  4827  31ms  327ms
    401   1241  0ms   242ms
    2700  8372  0ms   410ms
    0     3     0ms   0ms
    2953  9060  47ms  213ms
    0     0     0ms   0ms
    

    2011 年 3 月 9 日更新: cmd.CommandText = "select * from clust where group between @id and @id+1000";

    • 721 行有 > 0 CPU 并影响超过 0 行
    Counter   Minimum    Maximum Average  Weighted
    --------- ------- ---------- ------- ---------
    RowCounts    1001      69788    6056         -
    Cpu            15        468      38   0.00782
    Reads        3194     227018   20457   3.37618
    Writes          0          0       0       0.0
    Duration   0.3949   159.6223 11.5699   0.00214
    

    2011 年 3 月 9 日更新结束。


    SELECT WITH JOIN 性能

    cmd.CommandText = "select * from heap/clust h join keys k on h.group = k.group where h.group between @id and @id+1000";


    该基准测试的结果适用于heap:

    873 行有 > 0 CPU 并影响超过 0 行

    Counter   Minimum    Maximum Average  Weighted
    --------- ------- ---------- ------- ---------
    RowCounts    1009       4170    1683         -
    Cpu            15         47      18   0.01175
    Reads        2145       5518    2867   1.79246
    Writes          0          0       0   0.00000
    Duration   0.8215   131.9583  1.9095   0.00123
    

    该基准测试的结果适用于clust:

    865 行有 > 0 CPU 并影响超过 0 行

    Counter   Minimum    Maximum Average  Weighted
    --------- ------- ---------- ------- ---------
    RowCounts    1000       4143    1685         -
    Cpu            15         47      18   0.01193
    Reads        5320      18690    8237   4.97813
    Writes          0          0       0   0.00000
    Duration   0.9699    20.3217  1.7934   0.00109
    

    更新性能

    第二批查询是更新语句:

    update heap/clust set amount = amount + 0 where group between 5678910 and 5679410
    update heap/clust set amount = amount + 0 where group between 6234567 and 6234967
    update heap/clust set amount = amount + 0 where group between 6455429 and 6455729
    update heap/clust set amount = amount + 0 where group between 6655429 and 6655729
    update heap/clust set amount = amount + 0 where group between 6955429 and 6955729
    update heap/clust set amount = amount + 0 where group between 7195542 and 7155729
    

    该基准测试的结果heap:

    rows  reads CPU   Elapsed 
    ----- ----- ----- -------- 
    1503  3013  31ms  175ms
    401   806   0ms   22ms
    2700  5409  47ms  100ms
    0     3     0ms   0ms
    2953  5915  31ms  88ms
    0     0     0ms   0ms
    

    2011 年 3 月 9 日更新: cmd.CommandText = "update heap set amount = amount + @id where group between @id and @id+1000";

    • 811 行有 > 0 CPU 并影响超过 0 行
    Counter   Minimum    Maximum Average  Weighted
    --------- ------- ---------- ------- ---------
    RowCounts    1001      69788    5598       811         
    Cpu            15        873      56   0.01199
    Reads        2080     167593   11809   2.11217
    Writes          0       1687     121   0.02170
    Duration   0.6705   514.5347 17.2041   0.00344
    

    2011 年 3 月 9 日更新结束。


    该基准测试的结果clust:

    rows  reads CPU   Elapsed 
    ----- ----- ----- -------- 
    1503  9126  16ms  35ms
    401   2444  0ms   4ms
    2700  16385 31ms  54ms
    0     3     0ms   0ms 
    2953  17919 31ms  35ms
    0     0     0ms   0ms
    

    2011 年 3 月 9 日更新: cmd.CommandText = "update clust set amount = amount + @id where group between @id and @id+1000";

    • 853 行有 > 0 CPU 并影响超过 0 行
    Counter   Minimum    Maximum Average  Weighted
    --------- ------- ---------- ------- ---------
    RowCounts    1001      69788    5420         -
    Cpu            15        594      50   0.01073
    Reads        6226     432237   33597   6.20450
    Writes          0       1730     110   0.01971
    Duration   0.9134   193.7685  8.2919   0.00155
    

    2011 年 3 月 9 日更新结束。


    删除基准

    我运行的第三批查询是删除语句

    delete heap/clust where group between 5678910 and 5679410
    delete heap/clust where group between 6234567 and 6234967
    delete heap/clust where group between 6455429 and 6455729
    delete heap/clust where group between 6655429 and 6655729
    delete heap/clust where group between 6955429 and 6955729
    delete heap/clust where group between 7195542 and 7155729
    

    此基准测试的结果heap:

    rows  reads CPU   Elapsed 
    ----- ----- ----- -------- 
    1503  10630 62ms  179ms
    401   2838  0ms   26ms
    2700  19077 47ms  87ms
    0     4     0ms   0ms
    2953  20865 62ms  196ms
    0     4     0ms   9ms
    

    2011 年 3 月 9 日更新: cmd.CommandText = "delete heap where group between @id and @id+1000";

    • 724 行有 > 0 CPU 并影响超过 0 行
    Counter   Minimum    Maximum Average  Weighted
    --------- ------- ---------- ------- ---------
    RowCounts     192      69788    4781         -
    Cpu            15        499      45   0.01247
    Reads         841     307958   20987   4.37880
    Writes          2       1819     127   0.02648
    Duration   0.3775  1534.3383 17.2412   0.00349
    

    2011 年 3 月 9 日更新结束。


    这个基准的结果clust:

    rows  reads CPU   Elapsed 
    ----- ----- ----- -------- 
    1503  9228  16ms  55ms
    401   3681  0ms   50ms
    2700  24644 46ms  79ms
    0     3     0ms   0ms
    2953  26955 47ms  92ms
    0     3     0ms   0ms
    

    2011 年 3 月 9 日更新:

    cmd.CommandText = "delete clust where group between @id and @id+1000";

    • 751 行有 > 0 CPU 并影响超过 0 行
    Counter   Minimum    Maximum Average  Weighted
    --------- ------- ---------- ------- ---------
    RowCounts     144      69788    4648         -
    Cpu            15        764      56   0.01538
    Reads         989     458467   30207   6.48490
    Writes          2       1830     127   0.02694
    Duration   0.2938  2512.1968 24.3714   0.00555
    

    2011 年 3 月 9 日更新结束。


    插入基准

    基准测试的最后一部分是插入语句的执行。

    插入堆/簇 (...) 值 (...), (...), (...), (...), (...), (...)


    此基准测试的结果heap:

    rows  reads CPU   Elapsed 
    ----- ----- ----- -------- 
    6     38    0ms   31ms
    

    2011 年 3 月 9 日更新:

    string str = @"insert into heap (group, currency, year, period, domain_id, mtdAmount, mtdAmount, ytdAmount, amount, ytd_restated, restated, auditDate, auditUser)
                        values";
    
                        for (int x = 0; x < 999; x++)
                        {
                            str += string.Format(@"(@id + {0}, 'EUR', 2012, 2, 0, 100, 100, 1000 + @id,1000, 1000,1000, current_timestamp, 'test'),  ", x);
                        }
                        str += string.Format(@"(@id, 'CAD', 2012, 2, 0, 100, 100, 1000 + @id,1000, 1000,1000, current_timestamp, 'test') ", 1000);
    
                        cmd.CommandText = str;
    
    • 912 条语句的 CPU > 0
    Counter   Minimum    Maximum Average  Weighted
    --------- ------- ---------- ------- ---------
    RowCounts    1000       1000    1000         -
    Cpu            15       2138      25   0.02500
    Reads        5212       7069    6328   6.32837
    Writes         16         34      22   0.02222
    Duration   1.6336   293.2132  4.4009   0.00440
    

    2011 年 3 月 9 日更新结束。


    此基准测试的结果clust:

    rows  reads CPU   Elapsed 
    ----- ----- ----- -------- 
    6     50    0ms   18ms
    

    2011 年 3 月 9 日更新:

    string str = @"insert into clust (group, currency, year, period, domain_id, mtdAmount, mtdAmount, ytdAmount, amount, ytd_restated, restated, auditDate, auditUser)
                        values";
    
                        for (int x = 0; x < 999; x++)
                        {
                            str += string.Format(@"(@id + {0}, 'EUR', 2012, 2, 0, 100, 100, 1000 + @id,1000, 1000,1000, current_timestamp, 'test'),  ", x);
                        }
                        str += string.Format(@"(@id, 'CAD', 2012, 2, 0, 100, 100, 1000 + @id,1000, 1000,1000, current_timestamp, 'test') ", 1000);
    
                        cmd.CommandText = str;
    
    • 946 条语句的 CPU > 0
    Counter   Minimum    Maximum Average  Weighted
    --------- ------- ---------- ------- ---------
    RowCounts    1000       1000    1000         -      
    Cpu            15       2403      21   0.02157
    Reads        6810       8997    8412   8.41223
    Writes         16         25      19   0.01942
    Duration   1.5375   268.2571  6.1463   0.00614
    

    2011 年 3 月 9 日更新结束。


    结论

    尽管在使用聚集索引和非聚集索引访问表时(使用非聚集索引时)会进行更多的逻辑读取,但性能结果是:

    • SELECT statements are comparable
    • UPDATE statements are faster with a clustered index in place
    • DELETE statements are faster with a clustered index in place
    • INSERT statements are faster with a clustered index in place

    Of course my benchmark was very limited on a specific kind of table and with a very limited set of queries, but I think that based on this information we can already start saying that it is virtually always better to create a clustered index on your table.

    Update on 9 Mar 2011:

    As we can see from the added results, the conclusions on the limited tests were not correct in every case.

    Weighted Duration

    The results now indicate that the only statements which benefit from the clustered index are the update statements. The other statements are about 30% slower on the table with clustered index.

    Some additional charts where I plotted the weighted duration per query for heap vs clust. Weighted Duration heap vs clustered for Select

    Weighted Duration heap vs clustered for Join

    Weighted Duration heap vs clustered for Update

    Weighted Duration heap vs clustered for Delete

    As you can see the performance profile for the insert statements is quite interesting. The spikes are caused by a few data points which take a lot longer to complete. Weighted Duration heap vs clustered for Insert

    End of Update on 9 Mar 2011.

    • 41
  2. marc_s
    2011-02-24T09:50:14+08:002011-02-24T09:50:14+08:00

    正如索引女王金伯利·特里普(Kimberly Tripp)在她的博客文章“聚集索引辩论继续……”中很好地解释的那样,在数据库表上拥有一个聚集键几乎可以加快所有操作 - 而不仅仅是SELECT.

    与聚簇表相比,SELECT 在堆上的速度通常较慢,只要您选择一个好的聚簇键——比如INT IDENTITY. 如果您使用非常糟糕的集群键,例如 GUID 或具有许多可变长度组件的复合键,那么,但只有这样,堆可能会更快。但在那种情况下,你真的需要首先清理你的数据库设计......

    所以总的来说,我认为堆中没有任何意义——选择一个好的、有用的集群键,你应该在所有方面都受益。

    • 12
  3. Martin Smith
    2012-03-16T02:44:37+08:002012-03-16T02:44:37+08:00

    Just happened to come across this article from Joe Chang that addresses this question. Pasted his conclusions below.

    Consider a table for which the indexes have depth 4, so that there is a root level, 2 intermediate levels and the leaf level. The index seek for a single index key (that is, no key lookup) would generate 4 logical IO (LIO). Now consider if a key lookup is required. If the table has a clustered index also of depth 4, each key lookup generates 4 LIO. If the table were a heap, each key lookup generates 1 LIO. In actuality, the key lookup to a heap is about 20-30% less expensive than a key lookup to a clustered index, not anywhere close to the 4:1 LIO ratio.

    • 7

相关问题

  • SQL Server - 使用聚集索引时如何存储数据页

  • 我需要为每种类型的查询使用单独的索引,还是一个多列索引可以工作?

  • 什么时候应该使用唯一约束而不是唯一索引?

  • 死锁的主要原因是什么,可以预防吗?

  • 如何确定是否需要或需要索引

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    你如何mysqldump特定的表?

    • 4 个回答
  • Marko Smith

    您如何显示在 Oracle 数据库上执行的 SQL?

    • 2 个回答
  • Marko Smith

    如何选择每组的第一行?

    • 6 个回答
  • Marko Smith

    使用 psql 列出数据库权限

    • 10 个回答
  • Marko Smith

    我可以查看在 SQL Server 数据库上运行的历史查询吗?

    • 6 个回答
  • Marko Smith

    如何在 PostgreSQL 中使用 currval() 来获取最后插入的 id?

    • 10 个回答
  • Marko Smith

    如何在 Mac OS X 上运行 psql?

    • 11 个回答
  • Marko Smith

    如何从 PostgreSQL 中的选择查询中将值插入表中?

    • 4 个回答
  • Marko Smith

    如何使用 psql 列出所有数据库和表?

    • 7 个回答
  • Marko Smith

    将数组参数传递给存储过程

    • 12 个回答
  • Martin Hope
    Manuel Leduc PostgreSQL 多列唯一约束和 NULL 值 2011-12-28 01:10:21 +0800 CST
  • Martin Hope
    markdorison 你如何mysqldump特定的表? 2011-12-17 12:39:37 +0800 CST
  • Martin Hope
    Stuart Blackler 什么时候应该将主键声明为非聚集的? 2011-11-11 13:31:59 +0800 CST
  • Martin Hope
    pedrosanta 使用 psql 列出数据库权限 2011-08-04 11:01:21 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 对 SQL 查询进行计时? 2011-06-04 02:22:54 +0800 CST
  • Martin Hope
    Jonas 如何从 PostgreSQL 中的选择查询中将值插入表中? 2011-05-28 00:33:05 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 列出所有数据库和表? 2011-02-18 00:45:49 +0800 CST
  • Martin Hope
    BrunoLM Guid vs INT - 哪个更好作为主键? 2011-01-05 23:46:34 +0800 CST
  • Martin Hope
    bernd_k 什么时候应该使用唯一约束而不是唯一索引? 2011-01-05 02:32:27 +0800 CST
  • Martin Hope
    Patrick 如何优化大型数据库的 mysqldump? 2011-01-04 13:13:48 +0800 CST

热门标签

sql-server mysql postgresql sql-server-2014 sql-server-2016 oracle sql-server-2008 database-design query-performance sql-server-2017

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve