tail提出的问题 -dba

tail

Asked: 2022-05-14 12:45:46 +0800 CST

具有较小数据类型的表似乎在磁盘上占用更多空间？

0

我有这两个相同的表：

                  Table "public.region"
   Column    |   Type   | Collation | Nullable | Default 
-------------+----------+-----------+----------+---------
 r_regionkey | integer      |           | not null | 
 r_name      | char(25)     |           |          | 
 r_comment   | char(152)     |           |          | 
Indexes:
    "region_pkey" PRIMARY KEY, btree (r_regionkey)

和

                  Table "public.region2"
   Column    |   Type   | Collation | Nullable | Default 
-------------+----------+-----------+----------+---------
 r_regionkey | smallint |           | not null | 
 r_name      | text     |           |          | 
 r_comment   | text     |           |          | 
Indexes:
    "region_pkey" PRIMARY KEY, btree (r_regionkey)

我正在使用smallintandtext为了节省空间，但奇怪的是结果如下：

select pg_size_pretty(pg_table_size('region'))

返回8192 bytes时

select pg_size_pretty(pg_table_size('region2'))

返回48 kB。

为什么要region2占用更多空间，即使我使用的是smallint代替integer和text代替char(n)？

tail

Asked: 2022-05-14 07:16:33 +0800 CST

pg_class 返回 0 作为 relpages

0

我有这张桌子：

                  Table "public.region"
   Column    |   Type   | Collation | Nullable | Default 
-------------+----------+-----------+----------+---------
 r_regionkey | smallint |           | not null | 
 r_name      | text     |           |          | 
 r_comment   | text     |           |          | 
Indexes:
    "region_pkey" PRIMARY KEY, btree (r_regionkey)
Referenced by:
    TABLE "nation" CONSTRAINT "nation_n_regionkey_fkey" FOREIGN KEY (n_regionkey) REFERENCES region(r_regionkey)

奇怪的是这个查询：

select oid::regclass as tbl, relpages
from pg_class
where relname='region'

返回relpages = 0。

为什么以及如何解决？如果我改变smallint它的int工作原理（relpages = 1）

tail

Asked: 2022-05-13 02:33:08 +0800 CST

为什么优化器不在我的表上使用聚簇索引？

1

我有这张桌子

                     Table "public.lineitem"
     Column      |     Type      | Collation | Nullable | Default 
-----------------+---------------+-----------+----------+---------
 l_orderkey      | integer       |           |          | 
 l_partkey       | integer       |           |          | 
 l_suppkey       | integer       |           |          | 
 l_linenumber    | integer       |           |          | 
 l_quantity      | integer       |           |          | 
 l_extendedprice | numeric(12,2) |           |          | 
 l_discount      | numeric(12,2) |           |          | 
 l_tax           | numeric(12,2) |           |          | 
 l_returnflag    | character(1)  |           |          | 
 l_linestatus    | character(1)  |           |          | 
 l_shipdate      | date          |           |          | 
 l_commitdate    | date          |           |          | 
 l_receiptdate   | date          |           |          | 
 l_shipinstruct  | character(25) |           |          | 
 l_shipmode      | character(10) |           |          | 
 l_comment       | character(44) |           |          | 
 l_partsuppkey   | character(20) |           |          | 
Indexes:
    "l_shipdate_c_idx" btree (l_shipdate) CLUSTER
    "l_shipmode_h_idx" hash (l_shipdate)
Foreign-key constraints:
    "lineitem_l_orderkey_fkey" FOREIGN KEY (l_orderkey) REFERENCES orders(o_orderkey)
    "lineitem_l_partkey_fkey" FOREIGN KEY (l_partkey) REFERENCES part(p_partkey)
    "lineitem_l_partsuppkey_fkey" FOREIGN KEY (l_partsuppkey) REFERENCES partsupp(ps_partsuppkey)
    "lineitem_l_suppkey_fkey" FOREIGN KEY (l_suppkey) REFERENCES supplier(s_suppkey)

这个查询：

explain analyze select
    l_returnflag,
    l_linestatus,
    sum(l_quantity) as sum_qty,
    sum(l_extendedprice) as sum_base_price,
    sum(l_extendedprice*(1 - l_discount)) as sum_disc_price,
    sum(l_extendedprice*(1 - l_discount)*(1 + l_tax)) as sum_charge,
    avg(l_quantity) as avg_qty,
    avg(l_extendedprice) as avg_price,
    avg(l_discount) as avg_disc,
    count(*) as count_order
from
    lineitem
where
    l_shipdate<='31/08/1998'
GROUP by
    l_returnflag,
    l_linestatus
ORDER by
    l_returnflag,
    l_linestatus

返回此查询计划：

"Finalize GroupAggregate  (cost=2631562.25..2631564.19 rows=6 width=212) (actual time=28624.012..28624.466 rows=4 loops=1)"
"  Group Key: l_returnflag, l_linestatus"
"  ->  Gather Merge  (cost=2631562.25..2631563.65 rows=12 width=212) (actual time=28623.998..28624.442 rows=12 loops=1)"
"        Workers Planned: 2"
"        Workers Launched: 2"
"        ->  Sort  (cost=2630562.23..2630562.24 rows=6 width=212) (actual time=28620.633..28620.633 rows=4 loops=3)"
"              Sort Key: l_returnflag, l_linestatus"
"              Sort Method: quicksort  Memory: 27kB"
"              Worker 0:  Sort Method: quicksort  Memory: 27kB"
"              Worker 1:  Sort Method: quicksort  Memory: 27kB"
"              ->  Partial HashAggregate  (cost=2630562.03..2630562.15 rows=6 width=212) (actual time=28620.607..28620.611 rows=4 loops=3)"
"                    Group Key: l_returnflag, l_linestatus"
"                    Batches: 1  Memory Usage: 24kB"
"                    Worker 0:  Batches: 1  Memory Usage: 24kB"
"                    Worker 1:  Batches: 1  Memory Usage: 24kB"
"                    ->  Parallel Seq Scan on lineitem  (cost=0.00..1707452.35 rows=24616258 width=24) (actual time=0.549..19028.353 rows=19701655 loops=3)"
"                          Filter: (l_shipdate <= '1998-08-31'::date)"
"                          Rows Removed by Filter: 293696"
"Planning Time: 0.374 ms"
"Execution Time: 28624.523 ms"

为什么优化器更喜欢顺序扫描lineitem而不是使用表l_shipdate_c_idx？我应该放弃它吗？

Postgres 版本：PostgreSQL 14.2 on x86_64-apple-darwin20.6.0, compiled by Apple clang version 12.0.0 (clang-1200.0.32.29), 64-bit

tail

Asked: 2022-05-12 12:24:19 +0800 CST

列的数据类型会影响查询性能吗？

6

假设我有这张桌子：

Table "public.orders"
         Column      |     Type      | Collation | Nullable | Default 
    -----------------+---------------+-----------+----------+---------
     o_orderkey      | integer       |           | not null | 
     o_custkey       | integer       |           |          | 
     o_orderstatus   | character(1)  |           |          | 
     o_totalprice    | numeric(12,2) |           |          | 
     o_orderdate     | date          |           |          | 
     o_orderpriority | character(15) |           |          | 
     o_clerk         | character(15) |           |          | 
     o_shippriority  | integer       |           |          | 
     o_comment       | character(79) |           |          |

如果我有涉及o_orderstatus、或列的查询，我可以将数据类型更改o_orderpriority为以改进它们吗？o_clerko_commentchar(n)text

tail

Asked: 2022-05-12 08:01:50 +0800 CST

以 [X,Y) 中的 DATE 形式表达谓词 DATE >= X 和 DATE < Y

0

根据本文：

我可以将上限和下限表示为单个谓词。

我怎样才能在 PostgreSQL 中做到这一点？

tail

Asked: 2022-05-07 00:47:26 +0800 CST

检查我的索引需要存储多少字节？

0

假设我们有这张表：

CREATE TABLE CUSTOMER
 ( 
C_CUSTKEY INTEGER PRIMARY KEY ,
C_NAME CHAR (25),
C_ADDRESS CHAR (40),
C_NATIONKEY INTEGER REFERENCES NATION(N_NATIONKEY),
C_PHONE CHAR (15),
C_ACCTBAL NUMERIC (12,2),
C_MKTSEGMENT CHAR (10),
C_COMMENT CHAR (117)
)

如您所见，有一个PRIMARY KEYonC_CUSTKEY属性。

如何检查该索引需要存储多少字节？

我在用

SELECT
    pg_size_pretty (pg_indexes_size('customer'));

返回 32MB。那是对的吗？另外，我pg_table_size用来检查物化视图需要存储多少字节。

tail

Asked: 2022-04-24 12:46:35 +0800 CST

强制对索引扫描进行顺序扫描

0

我有这个查询

select
    s_acctbal,s_name,n_name,p_partkey,p_mfgr,s_address,s_phone,s_comment
from
    part,supplier,partsupp,nation,region
where
    p_partkey=ps_partkey
and
    s_suppkey=ps_suppkey
and
    s_nationkey=n_nationkey
and
    n_regionkey=r_regionkey
and
    p_size=15
and
    p_type like '%BRASS'
and
    r_name='EUROPE'
and
    ps_supplycost=1.0
ORDER by
    s_acctbal desc ,n_name,s_name,p_partkey;

它使用p_partkey, s_suppkey,n_nationkey和r_regionkey作为主键，所以查询计划是

[
  {
    "Plan": {
      "Node Type": "Sort",
      "Parallel Aware": false,
      "Async Capable": false,
      "Actual Rows": 0,
      "Actual Loops": 1,
      "Sort Key": [
        "supplier.s_acctbal DESC",
        "nation.n_name",
        "supplier.s_name",
        "part.p_partkey"
      ],
      "Sort Method": "quicksort",
      "Sort Space Used": 25,
      "Sort Space Type": "Memory",
      "Plans": [
        {
          "Node Type": "Nested Loop",
          "Parent Relationship": "Outer",
          "Parallel Aware": false,
          "Async Capable": false,
          "Join Type": "Inner",
          "Actual Rows": 0,
          "Actual Loops": 1,
          "Inner Unique": true,
          "Join Filter": "(nation.n_regionkey = region.r_regionkey)",
          "Rows Removed by Join Filter": 0,
          "Plans": [
            {
              "Node Type": "Nested Loop",
              "Parent Relationship": "Outer",
              "Parallel Aware": false,
              "Async Capable": false,
              "Join Type": "Inner",
              "Actual Rows": 0,
              "Actual Loops": 1,
              "Inner Unique": true,
              "Plans": [
                {
                  "Node Type": "Nested Loop",
                  "Parent Relationship": "Outer",
                  "Parallel Aware": false,
                  "Async Capable": false,
                  "Join Type": "Inner",
                  "Actual Rows": 0,
                  "Actual Loops": 1,
                  "Inner Unique": true,
                  "Plans": [
                    {
                      "Node Type": "Gather",
                      "Parent Relationship": "Outer",
                      "Parallel Aware": false,
                      "Async Capable": false,
                      "Actual Rows": 0,
                      "Actual Loops": 1,
                      "Workers Planned": 2,
                      "Workers Launched": 2,
                      "Single Copy": false,
                      "Plans": [
                        {
                          "Node Type": "Nested Loop",
                          "Parent Relationship": "Outer",
                          "Parallel Aware": false,
                          "Async Capable": false,
                          "Join Type": "Inner",
                          "Actual Rows": 0,
                          "Actual Loops": 3,
                          "Inner Unique": true,
                          "Workers": [],
                          "Plans": [
                            {
                              "Node Type": "Seq Scan",
                              "Parent Relationship": "Outer",
                              "Parallel Aware": true,
                              "Async Capable": false,
                              "Relation Name": "partsupp",
                              "Alias": "partsupp",
                              "Actual Rows": 30,
                              "Actual Loops": 3,
                              "Filter": "(ps_supplycost = 1.0)",
                              "Rows Removed by Filter": 2666636,
                              "Workers": []
                            },
                            {
                              "Node Type": "Memoize",
                              "Parent Relationship": "Inner",
                              "Parallel Aware": false,
                              "Async Capable": false,
                              "Actual Rows": 0,
                              "Actual Loops": 91,
                              "Cache Key": "partsupp.ps_partkey",
                              "Cache Mode": "logical",
                              "Cache Hits": 0,
                              "Cache Misses": 29,
                              "Cache Evictions": 0,
                              "Cache Overflows": 0,
                              "Peak Memory Usage": 2,
                              "Workers": [
                                {
                                  "Worker Number": 0,
                                  "Cache Hits": 0,
                                  "Cache Misses": 29,
                                  "Cache Evictions": 0,
                                  "Cache Overflows": 0,
                                  "Peak Memory Usage": 2
                                },
                                {
                                  "Worker Number": 1,
                                  "Cache Hits": 0,
                                  "Cache Misses": 33,
                                  "Cache Evictions": 0,
                                  "Cache Overflows": 0,
                                  "Peak Memory Usage": 3
                                }
                              ],
                              "Plans": [
                                {
                                  "Node Type": "Index Scan",
                                  "Parent Relationship": "Outer",
                                  "Parallel Aware": false,
                                  "Async Capable": false,
                                  "Scan Direction": "Forward",
                                  "Index Name": "part_pkey",
                                  "Relation Name": "part",
                                  "Alias": "part",
                                  "Actual Rows": 0,
                                  "Actual Loops": 91,
                                  "Index Cond": "(p_partkey = partsupp.ps_partkey)",
                                  "Rows Removed by Index Recheck": 0,
                                  "Filter": "((p_type ~~ '%BRASS'::text) AND (p_size = 15))",
                                  "Rows Removed by Filter": 1,
                                  "Workers": []
                                }
                              ]
                            }
                          ]
                        }
                      ]
                    },
                    {
                      "Node Type": "Index Scan",
                      "Parent Relationship": "Inner",
                      "Parallel Aware": false,
                      "Async Capable": false,
                      "Scan Direction": "Forward",
                      "Index Name": "supplier_pkey",
                      "Relation Name": "supplier",
                      "Alias": "supplier",
                      "Actual Rows": 0,
                      "Actual Loops": 0,
                      "Index Cond": "(s_suppkey = partsupp.ps_suppkey)",
                      "Rows Removed by Index Recheck": 0
                    }
                  ]
                },
                {
                  "Node Type": "Index Scan",
                  "Parent Relationship": "Inner",
                  "Parallel Aware": false,
                  "Async Capable": false,
                  "Scan Direction": "Forward",
                  "Index Name": "nation_pkey",
                  "Relation Name": "nation",
                  "Alias": "nation",
                  "Actual Rows": 0,
                  "Actual Loops": 0,
                  "Index Cond": "(n_nationkey = supplier.s_nationkey)",
                  "Rows Removed by Index Recheck": 0
                }
              ]
            },
            {
              "Node Type": "Seq Scan",
              "Parent Relationship": "Inner",
              "Parallel Aware": false,
              "Async Capable": false,
              "Relation Name": "region",
              "Alias": "region",
              "Actual Rows": 0,
              "Actual Loops": 0,
              "Filter": "(r_name = 'EUROPE'::bpchar)",
              "Rows Removed by Filter": 0
            }
          ]
        }
      ]
    },
    "Triggers": []
  }
]

如您所见，执行我的查询大约需要 13 秒。如果没有创建这些主键，我想看看该查询需要多少秒。那就是我想强制我的优化器选择顺序扫描而不是索引扫描。

我能怎么做？

tail

Asked: 2022-04-15 08:27:11 +0800 CST

优化 GROUP BY、ORDER BY 以及多项 SUM 和 AVG 运算查询

0

我有这个查询，来自 TPCH-H 基准：

explain analyze select
    l_returnflag,
    l_linestatus,
    sum(l_quantity) as sum_qty,
    sum(l_extendedprice) as sum_base_price,
    sum(l_extendedprice*(1 - l_discount)) as sum_disc_price,
    sum(l_extendedprice*(1 - l_discount)*(1 + l_tax)) as sum_charge,
    avg(l_quantity) as avg_qty,
    avg(l_extendedprice) as avg_price,
    avg(l_discount) as avg_disc,
    count(*) as count_order
from
    lineitem
where
    l_shipdate<='31/08/1998'
GROUP by
    l_returnflag,
    l_linestatus
ORDER by
    l_returnflag,
    l_linestatus

返回这个：

"Finalize GroupAggregate  (cost=2300777.06..2300779.00 rows=6 width=212) (actual time=38289.923..38290.426 rows=4 loops=1)"
"  Group Key: l_returnflag, l_linestatus"
"  ->  Gather Merge  (cost=2300777.06..2300778.46 rows=12 width=212) (actual time=38289.907..38290.390 rows=12 loops=1)"
"        Workers Planned: 2"
"        Workers Launched: 2"
"        ->  Sort  (cost=2299777.04..2299777.05 rows=6 width=212) (actual time=38284.169..38284.169 rows=4 loops=3)"
"              Sort Key: l_returnflag, l_linestatus"
"              Sort Method: quicksort  Memory: 27kB"
"              Worker 0:  Sort Method: quicksort  Memory: 27kB"
"              Worker 1:  Sort Method: quicksort  Memory: 27kB"
"              ->  Partial HashAggregate  (cost=2299776.84..2299776.96 rows=6 width=212) (actual time=38284.129..38284.133 rows=4 loops=3)"
"                    Group Key: l_returnflag, l_linestatus"
"                    Batches: 1  Memory Usage: 24kB"
"                    Worker 0:  Batches: 1  Memory Usage: 24kB"
"                    Worker 1:  Batches: 1  Memory Usage: 24kB"
"                    ->  Parallel Seq Scan on lineitem  (cost=0.00..1493832.54 rows=21491848 width=24) (actual time=0.281..29321.949 rows=17236798 loops=3)"
"                          Filter: (l_shipdate <= '1998-08-31'::date)"
"                          Rows Removed by Filter: 256933"
"Planning Time: 3.870 ms"
"Execution Time: 38290.784 ms"

它涉及这种关系：

CREATE TABLE LINEITEM
 ( 
L_ORDERKEY INTEGER REFERENCES ORDERS(O_ORDERKEY),
L_PARTKEY INTEGER REFERENCES PART(P_PARTKEY),
L_SUPPKEY INTEGER REFERENCES SUPPLIER(S_SUPPKEY),
L_LINENUMBER INTEGER,
L_QUANTITY INTEGER,
L_EXTENDEDPRICE NUMERIC (12,2),
L_DISCOUNT NUMERIC (12,2),
L_TAX NUMERIC (12,2),
L_RETURNFLAG CHAR (1),
L_LINESTATUS CHAR (1),
L_SHIPDATE DATE ,
L_COMMITDATE DATE ,
L_RECEIPTDATE DATE ,
L_SHIPINSTRUCT CHAR (25),
L_SHIPMODE CHAR (10),
L_COMMENT CHAR (44),
L_PARTSUPPKEY CHAR (20) REFERENCES PARTSUPP(PS_PARTSUPPKEY)
)

如您所见，它大约需要 40 秒，我想对此进行优化。我在列上添加了一个 b 树索引L_SHIPDATE（排序顺序 ASC 和最后一个 NULL）。

我怎样才能做得更好？

正如您在此处看到的，优化器没有使用索引，l_shipdate因此他更喜欢顺序扫描lineitem表。

具有较小数据类型的表似乎在磁盘上占用更多空间？

pg_class 返回 0 作为 relpages

为什么优化器不在我的表上使用聚簇索引？

列的数据类型会影响查询性能吗？

以 [X,Y) 中的 DATE 形式表达谓词 DATE >= X 和 DATE < Y

检查我的索引需要存储多少字节？

强制对索引扫描进行顺序扫描

优化 GROUP BY、ORDER BY 以及多项 SUM 和 AVG 运算查询

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

tail's questions