在名为“链接”的应用程序中,用户发布他们最近发现的有趣内容的链接和照片(以及其他人对上述帖子发表评论)。
照片下的这些张贴评论保存在links_photocomment
我的 postgresql 9.6.5 数据库中的一个表中。
表中的一个SELECT
查询links_photocomment
始终显示在slow_log
. 它花费的时间超过 500 毫秒,并且比我在大多数其他 postgresql 操作中遇到的慢 10 倍。
这是我的慢日志中相应 SQL 的示例:
日志:持续时间:5071.112 毫秒语句:
SELECT "links_photocomment"."abuse",
"links_photocomment"."text",
"links_photocomment"."id",
"links_photocomment"."submitted_by_id",
"links_photocomment"."submitted_on",
"auth_user"."username",
"links_userprofile"."score"
FROM "links_photocomment"
INNER JOIN "auth_user"
ON ( "links_photocomment"."submitted_by_id" = "auth_user"."id" )
LEFT OUTER JOIN "links_userprofile"
ON ( "auth_user"."id" = "links_userprofile"."user_id" )
WHERE "links_photocomment"."which_photo_id" = 3115087
ORDER BY "links_photocomment"."id" DESC
LIMIT 25;
查看explain analyze
结果:https ://explain.depesz.com/s/UuCk
查询最终根据那个过滤了 19,100,179 行!
我试过的:
我的直觉是 Postgres 将此查询计划基于误导性统计数据。因此,我VACUUM ANALYZE
在上述桌子上跑步。然而这并没有改变任何东西。
作为某种偶然的 DBA,我正在寻找有关该主题的一些快速专家指导。在此先感谢并为 noob 问题(如果是)道歉。
附录:
以下是 的完整输出\d links_photocomment
:
Table "public.links_photocomment"
Column | Type | Modifiers
-----------------+--------------------------+-----------------------------------------------------------------
id | integer | not null default nextval('links_photocomment_id_seq'::regclass)
which_photo_id | integer | not null
text | text | not null
device | character varying(10) | not null
submitted_by_id | integer | not null
submitted_on | timestamp with time zone | not null
image_comment | character varying(100) | not null
has_image | boolean | not null
abuse | boolean | default false
Indexes:
"links_photocomment_pkey" PRIMARY KEY, btree (id)
"links_photocomment_submitted_by_id" btree (submitted_by_id)
"links_photocomment_which_photo_id" btree (which_photo_id)
Foreign-key constraints:
"links_photocomment_submitted_by_id_fkey" FOREIGN KEY (submitted_by_id) REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED
"links_photocomment_which_photo_id_fkey" FOREIGN KEY (which_photo_id) REFERENCES links_photo(id) DEFERRABLE INITIALLY DEFERRED
Referenced by:
TABLE "links_photo" CONSTRAINT "latest_comment_id_refs_id_f2566197" FOREIGN KEY (latest_comment_id) REFERENCES links_photocomment(id) DEFERRABLE INITIALLY DEFERRED
TABLE "links_report" CONSTRAINT "links_report_which_photocomment_id_fkey" FOREIGN KEY (which_photocomment_id) REFERENCES links_photocomment(id) DEFERRABLE INITIALLY DEFERRED
TABLE "links_photo" CONSTRAINT "second_latest_comment_id_refs_id_f2566197" FOREIGN KEY (second_latest_comment_id) REFERENCES links_photocomment(id) DEFERRABLE INITIALLY DEFERRED
该计划不使用索引
(which_photo_id)
而是使用 PK(id)
索引,因此它必须读取索引的很大一部分(如果匹配过滤器的行少于 25 行,则读取全部)。这在具体执行中大约需要 4.4 秒(并在读取并拒绝 19M 行后找到 25 行):我会尝试这些:
用 上的索引替换 上
(which_photo_id)
的索引(which_photo_id, id)
。INNER
将连接重写为LEFT
连接(有一个FOREIGN KEY
约束可确保两个查询将产生相同的结果。)用子查询(派生表或 CTE)重写,将
WHERE
过滤器移动到内部),以便首先获得 25 个 ID(希望仅使用索引扫描),然后加入其他 2 个表。查询(带派生表):
查询(使用 CTE):