我在哪里可以找到mysql慢日志？

Question

Chloe

Asked: 2020-01-21 18:43:17 +0800 CST2020-01-21 18:43:17 +0800 CST 2020-01-21 18:43:17 +0800 CST

为什么 MySQL 不为子查询使用索引？

772

此查询需要永远运行（30+m - 无穷大）。

select date, 
       sc, 
       ( select count(fingerprint_id) 
         from stats 
         where hit_date >= t.date 
           and hit_date < date_add('2020-01-20', interval 1 day) 
           and hit_type = 0 
           and fingerprint_id is not null ) as total_fingerprint
from ( select date(hit_date) as date, 
              sum(sc) as sc 
       from delayed_stats  
       where hit_date > date_sub(now(), interval 1 day) 
       group by date(hit_date) 
       order by hit_date) t;

单个查询需要 1 秒和 8 秒才能运行，但组合起来永远不会完成。我预计8-9秒。如果我t.date用静态的“2020-01-20”替换，则需要 8 秒。只需将一个静态日期替换为t.date导致查询“挂起”。复制此挂起的最小查询是

select date, 
       (select count(fingerprint_id) from stats where hit_date >= t.date and hit_date < date_add(t.date, interval 1 day) and hit_type = 0 and fingerprint_id is not null) as total_fingerprint
from (select '2020-01-01' as date union select '2020-01-02' as date) t;

这是查询解释：

+----+--------------------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+---------------+--------------+---------+------+-----------+----------+--------------------------------------------------------+
| id | select_type        | table         | partitions                                                                                                                                | type  | possible_keys | key          | key_len | ref  | rows      | filtered | Extra                                                  |
+----+--------------------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+---------------+--------------+---------+------+-----------+----------+--------------------------------------------------------+
|  1 | PRIMARY            | <derived3>    | NULL                                                                                                                                       | ALL   | NULL          | NULL         | NULL    | NULL |      7496 |   100.00 | NULL                                                   |
|  3 | DERIVED            | delayed_stats | NULL                                                                                                                                       | range | hit_date_idx  | hit_date_idx | 5       | NULL |      7496 |   100.00 | Using index condition; Using temporary; Using filesort |
|  2 | DEPENDENT SUBQUERY | stats         | p20180101,p20180201,p20180301,p20180401,p20180501,p20180601,p20180701,p20180801,p20180901,p20181001,p20181101,p20181201,p20190101,p20190201,p20190301,p20190401,p20190501,p20190601,p20190701,p20190801,p20190901,p20191001,p20191101,p20191201,p20200101,p20200201 | ALL   | NULL          | NULL         | NULL    | NULL | 316867000 |     1.00 | Using where                                            |
+----+--------------------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+---------------+--------------+---------+------+-----------+----------+--------------------------------------------------------+
3 rows in set, 2 warnings (0.11 sec)

它似乎没有在表的子查询上使用 hit_date 索引（PRIMARY KEY (id ,hit_date )）stats。我的最终目标是结合这两个查询（interval 30 day）：

select date(hit_date), 
       sum(sc) 
from delayed_stats 
where hit_date > date_sub(now(), interval 30 day) 
group by date(hit_date) 
order by hit_date;

select date(hit_date), 
       count(fingerprint_id) 
from stats 
where hit_date > date_sub(now(), interval 30 day) 
  and hit_type = 0 
  and fingerprint_id is not null 
group by date(hit_date) 
order by hit_date; -- 2m21s

当我看到表上第二个查询的查询计划时stats，它显示possible_keys为PRIMARY,source_id,stats_bag_id_idx. 我尝试了另一种将它们组合在一起的方法，即加入，但是运行需要 15m，而它应该只需要 2m。

select t.date, 
       sc, 
       fingerprint_count 
from ( select date(hit_date) date, 
              sum(sc) as sc 
       from delayed_stats 
       where hit_date > date_sub(now(), interval 30 day) 
       group by date(hit_date) 
       order by hit_date ) t 
join ( select date(hit_date) date, 
              count(fingerprint_id) as fingerprint_count 
       from stats 
       where hit_date > date_sub(now(), interval 30 day) 
         and hit_type = 0 
         and fingerprint_id is not null 
       group by date(hit_date) 
       order by hit_date ) t2 on t.date = t2.date;

1 个回答

Voted

Chloe · Answer 1 · 2020-01-24T13:17:34+08:00

Best Answer

Chloe

2020-01-24T13:17:34+08:002020-01-24T13:17:34+08:00

我通过使用最后一个连接示例解决了这个问题，并且能够使用这种结构将 8 个不同的查询串在一起：

select date, sc, fc, fc10s, fc30s, ...
from (select date(hit_date) as date, ...) t
join (select date(hit_date) as date, ...) t2 on t.date = t2.date
join (...) t3 on t.date = t3.date
...

跑了大约20m，这对于报告来说还可以，比我想象的要少。

0

为什么 MySQL 不为子查询使用索引？

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

为什么 MySQL 不为子查询使用索引？

1 个回答

相关问题