是否有任何 MySQL 基准测试工具？[关闭]

Question

Chloe

Asked: 2018-12-13 11:40:19 +0800 CST2018-12-13 11:40:19 +0800 CST 2018-12-13 11:40:19 +0800 CST

我怎样才能加快这个有索引的 2m5s 查询？

772

我怎样才能加快这个有索引的 2m5s 查询？

select urls.id as urlId, 
    count(case when s1.hit_type = 0 then 1 end) as aCount, 
    count(case when s1.hit_type = 1 then 1 end) as bCount, 
    count(case when s1.hit_type = 2 then 1 end) as cCount, 
    count(distinct s1.source_id) as sourcesCount 
from urls join stats s1 on urls.id = s1.url_id 
where s1.hit_date >= '2017-12-12' 
group by urls.id 
order by aCount desc 
limit 0,100;

mysql> show create table stats;

| stats | CREATE TABLE `stats` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `url_id` varchar(100) DEFAULT NULL,
  `hit_date` datetime DEFAULT NULL,
  `hit_type` tinyint(4) DEFAULT NULL,
  `source_id` bigint(20) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `url_id_idx` (`url_id`),
  KEY `source_id` (`source_id`),
  KEY `stats_hit_date_idx` (`hit_date`),
  CONSTRAINT `stats_ibfk_1` FOREIGN KEY (`url_id`) REFERENCES `urls` (`ID`),
  CONSTRAINT `stats_ibfk_2` FOREIGN KEY (`source_id`) REFERENCES `sources` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6027557 DEFAULT CHARSET=latin1 |

mysql> describe select...
| id | select_type | table   | type   | possible_keys                                                                                   | key     | key_len | ref                      | rows    | Extra                                        |
+----+-------------+---------+--------+-------------------------------------------------------------------------------------------------+---------+---------+--------------------------+---------+----------------------------------------------+
|  1 | SIMPLE      | s1      | ALL    | url_id_idx,stats_hit_date_idx                                                                   | NULL    | NULL    | NULL                     | 5869695 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | urls    | eq_ref | PRIMARY,urls_email_idx,urls_status_idx,deptId_idx,deptId_status_email_idx                       | PRIMARY | 102     | db.s1.url_id             |     1   | Using index                                  |

它似乎没有使用 hit_date 索引或 url_id 索引。

我尝试使用子选择(select count(*) from stats where url_id = ... and hit_date >= ... and hit_type = 0) as aCount，速度更快，用了 24 秒。有没有办法让它小于5s？整个请求的限制是 30 秒。

MySQL 服务器版本：5.6.35-log MySQL Community Server (GPL)

2 个回答

Voted

Akina · Answer 1 · 2018-12-13T21:19:51+08:00

您的查询等于

select /* urls.id */ s1.url_id as urlId, 
    count(case when s1.hit_type = 0 then 1 end) as aCount, 
    count(case when s1.hit_type = 1 then 1 end) as bCount, 
    count(case when s1.hit_type = 2 then 1 end) as cCount, 
    count(distinct s1.source_id) as sourcesCount 
from /* urls join */ stats s1 /* on urls.id = s1.url_id */
where s1.hit_date >= '2017-12-12' 
group by /* urls.id */ s1.url_id
order by aCount desc 
limit 0,100;

除了在您的查询输出中，表中只存在“成对”的记录urls。

但约束

CONSTRAINT `stats_ibfk_1` FOREIGN KEY (`url_id`) REFERENCES `urls` (`ID`)

不允许那些记录。

因此，我的查询与您的查询完全相同，您可以改用它。

要提高此查询速度，您可以创建覆盖索引

ALTER TABLE stats ADD INDEX idx (url_id, hit_date, hit_type, source_id)

最好的方法是移动url_id到一个单独的表并将其替换为数字类型的引用（按 VARCHAR 字段分组很昂贵）。

另外 -count(case when s1.hit_type = N then 1 end)可以替换为 short SUM(s1.hit_type = N)。

为了加快整个查询的速度，我建议尝试将其分成 4 个单独的查询：

SELECT urlId, 
       MAX(aCount) aCount, 
       MAX(bCount) bCount, 
       MAX(cCount) cCount, 
       MAX(sourcesCount) sourcesCount 
FROM (  select  s1.url_id as urlId, 
                COUNT(*) as aCount, 
                0 as bCount, 
                0 as cCount, 
                0 as sourcesCount 
        from stats s1 
        where s1.hit_date >= '2017-12-12' AND s1.hit_type = 0
        group by s1.url_id
      UNION ALL
        select  s1.url_id, 0, COUNT(*), 0, 0
        from stats s1 
        where s1.hit_date >= '2017-12-12'  AND s1.hit_type = 1
        group by s1.url_id
      UNION ALL
        select  s1.url_id as urlId, 0, 0, COUNT(*), 0
        from stats s1 
        where s1.hit_date >= '2017-12-12'  AND s1.hit_type = 2
        group by s1.url_id
      UNION ALL
        select  s1.url_id as urlId, 0, 0, 0, count(distinct s1.source_id)
        from stats s1 
        where s1.hit_date >= '2017-12-12' 
        group by s1.url_id
    ) x
GROUP BY urlId
order by aCount desc 
limit 0,100;

by 索引(url_id, hit_type, hit_date)将加速前 3 个子查询，by索引将加速(url_id, hit_date, source_id)最后一个子查询。

danblack · Answer 2 · 2018-12-13T20:46:33+08:00

danblack

2018-12-13T20:46:33+08:002018-12-13T20:46:33+08:00

您的查询取决于在阅读 5869695+ 个结果并在另一个表中匹配这些结果后想要获得摘要。

在 < 5 秒内得到这个是一个很大的要求。

由于您的数据在输入后似乎相当稳定，我建议根据日期创建汇总表并使用 {a,b,c}Count。

1

我怎样才能加快这个有索引的 2m5s 查询？

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

我怎样才能加快这个有索引的 2m5s 查询？

2 个回答

相关问题