AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / user-102905

Chloe's questions

Martin Hope
Chloe
Asked: 2020-01-21 18:43:17 +0800 CST

为什么 MySQL 不为子查询使用索引?

  • 0

此查询需要永远运行(30+m - 无穷大)。

select date, 
       sc, 
       ( select count(fingerprint_id) 
         from stats 
         where hit_date >= t.date 
           and hit_date < date_add('2020-01-20', interval 1 day) 
           and hit_type = 0 
           and fingerprint_id is not null ) as total_fingerprint
from ( select date(hit_date) as date, 
              sum(sc) as sc 
       from delayed_stats  
       where hit_date > date_sub(now(), interval 1 day) 
       group by date(hit_date) 
       order by hit_date) t;

单个查询需要 1 秒和 8 秒才能运行,但组合起来永远不会完成。我预计8-9秒。如果我t.date用静态的“2020-01-20”替换,则需要 8 秒。只需将一个静态日期替换为t.date导致查询“挂起”。复制此挂起的最小查询是

select date, 
       (select count(fingerprint_id) from stats where hit_date >= t.date and hit_date < date_add(t.date, interval 1 day) and hit_type = 0 and fingerprint_id is not null) as total_fingerprint
from (select '2020-01-01' as date union select '2020-01-02' as date) t;

这是查询解释:

+----+--------------------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+---------------+--------------+---------+------+-----------+----------+--------------------------------------------------------+
| id | select_type        | table         | partitions                                                                                                                                | type  | possible_keys | key          | key_len | ref  | rows      | filtered | Extra                                                  |
+----+--------------------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+---------------+--------------+---------+------+-----------+----------+--------------------------------------------------------+
|  1 | PRIMARY            | <derived3>    | NULL                                                                                                                                       | ALL   | NULL          | NULL         | NULL    | NULL |      7496 |   100.00 | NULL                                                   |
|  3 | DERIVED            | delayed_stats | NULL                                                                                                                                       | range | hit_date_idx  | hit_date_idx | 5       | NULL |      7496 |   100.00 | Using index condition; Using temporary; Using filesort |
|  2 | DEPENDENT SUBQUERY | stats         | p20180101,p20180201,p20180301,p20180401,p20180501,p20180601,p20180701,p20180801,p20180901,p20181001,p20181101,p20181201,p20190101,p20190201,p20190301,p20190401,p20190501,p20190601,p20190701,p20190801,p20190901,p20191001,p20191101,p20191201,p20200101,p20200201 | ALL   | NULL          | NULL         | NULL    | NULL | 316867000 |     1.00 | Using where                                            |
+----+--------------------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+---------------+--------------+---------+------+-----------+----------+--------------------------------------------------------+
3 rows in set, 2 warnings (0.11 sec)

它似乎没有在表的子查询上使用 hit_date 索引(PRIMARY KEY (id ,hit_date ))stats。我的最终目标是结合这两个查询(interval 30 day):

select date(hit_date), 
       sum(sc) 
from delayed_stats 
where hit_date > date_sub(now(), interval 30 day) 
group by date(hit_date) 
order by hit_date;

select date(hit_date), 
       count(fingerprint_id) 
from stats 
where hit_date > date_sub(now(), interval 30 day) 
  and hit_type = 0 
  and fingerprint_id is not null 
group by date(hit_date) 
order by hit_date; -- 2m21s

当我看到表上第二个查询的查询计划时stats,它显示possible_keys为PRIMARY,source_id,stats_bag_id_idx. 我尝试了另一种将它们组合在一起的方法,即加入,但是运行需要 15m,而它应该只需要 2m。

select t.date, 
       sc, 
       fingerprint_count 
from ( select date(hit_date) date, 
              sum(sc) as sc 
       from delayed_stats 
       where hit_date > date_sub(now(), interval 30 day) 
       group by date(hit_date) 
       order by hit_date ) t 
join ( select date(hit_date) date, 
              count(fingerprint_id) as fingerprint_count 
       from stats 
       where hit_date > date_sub(now(), interval 30 day) 
         and hit_type = 0 
         and fingerprint_id is not null 
       group by date(hit_date) 
       order by hit_date ) t2 on t.date = t2.date;
mysql performance
  • 1 个回答
  • 1418 Views
Martin Hope
Chloe
Asked: 2019-09-24 20:46:53 +0800 CST

将一个大表拆分为 12 个滚动月度表并将它们用于报告或保留大表并删除超过 1 年的行是否更快?

  • 0

我的同事想将一个 158M 行的大型统计表拆分为 stats_jan、stats_feb ……并使用 UNION 从中选择报告。这是标准做法吗?它比只使用大表并删除超过一年的行更快吗?该表有许多小行。

mysql> describe stats;
+----------------+---------------------+------+-----+---------+----------------+
| Field          | Type                | Null | Key | Default | Extra          |
+----------------+---------------------+------+-----+---------+----------------+
| id             | bigint(20) unsigned | NO   | PRI | NULL    | auto_increment |
| badge_id       | bigint(20) unsigned | NO   | MUL | NULL    |                |
| hit_date       | datetime            | YES  | MUL | NULL    |                |
| hit_type       | tinyint(4)          | YES  |     | NULL    |                |
| source_id      | bigint(20) unsigned | YES  | MUL | NULL    |                |
| fingerprint_id | bigint(20) unsigned | YES  |     | NULL    |                |
+----------------+---------------------+------+-----+---------+----------------+

我确实手动拆分了表并将行复制到适当的月份表中并创建了一个巨大的 UNION 查询。大型 UNION 查询耗时 14s,而单表查询耗时 4.5m。当总行数相同时,为什么许多较小的表比一个大表花费的时间要短得多?

create table stats_jan (...);
create table stats_feb (...);
...
create index stats_jan_hit_date_idx on stats_jan (hit_date);
...
insert into stats_jan select * from stats where hit_date >= '2019-01-01' and hit_date < '2019-02-01';
...
delete from stats where hit_date < '2018-09-01';
...

月表有 170 万行到 3500 万行。

select host as `key`, count(*) as value from stats join sources on source_id = sources.id where hit_date >= '2019-08-21 19:43:19' and sources.host != 'NONE' group by source_id order by value desc limit 10;
4 min 30.39 sec

flush tables;
reset query cache;

select host as `key`, count(*) as value from stats_jan join sources on source_id = sources.id where hit_date >= '2019-08-21 19:43:19' and sources.host != 'NONE' group by source_id
UNION
...
order by value desc limit 10;
14.16 sec
mysql database-design
  • 1 个回答
  • 562 Views
Martin Hope
Chloe
Asked: 2018-12-13 11:40:19 +0800 CST

我怎样才能加快这个有索引的 2m5s 查询?

  • 0

我怎样才能加快这个有索引的 2m5s 查询?

select urls.id as urlId, 
    count(case when s1.hit_type = 0 then 1 end) as aCount, 
    count(case when s1.hit_type = 1 then 1 end) as bCount, 
    count(case when s1.hit_type = 2 then 1 end) as cCount, 
    count(distinct s1.source_id) as sourcesCount 
from urls join stats s1 on urls.id = s1.url_id 
where s1.hit_date >= '2017-12-12' 
group by urls.id 
order by aCount desc 
limit 0,100;

mysql> show create table stats;

| stats | CREATE TABLE `stats` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `url_id` varchar(100) DEFAULT NULL,
  `hit_date` datetime DEFAULT NULL,
  `hit_type` tinyint(4) DEFAULT NULL,
  `source_id` bigint(20) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `url_id_idx` (`url_id`),
  KEY `source_id` (`source_id`),
  KEY `stats_hit_date_idx` (`hit_date`),
  CONSTRAINT `stats_ibfk_1` FOREIGN KEY (`url_id`) REFERENCES `urls` (`ID`),
  CONSTRAINT `stats_ibfk_2` FOREIGN KEY (`source_id`) REFERENCES `sources` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6027557 DEFAULT CHARSET=latin1 |

mysql> describe select...
| id | select_type | table   | type   | possible_keys                                                                                   | key     | key_len | ref                      | rows    | Extra                                        |
+----+-------------+---------+--------+-------------------------------------------------------------------------------------------------+---------+---------+--------------------------+---------+----------------------------------------------+
|  1 | SIMPLE      | s1      | ALL    | url_id_idx,stats_hit_date_idx                                                                   | NULL    | NULL    | NULL                     | 5869695 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | urls    | eq_ref | PRIMARY,urls_email_idx,urls_status_idx,deptId_idx,deptId_status_email_idx                       | PRIMARY | 102     | db.s1.url_id             |     1   | Using index                                  |

它似乎没有使用 hit_date 索引或 url_id 索引。

我尝试使用子选择(select count(*) from stats where url_id = ... and hit_date >= ... and hit_type = 0) as aCount,速度更快,用了 24 秒。有没有办法让它小于5s?整个请求的限制是 30 秒。

MySQL 服务器版本:5.6.35-log MySQL Community Server (GPL)

mysql mysql-5.6
  • 2 个回答
  • 68 Views
Martin Hope
Chloe
Asked: 2018-05-17 14:54:08 +0800 CST

是否可以将这三个连接查询合并为一个?

  • 2

我想将不同 hit_types 的统计计数合并到一个查询中。那可能吗?

MariaDB [db]> select allurls.id, count(s1.id) from allurls inner join stats s1 on allurls.id = s1.allurl_id and s1.hit_type = 0 where s1.hit_date >= '2018-01-15'  group by allurls.id;
+-----+--------------+
| id  | count(s1.id) |
+-----+--------------+
| aaa |            1 |
| cnn |           16 |
+-----+--------------+

MariaDB [db]> select allurls.id, count(s1.id) from allurls inner join stats s1 on allurls.id = s1.allurl_id and s1.hit_type = 1 where s1.hit_date >= '2018-01-15'  group by allurls.id;
+-----+--------------+
| id  | count(s1.id) |
+-----+--------------+
| cnn |            1 |
+-----+--------------+

MariaDB [db]> select allurls.id, count(s1.id) from allurls inner join stats s1 on allurls.id = s1.allurl_id and s1.hit_type = 2 where s1.hit_date >= '2018-01-15'  group by allurls.id;
+-----+--------------+
| id  | count(s1.id) |
+-----+--------------+
| cnn |            4 |
+-----+--------------+

我试图将前两个结合起来,但数字都搞砸了,它消除了第一个结果“aaa”。

MariaDB [db]> select allurls.id, count(s1.id), count(s2.id) from allurls inner join stats s1 on allurls.id = s1.allurl_id and s1.hit_type = 0 inner join stats s2 on allurls.id = s2.allurl_id and s2.hit_type = 1 where s1.hit_date >= '2018-01-15' and s2.hit_date >= '2018-01-15' group by allurls.id;
+-----+--------------+--------------+
| id  | count(s1.id) | count(s2.id) |
+-----+--------------+--------------+
| cnn |           16 |           16 |
+-----+--------------+--------------+

我期待看到

+-----+--------------+--------------+
| id  | count(s1.id) | count(s2.id) |
+-----+--------------+--------------+
| aaa |            1 |            0 |
| cnn |           16 |            1 |
+-----+--------------+--------------+

最后我还想包括count(distinct(s4.source_id)).

这是一个小提琴:https ://www.db-fiddle.com/f/tGP5SbC2AdGgeEwWTAgobf/0

join count
  • 3 个回答
  • 45 Views
Martin Hope
Chloe
Asked: 2016-09-06 14:11:34 +0800 CST

这个语句如何在 Postgres 中更新 3 行?

  • 4

我很好奇这条语句是如何在 Postgres 中更新 3 行的。我运行它的所有其他时间,它都会更新 0 或 1。有没有办法找出哪些行?

bestsales=# update keyword set revenue = random()*10 where id = cast(random()*99999 as int);
UPDATE 3

id是主键。

 id               | integer                        | not null default nextval('keyword_id_seq'::regclass)
    "keyword_pkey" PRIMARY KEY, btree (id)

我尝试将其运行为SELECT:

bestsales=# select * from keyword where id = cast(random()*99999 as int);
  id   |       keyword       | seed_id | source | search_count | country | language | volume | cpc  | competition | modified_on | google_violation | revenue | bing_violation
-------+---------------------+---------+--------+--------------+---------+----------+--------+------+-------------+-------------+------------------+---------+----------------
  6833 | vizio m190mv        |         | GOOGLE |            0 |         |          |     70 | 0.38 |        0.90 |             |                  |         |
 65765 | shiatsu massage mat |         | SPYFU  |            0 |         |          |    110 | 0.69 |             |             |                  |         |
 87998 | granary flour       |         | SPYFU  |            0 |         |          |     40 | 0.04 |             |             |                  |         |
(3 rows)

有时它会返回多个。这怎么可能?

PostgreSQL 9.5.3

postgresql update
  • 2 个回答
  • 309 Views
Martin Hope
Chloe
Asked: 2016-08-24 11:07:33 +0800 CST

这个唯一索引如何允许重复行?

  • 1

有没有办法让这个唯一索引允许重复行?我想也许有一些额外的空格字符,但我找不到它们。

=> select *, length(keyword), length(country), length(language) from keyword where id in (4588076, 4951423);
   id    |       keyword       | seed_id | source | search_count | country | language | volume | cpc  | competition | modified_on | violation | revenue | length | length | length
---------+---------------------+---------+--------+--------------+---------+----------+--------+------+-------------+-------------+-----------+---------+--------+--------+--------
 4588076 | power wallet review |         | SPYFU  |            0 |         |          |     70 | 0.11 |        0.31 |             |           |         |     19 |        |
 4951423 | power wallet review |         | SPYFU  |            2 |         |          |     70 | 0.11 |        0.31 |             |           |         |     19 |        |
(2 rows)

指数是

"keyword_keyword_country_language" UNIQUE, btree (keyword, country, language)

PostgreSQL 9.5.3

好的,我打算删除其他两列,但我想我会测试该keyword列并发现:

=> select k1.id, k1.keyword, k2.id, k2.keyword, k1.keyword=k2.keyword from keyword k1, keyword k2 where k1.id=4588076 and k2.id=4951423;
   id    |       keyword       |   id    |       keyword       | ?column?
---------+---------------------+---------+---------------------+----------
 4588076 | power wallet review | 4951423 | power wallet review | f
postgresql index
  • 2 个回答
  • 3791 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    连接到 PostgreSQL 服务器:致命:主机没有 pg_hba.conf 条目

    • 12 个回答
  • Marko Smith

    如何让sqlplus的输出出现在一行中?

    • 3 个回答
  • Marko Smith

    选择具有最大日期或最晚日期的日期

    • 3 个回答
  • Marko Smith

    如何列出 PostgreSQL 中的所有模式?

    • 4 个回答
  • Marko Smith

    列出指定表的所有列

    • 5 个回答
  • Marko Smith

    如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

    • 4 个回答
  • Marko Smith

    你如何mysqldump特定的表?

    • 4 个回答
  • Marko Smith

    使用 psql 列出数据库权限

    • 10 个回答
  • Marko Smith

    如何从 PostgreSQL 中的选择查询中将值插入表中?

    • 4 个回答
  • Marko Smith

    如何使用 psql 列出所有数据库和表?

    • 7 个回答
  • Martin Hope
    Jin 连接到 PostgreSQL 服务器:致命:主机没有 pg_hba.conf 条目 2014-12-02 02:54:58 +0800 CST
  • Martin Hope
    Stéphane 如何列出 PostgreSQL 中的所有模式? 2013-04-16 11:19:16 +0800 CST
  • Martin Hope
    Mike Walsh 为什么事务日志不断增长或空间不足? 2012-12-05 18:11:22 +0800 CST
  • Martin Hope
    Stephane Rolland 列出指定表的所有列 2012-08-14 04:44:44 +0800 CST
  • Martin Hope
    haxney MySQL 能否合理地对数十亿行执行查询? 2012-07-03 11:36:13 +0800 CST
  • Martin Hope
    qazwsx 如何监控大型 .sql 文件的导入进度? 2012-05-03 08:54:41 +0800 CST
  • Martin Hope
    markdorison 你如何mysqldump特定的表? 2011-12-17 12:39:37 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 对 SQL 查询进行计时? 2011-06-04 02:22:54 +0800 CST
  • Martin Hope
    Jonas 如何从 PostgreSQL 中的选择查询中将值插入表中? 2011-05-28 00:33:05 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 列出所有数据库和表? 2011-02-18 00:45:49 +0800 CST

热门标签

sql-server mysql postgresql sql-server-2014 sql-server-2016 oracle sql-server-2008 database-design query-performance sql-server-2017

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve