我如何才能加快以下查询的速度?我有一张约url_meta
9000 万条记录的大型表,38 GB 数据,6 GB 索引
每行的主要唯一 ID 是url_hash
(截断为 16 个字符的 md5)
然后我创建了一个名为的大型全文索引url_meta_index
,包含以下列:
- url_title
- url_description
- url_关键词
- url_段落
表还包含一个名为url_total_links_in
- 如果我仅选择包含许多链接的 URL,则速度会非常快,最重要的是,只有240 行包含超过 1000 个链接:
SELECT * FROM url_meta WHERE url_total_links_in > 1000
240 rows in set (0.01 sec)
但是如果我在相同查询(选择相同的上述行)之后在大型文本索引中搜索,那么它需要很长时间:
mysql> SELECT * FROM url_meta WHERE url_total_links_in > 1000 AND match(url_title, url_description, url_keywords, url_paragraphs) against('computer store' IN BOOLEAN MODE) LIMIT 500; 10 rows in set (4 min 4.94 sec)
我原本以为第二部分match against
只会查看它找到的 240 条记录url_total_links_in > 1000
,但事实并非如此?根据较长的查询时间,我猜它会查看所有 9000 万条记录。
通过 PHP,我可以在第一个查询中选择 240 行,然后循环遍历这几行以“匹配”第二个查询中显示的文本。但是我如何通过 MySQL 做到这一点?
url_hash
如果我...在大型多列文本索引中包含唯一行 ID ( )会有帮助吗url_meta_index
?或者如果我将列添加url_total_links_in
到同一个多列文本索引会有帮助吗url_meta_index
?在这种情况下,还有其他 MySQL 运算符会有帮助吗?或者在这种情况下,有任何相关的 MySQL 变量吗?
稍后编辑,包括更多细节:
DESCRIBE url_meta;
+--------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+--------------+------+-----+---------+-------+
| url_hash | char(16) | NO | PRI | NULL | |
| url_sharding | char(2) | YES | MUL | NULL | |
| url | varchar(512) | NO | | NULL | |
| url_title | varchar(128) | YES | MUL | NULL | |
| url_description | text | YES | | NULL | |
| url_keywords | varchar(128) | YES | | NULL | |
| url_paragraphs | mediumtext | YES | | NULL | |
| url_total_links_in | smallint | NO | MUL | 0 | |
| url_meta_date | int | NO | | 0 | |
| url_misc | tinyint | YES | | NULL | |
+--------------------+--------------+------+-----+---------+-------+
EXPLAIN ANALYZE SELECT * FROM url_meta WHERE url_total_links_in > 1000 AND match(url_title, url_description, url_keywords, url_paragraphs) against('computer store' IN BOOLEAN MODE) LIMIT 500;

| EXPLAIN |

| -> Limit: 500 row(s) (cost=0.40 rows=0.05) (actual time=2582.991..21198.806 rows=10 loops=1)
-> Filter: ((url_meta.url_total_links_in > 1000) and (match url_meta.url_title,url_meta.url_description,url_meta.url_keywords,url_meta.url_paragraphs against ('computer store' in boolean mode))) (cost=0.40 rows=0.05) (actual time=2582.991..21198.800 rows=10 loops=1)
-> Full-text index search on url_meta using url_meta_index (url_title='computer store') (cost=0.40 rows=1) (actual time=312.284..21096.609 rows=1247864 loops=1)
|

1 row in set (23.36 sec)
SHOW INDEX FROM url_meta;
+----------+------------+--------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+----------+------------+--------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| url_meta | 0 | PRIMARY | 1 | url_hash | A | 81358144 | NULL | NULL | | BTREE | | | YES | NULL |
| url_meta | 1 | url_total_links_in | 1 | url_total_links_in | A | 2790 | NULL | NULL | | BTREE | | | YES | NULL |
| url_meta | 1 | url_sharding | 1 | url_sharding | A | 1590 | NULL | NULL | YES | BTREE | | | YES | NULL |
| url_meta | 1 | url_meta_index | 1 | url_title | NULL | 81358147 | NULL | NULL | YES | FULLTEXT | | | YES | NULL |
| url_meta | 1 | url_meta_index | 2 | url_description | NULL | 81358147 | NULL | NULL | YES | FULLTEXT | | | YES | NULL |
| url_meta | 1 | url_meta_index | 3 | url_keywords | NULL | 81358147 | NULL | NULL | YES | FULLTEXT | | | YES | NULL |
| url_meta | 1 | url_meta_index | 4 | url_paragraphs | NULL | 81358147 | NULL | NULL | YES | FULLTEXT | | | YES | NULL |
+----------+------------+--------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
7 rows in set (0.00 sec)