问题描述
MySQL failed to connect, and too many connections expcetion is thrown. But after that, port is open, mysqld failed to respond packets including version and plugins of password,mysql client is blocked and never recover again.
mysql错误信息
- 版本:percona-server-5.7.24-27
- 最大连接数=5000
- MySQL 错误日志
- mysql 尝试连接服务器时的 tcpdump。
- strace 信息 (strace -tt -T -v -f -p 15179 -o output.log)
17264 19:48:42.143662 set_robust_list(0x7fa5a22c39e0, 24 <unfinished ...>
15179 19:48:42.143673 <... clone resumed> child_stack=0x7fa5a22c2f30, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fa5a22c39d0, tls=0x7fa5a22c3700, child_tidptr=0x7fa5a22c39d0) = 17264 <0.001116>
17264 19:48:42.144164 <... set_robust_list resumed> ) = 0 <0.000497>
17264 19:48:42.144535 gettid( <unfinished ...>
17264 19:48:42.144879 <... gettid resumed> ) = 17264 <0.000337>
17264 19:48:42.145227 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
- strace -f -p 15179 > 20minutes.log
9107 18:50:39.688588 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
8926 18:50:39.688703 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
8738 18:50:39.688719 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
8492 18:50:39.688730 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
8203 18:50:39.688742 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
8106 18:50:39.688752 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
7983 18:50:39.688763 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
7753 18:50:39.688774 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
7353 18:50:39.688785 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
7339 18:50:39.688796 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
7255 18:50:39.688806 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
7205 18:50:39.688817 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
7187 18:50:39.688827 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
7157 18:50:39.688836 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
7104 18:50:39.688846 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
7055 18:50:39.688856 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
7028 18:50:39.688871 futex(0x1e01e80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
- lsof -p 15179 > lsof.15179.log,4410 行包括“协议:TCP”
mysqld 15179 mysql 5254u sock 0,7 0t0 111363206 protocol: TCP
mysqld 15179 mysql 5255u sock 0,7 0t0 111357243 protocol: TCP
mysqld 15179 mysql 5256u sock 0,7 0t0 111357244 protocol: TCP
mysqld 15179 mysql 5257u sock 0,7 0t0 111363207 protocol: TCP
mysqld 15179 mysql 5258u sock 0,7 0t0 111363208 protocol: TCP
mysqld 15179 mysql 5259u sock 0,7 0t0 111353396 protocol: TCP
mysqld 15179 mysql 5260u sock 0,7 0t0 111356603 protocol: TCP
mysqld 15179 mysql 5261u sock 0,7 0t0 111356604 protocol: TCP
mysqld 15179 mysql 5262u sock 0,7 0t0 111359747 protocol: TCP
mysqld 15179 mysql 5263u sock 0,7 0t0 111356606 protocol: TCP
mysqld 15179 mysql 5264u sock 0,7 0t0 111357250 protocol: TCP
mysqld 15179 mysql 5265u sock 0,7 0t0 111359748 protocol: TCP
mysqld 15179 mysql 5266u sock 0,7 0t0 111360201 protocol: TCP
mysqld 15179 mysql 5267u sock 0,7 0t0 111360202 protocol: TCP
mysqld 15179 mysql 5268u sock 0,7 0t0 111357251 protocol: TCP
mysqld 15179 mysql 5269u sock 0,7 0t0 111363211 protocol: TCP
mysqld 15179 mysql 5270u sock 0,7 0t0 111362371 protocol: TCP
mysqld 15179 mysql 5271u sock 0,7 0t0 111354590 protocol: TCP
mysqld 15179 mysql 5272u sock 0,7 0t0 111363212 protocol: TCP
mysqld 15179 mysql 5273u sock 0,7 0t0 111354591 protocol: TCP
netstat 包括许多 close_wait 连接;
顶部-Hbp 15179 -n1 | wc -l, mysqld 5000+线程
源代码分析
LOCK_thread_cache 不会像往常一样释放。 mysql_mutex_lock(&LOCK_thread_cache); ,但我不知道为什么。
@danblack
https://bugs.mysql.com/bug.php?id=91941 ,
https://bugs.mysql.com/bug.php?id=92108,
看来这是一个已知问题。我正在尝试重现错误中描述的问题。