我们有用户表:
CREATE UNIQUE INDEX uk_users_login_socnet ON public.users USING btree (login, socnet)
我们对我们的服务器进行了维护,在此期间它被备份并复制到不同的服务器pg_basebackup
(由我们的系统管理员完成,据说没有错误和问题)。
从那时起,我们的日志中出现如下错误:
ERROR: duplicate key value violates unique constraint "uk_users_login_socnet"
DETAIL: Key (login, socnet)=([email protected], 1) already exists. (UPDATE users SET extra = COALESCE(extra, '') || hstore('udid', '') || hstore('udid_last_update', '1721646485') WHERE id = 1234567;)
同时,以下查询未返回任何结果:
SELECT u1.login, u1.socnet
FROM users u1
INNER JOIN users u2 ON u1.login = u2.login AND u1.socnet = u2.socnet
而它的EXPLAIN ANALYZE
是:
Hash Join (cost=2817129.95..6386837.44 rows=9847200 width=18) (actual time=13654.755..31681.796 rows=15584155 loops=1)
Hash Cond: (((u1.login)::text = (u2.login)::text) AND (u1.socnet = u2.socnet))
-> Seq Scan on users u1 (cost=0.00..2490077.18 rows=15678918 width=18) (actual time=0.026..8806.271 rows=15582973 loops=1)
-> Hash (cost=2490077.18..2490077.18 rows=15678918 width=18) (actual time=13653.093..13653.093 rows=15582973 loops=1)
Buckets: 524288 Batches: 32 Memory Usage: 28510kB
-> Seq Scan on users u2 (cost=0.00..2490077.18 rows=15678918 width=18) (actual time=0.186..10887.626 rows=15582973 loops=1)
Planning time: 1.749 ms
Execution time: 32166.739 ms
但是这个查询:
SELECT t1.login, t1.socnet, t2.login, t2.socnet
FROM users AS t1
LEFT JOIN (SELECT id, login, socnet FROM users) AS t2
ON t2.login = t1.login AND t1.socnet = t2.socnet
WHERE t1.id != t2.id
login
显示具有相同和的重复行的列表socnet
,对我来说这毫无意义。以下分别是EXPLAIN ANALYZE
:
Hash Join (cost=2817129.95..6411455.44 rows=9847199 width=36) (actual time=17015.349..33466.957 rows=1182 loops=1)
Hash Cond: (((t1.login)::text = (users.login)::text) AND (t1.socnet = users.socnet))
Join Filter: (t1.id <> users.id)
Rows Removed by Join Filter: 15583110
-> Seq Scan on users t1 (cost=0.00..2490077.18 rows=15678918 width=22) (actual time=0.034..9902.685 rows=15583110 loops=1)
-> Hash (cost=2490077.18..2490077.18 rows=15678918 width=22) (actual time=14344.722..14344.722 rows=15583110 loops=1)
Buckets: 524288 Batches: 32 Memory Usage: 30951kB
-> Seq Scan on users (cost=0.00..2490077.18 rows=15678918 width=22) (actual time=0.024..11382.363 rows=15583110 loops=1)
Planning time: 1.764 ms
Execution time: 33467.260 ms
PostgreSQL 服务器版本为 9.6
我的问题是:
- 发生了什么事?为什么 UNIQUE CONSTRAINT 中有重复数据?
- 为什么我对
INNER JOIN
和得出的结果不同LEFT JOIN
? - DB 结构可能发生了什么?如何调试?在哪里可以找到有关问题出现的提示?
- 如何修复此问题以达到一致状态?