问题
使用 SQLite v3.35.4 和 v3.36.0 我有一个first_name
表和一个surname
包含常用名称列表的表。我想在一个新表中生成 N 个配对。
我写了这个递归查询:
WITH RECURSIVE
cte(first_name, surname) AS (
SELECT first_name, surname from ( -- always returns the same value
select first_name, surname from (select first_name from first_name order by random() limit 1)
join (select surname from surname order by random() limit 1)
)
UNION ALL
SELECT first_name, surname
FROM cte
LIMIT 2000
)
SELECT first_name, surname FROM cte;
不幸的是,输出看起来像这样:
+------------+---------+
| first_name | surname |
+------------+---------+
| james | smith |
| james | smith |
| james | smith |
| ---------- | ------- |
| ... | ... |
+------------+---------+
我试过的
在查看 SQLite 文档后,我尝试了递归 CTE 和子查询展平部分NOT MATERIALIZED
概述的几个条件。我将随机名称选择放在一个视图中。然而,这些都没有对结果产生积极影响。
有没有办法执行我正在尝试做的事情?
*编辑
我尝试了一个窗口函数并从 where 子句中随机选择名称,但没有成功:(其中 1998 是表的大小)
with recursive
r_first_name as (
select first_name, ROW_NUMBER() over(order by random()) as rn from first_name
),
r_surname as (
select surname, ROW_NUMBER() over(order by random()) as rn from surname
),
rcte(first_name, surname) as (
select first_name, surname from r_first_name rf
join r_surname rs on rs.rn = (select abs(random() % 1998))
where rf.rn = (select abs(random() % 1998))
union all
select first_name, surname from rcte
limit 3000
)
select * from rcte
!!!解决方案 !!!
在查看了类似问题的答案后。
我发现在 CTE 的递归方面,arandom()
将成功更新。虽然不幸的是,它在嵌套在子查询中时不会更新,但如果它位于 CTE 递归的“根”,我可以利用它来获取随机数。
以下是我开发的解决方案。它符合我的特定用例,并且与交叉连接相比性能相对较高:
WITH RECURSIVE
cte AS (
select abs(random()) % (select count(*) from first_name) as first_name_num, abs(random()) % (select count(*) from surname) as surname_num
union all
select abs(random()) % (select count(*) from first_name) as first_name_num, abs(random()) % (select count(*) from surname) as surname_num from cte
LIMIT 6000
),
result as (
select * from cte
join (select first_name, ROW_NUMBER() over (order by random()) as rn from first_name) fn -- this is always the same result
on cte.first_name_num = fn.rn
join (select surname, ROW_NUMBER() over (order by random()) as rn from surname) sn -- this updates every loop around except subqueries are compiled/cached or something so they are unusable here if you want updated values
on cte.surname_num = sn.rn
)
select first_name, surname from result