我需要将 2 个表(Key = analysis_id <=> id)分组,其中我的输出应该显示最后一周或一个月或一年,按时间框架分组并按布尔值 "R" 或 "L"分组。我有大约 4000 个用户输入。
我有的表格示例
+----------+-------------+-------------+ +------+---------+------------------------+
| t1 | | t2 |
+----------+-------------+-------------+ +------+---------+------------------------+
| user_id | Analyze_id | result | | id | boolean | date |
+----------+-------------+-------------+ +------+---------+------------------------+
| 1588 | 9001 | 0.753478 | | 9001 | "R" | 2022-10-30 06:38:29 |
| 1588 | 9000 | 0.758452 | | 9000 | "L" | 2022-10-30 06:39:30 |
| 1588 | 8554 | 0.853724 | | 8554 | "R" | 2022-10-22 11:48:42 |
| 1588 | 8553 | 0.603724 | | 8553 | "L" | 2022-10-22 11:47:35 |
| 1588 | 9887 | 0.931123 | | 9887 | "R" | 2022-10-01 14:48:40 |
| 1588 | 9886 | 0.756321 | | 9886 | "L" | 2022-10-01 14:01:57 |
| 1588 | 4832 | 0.755645 | | 4832 | "R" | 2022-10-01 17:18:14 |
| 1588 | 4831 | 0.987445 | | 4831 | "L" | 2022-10-01 17:17:24 |
| 1588 | 2458 | 0.662494 | | 2458 | "R" | 2022-10-01 21:18:12 |
| 1588 | 2458 | 0.864524 | | 2458 | "L" | 2022-10-01 21:17:12 |
+----------+-------------+-------------+ +------+---------+------------------------+
时间范围:
- 8h 包括 6h 到 9h29
- 11h 包括 9h30 到 12h29
- 14h 包括 12h30 到 15h29
- 17h 包括 15h30 到 18h29
- 20h 包括 18h30 到 23h59
目前,我要处理加入 + 时间范围和按组计算的平均值。问题是我的代码是按时间范围分组而不考虑布尔值“R”或“L”,因此结果的输出是将“R”和“L”混合在一个组中,并进行平均和 i需要分开
SELECT
max(t1.user_id) AS user_id,
max(t2.boolean) AS boolean,
AVG(t1.`result`* 8) AS result,
max(t2.`date`) max_date,
case
when TIME(`date`) between '06:00:00' and '09:29:00' then '08h'
when TIME(`date`) between '09:30:00' and '12:29:00' then '11h'
when TIME(`date`) between '12:30:00' and '15:29:00' then '14h'
when TIME(`date`) between '15:30:00' and '18:29:00' then '17h'
when TIME(`date`) between '18:30:00' and '23:59:00' then '20h'
end as 'time_intervals'
FROM table1_features t1
INNER JOIN table2 t2 ON t1.analysis_id = t2.id
WHERE `date` >= CURRENT_TIMESTAMP() - INTERVAL 1 month AND t1.user_id = 1588
GROUP BY time_intervals
ORDER BY max_date ASC
实际输出:
+----------+-------------+------------------------------+
| Actual output |
+----------+-------------+------------------------------+
| user_id | result | boolean | time_intervals |
+----------+-------------+------------------------------+
| 1588 | 0.753478 | R | 08h |
| 1588 | 0.603724 | R | 14h |
| 1588 | 0.931123 | R | 11h |
| 1588 | 0.755645 | R | 17h |
| 1588 | 0.662494 | R | 20h |
+----------+-------------+------------------------------+
预期输出:
+----------+-------------+------------------------------+
| Actual output |
+----------+-------------+------------------------------+
| user_id | result | boolean | time_intervals |
+----------+-------------+------------------------------+
| 1588 | 0.753478 | R | 08h |
| 1588 | 0.753478 | L | 08h |
| 1588 | 0.603724 | R | 14h |
| 1588 | 0.603724 | L | 14h |
| 1588 | 0.931123 | R | 11h |
| 1588 | 0.931123 | L | 11h |
| 1588 | 0.755645 | R | 17h |
| 1588 | 0.755645 | L | 17h |
| 1588 | 0.662494 | R | 20h |
| 1588 | 0.662494 | L | 20h |
+----------+-------------+------------------------------+
或者类似的东西,我有每个时间框架和 Bolean 的 AVR 结果。
正如我已经在评论中建议的那样,您需要更改
max(t2.boolean) AS boolean
在group by 子句中t2.boolean
添加和添加。t2.boolean
GROUP BY time_intervals,t2.boolean
最终查询如下所示: