在下面的示例中,我需要根据日期组合获取包含最新数据的行。我不能简单地这样做MAX(insert_date), MAX(update_date)
,因为它不会返回正确的数据。它现在的工作方式是让MAX(insert_date)
then 执行自连接以获得MAX(update_date)
then 自连接以返回行值。
有没有更好、更有效的方法来做到这一点?下面的示例仅包含 4 行,但在生产中我将每隔几分钟处理大约 100 万行。
例子:
create table #temp (
iud char(1) not null,
id int not null,
date date not null,
value decimal(9,2) not null,
insert_date datetimeoffset not null,
update_date datetime2 not null
);
insert #temp
values
('i', 1001, '2001-01-01', 2, '2001-01-01 00:00', '2001-01-01 00:00'),
('i', 1001, '2001-01-01', 9, '2001-01-01 00:00', '2001-01-01 01:00'),
('i', 1001, '2001-01-01', 7, '2001-01-02 00:00', '2001-01-01 00:30'),
('i', 1001, '2001-01-01', 4, '2001-01-02 00:00', '2001-01-01 00:00');
-- this is wrong as it returns no results
select t.*
from #temp as t
join (select iud, id, date, max(insert_date) as insert_date, max(update_date) as update_date
from #temp
group by iud, id, date) as x
on t.iud = x.iud
and t.id = x.id
and t.date = x.date
and t.insert_date = x.insert_date
and t.update_date = x.update_date;
-- this works, but can it be simplified?
select n.*
from #temp as n
join (
select n.iud, n.id, n.date, n.insert_date, max(update_date) as update_date
from #temp as n
join (select iud, id, date, max(insert_date) as insert_date
from #temp
group by iud, id, date) as i
on i.iud = n.iud
and i.id = n.id
and i.insert_date = n.insert_date
group by n.iud, n.id, n.date, n.insert_date) as x
on x.date = n.date
and x.insert_date = n.insert_date
and x.iud = n.iud
and x.id = n.id
and x.update_date = n.update_date
order by n.iud, n.id, n.date;
drop table #temp;
如果我已经理解您要正确执行的操作,那么
可能表现更好 - 特别是如果有一个覆盖索引
(iud, id, date, insert_date DESC, update_date DESC)