AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / dba / 问题 / 116552
Accepted
mpag
mpag
Asked: 2015-09-30 20:14:36 +0800 CST2015-09-30 20:14:36 +0800 CST 2015-09-30 20:14:36 +0800 CST

Access (Jet) SQL:表 B 中的日期时间戳位于表 A 中每个日期时间戳的两侧

  • 772

第一话

如果您只想破解代码,则可以安全地忽略以下(包括)加入部分:开始。背景和结果仅作为上下文。如果您想查看代码最初的样子,请查看 2015-10-06 之前的编辑历史。


客观的

最终,我想根据表中可用 GPS 数据的日期时间戳计算发射器(X或)的内插 GPS 坐标,这些数据直接位于表中观测值的两侧。XmitSecondTableFirstTable

我实现最终目标的直接目标是弄清楚如何最好地加入FirstTable以SecondTable获得那些侧翼时间点。稍后我可以使用该信息计算中间 GPS 坐标,假设沿等距柱状坐标系进行线性拟合(花言巧语是说我不在乎地球是这个比例的球体)。


问题

  1. 有没有更有效的方法来生成最接近的前后时间戳?
    • 由我自己修复,只需抓住“之后”,然后仅获取与“之后”相关的“之前”。
  2. 有没有不涉及(A<>B OR A=B)结构的更直观的方式。
    • Byrdzeye提供了基本的替代方案,但是我的“真实世界”体验并不符合他的所有 4 个连接策略都执行相同。但完全归功于他解决了替代连接方式。
  3. 您可能有的任何其他想法、技巧和建议。
    • 到目前为止,byrdzeye和Phrancis在这方面都非常有帮助。我发现Phrancis 的建议非常出色,并在关键阶段提供了帮助,所以我会在这里给予他优势。

如果我能就问题 3 获得任何额外帮助,我将不胜感激。 要点反映了我认为在个别问题上对我帮助最大的人。


表定义

半视觉表示

第一桌

Fields
  RecTStamp | DateTime  --can contain milliseconds via VBA code (see Ref 1) 
  ReceivID  | LONG
  XmitID    | TEXT(25)
Keys and Indices
  PK_DT     | Primary, Unique, No Null, Compound
    XmitID    | ASC
    RecTStamp | ASC
    ReceivID  | ASC
  UK_DRX    | Unique, No Null, Compound
    RecTStamp | ASC
    ReceivID  | ASC
    XmitID    | ASC

第二张桌子

Fields
  X_ID      | LONG AUTONUMBER -- seeded after main table has been created and already sorted on the primary key
  XTStamp   | DateTime --will not contain partial seconds
  Latitude  | Double   --these are in decimal degrees, not degrees/minutes/seconds
  Longitude | Double   --this way straight decimal math can be performed
Keys and Indices
  PK_D      | Primary, Unique, No Null, Simple
    XTStamp   | ASC
  UIDX_ID   | Unique, No Null, Simple
    X_ID      | ASC

ReceiverDetails表

Fields
  ReceivID                      | LONG
  Receiver_Location_Description | TEXT -- NULL OK
  Beginning                     | DateTime --no partial seconds
  Ending                        | DateTime --no partial seconds
  Lat                           | DOUBLE
  Lon                           | DOUBLE
Keys and Indicies
  PK_RID  | Primary, Unique, No Null, Simple
    ReceivID | ASC

ValidXmitters表

Field (and primary key)
  XmitID    | TEXT(25) -- primary, unique, no null, simple

SQL小提琴...

...以便您可以使用表定义和代码 这个问题是针对 MSAccess 的,但正如 Phancis 指出的那样,Access 没有 SQL fiddle 样式。所以,您应该可以到这里查看我的表定义和基于Phancis 回答的代码:
http://sqlfiddle.com/#!6/e9942/4(外部链接)


加入:开始

我目前的“内胆”加入策略

首先创建一个 FirstTable_rekeyed 列顺序和复合主键(RecTStamp, ReceivID, XmitID)所有索引/排序ASC。我还在每一列上分别创建了索引。然后像这样填充它。

INSERT INTO FirstTable_rekeyed (RecTStamp, ReceivID, XmitID)
  SELECT DISTINCT ROW RecTStamp, ReceivID, XmitID
  FROM FirstTable
  WHERE XmitID IN (SELECT XmitID from ValidXmitters)
  ORDER BY RecTStamp, ReceivID, XmitID;

上面的查询用 153006 条记录填充新表,并在 10 秒左右的时间内返回。

当使用 TOP 1 子查询方法时,当整个方法被包装在“SELECT Count(*) FROM ( ... )”中时,以下内容会在一两秒内完成

SELECT 
    ReceiverRecord.RecTStamp, 
    ReceiverRecord.ReceivID, 
    ReceiverRecord.XmitID,
    (SELECT TOP 1 XmitGPS.X_ID FROM SecondTable as XmitGPS WHERE ReceiverRecord.RecTStamp < XmitGPS.XTStamp ORDER BY XmitGPS.X_ID) AS AfterXmit_ID
    FROM FirstTable_rekeyed AS ReceiverRecord
    -- INNER JOIN SecondTable AS XmitGPS ON (ReceiverRecord.RecTStamp < XmitGPS.XTStamp)
         GROUP BY RecTStamp, ReceivID, XmitID;
-- No separate join needed for the Top 1 method, but it would be required for the other methods. 
-- Additionally no restriction of the returned set is needed if I create the _rekeyed table.
-- May not need GROUP BY either. Could try ORDER BY.
-- The three AfterXmit_ID alternatives below take longer than 3 minutes to complete (or do not ever complete).
  -- FIRST(XmitGPS.X_ID)
  -- MIN(XmitGPS.X_ID)
  -- MIN(SWITCH(XmitGPS.XTStamp > ReceiverRecord.RecTStamp, XmitGPS.X_ID, Null))

以前的“内胆” JOIN 查询

首先(快......但还不够好)

SELECT 
  A.RecTStamp,
  A.ReceivID,
  A.XmitID,
  MAX(IIF(B.XTStamp<= A.RecTStamp,B.XTStamp,Null)) as BeforeXTStamp,
  MIN(IIF(B.XTStamp > A.RecTStamp,B.XTStamp,Null)) as AfterXTStamp
FROM FirstTable as A
INNER JOIN SecondTable as B ON 
  (A.RecTStamp<>B.XTStamp OR A.RecTStamp=B.XTStamp)
GROUP BY A.RecTStamp, A.ReceivID, A.XmitID
  -- alternative for BeforeXTStamp MAX(-(B.XTStamp<=A.RecTStamp)*B.XTStamp)
  -- alternatives for AfterXTStamp (see "Aside" note below)
  -- 1.0/(MAX(1.0/(-(B.XTStamp>A.RecTStamp)*B.XTStamp)))
  -- -1.0/(MIN(1.0/((B.XTStamp>A.RecTStamp)*B.XTStamp)))

第二(较慢)

SELECT
  A.RecTStamp, AbyB1.XTStamp AS BeforeXTStamp, AbyB2.XTStamp AS AfterXTStamp
FROM (FirstTable AS A INNER JOIN 
  (select top 1 B1.XTStamp, A1.RecTStamp 
   from SecondTable as B1, FirstTable as A1
   where B1.XTStamp<=A1.RecTStamp
   order by B1.XTStamp DESC) AS AbyB1 --MAX (time points before)
ON A.RecTStamp = AbyB1.RecTStamp) INNER JOIN 
  (select top 1 B2.XTStamp, A2.RecTStamp 
   from SecondTable as B2, FirstTable as A2
   where B2.XTStamp>A2.RecTStamp
   order by B2.XTStamp ASC) AS AbyB2 --MIN (time points after)
ON A.RecTStamp = AbyB2.RecTStamp; 

背景

我有一个包含不到 100 万个条目的遥测表(别名为 A),其中包含一个基于DateTime标记、发射器 ID 和记录设备 ID 的复合主键。由于无法控制的情况,我的SQL语言是Microsoft Access中的标准Jet DB(用户将使用2007及以后的版本)。由于传输器 ID,这些条目中只有大约 200,000 个与查询相关。

还有第二个遥测表(别名 B),它包含大约 50,000 个条目和一个DateTime主键

对于第一步,我专注于从第二个表中找到最接近第一个表中的时间戳的时间戳。


加入结果

我发现的怪癖......

...在调试过程中

JOIN编写逻辑感觉真的很奇怪FROM FirstTable as A INNER JOIN SecondTable as B ON (A.RecTStamp<>B.XTStamp OR A.RecTStamp=B.XTStamp),就像@byrdzeye在评论中指出的那样(此后消失了)是一种交叉连接形式。请注意,在上面的代码中替换LEFT OUTER JOIN为INNER JOIN似乎对返回的行的数量或标识没有影响。我似乎也不能放弃 ON 子句或 say ON (1=1)。仅使用逗号连接(而不是INNERor LEFT OUTER JOIN)会导致Count(select * from A) * Count(select * from B)此查询中返回行,而不是每个表 A 仅一行,因为 (A<>B OR A=B) 显式JOIN返回。这显然不合适。FIRST在给定复合主键类型的情况下似乎无法使用。

第二种JOIN风格虽然可以说更易读,但速度较慢。JOIN这可能是因为针对较大的表以及CROSS JOIN在两个选项中找到的两个 s需要额外的两个 inner s。

旁白:用/替换该IIF子句似乎会返回相同数量的条目。 适用于“之前”( ) 时间戳,但不直接适用于“之后”( ),如下所示: 因为条件的最小值始终为 0 。此 0 小于任何后纪元(字段是 Access 中的子集,并且此计算将字段转换为)。和/方法 为 AfterXTStamp 值建议的替代方案之所以有效,是因为除以零 ( ) 会生成空值,聚合函数 MIN 和 MAX 会跳过这些空值。MINMAX
MAX(-(B.XTStamp<=A.RecTStamp)*B.XTStamp)
MAXMIN
MIN(-(B.XTStamp>A.RecTStamp)*B.XTStamp)
FALSEDOUBLEDateTimeIIFMINMAXFALSE

下一步

更进一步,我希望在第二个表中找到直接位于第一个表中时间戳两侧的时间戳,并根据到这些点的时间距离对第二个表中的数据值进行线性插值(即如果时间戳来自第一个表是“之前”和“之后”之间的 25%,我希望计算值的 25% 来自与“之后”点关联的第二个表值数据,而 75% 来自“之前” ). 使用修改后的连接类型作为内部胆量的一部分,并在下面的建议答案之后产生......

    SELECT
        AvgGPS.XmitID,
        StrDateIso8601Msec(AvgGPS.RecTStamp) AS RecTStamp_ms,
        -- StrDateIso8601MSec is a VBA function returning a TEXT string in yyyy-mm-dd hh:nn:ss.lll format
        AvgGPS.ReceivID,
        RD.Receiver_Location_Description,
        RD.Lat AS Receiver_Lat,
        RD.Lon AS Receiver_Lon,
        AvgGPS.Before_Lat * (1 - AvgGPS.AfterWeight) + AvgGPS.After_Lat * AvgGPS.AfterWeight AS Xmit_Lat,
        AvgGPS.Before_Lon * (1 - AvgGPS.AfterWeight) + AvgGPS.After_Lon * AvgGPS.AfterWeight AS Xmit_Lon,
        AvgGPS.RecTStamp AS RecTStamp_basic
    FROM ( SELECT 
        AfterTimestampID.RecTStamp,
        AfterTimestampID.XmitID,
        AfterTimestampID.ReceivID,
        GPSBefore.BeforeXTStamp, 
        GPSBefore.Latitude AS Before_Lat, 
        GPSBefore.Longitude AS Before_Lon,
        GPSAfter.AfterXTStamp, 
        GPSAfter.Latitude AS After_Lat, 
        GPSAfter.Longitude AS After_Lon,
        ( (AfterTimestampID.RecTStamp - GPSBefore.XTStamp) / (GPSAfter.XTStamp - GPSBefore.XTStamp) ) AS AfterWeight
        FROM (
            (SELECT 
                ReceiverRecord.RecTStamp, 
                ReceiverRecord.ReceivID, 
                ReceiverRecord.XmitID,
               (SELECT TOP 1 XmitGPS.X_ID FROM SecondTable as XmitGPS WHERE ReceiverRecord.RecTStamp < XmitGPS.XTStamp ORDER BY XmitGPS.X_ID) AS AfterXmit_ID
             FROM FirstTable AS ReceiverRecord 
             -- WHERE ReceiverRecord.XmitID IN (select XmitID from ValidXmitters)
             GROUP BY RecTStamp, ReceivID, XmitID
            ) AS AfterTimestampID INNER JOIN SecondTable AS GPSAfter ON AfterTimestampID.AfterXmit_ID = GPSAfter.X_ID
        ) INNER JOIN SecondTable AS GPSBefore ON AfterTimestampID.AfterXmit_ID = GPSBefore.X_ID + 1
    ) AS AvgGPS INNER JOIN ReceiverDetails AS RD ON (AvgGPS.ReceivID = RD.ReceivID) AND (AvgGPS.RecTStamp BETWEEN RD.Beginning AND RD.Ending)
    ORDER BY AvgGPS.RecTStamp, AvgGPS.ReceivID;

...返回 152928 条记录,符合(至少大约)预期记录的最终数量。在我的 i7-4790、16GB RAM、无 SSD、Win 8.1 Pro 系统上运行时间可能是 5-10 分钟。


参考资料1:MS Access Can Handle Millisecond Time Values--真正和随附的源文件[08080011.txt]

join ms-access
  • 3 3 个回答
  • 685 Views

3 个回答

  • Voted
  1. Best Answer
    Phrancis
    2015-10-03T21:21:58+08:002015-10-03T21:21:58+08:00

    I must first compliment you on your courage to do something like this with an Access DB, which from my experience is very difficult to do anything SQL-like. Anyways, on to the review.


    First join

    Your IIF field selections might benefit from using a Switch statement instead. It seems to be sometimes the case, especially with things SQL, that a SWITCH (more commonly known as CASE in typical SQL) is quite fast when just making simple comparisons in the body of a SELECT. The syntax in your case would be almost identical, although a switch can be expanded to cover a large chunk of comparisons in one field. Something to consider.

      SWITCH (
        expr1, val1,
        expr2, val2,
        val3        -- default value or "else"
      )
    

    A switch can also help readability, in larger statements. In context:

      MAX(SWITCH(B.XTStamp <= A.RecTStamp,B.XTStamp,Null)) as BeforeXTStamp,
      --alternatively MAX(-(B.XTStamp<=A.RecTStamp)*B.XTStamp) as BeforeXTStamp,
      MIN(SWITCH(B.XTStamp>A.RecTStamp,B.XTStamp,Null)) as AfterXTStamp
    

    As for the join itself, I think (A.RecTStamp<>B.XTStamp OR A.RecTStamp=B.XTStamp) is about as good as you're going to get, given what you are trying to do. It's not that fast, but I wouldn't expect it to be either.


    Second join

    You said this is slower. It's also less readable from a code standpoint. Given equally satisfactory result sets between 1 and 2, I'd say go for 1. At least it's obvious what you are trying to do that way. Subqueries are often not very fast (though often unavoidable) especially in this case you are throwing in an extra join in each, which must certainly complicate the execution plan.

    One remark, I saw that you used old ANSI-89 join syntax. It's best to avoid that, the performance will be same or better with the more modern join syntax, and they are less ambiguous or easier to read, harder to make mistakes.

    FROM (FirstTable AS A INNER JOIN 
      (select top 1 B1.XTStamp, A1.RecTStamp 
       from SecondTable as B1
       inner join FirstTable as A1
         on B1.XTStamp <= A1.RecTStamp
       order by B1.XTStamp DESC) AS AbyB1 --MAX (time points before)
    

    Naming things

    I think the way your things are named is unhelpful at best, and cryptic at worst. A, B, A1, B1 etc. as table aliases I think could be better. Also, I think the field names are not very good, but I realize you may not have control over this. I will just quickly quote The Codeless Code on the topic of naming things, and leave it at that...

    “Invective!” answered the priestess. “Verb your expletive nouns!”


    "Next steps" query

    I couldn't make much sense of it how it was written, I had to take it to a text editor and do some style changes to make it more readable. I know Access' SQL editor is beyond clunky, so I usually write my queries in a good editor like Notepad++ or Sublime Text. Some of the stylistic changes I applied to make it more readable:

    • 4 spaces indent instead of 2 spaces
    • Spaces around mathematical and comparison operators
    • More natural placing of braces and indentation (I went with Java-style braces, but could also be C-style, at your preference)

    So as it turns out, this is a very complicated query indeed. To make sense of it, I have to start from the innermost query, your ID data set, which I understand is the same as your First Join. It returns the IDs and timestamps of the devices where the before/after timestamps are the closest, within the subset of devices you are interested in. So instead of ID why not call it ClosestTimestampID.

    Your Det join is used only once:

    enter image description here

    The rest of the time, it only joins the values you already have from ClosestTimestampID. So instead we should be able to just do this:

        ) AS ClosestTimestampID
        INNER JOIN SecondTable AS TL1 
            ON ClosestTimestampID.BeforeXTStamp = TL1.XTStamp) 
        INNER JOIN SecondTable AS TL2 
            ON ClosestTimestampID.AfterXTStamp = TL2.XTStamp
        WHERE ClosestTimestampID.XmitID IN (<limited subset S>)
    

    Maybe not be a huge performance gain, but anything we can do to help the poor Jet DB optimizer will help!


    I can't shake the feeling that the calculations/algorithm for BeforeWeight and AfterWeight which you use to interpolate could be done better, but unfortunately I'm not very good with those.

    One suggestion to avoid crashing (although it's not ideal depending on your application) would be to break out your nested subqueries into tables of their own and update those when needed. I'm not sure how often you need your source data to be refreshed, but if it is not too often you might think of writing some VBA code to schedule an update of the tables and derived tables, and just leave your outermost query to pull from those tables instead of the original source. Just a thought, like I said not ideal but given the tool you may not have a choice.


    Everything together:

    SELECT
        InGPS.XmitID,
        StrDateIso8601Msec(InGPS.RecTStamp) AS RecTStamp_ms,
           -- StrDateIso8601MSec is a VBA function returning a TEXT string in yyyy-mm-dd hh:nn:ss.lll format
        InGPS.ReceivID,
        RD.Receiver_Location_Description,
        RD.Lat AS Receiver_Lat,
        RD.Lon AS Receiver_Lon,
        InGPS.Before_Lat * InGPS.BeforeWeight + InGPS.After_Lat * InGPS.AfterWeight AS Xmit_Lat,
        InGPS.Before_Lon * InGPS.BeforeWeight + InGPS.After_Lon * InGPS.AfterWeight AS Xmit_Lon,
        InGPS.RecTStamp AS RecTStamp_basic
    FROM (
        SELECT 
            ClosestTimestampID.RecTStamp,
            ClosestTimestampID.XmitID,
            ClosestTimestampID.ReceivID,
            ClosestTimestampID.BeforeXTStamp, 
            TL1.Latitude AS Before_Lat, 
            TL1.Longitude AS Before_Lon,
            (1 - ((ClosestTimestampID.RecTStamp - ClosestTimestampID.BeforeXTStamp) 
                / (ClosestTimestampID.AfterXTStamp - ClosestTimestampID.BeforeXTStamp))) AS BeforeWeight,
            ClosestTimestampID.AfterXTStamp, 
            TL2.Latitude AS After_Lat, 
            TL2.Longitude AS After_Lon,
            (     (ClosestTimestampID.RecTStamp - ClosestTimestampID.BeforeXTStamp) 
                / (ClosestTimestampID.AfterXTStamp - ClosestTimestampID.BeforeXTStamp)) AS AfterWeight
            FROM (((
                SELECT 
                    A.RecTStamp, 
                    A.ReceivID, 
                    A.XmitID,
                    MAX(SWITCH(B.XTStamp <= A.RecTStamp, B.XTStamp, Null)) AS BeforeXTStamp,
                    MIN(SWITCH(B.XTStamp > A.RecTStamp, B.XTStamp, Null)) AS AfterXTStamp
                FROM FirstTable AS A
                INNER JOIN SecondTable AS B 
                    ON (A.RecTStamp <> B.XTStamp OR A.RecTStamp = B.XTStamp)
                WHERE A.XmitID IN (<limited subset S>)
                GROUP BY A.RecTStamp, ReceivID, XmitID
            ) AS ClosestTimestampID
            INNER JOIN FirstTable AS Det 
                ON (Det.XmitID = ClosestTimestampID.XmitID) 
                AND (Det.ReceivID = ClosestTimestampID.ReceivID) 
                AND (Det.RecTStamp = ClosestTimestampID.RecTStamp)) 
            INNER JOIN SecondTable AS TL1 
                ON ClosestTimestampID.BeforeXTStamp = TL1.XTStamp) 
            INNER JOIN SecondTable AS TL2 
                ON ClosestTimestampID.AfterXTStamp = TL2.XTStamp
            WHERE Det.XmitID IN (<limited subset S>)
        ) AS InGPS
    INNER JOIN ReceiverDetails AS RD 
        ON (InGPS.ReceivID = RD.ReceivID) 
        AND (InGPS.RecTStamp BETWEEN <valid parameters from another table>)
    ORDER BY StrDateIso8601Msec(InGPS.RecTStamp), InGPS.ReceivID;
    
    • 10
  2. byrdzeye
    2015-10-02T06:50:08+08:002015-10-02T06:50:08+08:00
    • 添加了额外的属性和过滤条件。
    • 使用最小和最大嵌套查询消除了任何形式的交叉连接。这是最大的性能增益。
    • 最内层嵌套查询返回的最小和最大侧翼值是主键值(扫描),用于使用搜索检索额外的侧翼属性(纬度和经度)以进行最终计算(访问确实具有等效的应用)。
    • 主表属性在最内层的查询中被检索和过滤,应该有助于提高性能。
    • 排序的时间值不需要格式化(StrDateIso8601Msec)。使用表中的日期时间值是等效的。

    SQL Server 执行计划(因为 Access 无法显示此)
    没有最终顺序,因为它很昂贵:
    聚簇索引扫描 [ReceiverDetails].[PK_ReceiverDetails] 成本 16%
    聚簇索引查找 [FirstTable].[PK_FirstTable] 成本 19%
    聚簇索引查找 [SecondTable].[PK_SecondTable] 成本 16%
    聚簇索引查找 [SecondTable].[PK_SecondTable] 成本 16%
    聚簇索引查找 [SecondTable].[PK_SecondTable] [TL2] 成本 16%
    聚簇索引查找 [SecondTable].[PK_SecondTable] [TL1] 成本 16%

    最终排序依据:
    排序成本 36%
    聚簇索引扫描 [ReceiverDetails].[PK_ReceiverDetails] 成本 10%
    聚簇索引查找 [FirstTable].[PK_FirstTable] 成本 12%
    聚簇索引查找 [SecondTable].[PK_SecondTable] 成本 10%
    聚簇索引查找 [SecondTable].[PK_SecondTable] 成本 10%
    聚簇索引查找 [SecondTable].[PK_SecondTable] [TL2] 成本 10%
    聚簇索引查找 [SecondTable].[ PK_SecondTable] [TL1] 成本 10%

    代码:

    select
         ClosestTimestampID.XmitID
        --,StrDateIso8601Msec(InGPS.RecTStamp) AS RecTStamp_ms
        ,ClosestTimestampID.ReceivID
        ,ClosestTimestampID.Receiver_Location_Description
        ,ClosestTimestampID.Lat
        ,ClosestTimestampID.Lon
    ,[TL1].[Latitude] * (1 - ((ClosestTimestampID.RecTStamp - ClosestTimestampID.BeforeXTStamp) / (ClosestTimestampID.AfterXTStamp - ClosestTimestampID.BeforeXTStamp))) + [TL2].[Latitude] * ((ClosestTimestampID.RecTStamp - ClosestTimestampID.BeforeXTStamp) / (ClosestTimestampID.AfterXTStamp - ClosestTimestampID.BeforeXTStamp)) AS Xmit_Lat
    ,[TL1].[Longitude] * (1 - ((ClosestTimestampID.RecTStamp - ClosestTimestampID.BeforeXTStamp) / (ClosestTimestampID.AfterXTStamp - ClosestTimestampID.BeforeXTStamp))) + [TL2].[Longitude] * ((ClosestTimestampID.RecTStamp - ClosestTimestampID.BeforeXTStamp) / (ClosestTimestampID.AfterXTStamp - ClosestTimestampID.BeforeXTStamp)) AS Xmit_Lon
        ,ClosestTimestampID.RecTStamp as RecTStamp_basic
    from (
            (
                (
                    select
                         FirstTable.RecTStamp
                        ,FirstTable.ReceivID
                        ,FirstTable.XmitID
                        ,ReceiverDetails.Receiver_Location_Description
                        ,ReceiverDetails.Lat
                        ,ReceiverDetails.Lon
                        ,(
                            select max(XTStamp) as val
                            from SecondTable
                            where XTStamp <= FirstTable.RecTStamp
                         ) as BeforeXTStamp
                        ,(
                            select min(XTStamp) as val
                            from SecondTable
                            where XTStamp > FirstTable.RecTStamp
                         ) as AfterXTStamp
                    from FirstTable
                    inner join ReceiverDetails
                    on ReceiverDetails.ReceivID = FirstTable.ReceivID
                    where FirstTable.RecTStamp between #1/1/1990# and #1/1/2020#
                    and FirstTable.XmitID in (100,110)
                ) as ClosestTimestampID
                inner join SecondTable as TL1
                on ClosestTimestampID.BeforeXTStamp = TL1.XTStamp
            )
            inner join SecondTable as TL2
            on ClosestTimestampID.AfterXTStamp = TL2.XTStamp
        )
    order by ClosestTimestampID.RecTStamp, ClosestTimestampID.ReceivID;
    

    针对包含交叉连接的查询对我的查询进行性能测试。

    FirstTable 加载了 13 条记录,SecondTable 加载了 1,000,000 条记录。
    我的查询的执行计划与发布的内容没有太大变化。
    交叉连接的执行计划:
    嵌套循环成本 81% 使用INNER JOIN SecondTable AS B ON (A.RecTStamp <> B.XTStamp OR A.RecTStamp = B.XTStamp
    嵌套循环下降到 75% 如果使用CROSS JOIN SecondTable AS B' or ',SecondTable AS B
    流聚合 8%
    索引扫描 [SecondTable][UK_ID][B] 6%
    表假脱机 5%
    其他几个聚集索引查找和索引查找(类似于我发布的查询)成本为 0%。

    我的查询和 CROSS JOIN 的执行时间为 0.007 和 8-9 秒。
    成本比较 0% 和 100%。

    我将包含 50,000 条记录和一条记录的 FirstTable 加载到 ReceiverDetails 以用于连接条件并运行我的查询。
    50,013 在 0.9 到 1.0 秒之间返回。

    I ran second query with the cross join and allowed it to run for about 20 minutes before I killed it.
    If the cross join query is filtered to return only the original 13, execution time is again, 8-9 seconds.
    Placement of the filter condition was at inner most select, outer most select and both. No difference.

    There is a difference between these two join conditions in favor of the CROSS JOIN, the first uses a predicate, the CROSS JOIN does not:
    INNER JOIN SecondTable AS B ON (A.RecTStamp <> B.XTStamp OR A.RecTStamp = B.XTStamp) CROSS JOIN SecondTable AS B

    • 5
  3. byrdzeye
    2015-10-18T07:27:46+08:002015-10-18T07:27:46+08:00

    Adding a second answer, not better than the first but without changing any of the requirements presented, there are a few of ways to beat Access into submission and appear snappy. 'Materialize' the complications a bit at a time effectivity using 'triggers'. Access tables do not have triggers so intercept and inject the crud processes.

    --*** Create a table for flank values.
        create table Flank (
             RecTStamp      datetime not null
            ,BeforeXTStamp  datetime null
            ,AfterXTStamp   datetime null
            ,constraint PK_Flank primary key clustered ( RecTStamp asc )
            )
    
    --*** Create a FlankUpdateLoop sub. (create what is missing)
        -- loop until rowcount < 5000 or rowcount = 0
        -- a 5K limit appears to be manageable for Access, especially for the initial population.
        insert into Flank (
             RecTStamp
            ,BeforeXTStamp
            ,AfterXTStamp
            )
        select top 5000 FirstTable.RecTStamp
            ,(
                select max(XTStamp) as val
                from SecondTable
                where XTStamp <= FirstTable.RecTStamp
                ) as BeforeXTStamp
            ,(
                select min(XTStamp) as val
                from SecondTable
                where XTStamp > FirstTable.RecTStamp
                ) as AfterXTStamp
        from FirstTable
        left join Flank
            on FirstTable.RecTStamp = Flank.RecTStamp
        where Flank.RecTStamp is null;
    
    --*** For FirstTable Adds, Changes or Deletes:
        delete from Flank where Flank.RecTStamp = CRUD_RecTStamp
        execute FlankUpdateLoop --See above. This will handle Adds, Changes or Deletes.
    
    --*** For SecondTable Adds, Changes or Deletes:
        --delete from Flank where the old value is immediately before and after the new flank value.
        --They may or may not get be assigned a new value. Let FlankUpdate figure it out.
    
        --execute deletes for both beforextstamp and afterxtstamp
        --then update flank
    
        delete *
        from flank
        where beforextstamp between (
                        select min(beforextstamp)
                        from flank
                        where beforextstamp >= '3/16/2009 10:00:46 AM'
                        ) and (
                        select max(beforextstamp)
                        from flank
                        where beforextstamp <= '3/16/2009 10:00:46 AM'
                        );
    
        delete *
        from flank
        where afterxtstamp between (
                        select min(afterxtstamp)
                        from flank
                        where afterxtstamp >= '3/16/2009 10:00:46 AM'
                        ) and (
                        select max(afterxtstamp)
                        from flank
                        where afterxtstamp <= '3/16/2009 10:00:46 AM'
                        );
    
        execute FlankUpdateLoop
    
    --*** Final Report Query***--
        --Should execute without issues including 'deferred execution' problem.
        --Add filters as needed.
        select FirstTable.XmitID
            ,FirstTable.ReceivID
            ,ReceiverDetails.Lat
            ,ReceiverDetails.Lon
            ,BeforeTable.Latitude * (1 - ((FirstTable.RecTStamp - BeforeXTStamp) / (AfterXTStamp - BeforeXTStamp))) + AfterTable.Latitude * ((FirstTable.RecTStamp - BeforeXTStamp) / (AfterXTStamp - BeforeXTStamp)) as Xmit_Lat
            ,BeforeTable.Longitude * (1 - ((FirstTable.RecTStamp - BeforeXTStamp) / (AfterXTStamp - BeforeXTStamp))) + AfterTable.Longitude * ((FirstTable.RecTStamp - BeforeXTStamp) / (AfterXTStamp - BeforeXTStamp)) as Xmit_Lon
            ,FirstTable.RecTStamp as RecTStamp_basic
        from (((
            FirstTable
        inner join Flank on FirstTable.RecTStamp = Flank.RecTStamp)
        inner join SecondTable as BeforeTable on Flank.BeforeXTStamp = BeforeTable.XTStamp)
        inner join SecondTable as AfterTable on Flank.AfterXTStamp = AfterTable.XTStamp)
        inner join ReceiverDetails on FirstTable.ReceivID = ReceiverDetails.ReceivID
        order by FirstTable.RecTStamp;
    
    • 2

相关问题

  • 我可以自动执行 MySQL 查询中的“on”语句吗?

  • 使用参数查询在 MS Access 报告中生成图表

  • 通过 SQL Job Agent 查询网络共享上的 Linked Access 数据库

  • INNER JOIN 和 OUTER JOIN 有什么区别?

  • JOIN 语句的输出是什么样的?

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    连接到 PostgreSQL 服务器:致命:主机没有 pg_hba.conf 条目

    • 12 个回答
  • Marko Smith

    如何让sqlplus的输出出现在一行中?

    • 3 个回答
  • Marko Smith

    选择具有最大日期或最晚日期的日期

    • 3 个回答
  • Marko Smith

    如何列出 PostgreSQL 中的所有模式?

    • 4 个回答
  • Marko Smith

    列出指定表的所有列

    • 5 个回答
  • Marko Smith

    如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

    • 4 个回答
  • Marko Smith

    你如何mysqldump特定的表?

    • 4 个回答
  • Marko Smith

    使用 psql 列出数据库权限

    • 10 个回答
  • Marko Smith

    如何从 PostgreSQL 中的选择查询中将值插入表中?

    • 4 个回答
  • Marko Smith

    如何使用 psql 列出所有数据库和表?

    • 7 个回答
  • Martin Hope
    Jin 连接到 PostgreSQL 服务器:致命:主机没有 pg_hba.conf 条目 2014-12-02 02:54:58 +0800 CST
  • Martin Hope
    Stéphane 如何列出 PostgreSQL 中的所有模式? 2013-04-16 11:19:16 +0800 CST
  • Martin Hope
    Mike Walsh 为什么事务日志不断增长或空间不足? 2012-12-05 18:11:22 +0800 CST
  • Martin Hope
    Stephane Rolland 列出指定表的所有列 2012-08-14 04:44:44 +0800 CST
  • Martin Hope
    haxney MySQL 能否合理地对数十亿行执行查询? 2012-07-03 11:36:13 +0800 CST
  • Martin Hope
    qazwsx 如何监控大型 .sql 文件的导入进度? 2012-05-03 08:54:41 +0800 CST
  • Martin Hope
    markdorison 你如何mysqldump特定的表? 2011-12-17 12:39:37 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 对 SQL 查询进行计时? 2011-06-04 02:22:54 +0800 CST
  • Martin Hope
    Jonas 如何从 PostgreSQL 中的选择查询中将值插入表中? 2011-05-28 00:33:05 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 列出所有数据库和表? 2011-02-18 00:45:49 +0800 CST

热门标签

sql-server mysql postgresql sql-server-2014 sql-server-2016 oracle sql-server-2008 database-design query-performance sql-server-2017

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve