Request optimisation - sql

I have two tables, on one there are all the races that the buses do
dbo.Courses_Bus
|ID|ID_Bus|ID_Line|DateHour_Start_Course|DateHour_End_Course|
On the other all payments made in these buses
dbo.Payments
|ID|ID_Bus|DateHour_Payment|
The goal is to add the notion of a Line in the payment table to get something like this
dbo.Payments
|ID|ID_Bus|DateHour_Payment|Line|
So I tried to do this :
/** I first added a Line column to the dbo.Payments table**/
UPDATE
Table_A
SET
Table_A.Line = Table_B.ID_Line
FROM
[dbo].[Payments] AS Table_A
INNER JOIN [dbo].[Courses_Bus] AS Table_B
ON Table_A.ID_Bus = Table_B.ID_Bus
AND Table_A.DateHour_Payment BETWEEN Table_B.DateHour_Start_Course AND Table_B.DateHour_End_Course
And this
UPDATE
Table_A
SET
Table_A.Line = Table_B.ID_Line
FROM
[dbo].[Payments] AS Table_A
INNER JOIN (
SELECT
P.*,
CP.ID_Line AS ID_Line
FROM
[dbo].[Payments] AS P
INNER JOIN [dbo].[Courses_Bus] CP ON CP.ID_Bus = P.ID_Bus
AND CP.DateHour_Start_Course <= P.Date
AND CP.DateHour_End_Course >= P.Date
) AS Table_B ON Table_A.ID_Bus = Table_B.ID_Bus
The main problem, apart from the fact that these requests do not seem to work properly, is that each table has several million lines that are increasing every day, and because of the datehour filter (mandatory since a single bus can be on several lines everyday) SSMS must compare each row of the second table to all rows of the other table.
So it takes an infinite amount of time, which will increase every day.
How can I make it work and optimise it ?

Assuming that this is the logic you want:
UPDATE p
SET p.Line = cb.ID_Line
FROM [dbo].[Payments] p JOIN
[dbo].[Courses_Bus] cb
ON p.ID_Bus = cb.ID_Bus AND
p.DateHour_Payment BETWEEN cb.DateHour_Start_Course AND cb.DateHour_End_Course;
To optimize this query, then you want an index on Courses_Bus(ID_Bus, DateHour_Start_Course, DateHour_End_Course).
There might be slightly more efficient ways to optimize the query, but your question doesn't have enough information -- is there always exactly one match, for instance?
Another big issue is that updating all the rows is quite expensive. You might find that it is better to do this in loops, one chunk at a time:
UPDATE TOP (10000) p
SET p.Line = cb.ID_Line
FROM [dbo].[Payments] p JOIN
[dbo].[Courses_Bus] cb
ON p.ID_Bus = cb.ID_Bus AND
p.DateHour_Payment BETWEEN cb.DateHour_Start_Course AND cb.DateHour_End_Course
WHERE p.Line IS NULL;
Once again, though, this structure depends on all the initial values being NULL and an exact match for all rows.

Thank you Gordon for your answer.
I have investigated and came with this query :
MERGE [dbo].[Payments] AS p
USING [dbo].[Courses_Bus] AS cb
ON p.ID_Bus= cb.ID_Bus AND
p.DateHour_Payment>= cb.DateHour_Start_Course AND
p.DateHour_Payment<= cb.DateHour_End_Course
WHEN MATCHED THEN
UPDATE SET p.Line = cb.ID_Ligne;
As it seems to be the most suitable in an MS-SQL environment.
It also came with the error :
The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.
I understood this to mean that it finds several lines with identical
[p.ID_Bus= cb.ID_Bus AND
p.DateHour_Payment >= cb.DateHour_Start_Course AND
p.DateHour_Payment <= cb.DateHour_End_Course]
Yes, this is a possible case, however the ID is different each time.
For example, if two blue cards are beeped at the same time, or if there is a loss of network and the equipment has been updated, thus putting the beeps at the same time. These are different lines that must be treated separately, and you can obtain for example:
|ID|ID_Bus|DateHour_Payments|Line|
----------------------------------
|56|204|2021-01-01 10:00:00|15|
----------------------------------
|82|204|2021-01-01 10:00:00|15|
How can I improve this query so that it takes into account different payment IDs?
I can't figure out how to do this with the help I find online. Maybe this method is not the right one in this context.

Related

Oracle query with multiple joins taking far too long to run [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 15 days ago.
Improve this question
I'm trying to develop a table which involves three tables. TB3 has 500k records, TBL6 has 10 million records and TBL2, TBL4, TBL5 has 20 million records and are the same table. TBL1 is a subquery with a number of joins. I cannot change the database structure but have checked the indices available and that hasn't helped. I've used the OUTER APPLY join at the end as I thought that may speed up my performance but after much experimenting I still and up killing the query after 15-20 minutes.
SELECT TBL1.START_TIME,
TBL1.DEST_TIME,
TBL1.SRCADDRESS,
TBL1.POS,
TBL2.ADDRESSID AS TBL2 FROM (
SELECT TBL3.EVENTTIME,
TBL3.SOURCEADDRESS,
TBL6.FROM_POS,
TBL3.LC_NAME
FROM CUSTOMER_OVERVIEW_V TBL3
INNER JOIN CUSTOMER_SALE_RELATED TBL6 ON TBL6.LC_NAME = TBL3.LC_NAME
AND TBL6.FROM_LOC = TBL3.SOURCEADDRESS
INNER JOIN CUSTOMER TBL4 ON TBL4.CUSTID = TBL3.LC_NAME
AND TBL4.AREATYPE = 'SW'
AND TBL4.EVENTTIME <= TBL3.EVENTTIME + interval '1' second
AND TBL4.EVENTTIME >= TBL3.EVENTTIME - interval '1' second
INNER JOIN CUSTOMER TBL5 ON CUSTID = TBL3.LC_NAME
AND TBL5.AREATYPE = 'SE'
AND TBL5.EVENTTIME <= TBL3.EVENTTIME + interval '1' second
AND TBL5.EVENTTIME >= TBL3.EVENTTIME - interval '1' second
WHERE TBL3.SOURCEADDRESS IS NOT NULL
AND extract(second from TBL5.EVENTTIME - TBL4.EventTime) * 1000 > 250
ORDER BY TBL3.EVENTTIME DESC
FETCH FIRST 500 ROWS ONLY) TBL1
OUTER APPLY (SELECT ADDRESSID
FROM CUSTOMER
WHERE AREATYPE = 'STH'
AND EVENTTIME > TBL1.DEST_TIME
ORDER BY EVENTTIME ASC
FETCH FIRST 1 ROW ONLY) TBL2;
There must be a way to structure this query better to improve the performance so any suggestions would be appreciated.
You are asking for the first 500 rows, so you only want a tiny fraction of your overall data. Therefore you want to use nested loops joins with appropriate indexes in order to get those 500 rows and be done, rather than have to process millions of rows and only then take off the top 500.
That however is complicated by the fact that you want the first 500 after ordering results. That ORDER BY will require a sort, and a sort operation will have to process 100% of the incoming rows in order to produce even its first row of output, which means you have to crunch 100% of your data and that kills your performance.
If your inner joins to TBL6, TBL4 and TBL5 are not meant to drop rows that don't have a matching key (e.g. if making those outer joins would result in the same # of result rows from TBL3), and if you don't really need to filter on extract(second from TBL5.EVENTTIME - TBL4.EventTime) * 1000 > 250 which requires a join of TBL5 and TBL4 to accomplish, and if this CUSTOMER_OVERVIEW_V view is a simple single-table view that doesn't apply any predicates (not likely), or if you can bypass the view and hit the table directly, then you can do this:
Create a 2-column function-based index (e.g. customer_eventtime_idx) on (DECODE(sourceaddress,NULL,'Y','N'),eventtime) of table you need from the customer_overview_v view, in that exact column order.
Rewrite the query to get the 500 rows as early as possible, preferably before any joins, using a hint to force a descending index scan on this index. You will also need to change your IS NOT NULL predicate to the same DECODE function used in the index definition:
SELECT /*+ LEADING(c) USE_NL(csr c1 c2 adr) */
[columns needed]
FROM (SELECT /*+ NO_MERGE INDEX_DESC(x customer_eventtime_idx) */
[columns you need]
FROM TABLE_UNDER_CUSTOMER_OVERVIEW_V x
WHERE DECODE(SOURCEADDRESS,NULL,'Y','N') = 'N'
ORDER BY DECODE(SOURCEADDRESS,NULL,'Y','N') DESC,EVENTTIME DESC
FETCH FIRST 500 ROWS ONLY) c
INNER JOIN CUSTOMER_SALE_RELATED ... csr
INNER JOIN CUSTOMER ... c1
INNER JOIN CUSTOMER ... c2
OUTER APPLY ([address subquery like you have it]) adr
WHERE extract(second from c.EVENTTIME - c1.EventTime) * 1000 > 250
Generate your explain plan and make sure you see the index scan descending operation on the index you created - it must say descending, and also ensure that you see ORDER BY NOSORT step afterwards... it must say NOSORT. Creating the index as we have and ordering our query as we have was all about getting these plan operations to be selected. This is not easy to get to work right. Every detail around the inner c query must be crafted precisely to achieve the trick.
Explanation:
Oracle should seek/access the index on the DECODE Y/N result, so find the N records (those that have a non-null sourceaddress) in the first column of the index starting at the leading edge of the N values, then get the corresponding row from the table, then step backward to the pervious N value, get that table row, then the previous, etc.. emitting rows as they are found. Because the ORDER BY matches the index exactly in the same direction (descending), Oracle will skip the SORT operation as it knows that the data coming out of the index step will already be in the correct order.
These rows therefore stream out of the c inline view as they are found, which allows the FETCH FIRST to stop processing when it gets too 500 rows of output without having to wait for any buffering operation (like a SORT) to complete. You only ever hit those 500 records - it never visits the rest of the rows.
You then join via nested loops to the remaining tables, but you only have 500 driving rows. Obviously you must ensure appropriate indexing on those tables for your join clauses.
If your CUSTOMER_ORDER_V view however does joins and applies more predicates, you simply cannot do this with the view. You will have to use the base tables and apply this trick on whatever base table has that eventtime column, then join in whatever you need that the view was joining to and reproduce any remaining needed view logic (though you might find it does more than you need, and you can omit much of it). In general, don't use views whenever you can help it. You always have more control against base tables.
Lastly, note that I did not follow your TBL1, TBL2, TBL3, etc. alias convention. That is hard to read because you have to constantly look elsewhere to see what "TBL3" means. Far better to use aliases that communicate immediately what table they are, such as the initial letter or couple letters or acronym from the first letter of each word, etc..

Complex SQL View with Joins & Where clause

My SQL skill level is pretty basic. I have certainly written some general queries and done some very generic views. But once we get into joins, I am choking to get the results that I want, in the view I am creating.
I feel like I am almost there. Just can't get the final piece
SELECT dbo.ics_supplies.supplies_id,
dbo.ics_supplies.old_itemid,
dbo.ics_supplies.itemdescription,
dbo.ics_supplies.onhand,
dbo.ics_supplies.reorderlevel,
dbo.ics_supplies.reorderamt,
dbo.ics_supplies.unitmeasure,
dbo.ics_supplies.supplylocation,
dbo.ics_supplies.invtype,
dbo.ics_supplies.discontinued,
dbo.ics_supplies.supply,
dbo.ics_transactions.requsitionnumber,
dbo.ics_transactions.openclosed,
dbo.ics_transactions.transtype,
dbo.ics_transactions.originaldate
FROM dbo.ics_supplies
LEFT OUTER JOIN dbo.ics_orders
ON dbo.ics_supplies.supplies_id = dbo.ics_orders.suppliesid
LEFT OUTER JOIN dbo.ics_transactions
ON dbo.ics_orders.requisitionnumber =
dbo.ics_transactions.requsitionnumber
WHERE ( dbo.ics_transactions.transtype = 'PO' )
When I don't include the WHERE clause, I get 17,000+ records in my view. That is not correct. It's doing this because we are matching on a 1 to many table. Supplies table is 12,000 records. There should always be 12,000 records. Never more. Never less.
The pieces that I am missing are:
I only need ONE matching record from the ICS_Transactions Table. Ideally, the one that I want is the most current 'ICS_Transactions.OriginalDate'.
I only want the ICS_Transactions Table fields to populate IF ICS_Transacions.Type = 'PO'. Otherwise, these fields should remain null.
Sample code or anything would help a lot. I have done a lot of research on joins and it's still very confusing to get what I need for results.
EDIT/Update
I feel as if I asked my question in the wrong way, or didn't give a good overall view of what I am asking. For that, I apologize. I am still very new to SQL, but trying hard.
ICS_Supplies Table has 12,810 records
ICS_Orders Table has 3,666 records
ICS_Transaction Table has 4,701 records
In short, I expect to see a result of 12,810 records. No more and no less. I am trying to create a View of ALL records from the ICS_Supplies table.
Not all records in Supply Table are in Orders and or Transaction Table. But still, I want to see all 12,810 records, regardless.
My users have requested that IF any of these supplies have an open PO (ICS_Transactions.OpenClosed = 'Open' and ICS_Transactions.InvType = 'PO') Then, I also want to see additional fields from ICS_Transactions (ICS_Transactions.OpenClosed, ICS_Transactions.InvType, ICS_Transactions.OriginalDate, ICS_Transactions.RequsitionNumber).
If there are no open PO's for supply record, then these additional fields should be blank/null (regardless to what data is in these added fields, they should display null if they don't meet the criteria).
The ICS_Orders Table is nly needed to hop from the ICS_Supplies to the ICS_Transactions (I first, need to obtain the Requisition Number from the Orders field, if there is one).
I am sorry if I am not doing a good job to explain this. Please ask if you need clarification.
Here's a simplified version of Ross Bush's answer (It removes a join from the CTE to keep things more focussed, speed things up, and cut down the code).
;WITH
ordered_ics_transactions AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY requisitionnumber
ORDER BY originaldate DESC
)
AS seq_id
FROM
dbo.ics_transactions
)
SELECT
s.supplies_id, s.old_itemid,
s.itemdescription, s.onhand,
s.reorderlevel, s.reorderamt,
s.unitmeasure, s.supplylocation,
s.invtype, s.discontinued,
s.supply,
t.requsitionnumber, t.openclosed,
t.transtype, t.originaldate
FROM
dbo.ics_supplies AS s
LEFT OUTER JOIN
dbo.ics_orders AS o
ON o.supplies_id = s.suppliesid
LEFT OUTER JOIN
ordered_ics_transactions AS t
ON t.requisitionnumber = o.requisitionnumber
AND t.transtype = 'PO'
AND t.seq_id = 1
This will only join the most recent transaction record for each requisitionnumber, and only if it has transtype = 'PO'
IF you want to reverse that (joining only transaction records that have transtype = 'PO', and of those only the most recent one), then move the transtype = 'PO' filter to be a WHERE clause inside the ordered_ics_transactions CTE.
You can possibly work with the query below to get what you need.
1. I only need ONE matching record from the ICS_Transactions Table. Ideally, the one that I want is the most current 'ICS_Transactions.OriginalDate'.
I would solve this by creating a CTE with all the ICS_Transaction fields needed in the query, rank-ordered by OPriginalDate, partitioned by suppliesid.
2. I only want the ICS_Transactions Table fields to populate IF ICS_Transacions.Type = 'PO'. Otherwise, these fields should remain null.
If you move the condition from the WHERE clause to the LEFT JOIN then ICS_Transactions not matching the criteria will be peeled and replaced with null values with the rest of the query records.
;
WITH ReqNumberRanked AS
(
SELECT
dbo.ICS_Orders.SuppliesID,
dbo.ICS_Transactions.RequisitionNumber,
dbo.ICS_Transactions.TransType,
dbo.ICS_Transactions.OriginalDate,
dbo.ICS_Transactions.OpenClosed,
RequisitionNumberRankReversed = RANK() OVER(PARTITION BY dbo.ICS_Orders.SuppliesID, dbo.ICS_Transactions.RequisitionNumber ORDER BY dbo.ICS_Transactions.OriginalDate DESC)
FROM
dbo.ICS_Orders
LEFT OUTER JOIN dbo.ICS_Transactions ON dbo.ICS_Orders.RequisitionNumber = dbo.ICS_Transactions.RequsitionNumber
)
SELECT
dbo.ICS_Supplies.Supplies_ID, dbo.ICS_Supplies.Old_ItemID,
dbo.ICS_Supplies.ItemDescription, dbo.ICS_Supplies.OnHand,
dbo.ICS_Supplies.ReorderLevel, dbo.ICS_Supplies.ReorderAmt,
dbo.ICS_Supplies.UnitMeasure,
dbo.ICS_Supplies.SupplyLocation, dbo.ICS_Supplies.InvType,
dbo.ICS_Supplies.Discontinued, dbo.ICS_Supplies.Supply,
ReqNumberRanked.RequsitionNumber,
ReqNumberRanked.OpenClosed,
ReqNumberRanked.TransType,
ReqNumberRanked.OriginalDate
FROM
dbo.ICS_Supplies
LEFT OUTER JOIN dbo.ICS_Orders ON dbo.ICS_Supplies.Supplies_ID = dbo.ICS_Orders.SuppliesID
LEFT OUTER JOIN ReqNumberRanked ON ReqNumberRanked.RequisitionNumber = dbo.ICS_Transactions.RequsitionNumber
AND (ReqNumberRanked.TransType = 'PO')
AND ReqNumberRanked.RequisitionNumberRankReversed = 1

Improving Mass Update Query Performance

I'm looking to run some remediation on my database which requires adding some new fields to a table and backfilling them based on specific criteria. There are a LOT of records and I'd prefer the backfill to run relatively fast, as my current query is taking forever to run.
I attempted updating via subqueries, but this doesn't seem to be very performant.
Query below is edited to show general process (so syntax might be a little off). Hopefully you can understand what I'm trying to do./
I'd like to update every single record in the accounts table.
I'd like to do this by goin through each record and running a number of checks against the ID of that record prior to updating. For any records that don't match up in the join, I just want to set them to 0.
Doing this over the course of a few hundred thousand records seems to take forever. I'm sure there is a faster more efficient way to go about doing this. Using Postgres.
UPDATE
accounts
SET
account_age = a2.account_age,
FROM
(
with dataPoints as (
select
COALESCE(account__c.account_id as account_id,
account__c.account_age, 0) as account_age
        from account__c
        OUTER LEFT join points on points.account_id = account__c.id
        OUTER LEFT join goal__c on goal__c.id = scores.goal_id
group by account_id, account__c.account_age
     )
       select
account_id,
        max(dataPoints.account_age) as account_age,
        from scores
join account on accounts.id = scores.account_id
group by account_id
) as a2
WHERE accounts.id = a2.account_id;

Hive: Can't select one random match on right table in left outer join

EDIT - I don't care about the skewness or things being slow. I found out that the slowness was more so caused by a many times many join on many matches in my left outer join... Please skip down to the bottom.
I have an issue of a skewed table, that is, many more keys than other keys to join. My problem is that I have more than one key with many appearances in the rows.
Stats on this table and table I am joining with:
Larger table: totalSize=47431500000, numRows=509500000, rawDataSize=47022050000 21052 distinct keys
Smaller table: totalSize=1154984612, numRows=13780692, rawDataSize=1141203920 AND 39313 distinct keys
The smaller table also has repeated rows of keys. The other challenge is that I need to randomly select a matching key from the smaller table.
What I have tried so far:
set hive.auto.convert.join=true;
set hive.mapjoin.smalltable.filesize=1155mb;
and
CREATE TABLE joined_table AS SELECT * FROM (
select * from larger_table as a
LEFT OUTER JOIN smaller_table as b
on a.key = b.key
ORDER BY RAND()
)b;
It has been running for a day now.
I thought about manually doing something like this but, I have more than one key that there are a ton of, so I would have to make a bunch of tables and merge them. Which I can do if that is my only option :O
But I wanted to reach out to you all on SO.
Thanks for the help in advance
EDIT June 20th
I found to try:
set hive.optimize.skewjoin = true;
set hive.skewjoin.key = 200000;
I had already created a few separate tables to separate and join up the highest appearing keys, such that now the highest appearing key in the rest was 200k. Running the query to join the rest now took 25 minutes, finished all tasks successfully, according to the job tracker on the web interface. On the command line in the Hive shell, it is still sitting there, and when I go to check, the table does not exist.
**EDIT #2 After a lot of reading and trying out a lot of sql hive code... the 1 solution that should have worked in theory did not work, specifically the order by rand() never even happened...
CREATE TABLE joined_table AS SELECT * FROM (
select * from larger_table as a
JOIN
(SELECT *, row_number() over (partition by key order by rand() )
from smaller_table) as b
on a.key = b.key
and b.row_num=1
)b;
In the results it is being matched with the first row, not a rand() row at all..
Any other options or anything I did incorrectly here?

Why does my left join in Access have fewer rows than the left table?

I have two tables in an MS Access 2010 database: TBLIndividuals and TblIndividualsUpdates. They have a lot of the same data, but the primary key may not be the same for a given person's record in both tables. So I'm doing a join between the two tables on names and birthdates to see which records correspond. I'm using a left join so that I also get rows for the people who are in TblIndividualsUpdates but not in TBLIndividuals. That way I know which records need to be added to TBLIndividuals to get it up to date.
SELECT TblIndividuals.PersonID AS OldID,
TblIndividualsUpdates.PersonID AS UpdateID
FROM TblIndividualsUpdates LEFT JOIN TblIndividuals
ON ( (TblIndividuals.FirstName = TblIndividualsUpdates.FirstName)
and (TblIndividuals.LastName = TblIndividualsUpdates.LastName)
AND (TblIndividuals.DateBorn = TblIndividualsUpdates.DateBorn
or (TblIndividuals.DateBorn is null
and (TblIndividuals.MidName is null and TblIndividualsUpdates.MidName is null
or TblIndividuals.MidName = TblIndividualsUpdates.MidName))));
TblIndividualsUpdates has 4149 rows, but the query returns only 4103 rows. There are about 50 new records in TblIndividualsUpdates, but only 4 rows in the query result where OldID is null.
If I export the data from Access to PostgreSQL and run the same query there, I get all 4149 rows.
Is this a bug in Access? Is there a difference between Access's left join semantics and PostgreSQL's? Is my database corrupted (Compact and Repair doesn't help)?
ON (
TblIndividuals.FirstName = TblIndividualsUpdates.FirstName
and
TblIndividuals.LastName = TblIndividualsUpdates.LastName
AND (
TblIndividuals.DateBorn = TblIndividualsUpdates.DateBorn
or
(
TblIndividuals.DateBorn is null
and
(
TblIndividuals.MidName is null
and TblIndividualsUpdates.MidName is null
or TblIndividuals.MidName = TblIndividualsUpdates.MidName
)
)
)
);
What I would do is systematically remove all the join conditions except the first two until you find the records drop off. Then you will know where your problem is.
This should never happen. Unless rows are being inserted/deleted in the meantime,
the query:
SELECT *
FROM a LEFT JOIN b
ON whatever ;
should never return less rows than:
SELECT *
FROM a ;
If it happens, it's a bug. Are you sure the queries are exactly like this (and you have't omitted some detail, like a WHERE clause)? Are you sure that the first returns 4149 rows and the second one 4103 rows? You could make another check by changing the * above to COUNT(*).
Drop any indexes from both tables which include those JOIN fields (FirstName, LastName, and DateBorn). Then see whether you get the expected
4,149 rows with this simplified query.
SELECT
i.PersonID AS OldID,
u.PersonID AS UpdateID
FROM
TblIndividualsUpdates AS u
LEFT JOIN TblIndividuals AS i
ON
(
(i.FirstName = u.FirstName)
AND (i.LastName = u.LastName)
AND (i.DateBorn = u.DateBorn)
);
For whatever it is worth, since this seems to be a deceitful bug and any additional information could help resolving it, I have had the same problem.
The query is too big to post here and I don't have the time to reduce it now to something suitable, but I can report what I found. In the below, all joins are left joins.
I was gradually refining and changing my query. It had a derived table in it (D). And the whole thing was made into a derived table (T) and then joined to a last table (L). In any case, at one point in its development, no field in T that originated in D participated in the join to L. It was then the problem occurred, the total number of rows mysteriously became less than the main table, which should be impossible. As soon as I again let a field from D participate (via T) in the join to L, the number increased to normal again.
It was as if the join condition to D was moved to a WHERE clause when no field in it was participating (via T) in the join to L. But I don't really know what the explanation is.