Problem optimizing sql query with cross apply sub query - sql

So I have three tables:
MakerParts, that holds the primary information of a Vehicle Part:
Id
MakerId
PartNumber
Description
1
1
ABC1234
Tire
2
1
XYZ1234
Door
MakerPrices, that holds the price history variation for the parts (references MakerParts.Id on MakerPartNumberId, and the table MakerPriceUpdates on UpdateId):
Id
MakerPartNumberId
UpdateId
Price
1
1
1
9.83
2
1
2
11.23
MakerPriceUpdates, that holds the date of prices updates. This update is basically a CSV file that is uploaded to our system. One file, one line on this table, multiple prices changes on the table MakerPrices.
Id
Date
FileName
1
2019-01-09 00:00:00.000
temp.csv
2
2019-01-11 00:00:00.000
temp2.csv
This means that one part (MakerParts) may have multiple prices (MakerPrices). The date of the price change is on the table MakerPricesUpdates.
I want to select all MakerParts where the most recent price is zero, filtering by the MakerId on table MakerParts.
What I've tried:
select mp.* from MakerParts mp cross apply
(select top 1 Price from MakerPrices inner join
MakerPricesUpdates on MakerPricesUpdates.Id = MakerPrices.UpdateId where
MakerPrices.MakerPartNumberId = mp.Id order by Date desc) as p
where mp.MakerId = 1 and p.Price = 0
But that is absurdly slow (we have about 100 million lines on the MakerPrices table). I'm having a hard time optimizing this query. (the result is only two rows for the MakerId 1, and it took 2 mins to run). I also tried:
select * from (
select
mp.*,
(select top 1 Price from MakerPrices inner join
MakerPricesUpdates on MakerPricesUpdates.Id = MakerPrices.UpdateId
where MakerPrices.MakerPartNumberId = mp.Id order by Date desc) as Price
from MakerParts mp) as temp
where temp.Price = 0 and MakerId = 1
Same result, and same time. My query plan (for the first query) (no new indexes suggested by Management Studio):

I think you can avoid joining MakerPriceUpdates with makerprices since with the highest
UpdateId you can find the latest price updates. It will save you some time.
select mp.* from MakerParts mp cross apply
(select top 1 Price from MakerPrices where
MakerPrices.MakerPartNumberId = mp.Id order by MakerPrices.UpdateId desc) as p
where mp.MakerId = 1 and p.Price = 0
You can further reduced some times by avoiding sort and order by with cte and row_number() as below:
;with LatestMakerPrices as
(
select *,row_number()over(partition by MakerPartNumberId order by updateid desc)rn from MakerPrices
)
select mp.* from MakerParts mp cross apply
(select price from LatestMakerPrices lmp where lmp.MakerPartNumberId=mp.Id) as p
where mp.MakerId = 1 and p.Price = 0
Execution plan difference between query in question and my answer:

try:
WITH tab AS (
SELECT *, NULL as Price FROM MakerParts
WHERE not exists (
SELECT Id
FROM MakerPrices
WHERE MakerPrices.MakerPartNumberId = MakerParts.Id
)
)
SELECT * from tab WHERE MakerId = 2
UNION ALL
SELECT a.* , Price
FROM [dbo].[MakerParts] a
LEFT JOIN [dbo].[MakerPrices] b
ON b.MakerPartNumberId = a.Id
WHERE MakerId = 2 AND Price = 0

Try your query:
select mp.* from MakerParts mp cross apply
(select top 1 Price from MakerPrices inner join
MakerPricesUpdates on MakerPricesUpdates.Id = MakerPrices.UpdateId where
MakerPrices.MakerPartNumberId = mp.Id order by Date desc) as p
where mp.MakerId = 1 and p.Price = 0
After creating below index:
CREATE NONCLUSTERED INDEX [NCIdx_MakerPrices_MakerPartNumberId_UpdateId] ON [dbo].[MakerPrices]
(
[MakerPartNumberId] ASC,
[UpdateId] ASC
)
INCLUDE([Price])
And making ID column of MakerPricesUpdates table primary key.

Related

Remove unnecessary rows with different status degree

OrderNumber
OrderStatus
560
0002
560
0016
560
0028
180
0002
180
0215
180
0485
So the order status numbers represents different status' like 0002 means the order object is created, 0485 means order has been completed etc. What I wanted to achieve is if an order is completed or cancelled, I don't want to see any other status of the order like object creation. I have three tables. Let's call them A,B and C. OrderNumber is from table A and OrderStatus from table C . And table B is the joint table where I keep my OrderNo and OrderStat.
select A.OrderNumber, C.OrderStatus
from(A inner join B on B.OrderNo = A.OrderNo
inner join C on C.OrderStatus = B.OrderStat)
where A.OrderNumber in (this is where I need some help I think);```
You can go for CTE to get all the completed orders. Now, you can filter these orders in your resultset and show the completed orders only with completed status.
Note: Based on the question content, I have only considered completed orders. You can add cancelled orders also to the CTE, by adding cancelled Order Status.
;WITH CTE_CompletedCancelledOrders AS
(
SELECT OrderNumber, OrderStatus FROM B
WHERE B.OrderStatus = '0485' -- completedOrders
)
select A.OrderNumber, C.OrderStatus
from A inner join B on B.OrderNo = A.OrderNo
inner join C on C.OrderStatus = B.OrderStatus
WHERE NOT EXISTS (SELECT 1 FROM CTE_CompletedCancelledOrders
WHERE OrderNumber = A.OrderNumber) -- only incomplete Orders
UNION ALL
SELECT * FROM CTE_CompletedCancelledOrders -- completed Orders
The best approach is probably window functions:
select o.*
from (select A.OrderNumber, C.OrderStatus,
sum(case when C.OrderStatus in ('0485', . . . ) then 1 else 0 end) over (partition by A.OrderNumber) as completed_or_canceled
from A inner join
B
on B.OrderNo = A.OrderNo inner join
C
on C.OrderStatus = B.OrderStat
) o
where completed_or_canceled = 0 or
OrderStatus in ('0485', . . . );
The . . . is for the statuses that define the completed/canceled conditions.

Optimize query with a subquery With Group BY MAX and JOINED with another TABLE

I need help to optimize this SQL query, so that it would run much faster.
What I am trying to do is, get the latest values of DATA out of these tables:
TABLE: Quotes
ID QuoteNumber LastUpdated(inticks) PolicyId
1 C1000 1000000000000 100
1 D2000 1001111111110 200
2 A1000 1000000000000 300
2 B2000 1002222222222 400
TABLE: Policies
ID CustomerName Blah1(dummy column)
100 Mark someData
200 Lisa someData2
300 Brett someData3
400 Goku someData4
DESIRED RESULT:
LastUpdated Id(quoteId) QuoteNumber CustomerName
1001111111110- -1- -D2000- -Lisa
1002222222222- -2- -B2000- -Goku
Select DISTINCT subquery1.LastUpdated,
q2.Id,
q2.QuoteNumber,
p.CustomerName
FROM
(Select q.id,
Max(q.LastUpdated) from Quotes q
where q.LastUpdated > #someDateTimeParameter
and q.QuoteNumber is not null
and q.IsDiscarded = 0
GROUP BY q.id) as subquery1
LEFT JOIN Quotes q2
on q2.id = subquery1.id
and q2.LastUpdated = subquery1.LastUpdated
INNER JOIN Policies p
on p.id = q2.PolicyId
where p.blah1 = #someBlahParameter
ORDER BY subquery1.LastUpdated
Here is the actual execution plan:
https://www.brentozar.com/pastetheplan/?id=SkD3fPdwD
I think you're looking for something like this
with q_cte as (
select q.Id, q.QuoteNumber, q.LastUpdated,
row_number() over (partition by q.id order by q.LastUpdated desc) rn
from Quotes q
where q.LastUpdated>#someDateTimeParameter
and q.QuoteNumber is not null
and q.IsDiscarded=0)
select q.*, p.CustomerName
from q_cte q
join Policies p on q.PolicyId=p.id
where q.rn=1 /* Only the lastest date */
and p.blah1=someBlahParameter
order by q.LastUpdated;

Select MAX() or Select TOP 1 on Join

I'm working with the following code to only get one associated person per case, using the MAX Associated Type to get the top 1.
Associated Type is not a GUID, rather looks like:
Responsible Party, Primary Physician, etc.
It just so happens that Responsible Party is the last alphabetical option, so it's a lucky workaround. Not every case has a responsible party, however, and if there isn't a responsible party, the next top associated person is 'good enough' and will be highlighted as a data error anyway.
The result shows every single associated person (rather than top 1), but shows all of them as Responsible Party, which is not true. What am I doing wrong here?
FROM T_LatestIFSP Ltst
LEFT OUTER JOIN (
SELECT
Clas.ClientCase_ID,
MAX(Astp.AssociatedType) AS AssociatedType
FROM
T_ClientAssociatedPerson Clas
Inner Join T_AssociatedType Astp
ON Clas.AssociatedType_ID = Astp.AssociatedType_ID
GROUP BY Clas.ClientCase_ID
) AS Astp ON Ltst.ClientCase_ID = Astp.ClientCase_ID
LEFT OUTER JOIN T_ClientAssociatedPerson Clas
on Clas.ClientCase_ID = Astp.ClientCase_ID
LEFT OUTER JOIN T_AssociatedPerson Aspr
ON Aspr.AssociatedPerson_ID = Clas.AssociatedPerson_ID
To get AssocId in the select, you have to do a self join.
LEFT OUTER JOIN
(your subselect with max(AssociatedType) in it) AS Astp
INNER JOIN T_AssociatedType AS Astp2
ON (whatever the primary key is on that table)
Then you can add astp2.AssociationTypeId to the original SELECT.
You can try this query.
Make rn from your order condition in CASE WHEN
You can use Rank with window function to make rank number in subquery, then get rnk=1 data row.
;WITH CTE AS (
SELECT ClientCase_ID,
AssociatedPerson_ID,
AssociatedPersonType,
AssociatedType_ID,
RANK() OVER(PARTITION BY ClientCase_ID ORDER BY rn desc,AssociatedPerson_ID) rnk
FROM (
SELECT t1.ClientCase_ID,
t1.AssociatedPerson_ID,
t1.AssociatedPersonType,
t1.AssociatedType_ID,
(CASE
WHEN t1.AssociatedPersonType = 'ResPonsible Party' then 16
WHEN t1.AssociatedPersonType = 'Primary Physician' then 15
ELSE 14
END) rn
FROM T t1
INNER JOIN T t2 ON t1.ClientCase_ID = t2.AssociatedPerson_ID
UNION ALL
SELECT t2.AssociatedPerson_ID,
t1.AssociatedPerson_ID,
t1.AssociatedPersonType,
t2.AssociatedType_ID,
(CASE
WHEN t2.AssociatedPersonType = 'ResPonsible Party' then 16
WHEN t2.AssociatedPersonType = 'Primary Physician' then 15
ELSE 14
END) rn
FROM T t1
INNER JOIN T t2 ON t1.ClientCase_ID = t2.AssociatedPerson_ID
) t1
)
select DISTINCT ClientCase_ID,AssociatedPerson_ID,AssociatedPersonType,AssociatedType_ID
FROM CTE
WHERE rnk = 1
sqlfiddle
Also, you can try to use CROSS APPLY with value instead of UNION ALL
;with CTE AS (
SELECT v.*, (CASE
WHEN v.AssociatedPersonType = 'ResPonsible Party' then 16
WHEN v.AssociatedPersonType = 'Primary Physician' then 15
ELSE 14
END) rn
FROM T t1
INNER JOIN T t2 ON t1.ClientCase_ID = t2.AssociatedPerson_ID
CROSS APPLY (VALUES
(t1.ClientCase_ID,t1.AssociatedPerson_ID,t1.AssociatedPersonType, t1.AssociatedType_ID),
(t2.AssociatedPerson_ID,t1.AssociatedPerson_ID,t2.AssociatedPersonType, t2.AssociatedType_ID)
) v (ClientCase_ID,AssociatedPerson_ID,AssociatedPersonType,AssociatedType_ID)
)
SELECT distinct ClientCase_ID,AssociatedPerson_ID,AssociatedPersonType,AssociatedType_ID
FROM
(
SELECT *,
RANK() OVER(PARTITION BY ClientCase_ID ORDER BY rn desc,AssociatedPerson_ID) rnk
FROM CTE
) t1
WHERE rnk = 1
sqlfiddle
Note
you can add your customer order number in CASE WHEN
[Results]:
| ClientCase_ID | AssociatedPerson_ID | AssociatedPersonType | AssociatedType_ID |
|---------------|---------------------|----------------------|-------------------|
| 01 | 01 | ResPonsible Party | 16 |
| 02 | 03 | Physician Therapist | 24 |
I solved the problem with the following code:
LEFT OUTER JOIN T_ClientAssociatedPerson Clas
on Clas.ClientCase_ID = Ltst.ClientCase_ID
and
CASE
WHEN Clas.AssociatedType_ID = 16 AND Clas.ClientCase_ID = Ltst.ClientCase_ID THEN 1
WHEN Clas.AssociatedType_ID <> 16 AND Clas.AssociatedType_ID = (
SELECT TOP 1 Clas.AssociatedType_ID
FROM T_ClientAssociatedPerson Clas
WHERE Clas.ClientCase_ID = Ltst.ClientCase_ID
ORDER BY AssociatedType_ID DESC
) THEN 1
ELSE 0
END = 1

SQL getting averages with multiple joins

I'm trying to write a single query using 3 tables.
The tables and their columns that I will be using are:
Sec – ID, Symbol
Hss – Code, HDate, Holiday
Fddd – ID, Date, Price
Given a symbol AAA, I need to get the ID from the first table and match it with the ID from the third table. The second table's date must match the third table's dates with the condition of Code=1 and Holiday=1.
The Dates in the second and third table are in ascending order with most recent dates at the bottom. I want to get the average 50 day and 200 day prices. The dates in the tables are in ascending order so I want to make it descending and select the top 50 and 200 to get the average prices.
So far I can only get one average. I cannot add a second SELECT TOP 50 or add a subquery within the second avg().
SELECT AVG(TwoHun)TwoHunAvg --, AVG(Fifty) AS FiftyAvg
FROM (SELECT TOP 200 Fddd.price AS TwoHun --, TOP 50 Fddd.price AS Fifty
FROM Sec
JOIN Fddd
ON Sec.ID = Fddd.ID AND Sec.symbol = 'AAA'
JOIN Hss
ON Fddd.date = Hss.Hdate AND Hss.Code = 1 AND Hss.Holiday = 1
ORDER BY Fddd.Date DESC) AS tmp;
Thanks in advance!
Consider a union query which even scales for other Averages. I add a Type column to indicate the Averages.
SELECT '200 DAY AVG' As Type, AVG(TwoHun) As Avg
FROM
(SELECT TOP 200 Fddd.price AS TwoHun
FROM Sec
INNER JOIN Fddd ON Sec.ID = Fddd.ID
INNER JOIN Hss ON Fddd.date = Hss.Hdate
WHERE Sec.symbol = 'AAA' AND Hss.Code = 1 AND Hss.Holiday = 1
ORDER BY Fddd.Date DESC) AS tmp;
UNION
SELECT '50 DAY AVG' As Type, AVG(FiftyHun) As Avg
FROM
(SELECT TOP 50 Fddd.price AS FiftyHun
FROM Sec
INNER JOIN Fddd ON Sec.ID = Fddd.ID
INNER JOIN Hss ON Fddd.date = Hss.Hdate
WHERE Sec.symbol = 'AAA' AND Hss.Code = 1 AND Hss.Holiday = 1
ORDER BY Fddd.Date DESC) AS tmp;
Also, I moved some of your join expressions to where clause which should not change performance but does in readability.
I suspect your using SQL Server or MS Access.
One quick solution would be to have your total query as a subquery and then copy a modified version as a second subquery.
Quick rough example:
SELECT (SELECT
AVG(Fifty) AS FiftyAvg
FROM (SELECT TOP 50
Fddd.price AS Fifty
FROM Sec
JOIN Fddd
ON Sec.ID = Fddd.ID
AND Sec.symbol = 'AAA'
JOIN Hss
ON Fddd.date = Hss.Hdate
AND Hss.Code = 1
AND Hss.Holiday = 1
ORDER BY Fddd.Date DESC) AS tmp)
AS FiftyAvg,
(SELECT
AVG(TwoHun) TwoHunAvg
FROM (SELECT TOP 200
Fddd.price AS TwoHun
FROM Sec
JOIN Fddd
ON Sec.ID = Fddd.ID
AND Sec.symbol = 'AAA'
JOIN Hss
ON Fddd.date = Hss.Hdate
AND Hss.Code = 1
AND Hss.Holiday = 1
ORDER BY Fddd.Date DESC) AS tmp)
AS TwoHundredAverge;

Selecting max value from 2nd table in first table results

I have 2 tables as below-
Table I
ID DATE
1 05/11/12
2 23/11/12
3 29/11/12
4 04/10/12
5 20/11/12
And another table (IH) with the following info-
ID RECNO NOTE
1 1 Open
1 2 Update
1 3 Close
2 1 Open
2 2 Update
2 3 Hold
2 4 Close
3 1 Open
4 1 Open
4 2 Update
5 1 Open
I would like to output a result as shown below, displaying the Note field using the highest value of RecNo for each ID. So using the data above the output should be-
ID DATE NOTE
2 23/11/12 Close
3 29/11/12 Open
The code I have is-
SELECT I.ID, I.DATE, IH.NOTE FROM
I I, IH IH
JOIN (SELECT MAX([RECNO]) [RECNO] FROM
IH
GROUP BY RECNO) IH2 ON IH2.ID = IH.ID AND
IH2.[RECNO] = IH.[RECNO]
JOIN I I2 ON I2.ID = IH.ID WHERE
(I2.DATE>={TS ‘2012-11-22 00:00:002}) GROUP BY I2.ID
However when I execute the code I get-
Invalid Column Name 'RECNO'. Statement(s) could not be prepared.
How about this? Note, haven't tried it, I'm on my Mac at the moment.
SELECT I.ID, I.DATE, IH.NOTE
FROM I I
OUTER APPLY
(SELECT TOP 1 *
FROM IH
WHERE IH.ID = I.ID
ORDER BY RECNO DESC) IH
WHERE I.DATE >= '2012-11-22'
Your SQL is rather, uh, messy.
Assuming you are using SQL Server 2005 or greater, you can use the row_number() function, as follows:
SELECT I.ID, I.DATE, IH.NOTE
FROM I join
(select ih.*, ROW_NUMBER() over (PARTITION by id order by recno desc) as seqnum
from IH
) ih
on IH2.[RECNO] = IH.[RECNO] and seqnum = 1
WHERE I2.DATE>='2012-11-22 00:00:002'
This is assigning a sequence number in the IH table, for each id with the highest record number getting the value "1". The rest is just SQL.
Your original query is simply not correct syntactically, but I think this is what you want based on the description.
and another one
SELECT I.ID, I.DATE
,(Select TOP 1 IH.NOTE FROM IH where IH.ID=i.ID Order by Recno DESC) as Note
from I
WHERE
I.DATE>'20121122'
maybe this will help
SELECT a.ID, a.DATE, b.NOTE FROM a
inner join b on a.ID = b.ID
where b.recno in (select max(bb.recno)
from b as bb where bb.id = b.id)
http://sqlfiddle.com/#!3/fd141/2
If you don't mind the different identifiers, look at this solution:
select t1.MyID, t1.MyDate, y.Note
from t1
join
(
select MyID, max(RecNo) as RecNo
from t2
group by MyID
) x
on t1.MyID = x.MyID
left join
(
select *
from t2
) y
on t1.MyID = y.MyID
and x.RecNo = y.RecNo
where t1.MyDate >= '2012.11.22'
The complete solution is here: http://sqlfiddle.com/#!3/4ca09/3
Update: Oops, forgot to bring in the date in where clause. Updated SQL Fiddle and the query above.