SQL Server Outer Apply Query Optimization

SQL Server Outer Apply Query Optimization - sql

I have two tables - Table A and Table B.
Table A has 121,903 rows. Table B has only 95 rows.
I need to join Table A with Table B such that I will get first row of Table B which have matching rows with Table A order by sort criteria.
I am using the following query to get the results. It is returning results correctly but has performance issues.
;WITH [TableAB] AS
(
SELECT * FROM #TableA A
OUTER APPLY
(
SELECT TOP 1 * FROM #TableB
WHERE
([Col1] = A.[Col1]OR [Col1]IS NULL)
AND ([Col2] = A.[Col2]OR [Col2]IS NULL)
AND ([Col3] = A.[Col3]OR [Col3]IS NULL)
AND ([Col4] = A.[Col4]OR [Col4]IS NULL)
AND ([Col5] = A.[Col5] OR [Col5] IS NULL)
AND ([Col6] = A.[Col6]OR [Col6]IS NULL)
AND ([Col7] = A.[Col7]OR [Col7]IS NULL)
AND ([Col8] = A.[Col8]OR [Col8]IS NULL)
AND ([Col9] IS NULL)
AND ([Col10] IS NULL)
AND ([Col11] = A.[Col11] OR [Col11] IS NULL)
AND ([Col12] = A.[Col12]OR [Col12] IS NULL)
AND ([Col13] = A.[Col13]OR [Col13]IS NULL)
AND ([Col14] = A.[Col14] OR [Col14] IS NULL)
AND ([Col15]= A.[Col15]OR [Col15]IS NULL)
AND ([Col16] = A.[Col16] OR [Col16] IS NULL)
AND ([Col17]= A.[Col17]OR [Col17]IS NULL)
AND ([Col18]= A.[Col18]OR [Col18]IS NULL)
AND ([Col19]= A.[Col19]OR [Col19]IS NULL)
AND ([Col20] = A.[Col20] OR [Col20]IS NULL)
ORDER BY [SortCriteria]
) B
)
SELECT * FROM [TableAB]
It currently takes ~1 minute to execute this query. Is there any way I can rewrite the query to improve the performance?
Note that it is a data warehouse system, the above query is part of a large query which uses CTE table "TableAB".
Thanks.

Since the bulk of the execution is spent sorting TableB, the most likely candidate for improving performance would be to add an index that covers SortCriteria and INCLUDES all the columns in TableB that are selected in the query.

Related

oracle "merge into" too slow

i'm trying to update a column in a table using the id of another table only if one or two field match each other. Sadly the query run very slowly and i don't understand why.
PS:(the checked fields for table A may be null or have leading/trailing empty spaces )
MERGE INTO B B1
USING (
SELECT B2.LUSERINVENTORYID LUSERINVENTORYID, a1.lastid lastid
FROM B B2,
(SELECT lastid,
TRIM(UPPER(serialno)) AS serialno,
TRIM(UPPER(barcode)) AS barcode
FROM A) a1
WHERE (B2.loaded_serialno = a1.serialno AND B2.loaded_barcode = a1.barcode)
OR (B2.loaded_serialno = a1.serialno AND B2.loaded_barcode IS NULL)
OR (B2.loaded_serialno IS NULL AND B2.loaded_barcode = a1.barcode)
) res
ON (B1.luserinventoryid = res.luserinventoryid)
WHEN MATCHED THEN
UPDATE SET B1.lassetinvolvedid = res.lastid
please somebody can tell me how i can improve the execution time of this merge?

Without looking at your execution plan or knowing your data, we can only guess. That being said, at first glance I can tell you that you are almost certain to have problems stemming from those OR clauses in your join. If you can rewrite this to use a definite join column instead of all these conditions, you'll be much better off.
If you can't, you may also try the hint /*+ use_concat */ and Oracle might rewrite it as three UNION ALL sets with a single-column definite join in each one, which is basically rewriting it for you.
MERGE INTO b b1
USING (
SELECT /*+ use_concat */ b2.id id, a1.id lastid
FROM b b2,
(SELECT a1.id,
TRIM(UPPER(a1.serialno)) AS serialno,
TRIM(UPPER(a1.barcode)) AS barcode
FROM a) a1
WHERE (b2.loaded_serialno = a1.serialno AND b2.loaded_barcode = a1.barcode)
OR (b2.loaded_serialno = a1.serialno AND b2.loaded_barcode IS NULL)
OR (b2.loaded_serialno IS NULL AND b2.loaded_barcode = a1.barcode)
) res
ON (a1.luserinventoryid = res.luserinventoryid)
WHEN MATCHED THEN
UPDATE SET b1.lassetinvolvedid = res.lastid;

Without your data it is difficult to determine but you appear to perform a self-join on B in the ON clause of the merge and if there is a 1-to-1 correspondence (i.e. you are joining on a field with a UNIQUE key) then you could possibly skip that and merge A and B directly:
MERGE INTO B
USING (
SELECT lastid,
TRIM(UPPER(serialno)) AS serialno,
TRIM(UPPER(barcode)) AS barcode
FROM A
) A
ON ( (B.loaded_serialno = a.serialno AND B.loaded_barcode = a.barcode)
OR (B.loaded_serialno = a.serialno AND B.loaded_barcode IS NULL)
OR (B.loaded_serialno IS NULL AND B.loaded_barcode = a.barcode)
)
WHEN MATCHED THEN
UPDATE SET B.lassetinvolvedid = A.lastid;

Rewrite query without using temp table

I have a query that is using a temp table to insert some data then another select from to extract distinct results. That query by it self was fine but now with entity-framework it is causing all kinds of unexpected errors at the wrong time.
Is there any way I can rewrite the query not to use a temp table? When this is converted into a stored procedure and in entity framework the result set is of type int which throws an error:
Could not find an implementation of the query pattern Select not found.
Here is the query
Drop Table IF EXISTS #Temp
SELECT
a.ReceiverID,
a.AntennaID,
a.AntennaName into #Temp
FROM RFIDReceiverAntenna a
full join Station b ON (a.ReceiverID = b.ReceiverID) and (a.AntennaID = b.AntennaID)
where (a.ReceiverID is NULL or b.ReceiverID is NULL)
and (a.AntennaID IS NULL or b.antennaID is NULL)
select distinct r.ReceiverID, r.ReceiverName, r.receiverdescription
from RFIDReceiver r
inner join #Temp t on r.ReceiverID = t.ReceiverID;

No need for anything fancy, you can just replace the reference to #temp with an inner sub-query containing the query that generates #temp e.g.
select distinct r.ReceiverID, r.ReceiverName, r.receiverdescription
from RFIDReceiver r
inner join (
select
a.ReceiverID,
a.AntennaID,
a.AntennaName
from RFIDReceiverAntenna a
full join Station b ON (a.ReceiverID = b.ReceiverID) and (a.AntennaID = b.AntennaID)
where (a.ReceiverID is NULL or b.ReceiverID is NULL)
and (a.AntennaID IS NULL or b.antennaID is NULL)
) t on r.ReceiverID = t.ReceiverID;
PS: I haven't made any effort to improve the query overall like Gordon has but do consider his suggestions.

First, a full join makes no sense in the first query. You are selecting only columns from the first table, so you need that.
Second, you can use a CTE.
Third, you should be able to get rid of the SELECT DISTINCT by using an EXISTS condition.
I would suggest:
WITH ra AS (
SELECT ra.*
FROM RFIDReceiverAntenna ra
Station s
ON s.ReceiverID = ra.ReceiverID AND
s.AntennaID = ra.AntennaID)
WHERE s.ReceiverID is NULL
)
SELECT r.ReceiverID, r.ReceiverName, r.receiverdescription
FROM RFIDReceiver r
WHERE EXISTS (SELECT 1
FROM ra
WHERE r.ReceiverID = ra.ReceiverID
);

You can use CTE instead of the temp table:
WITH
CTE
AS
(
SELECT
a.ReceiverID,
a.AntennaID,
a.AntennaName
FROM
RFIDReceiverAntenna a
full join Station b
ON (a.ReceiverID = b.ReceiverID)
and (a.AntennaID = b.AntennaID)
where
(a.ReceiverID is NULL or b.ReceiverID is NULL)
and (a.AntennaID IS NULL or b.antennaID is NULL)
)
select distinct
r.ReceiverID, r.ReceiverName, r.receiverdescription
from
RFIDReceiver r
inner join CTE t on r.ReceiverID = t.ReceiverID
;
This query will return the same results as your original query with the temp table, but its performance may be quite different; not necessarily slower, it can be faster. Just something that you should be aware about.

Slowness in update query using inner join

I am using the below query to update one column based on the conditions it is specified. I am using "inner join" but it is taking more than 15 seconds to run the query even if it has to update no records(0 records).
UPDATE CONFIGURATION_LIST
SET DUPLICATE_SERIAL_NUM = 0
FROM CONFIGURATION_LIST
INNER JOIN (SELECT DISTINCT APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER, COUNT(*) AS NB
FROM CONFIGURATION_LIST
WHERE
PLANT = '0067'
AND APPLIED_SERIAL_NUMBER IS NOT NULL
AND APPLIED_SERIAL_NUMBER !=''
AND DUPLICATE_SERIAL_NUM = 1
GROUP BY
APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER
HAVING
COUNT(*) = 1) T2 ON T2.APPLIED_SERIAL_NUMBER = CONFIGURATION_LIST.APPLIED_SERIAL_NUMBER
AND T2.APPLIED_MAT_CODE = CONFIGURATION_LIST.APPLIED_MAT_CODE
WHERE
CONFIGURATION_LIST.PLANT = '0067'
AND DUPLICATE_SERIAL_NUM = 1
The index is there with APPLIED_SERIAL_NUMBER and APPLIED_MAT_CODE and fragmentation is also fine.
Could you please help me on the above query performance.

First, you don't need the DISTINCT when using GROUP BY. SQL Server probably ignores it, but it is a bad idea anyway:
UPDATE CONFIGURATION_LIST
SET DUPLICATE_SERIAL_NUM = 0
FROM CONFIGURATION_LIST INNER JOIN
(SELECT APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER, COUNT(*) AS NB
FROM CONFIGURATION_LIST cl
WHERE cl.PLANT = '0067' AND
cl.APPLIED_SERIAL_NUMBER IS NOT NULL AND
cl.APPLIED_SERIAL_NUMBER <> ''
cl.DUPLICATE_SERIAL_NUM = 1
GROUP BY cl.APPLIED_MAT_CODE, cl.APPLIED_SERIAL_NUMBER
HAVING COUNT(*) = 1
) T2
ON T2.APPLIED_SERIAL_NUMBER = CONFIGURATION_LIST.APPLIED_SERIAL_NUMBER AND
T2.APPLIED_MAT_CODE = CONFIGURATION_LIST.APPLIED_MAT_CODE
WHERE CONFIGURATION_LIST.PLANT = '0067' AND
DUPLICATE_SERIAL_NUM = 1;
For this query, you want the following index: CONFIGURATION_LIST(PLANT, DUPLICATE_SERIAL_NUM, APPLIED_SERIAL_NUMBER, APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER).
The HAVING COUNT(*) = 1 suggests that you might really want NOT EXISTS (which would normally be faster). But you don't really explain what the query is supposed to be doing, you only say that this code is slow.

Looks like you're checking the table for rows that exist in the same table with the same values, and if not, update the duplicate column to zero. If your table has a unique key (identity field or composite key), you could do something like this:
UPDATE C
SET C.DUPLICATE_SERIAL_NUM = 0
FROM
CONFIGURATION_LIST C
where
not exists (
select
1
FROM
CONFIGURATION_LIST C2
where
C2.APPLIED_SERIAL_NUMBER = C.APPLIED_SERIAL_NUMBER and
C2.APPLIED_MAT_CODE = C.APPLIED_MAT_CODE and
C2.UNIQUE_KEY_HERE != C.UNIQUE_KEY_HERE
) and
C.PLANT = '0067' and
C.DUPLICATE_SERIAL_NUM = 1

I will try with a select first:
select APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER, count(*) as n
from CONFIGURATION_LIST cl
where
cl.PLANT='0067' and
cl.APPLIED_SERIAL_NUMBER IS NOT NULL and
cl.APPLIED_SERIAL_NUMBER <> ''
group by APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER;
How many rows do you get with this and how long does it take?
If you remove your DUPLICATE_SERIAL_NUM column from your table it might be very simple. The DUPLICATE_SERIAL_NUM suggests that you are searching for duplicates. As you count your rows you could introduce a simple table that contains the counts:
create table CLCOUNT ( N int unsigned, C int /* or what APPLIED_MAT_CODE is */, S int /* or what APPLIED_SERIAL_NUMBER is */, PLANT char(20) /* or what PLANT is */, index unique (C,S,PLANT), index(PLANT,N));
insert into CLCOUNT select count(*), cl.APPLIED_MAT_CODE, cl.APPLIED_SERIAL_NUMBER, cl.PLANT
from CONFIGURATION_LIST cl
where
cl.PLANT='0067' and
cl.APPLIED_SERIAL_NUMBER IS NOT NULL and
cl.APPLIED_SERIAL_NUMBER <> ''
group by APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER;
How long does this take?
Now you can simply select * from CLCOUNT where PLANT='0067' and N=1;
This is all far from being perfect. But you should be able to analyze (EXPLAIN SELECT ...) your queries and find why it takes so long.

update rows from joined tables in oracle

I'm trying to migrate some tables into an existing table, I need to perform the updates only where DET_ATTACHMENT_ID equals DET_ATTACHMENT.ID, here's the query I have so far.
UPDATE DET_ATTACHMENT
SET attachment_type = 'LAB', -- being added by the query, to replace the table difference
payer_criteria_id = (
SELECT PAYER_CRITERIA_ID
FROM DET_LAB_ATTACHMENT
WHERE DET_LAB_ATTACHMENT.DET_ATTACHMENT_ID = DET_ATTACHMENT.ID)
WHERE exists(
SELECT DET_ATTACHMENT_ID
FROM DET_ATTACHMENT
JOIN DET_LAB_ATTACHMENT ON (ID = DET_ATTACHMENT_ID)
WHERE DET_ATTACHMENT_ID = DET_ATTACHMENT.ID
the problem with the existing query is that it's setting every row to have an attachment_type of "LAB", and nulling out the payer_criteria_id where it didn't match. What am I doing wrong?

The problem might be that your exists(...) predicate always evaluates to true, thus making the update run for all rows of det_attachment. Try it this way:
UPDATE DET_ATTACHMENT X
SET X.attachment_type = 'LAB',
X.payer_criteria_id = (
SELECT C.PAYER_CRITERIA_ID
FROM DET_LAB_ATTACHMENT C
WHERE C.DET_ATTACHMENT_ID = X.ID
)
WHERE
exists(
SELECT 1
FROM DET_ATTACHMENT A
JOIN DET_LAB_ATTACHMENT B
ON B.DET_ATTACHMENT_ID = A.ID
where B.det_attachment_id = X.id
)
;

Linq Left Join returns repeated same rows for all set

my linq returns all repeated same rows for all set
run SQL in DB :
select top 20 * from t1
left join t2
on t1.sid= t2.sid
and t1.pid=t2.pid
where(t2.sid is null and t1.pid='r')
i can get 20 different rows of result.
then i write Linq:
Entities dbconn = new Entities();
List<t1> myResult = (
from t1Data in dbconn.t1
join t2Data in dbconn.t2
on new { sid = (int)t1.sid, pid= t1.pid}
equals new { sid= (int)t2.sid, pid= t2.pid}
into joinSet
from joinUnit in joinSet.DefaultIfEmpty()
where (joinUnit == null) && (t1.pid== "r")
select t1Data
).Take(20).ToList();
all rows of result are the some row.

select t1Data is wrong as t1Data is from the original dataset.
Instead, select the joined result: select joinSet.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server Outer Apply Query Optimization - sql

Since the bulk of the execution is spent sorting TableB, the most likely candidate for improving performance would be to add an index that covers SortCriteria and INCLUDES all the columns in TableB that are selected in the query.

Related

oracle "merge into" too slow

Rewrite query without using temp table

Slowness in update query using inner join

update rows from joined tables in oracle

Linq Left Join returns repeated same rows for all set

Categories

Resources