SORT Operation in an Update Statement

SORT Operation in an Update Statement - sql

In the SQL below, why is my Update statement performing a SORT operation. The cost of the SORT operation is 41% and I would like to avoid it.
declare #m_table as table (oh_job_cons_id varchar(36))
Insert into #m_table
select top 100 oh_job_cons_id
from oh_job_cons with (nolock)
-- select * from #m_table
Update j
set oh_locked_by_user_id = null,
oh_locked_on = null
from oh_job_cons j with (nolock)
join #m_table m on j.oh_job_cons_id = m.oh_job_cons_id

The SORT in the update operation is probably due to the join constraint (#m_table m on j.oh_jobs_cons_id = m.oh_job_cons_id).
Particularly if the "oh_job_cons_id" column is not the primary key of the oh_jobs_cons table.

Related

How to loop through Delete one row at a time

I am wondering if there is a way to restructure the below SQL so that one row is deleted at a time, as opposed to performing a delete in one mass operation? The reason being is that the delete action causes a trigger on this table to execute and (in cases where a USER_ID has more than 1 row) is attempting to insert data into another table that has a datetime stamp as a key and the same time (to the millisecond) is attempting to be inserted and causing a duplicate key insert error.
DELETE ORDERS
FROM LINE_ORDER ORDERS
INNER JOIN LINE_ORDER_XREF B ON B.OPRID = ORDERS.USER_ID
WHERE B.USERID = 'SYSACCT'
The thought was that if each row is deleted separately as it's own transaction then this will make each datetime stamp unique. The number of delete operations will be low and the additional processing time is not a concern in this case. Is it possible to structure into some loop or use a cursor? The primary ID column's in LINE_ORDER (USER_ID and USER_ROLE) are varchar columns so I don't believe I can increment this.
USER_ID USER_ROLE DYNAMIC_SW
11000_600 E_SAML N
11000_602 E_SAML N
11000_602 SUPRV N
11000_604 E_PRO N
11000_605 E_SAML N

Well, you can use TOP for this purpose:
DELETE o
FROM (SELECT TOP (1) o.*
FROM LINE_ORDER o INNER JOIN
LINE_ORDER_XREF lox
ON lox.OPRID = o.USER_ID
WHERE lox.USERID = 'SYSACCT'
) o;
You then need to embed this in a loop to delete all the matching values.

You can try this:
DECLARE #i INT=
(
SELECT ORDERS
FROM LINE_ORDER ORDERS
INNER JOIN LINE_ORDER_XREF B ON B.OPRID = ORDERS.USER_ID
WHERE B.USERID = 'SYSACCT'
);
DECLARE #count INT= 1;
WHILE(#count <= #i)
BEGIN
DELETE TOP (1)
ORDERS
FROM LINE_ORDER ORDERS
INNER JOIN LINE_ORDER_XREF B ON B.OPRID = ORDERS.USER_ID
WHERE B.USERID = 'SYSACCT';
SET #count = #count + 1;
END;

Improve update performance in oracle

I've generated two temporary tables, also assigned a primary key to the generated tables - to get an index on them.
Like this on both:
ALTER TABLE TEMP_MEASURINGS ADD PRIMARY KEY (MEASURINGID)
ALTER TABLE TEMP_VALUES ADD PRIMARY KEY (<some_other_col>)
The two temp-tables are related by a date and another id as you can see by the query. Now I need to update the "measuringid" in TEMP_VALUES based on the other table.
Can I make this query go faster in any way?
UPDATE TEMP_VALUES v
SET v.MEASURINGID =
(
SELECT MEASURINGID
FROM TEMP_MEASURINGS m
WHERE m.MEASURDATE = v.MEASUREDATE
AND m.ORDERID = v.ORDERID
)
The tables needs to be generated first, so I can't do an insert directly.
SELECT COUNT(*) FROM TEMP_VALUES ~6M
SELECT COUNT(*) FROM TEMP_MEASURINGS ~1.5M

Your query is going to be slow, because so many rows are being updated.
You can speed it with an index on TEMP_MEASURINGS(MEASUREDATE, ORDERID, MEASURINGID). This is a covering index for the subquery. The lookups should be fast.
You might find it faster just to create a new table:
create new_temp_values as
select v.*, m.measuringid
from temp_values v left join
temp_measurings m
on v.measuredate = m.measuredate and v.orderid = m.orderid;
The same index will work here (you can adjust the select columns to be what you really need).
Typically, creating a new table is much, much faster than updating all or even a significant number of rows in a given table.

Try with below merge for peformance:
MERGE TEMP_VALUES v
USING (SELECT MEASURINGID,
MEASURDATE,
ORDERID
FROM TEMP_MEASURINGS) m
ON
m.MEASURDATE = v.MEASUREDATE
AND m.ORDERID = v.ORDERID
WHEN MATCHED THEN
UPDATE
SET v.MEASURINGID = m.MEASURINGID;

Instead of updating all records you can update records that exists in temp table:
UPDATE TEMP_VALUES v
SET v.MEASURINGID =
(
SELECT MEASURINGID
FROM TEMP_MEASURINGS m
WHERE m.MEASURDATE = v.MEASUREDATE
AND m.ORDERID = v.ORDERID
)
where
exists (select 1 from TEMP_VALUES ttt, TEMP_MEASURINGS ttm WHERE ttm.MEASURDATE = ttt.MEASUREDATE
AND ttm.ORDERID = ttt.ORDERID and ttt.ID = v.ID)

SQL Join taking too much time to run

This query shown below is taking almost 2 hrs to run and I want to reduce the execution time of this query. Any help would be really helpful for me.
Currently:
If Exists (Select 1
From PRODUCTS prd
Join STORE_RANGE_GRP_MATCH srg On prd.Store_Range_Grp_Id = srg.Orig_Store_Range_Grp_ID
And srg.Match_Flag = 'Y'
And prd.Range_Event_Id = srg.LAR_Range_Event_Id
Where srg.Range_Event_Id Not IN (Select distinct Range_Event_Id
From Last_Authorised_Range)
)
I have tried replacing the Not IN clause by Not Exists and Left join but no luck in runtime execution.
What I have used:
If Exists( Select top 1 *
From PRODUCTS prd
Join STORE srg
On prd.Store_Range_Grp_Id = srg.Orig_Store_Range_Grp_ID
And srg.Match_Flag = 'Y'
And prd.Range_Event_Id = srg.LAR_Range_Event_Id
and srg.Range_Event_Id ='45655'
Where NOT EXISTS (Select top 1 *
From Last_Authorised_Range where Range_Event_Id=srg.Range_Event_Id)
)
Product table has 432837 records and the Store table also has almost the same number of records. This table I am creating in the stored procedure itself and then dropping it in the end in the stored procedure.
Create Table PRODUCTS
(
Range_Event_Id int,
Store_Range_Grp_Id int,
Ranging_Prod_No nvarchar(14) collate database_default,
Space_Break_Code nchar(1) collate database_default
)
Create Clustered Index Idx_tmpLAR_PRODUCTS
ON PRODUCTS (Range_Event_Id, Ranging_Prod_No, Store_Range_Grp_Id, Space_Break_Code)
Should I use non clustered index on this table or what all can I do to lessen the execution time? Thanks in advance

First, you don't need top 1 or distinct in exists and in subqueries. But this shouldn't affect performance.
This is the query, slightly re-arranged so I can understand it better:
Select 1
From PRODUCTS prd Join
STORE srg
On prd.Store_Range_Grp_Id = srg.Orig_Store_Range_Grp_ID and
prd.Range_Event_Id = srg.LAR_Range_Event_Id
Where srg.Match_Flag = 'Y'
srg.Range_Event_Id = 45655 and
Where NOT EXISTS (Select 1
From Last_Authorised_Range lar
where lar.Range_Event_Id = srg.Range_Event_Id)
)
Do note that I removed the double quotes around 45655. I presume this column is actually a number. If so, don't confuse yourself and the optimizer by using a string for the comparison.
Then, try indexes. I think the best indexes are:
store(Range_Event_Id, Match_Flag, Orig_Store_Range_Grp_ID, LAR_Range_Event_Id)
products(Store_Range_Grp_Id, Range_Event_Id) (or any index, clustered or otherwise, that starts with these two columns in either order)
Last_Authorised_Range(Range_Event_Id)
From what you describe as the volume of data, your query should not be taking hours. I think indexes can help.

Slowness in update query using inner join

I am using the below query to update one column based on the conditions it is specified. I am using "inner join" but it is taking more than 15 seconds to run the query even if it has to update no records(0 records).
UPDATE CONFIGURATION_LIST
SET DUPLICATE_SERIAL_NUM = 0
FROM CONFIGURATION_LIST
INNER JOIN (SELECT DISTINCT APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER, COUNT(*) AS NB
FROM CONFIGURATION_LIST
WHERE
PLANT = '0067'
AND APPLIED_SERIAL_NUMBER IS NOT NULL
AND APPLIED_SERIAL_NUMBER !=''
AND DUPLICATE_SERIAL_NUM = 1
GROUP BY
APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER
HAVING
COUNT(*) = 1) T2 ON T2.APPLIED_SERIAL_NUMBER = CONFIGURATION_LIST.APPLIED_SERIAL_NUMBER
AND T2.APPLIED_MAT_CODE = CONFIGURATION_LIST.APPLIED_MAT_CODE
WHERE
CONFIGURATION_LIST.PLANT = '0067'
AND DUPLICATE_SERIAL_NUM = 1
The index is there with APPLIED_SERIAL_NUMBER and APPLIED_MAT_CODE and fragmentation is also fine.
Could you please help me on the above query performance.

First, you don't need the DISTINCT when using GROUP BY. SQL Server probably ignores it, but it is a bad idea anyway:
UPDATE CONFIGURATION_LIST
SET DUPLICATE_SERIAL_NUM = 0
FROM CONFIGURATION_LIST INNER JOIN
(SELECT APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER, COUNT(*) AS NB
FROM CONFIGURATION_LIST cl
WHERE cl.PLANT = '0067' AND
cl.APPLIED_SERIAL_NUMBER IS NOT NULL AND
cl.APPLIED_SERIAL_NUMBER <> ''
cl.DUPLICATE_SERIAL_NUM = 1
GROUP BY cl.APPLIED_MAT_CODE, cl.APPLIED_SERIAL_NUMBER
HAVING COUNT(*) = 1
) T2
ON T2.APPLIED_SERIAL_NUMBER = CONFIGURATION_LIST.APPLIED_SERIAL_NUMBER AND
T2.APPLIED_MAT_CODE = CONFIGURATION_LIST.APPLIED_MAT_CODE
WHERE CONFIGURATION_LIST.PLANT = '0067' AND
DUPLICATE_SERIAL_NUM = 1;
For this query, you want the following index: CONFIGURATION_LIST(PLANT, DUPLICATE_SERIAL_NUM, APPLIED_SERIAL_NUMBER, APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER).
The HAVING COUNT(*) = 1 suggests that you might really want NOT EXISTS (which would normally be faster). But you don't really explain what the query is supposed to be doing, you only say that this code is slow.

Looks like you're checking the table for rows that exist in the same table with the same values, and if not, update the duplicate column to zero. If your table has a unique key (identity field or composite key), you could do something like this:
UPDATE C
SET C.DUPLICATE_SERIAL_NUM = 0
FROM
CONFIGURATION_LIST C
where
not exists (
select
1
FROM
CONFIGURATION_LIST C2
where
C2.APPLIED_SERIAL_NUMBER = C.APPLIED_SERIAL_NUMBER and
C2.APPLIED_MAT_CODE = C.APPLIED_MAT_CODE and
C2.UNIQUE_KEY_HERE != C.UNIQUE_KEY_HERE
) and
C.PLANT = '0067' and
C.DUPLICATE_SERIAL_NUM = 1

I will try with a select first:
select APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER, count(*) as n
from CONFIGURATION_LIST cl
where
cl.PLANT='0067' and
cl.APPLIED_SERIAL_NUMBER IS NOT NULL and
cl.APPLIED_SERIAL_NUMBER <> ''
group by APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER;
How many rows do you get with this and how long does it take?
If you remove your DUPLICATE_SERIAL_NUM column from your table it might be very simple. The DUPLICATE_SERIAL_NUM suggests that you are searching for duplicates. As you count your rows you could introduce a simple table that contains the counts:
create table CLCOUNT ( N int unsigned, C int /* or what APPLIED_MAT_CODE is */, S int /* or what APPLIED_SERIAL_NUMBER is */, PLANT char(20) /* or what PLANT is */, index unique (C,S,PLANT), index(PLANT,N));
insert into CLCOUNT select count(*), cl.APPLIED_MAT_CODE, cl.APPLIED_SERIAL_NUMBER, cl.PLANT
from CONFIGURATION_LIST cl
where
cl.PLANT='0067' and
cl.APPLIED_SERIAL_NUMBER IS NOT NULL and
cl.APPLIED_SERIAL_NUMBER <> ''
group by APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER;
How long does this take?
Now you can simply select * from CLCOUNT where PLANT='0067' and N=1;
This is all far from being perfect. But you should be able to analyze (EXPLAIN SELECT ...) your queries and find why it takes so long.

COUNT (DISTINCT column_name) Discrepancy vs. COUNT (column_name) in SQL Server 2008?

I'm running into a problem that's driving me nuts.
When running the query below, I get a count of 233,769
SELECT COUNT(distinct Member_List_Link.UserID)
FROM Member_List_Link with (nolock)
INNER JOIN MasterMembers with (nolock)
ON Member_List_Link.UserID = MasterMembers.UserID
WHERE MasterMembers.Active = 1 And
Member_List_Link.GroupID = 5 AND
MasterMembers.ValidUsers = 1 AND
Member_List_Link.Status = 1
But if I run the same query without the distinct keyword, I get a count of 233,748
SELECT COUNT(Member_List_Link.UserID)
FROM Member_List_Link with (nolock)
INNER JOIN MasterMembers with (nolock)
ON Member_List_Link.UserID = MasterMembers.UserID
WHERE MasterMembers.Active = 1 And Member_List_Link.GroupID = 5
AND MasterMembers.ValidUsers = 1 AND Member_List_Link.Status = 1
To test, I recreated all the tables and place them into temp tables and ran the queries again:
SELECT COUNT(distinct #Temp_Member_List_Link.UserID)
FROM #Temp_Member_List_Link with (nolock)
INNER JOIN #Temp_MasterMembers with (nolock)
ON #Temp_Member_List_Link.UserID = #Temp_MasterMembers.UserID
WHERE #Temp_MasterMembers.Active = 1 And
#Temp_Member_List_Link.GroupID = 5 AND
#Temp_MasterMembers.ValidUsers = 1 AND
#Temp_Member_List_Link.Status = 1
And without the distinct keyword
SELECT COUNT(#Temp_Member_List_Link.UserID)
FROM #Temp_Member_List_Link with (nolock)
INNER JOIN #Temp_MasterMembers with (nolock)
ON #Temp_Member_List_Link.UserID = #Temp_MasterMembers.UserID
WHERE #Temp_MasterMembers.Active = 1 And
#Temp_Member_List_Link.GroupID = 5 AND
#Temp_MasterMembers.ValidUsers = 1 AND
#Temp_Member_List_Link.Status = 1
On a side note, I recreated the temp tables by simply running (select * from Member_List_Link into #temp...)
And now when I check to see the difference between COUNT(column) vs. COUNT(distinct column) with these temp tables, I don't see any!
So why is there a discrepancy with the original tables?
I'm running SQL Server 2008 (Dev Edition).
UPDATE - Including statistics profile
PhysicalOp column only for the first query (without distinct)
NULL
Compute Scalar
Stream Aggregate
Clustered Index Seek
PhysicalOp column only for the first query (with distinct)
NULL
Compute Scalar
Stream Aggregate
Parallelism
Stream Aggregate
Hash Match
Hash Match
Bitmap
Parallelism
Index Seek
Parallelism
Clustered Index Scan
Rows and Executes for the 1st query (without distinct)
1 1
0 0
1 1
1 1
Rows and Executes for the 2nd query (with distinct)
Rows Executes
1 1
0 0
1 1
16 1
16 16
233767 16
233767 16
281901 16
281901 16
281901 16
234787 16
234787 16
Adding OPTION(MAXDOP 1) to the 2nd query (with distinct)
Rows Executes
1 1
0 0
1 1
233767 1
233767 1
281901 1
548396 1
And the resulting PhysicalOp
NULL
Compute Scalar
Stream Aggregate
Hash Match
Hash Match
Index Seek
Clustered Index Scan

FROM http://msdn.microsoft.com/en-us/library/ms187373.aspx
NOLOCK Is equivalent to READUNCOMMITTED. For more information, see READUNCOMMITTED later in this topic.
READUNCOMMITED will read rows twice if they are the subject of a transation- since both the roll foward and roll back rows exist within the database when the transaction is IN process.
By default all queries are read committed which excludes uncommitted rows
When you insert into a temp table the select will give you only committed rows - I believe this covers all the symptoms you are trying to explain

I think i have got the answer to your question but tell me first is userid a primary key in your original table ?
if yes,then CTAS query to create temp table would not copy any primary key of original table ,it only copy NOT NULL constraint that is not a part of primary key..fine?
now what happened your original table had a primary key so count(distinct column_name) doesnt include tuples with null records and while you created temp tables , primary key doesnt get copied and hence the NOT NULL constraint doesnt get to the temp table!!
is that clear to you?

It's hard to reproduce this behaviour, so I'm punching in the dark here:
The WITH (NOLOCK) statement enables reading of uncommitted data. I'm guessing you've added that to not lock anything for your users? If you remove those and issue a
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
Prior to executing the query, you should get more reliable results. But then, the tables may receive locks while executing the query.
If that doesn't work, my guess is that DISTINCT use an index to optimize. Check the queryplan, and rebuild indexes as necessary. Could be the source of your problem.

What result do you get with
SELECT count(*) FROM (
SELECT distinct Member_List_Link.UserID
FROM Member_List_Link with (nolock)
INNER JOIN MasterMembers with (nolock)
ON Member_List_Link.UserID = MasterMembers.UserID
WHERE MasterMembers.Active = 1 And
Member_List_Link.GroupID = 5 AND
MasterMembers.ValidUsers = 1 AND
Member_List_Link.Status = 1
) as m
AND WITH:
SELECT count(*) FROM (
SELECT distinct Member_List_Link.UserID
FROM Member_List_Link
INNER JOIN MasterMembers
ON Member_List_Link.UserID = MasterMembers.UserID
WHERE MasterMembers.Active = 1 And
Member_List_Link.GroupID = 5 AND
MasterMembers.ValidUsers = 1 AND
Member_List_Link.Status = 1
) as m

Ray, please try the following
SELECT COUNT(*)
FROM
(
SELECT Member_List_Link.UserID, ROW_NUMBER() OVER (PARTITION BY Member_List_Link.UserID ORDER BY (SELECT NULL)) N
FROM Member_List_Link with (nolock)
INNER JOIN MasterMembers with (nolock)
ON Member_List_Link.UserID = MasterMembers.UserID
WHERE MasterMembers.Active = 1 And
Member_List_Link.GroupID = 5 AND
MasterMembers.ValidUsers = 1 AND
Member_List_Link.Status = 1
) A
WHERE N = 1

when you use count with distinct column it doesn't count columns having values null.
create table #tmp(name char(4) null)
insert into #tmp values(null)
insert into #tmp values(null)
insert into #tmp values("AAA")
Query:-
1> select count(*) from #tmp
2> go
3
1> select count(distinct name) from #tmp
2> go
1
1> select distinct name from #tmp
2> go
name
NULL
AAA
but it works in derived table
1> select count(*) from ( select distinct name from #tmp) a
2> go
2
Note:- I tested it in Sybase

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SORT Operation in an Update Statement - sql

The SORT in the update operation is probably due to the join constraint (#m_table m on j.oh_jobs_cons_id = m.oh_job_cons_id). Particularly if the "oh_job_cons_id" column is not the primary key of the oh_jobs_cons table.

Related

How to loop through Delete one row at a time

Improve update performance in oracle

SQL Join taking too much time to run

Slowness in update query using inner join

COUNT (DISTINCT column_name) Discrepancy vs. COUNT (column_name) in SQL Server 2008?

Categories

Resources