Performance issue using Merge in azure sql stored proc

Performance issue using Merge in azure sql stored proc - sql

Problem: We are experiencing sql timeouts which we believe are attributed to a recent database change that changed db schema and implemented a new proc that handled deletion and insertion of rows into a table. This table is quite large with around 4.5 million rows(but only five columns,three of which can be null), and is indexed with a primary key made up of two columns (UserID, GroupID). Unfortunately our database guy is unavailable and we are kind of stuck in between a rock and a hard place.
Question: Is there anything in the following stored procedure that sticks out as being a performance issue or is incorrectly done?
inputs:
UserID
GroupIDs (list of unique identifiers)
UpdateAdminID (Unique identifier of user who initiated stored proc)
Expectations:
When calling this stored procedure the expectation is that a row for each groupID will be inserted into UserGroups where it does not exist already. Also if a row is found that has the UserID parameter and a GroupID that is not in the input list then it must be deleted.
Procedure:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[sp_UpdateUserGroups] (
#UserID uniqueidentifier,
#Groups IDs READONLY,
#UpdateAdminID uniqueidentifier = NULL
) AS
BEGIN
DECLARE #EnforceUserGroupHeirarchy bit
DECLARE #ClientID uniqueidentifier
DECLARE #AttributeClientID uniqueidentifier
select #ClientID = clientid from users where userid=#userid
select #AttributeClientID=isnull(ParentClientID, ClientID)from clients where ClientID=#ClientID
select #EnforceUserGroupHeirarchy= value from v_clientpreferences where preferenceid=323 and clientid=#AttributeClientID
DECLARE #NumRows INT
SET #NumRows = 1
CREATE TABLE #AllGroups
(
GroupID uniqueidentifier
)
insert into #AllGroups (GroupID) select id from #Groups
if #EnforceUserGroupHeirarchy = 1
BEGIN
WHILE #NumRows>0
BEGIN
INSERT INTO #AllGroups (GroupID)
(
SELECT groupid from groups where (parentgroupid in (select groupid from #AllGroups)) and groupid not in (select groupid from #AllGroups)
)
SET #NumRows = ROWCOUNT_BIG()
END
END
merge usergroups as T
USING (select #UserID as UserId, g.groupid as GroupId from #AllGroups g) as S
on (T.UserID = S.UserID and T.GroupID = S.GroupID)
WHEN NOT MATCHED BY TARGET
THEN INSERT (UserID, GroupID, DateCreated, UpdateAdminID) VALUES(S.Userid,S.GroupID,GetDate(),#UpdateAdminID)
WHEN NOT MATCHED BY SOURCE and T.UserID =#UserID
THEN DELETE;
END
GO
Edit: Fragmentation of indexes is less then 30 percent.
Edit2: EnforceGroupHirearchy looks at a group, and recursivly adds all children to #AllGroups.

Related

Selecting top 10 data from one table and inserting into other table

PSeudo Code:
Create Procedure SP_GetAllData
(#Count,#EmailID)
Create table #tempTable
(
Id Int Not Null Identity (1,1)
Message nvarchar (max)
)
Insert into SecondTable
(EmailID,Message,Subject,MessageID,)
(#EmailID,'Select Message from #tempTable','Message Subjest','Select ID from #tempTable')
How to insert the data into temptable and then into second table?
Here in the above statement i want to insert both the records from FirstTable into SecondTable along with its existing columns

I think this is something what you need, you do not need a temp table for this operation anyway, but you do need to order by some column otherwise TOP Clause without an ORDER BY is pretty meaningless.
Create Procedure SP_GetAllData
#Count INT
,#EmailID INT
AS
BEGIN
SET NOCOUNT ON;
Insert into SecondTable (EmailID,[Message], [Subject] ,MessageID)
Select top (#Count) #EmailID,[Message], [Subject] ,ID
from FirstTable
-- ORDER BY SomeColumn
END

SQL Server : split column into table in a trigger

I have a table that looks something like this:
UserID Email
-----------------------------------
1 1_0#email.com;1_1#email.com
2 2_0#email.com;2_1#email.com
3 3_0#email.com;3_3#email.com
And I need to create a temp table that will look like this:
UserID Email
-----------------------------------
1 1_0#email.com
1 1_1#email.com
2 2_0#email.com
2 2_1#email.com
3 3_0#email.com
3 3_1#email.com
The temp table will be used in a update trigger and I was wondering if there is a more elegant approach than doing something like this:
-- Create temp table to hold the result table
CREATE TABLE #resultTable(
UserID int,
Email nvarchar(50)
)
-- Create temp table to help iterate through table
CREATE TABLE #tempTable(
ID int IDENTITY(1,1),
UserID int,
Email nvarchar(50)
)
-- Insert data from updated table into temp table
INSERT INTO #tempTable
SELECT [UserId], [Email]
FROM inserted
-- Iterate through temp table
DECLARE #count int = ##ROWCOUNT
DECLARE #index int = 1
WHILE (#index <= #count)
BEGIN
DECLARE #userID int
DECLARE #email nvarchar(50)
-- Get the user ID and email values
SELECT
#userID = [UserID], #email = [Email]
FROM #tempTable
WHERE [ID] = #index
-- Insert the parsed email address into the result table
INSERT INTO #resultTable([UserID], [Email])
SELECT #userID, [Data]
FROM myFunctionThatSplitsAColumnIntoATable(#email, ';')
SET #index = #index + 1
END
-- Do stuff with the result table

You'd better avoid iterative approaches when using T-SQL unless strictly necessary, specially inside triggers.
You can use the APPLY operator.
From MSDN:
The APPLY operator allows you to invoke a table-valued function for each row returned by an outer table expression of a query.
So, you can try to replace all your code with this:
INSERT INTO #resultTable(UserID, Email)
SELECT T1.UserID
,T2.Data
FROM updated T1
CROSS APPLY myFunctionThatSplitsAColumnIntoATable(T1.Email, ';') AS T2

How to lock a row in a table before updating it

To implement multi threading in SQL Server 2012 for one my update tasks, I need to have different threads select a row from a table (Accounts) and mark that row as processed using an update in a stored procedure.
Something like this:
create procedure ChooseNextAccountToProcess (#Account_ID Int Output)
select top 1 #Account_ID = Account_ID
from Accounts
order by LastProcessDate Desc
update Accounts
set LastProcessDate = getdate()
where Account_ID = #Account_ID
go
The problem with this approach is that two threads might call this stored procedure exactly at the same time and process the same account. My goal is to select an account from accounts table and exclusively lock it before update has chance to update it.
I tried SELECT .... WITH (UPDLOCK) and WITH Exclusive lock but none of these can actually put exclusive lock on the row when I select that row.
Any suggestion?

You can use update top (n) ..., but you can't specify order by directly in the statement. So, a little trick is in order:
declare #t table (
Id int primary key,
LastProcessDate date not null
);
insert into #t (Id, LastProcessDate)
values
(1, getdate() - 10),
(2, getdate() - 7),
(3, getdate() - 1),
(4, getdate() - 4);
-- Your stored procedure code starts from here
declare #res table (Id int primary key);
declare #AccountId int;
update a set LastProcessDate = getdate()
output inserted.Id into #res(Id)
from (select top (1) * from #t order by LastProcessDate desc) a;
select #AccountId = Id from #res;
-- Returns 3
select #AccountId;
Not exactly a one-liner, but close to it, yes.

Creating a status update trigger

I have 2 tables like so:
JOBS table
Jobcode UserId Status
101 130 R
102 139 D
USERS table
UserId Email
130 test#example.com
I want to create a trigger on insert and update that sends an email to my stored procedure:
EXEC dbo.SendMyEmail #email, #jobcode;
when the jobcode is inserted as 'D' or updated to 'D'.

In my opinion, sending email in a trigger is not optimal.
Instead, you should just insert to a queue table, and have a process run frequently that checks the table and sends the email.
What happens if you get an error in your email procedure? It will force a rollback of your job completion status. Only you know whether that is minor or possibly catastrophic. But I can tell you for sure that DB best practice is to NOT do extended I/O during a DML operation.
CREATE TRIGGER TR_Jobs_EnqueueEmail_IU ON dbo.Jobs FOR INSERT, UPDATE
AS
SET NOCOUNT ON;
INSERT dbo.EmailQueue (UserID, JobCode)
SELECT UserID, JobCode
FROM
Inserted I
LEFT JOIN Deleted D
ON I.JobCode = D.JobCode -- or proper PK columns
WHERE
IsNull(D.Status, 'R') <> 'D'
AND I.Status = 'D';
Tables needed:
CREATE TABLE dbo.EmailQueue (
QueuedDate datetime NOT NULL
CONSTRAINT DF_EmailQueue_QeueueDate DEFAULT (GetDate()),
UserID int NOT NULL,
JobCode int NOT NULL,
CONSTRAINT PK_EmailQueue PRIMARY KEY CLUSTERED (QueuedDate, UserID, JobCode)
);
CREATE TABLE dbo.EmailSent (
SentDate datetime NOT NULL
CONSTRAINT DF_EmailSent_SentDate DEFAULT (GetDate()),
QueuedDate datetime NOT NULL,
UserID int NOT NULL,
JobCode int NOT NULL,
CONSTRAINT PK_EmailSent PRIMARY KEY CLUSTERED (SentDate, QueuedDate, UserID, JobCode)
);
Then, run the following stored procedure once a minute from a SQL Job:
CREATE PROCEDURE dbo.EmailProcess
AS
DECLARE #Email TABLE (
QueuedDate datetime,
UserID int,
JobCode int
);
DECLARE
#EmailAddress nvarchar(255),
#JobCode int;
WHILE 1 = 1 BEGIN
DELETE TOP 1 Q.*
OUTPUT Inserted.QueuedDate, Inserted.UserID, Inserted.JobCode
INTO #Email (QueuedDate, UserID, JobCode)
FROM dbo.EmailQueue Q WITH (UPDLOCK, ROWLOCK, READPAST)
ORDER BY QueuedDate;
IF ##RowCount = 0 RETURN;
SELECT #EmailAddress = U.EmailAddress, #JobCode = E.JobCode
FROM
#Email E
INNER JOIN dbo.User U
ON E.UserID = U.UserID;
EXEC dbo.SendMyEmail #EmailAddress, #JobCode;
DELETE E
OUTPUT QueuedDate, UserID, JobCode
INTO dbo.EmailSent (QueuedDate, UserID, JobCode)
FROM #Email E;
END;
The delete pattern and locks I used are very specifically chosen. If you change them or change the delete pattern in any way it is almost certain you will break it. Handling locks and concurrency is hard. Don't change it.
Note: I typed all the above without checking anything on a SQL Server. It is likely there are typos. Please forgive any.

I'm not sure about data types etc but this should at least put you on the right track.
Hope it helps...
CREATE TRIGGER SendEmailOnStatusD
ON JOBS
-- trigger is fired when an update is made for the table
FOR UPDATE --You can add the same for INSERT
AS
-- holds the UserID so we know which Customer was updated
DECLARE #UserID int
DECLARE #JobCode int
SELECT #UserID = UserId, #JobCode = JobCode
FROM INSERTED WHERE [Status] = 'D' --If you want the old value before the update, use 'deleted' table instead of 'inserted' table
IF (#UserID IS NOT NULL)
BEGIN
-- holds the email
DECLARE #email varchar(250)
SELECT #email = Email FROM USERS WHERE UserId = #UserID
EXEC SendMyEmail (#email, #jobcode);
END
GO
EDIT:
Above code does not handle multiple updates, so for better practice see below option
CREATE TRIGGER SendEmailOnStatusD ON JOBS
-- trigger is fired when an update is made for the table
FOR UPDATE --You can add the same for INSERT
AS
DECLARE #Updates table(UserID int, JobCode int, Email varchar(250))
INSERT INTO #Updates (UserID, JobCode, Email)
SELECT i.UserID, i.JobCode, u.Email
FROM INSERTED i
JOIN USERS u ON i.UserID = u.UserID
WHERE [Status] = 'D'
DECLARE #UserID int
DECLARE #JobCode int
DECLARE #Email varchar(250)
WHILE EXISTS(SELECT * FROM #Updates)
BEGIN
SELECT TOP 1
#UserID = UserID,
#Email = Email,
#JobCode = JobCode
FROM #Updates WHERE UserID = #UserID
EXEC SendMyEmail (#email, #jobcode);
DELETE FROM #Updates
WHERE UserID = #UserID
END
GO
Additionally, as discussed in the comments, sending emails from a trigger is also not the best, but as this is what the question asks for it has been included. I would recommend alternative options for sending emails such as a queue which has been mentioned in other answers.

Using temporary table in where clause

I want to delete many rows with the same set of field values in some (6) tables. I could do this by deleting the result of one subquery in every table (Solution 1), which would be redundant, because the subquery would be the same every time; so I want to store the result of the subquery in a temporary table and delete the value of each row (of the temp table) in the tables (Solution 2). Which solution is the better one?
First solution:
DELETE FROM dbo.SubProtocols
WHERE ProtocolID IN (
SELECT ProtocolID
FROM dbo.Protocols
WHERE WorkplaceID = #WorkplaceID
)
DELETE FROM dbo.ProtocolHeaders
WHERE ProtocolID IN (
SELECT ProtocolID
FROM dbo.Protocols
WHERE WorkplaceID = #WorkplaceID
)
// ...
DELETE FROM dbo.Protocols
WHERE WorkplaceID = #WorkplaceID
Second Solution:
DECLARE #Protocols table(ProtocolID int NOT NULL)
INSERT INTO #Protocols
SELECT ProtocolID
FROM dbo.Protocols
WHERE WorkplaceID = #WorkplaceID
DELETE FROM dbo.SubProtocols
WHERE ProtocolID IN (
SELECT ProtocolID
FROM #Protocols
)
DELETE FROM dbo.ProtocolHeaders
WHERE ProtocolID IN (
SELECT ProtocolID
FROM #Protocols
)
// ...
DELETE FROM dbo.Protocols
WHERE WorkplaceID = #WorkplaceID
Is it possible to do solution 2 without the subquery? Say doing WHERE ProtocolID IN #Protocols (but syntactically correct)?
I am using Microsoft SQL Server 2005.

While you can avoid the subquery in SQL Server with a join, like so:
delete from sp
from subprotocols sp
inner join protocols p on
sp.protocolid = p.protocolid
and p.workspaceid = #workspaceid
You'll find that this doesn't gain you really any performance over either of your approaches. Generally, with your subquery, SQL Server 2005 optimizes that in into an inner join, since it doesn't rely on each row. Also, SQL Server will probably cache the subquery in your case, so shoving it into a temp table is most likely unnecessary.
The first way, though, would be susceptible to changes in Protocols during the transactions, where the second one wouldn't. Just something to think about.

Can try this
DELETE FROM dbo.ProtocolHeaders
FROM dbo.ProtocolHeaders INNER JOIN
dbo.Protocols ON ProtocolHeaders.ProtocolID = Protocols.ProtocolID
WHERE Protocols.WorkplaceID = #WorkplaceID

DELETE ... FROM is a T-SQL extension to the standard SQL DELETE that provides an alternative to using a subquery. From the help:
D. Using DELETE based on a subquery
and using the Transact-SQL extension
The following example shows the
Transact-SQL extension used to delete
records from a base table that is
based on a join or correlated
subquery. The first DELETE statement
shows the SQL-2003-compatible subquery
solution, and the second DELETE
statement shows the Transact-SQL
extension. Both queries remove rows
from the SalesPersonQuotaHistory table
based on the year-to-date sales stored
in the SalesPerson table.
-- SQL-2003 Standard subquery
USE AdventureWorks;
GO
DELETE FROM Sales.SalesPersonQuotaHistory
WHERE SalesPersonID IN
(SELECT SalesPersonID
FROM Sales.SalesPerson
WHERE SalesYTD > 2500000.00);
GO
-- Transact-SQL extension
USE AdventureWorks;
GO
DELETE FROM Sales.SalesPersonQuotaHistory
FROM Sales.SalesPersonQuotaHistory AS spqh
INNER JOIN Sales.SalesPerson AS sp
ON spqh.SalesPersonID = sp.SalesPersonID
WHERE sp.SalesYTD > 2500000.00;
GO
You would want, in your second solution, something like
-- untested!
DELETE FROM
dbo.SubProtocols -- ProtocolHeaders, etc
FROM
dbo.SubProtocols
INNER JOIN #Protocols ON SubProtocols.ProtocolID = #Protocols.ProtocolID
However!!
Is it not possible to alter your design so that all the susidiary protocol tables have a FOREIGN KEY with DELETE CASCADE to the main Protocols table? Then you could just DELETE from Protocols and the rest would be taken care of...
edit to add:
If you already have FOREIGN KEYs set up, you would need to use DDL to alter them (I think a drop and recreate is required) in order for them to have DELETE CASCADE turned on. Once that is in place, a DELETE from the main table will automatically DELETE related records from the child table.

Without the temp table you risk deleting different rows in the the second delete, but that takes three operations to do.
You could delete from the first table and use the OUTPUT INTO clause to insert into a temp table all the IDs, and then use that temp table to delete the second table. This will make sure you only delete the same keys with and with only two statements.
declare #x table(RowID int identity(1,1) primary key, ValueData varchar(3))
declare #y table(RowID int identity(1,1) primary key, ValueData varchar(3))
declare #temp table (RowID int)
insert into #x values ('aaa')
insert into #x values ('bab')
insert into #x values ('aac')
insert into #x values ('bad')
insert into #x values ('aae')
insert into #x values ('baf')
insert into #x values ('aag')
insert into #y values ('aaa')
insert into #y values ('bab')
insert into #y values ('aac')
insert into #y values ('bad')
insert into #y values ('aae')
insert into #y values ('baf')
insert into #y values ('aag')
DELETE #x
OUTPUT DELETED.RowID
INTO #temp
WHERE ValueData like 'a%'
DELETE y
FROM #y y
INNER JOIN #temp t ON y.RowID=t.RowID
select * from #x
select * from #y
SELECT OUTPUT:
RowID ValueData
----------- ---------
2 bab
4 bad
6 baf
(3 row(s) affected)
RowID ValueData
----------- ---------
2 bab
4 bad
6 baf
(3 row(s) affected)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas