SQL Deadlocks Insert Update statments in Stored Procedure - database-deadlocks

I have a very basic table called Titles as below,
TitleID - auto identity and PK
UserID - reference key to User table
Title - varchar
IsPrimary - bit
Only one index which is PK Clustered Index on TitleID
Now I'm inserting records in this table via stored procedure in a ReadCommitted transaction,
This stored procedure inserts the record in a table with IsPrimary = 1 and update all other titles to 0
INSERT INTO Titles(...)
VALUES (...)
UPDATE T
SET IsPrimary = 0
FROM Titles T
WHERE T.UserID = #UserID AND T.JobTitle != #Title
The moment I test this in multi user scenario I hit deadlock issues. If I remove the UPDATE command from stored proc then everything works perfectly fine...
I tried to create non clustered indexes on the lookup column and also tried WITH (ROWLOCK) hint in update statement but nothing seems to work.
When I ran sql statements and viewed the estimated execution plan I can see both update the clustered index and I'm thinking this is where it fails during multiple user scenario...
I believe it's fairly simple scenario and lots of people should have implemented this kind of behaviour in high transaction system but I can't finding anything on how to approach/solve this issue and your help will be appreciated.
Thank you.

Your deadlock is probably on the index resource.
In the execution plan look for bookmark/key lookups and create a non-clustered index covering those fields - that way the 'read' of the data for the UPDATE will not clash with the 'write' of the INSERT.

Related

Improve insert performance when checking existing rows

I have this simple query that inserts rows from one table(sn_users_main) into another(sn_users_history).
To make sure sn_users_history only has unique rows it checks if the column query_time already exists and if it does then don't insert. query_time is kind of a session identifier that is the same for every row in sn_users_main.
This works fine but since sn_users_history is reaching 50k rows running this query takes more than 2 minutes to run which is too much. Is there anything I can do to improve performance and get the same result?
INSERT INTO sn_users_history(query_time,user_id,sn_name,sn_email,sn_manager,sn_active,sn_updated_on,sn_last_Login_time,sn_is_vip,sn_created_on,sn_is_team_lead,sn_company,sn_department,sn_division,sn_role,sn_employee_profile,sn_location,sn_employee_type,sn_workstation) --- Columns of history table
SELECT snm.query_time,
snm.user_id,
snm.sn_name,
snm.sn_email,
snm.sn_manager,
snm.sn_active,
snm.sn_updated_on,
snm.sn_last_Login_time,
snm.sn_is_vip,
snm.sn_created_on,
snm.sn_is_team_lead,
snm.sn_company,
snm.sn_department,
snm.sn_division,
snm.sn_role,
snm.sn_employee_profile,
snm.sn_location,
snm.sn_employee_type,
snm.sn_workstation
---Columns of main table
FROM sn_users_main snm
WHERE NOT EXISTS(SELECT snh.query_time
FROM sn_users_history snh
WHERE snh.query_time = snm.query_time) --Dont insert items into history table if they already exist
I think you are missing extra condition on user_id, when you are inserting into history table. You have to check combination of userid, querytime.
For your question, I think you are trying to reinvent the wheel. SQL Server is already having temporal tables, to suppor this historical data holding. Read about SQL Server Temporal Tables
If you want to still continue with this approach, I would suggest you to do in batches:
Create a configuration Table to hold the last processed querytime
CREATE TABLE HistoryConfig(HistoryConfigId int, HistoryTableName SYSNAME,
lastProcessedQueryTime DATETIME)
you can do incremental historical inserts
DECLARE #lastProcessedQueryTime DATETIME = (SELECT MAX(lastProcessedQueryTime) FROM HistoryConfig)
INSERT INTO sn_users_history(query_time,user_id,sn_name,sn_email,sn_manager,sn_active,sn_updated_on,sn_last_Login_time,sn_is_vip,sn_created_on,sn_is_team_lead,sn_company,sn_department,sn_division,sn_role,sn_employee_profile,sn_location,sn_employee_type,sn_workstation) --- Columns of history table
SELECT snm.query_time,
snm.user_id,
snm.sn_name,
snm.sn_email,
snm.sn_manager,
snm.sn_active,
snm.sn_updated_on,
snm.sn_last_Login_time,
snm.sn_is_vip,
snm.sn_created_on,
snm.sn_is_team_lead,
snm.sn_company,
snm.sn_department,
snm.sn_division,
snm.sn_role,
snm.sn_employee_profile,
snm.sn_location,
snm.sn_employee_type,
snm.sn_workstation
---Columns of main table
FROM sn_users_main snm
WHERE query_time > #lastProcessedQueryTime
Now, you can update the configuration again
UPDATE HistoryConfig SET lastProcessedQueryTime = (SELECT MAX(lastProcessedQueryTime) FROM HistoryConfig)
HistoryTableName = 'sn_users_history'
I would suggest you to create index on clustered index on UserId, Query_Time(if possible, Otherwise create non-clustered index) which will improve the performance.
Other approaches you can think of:
Create clustered index on userId, querytime in the historical table and also have userid,querytime as clustered index on the main table and perform MERGE operation.

Is update/select of a single column in a single row atomic and safe in SQL Server?

I'd like to use the following statement to update a column of a single row:
UPDATE Test SET Column1 = Column1 & ~2 WHERE Id = 1
The above seems to work. Is this safe in SQL Server? I remember reading about possible deadlocks when using similar statments in a non-SQL Server DBMS (I think it was related to PostgreSQL).
Example of a table and corresponding stored procs:
CREATE TABLE Test (Id int IDENTITY(1,1) NOT NULL, Column1 int NOT NULL, CONSTRAINT PK_Test PRIMARY KEY (Id ASC))
GO
INSERT INTO Test (Column1) Values(255)
GO
-- this will always affect a single row only
UPDATE Test SET Column1 = Column1 & ~2 WHERE Id = 1
For the table structure you have shown both the UPDATE and the SELECT are standalone transactions and can use clustered index seeks to do their work without needing to read unnecessary rows and take unnecessary locks so I would not be particularly concerned about deadlocks with this procedure.
I would be more concerned about the fact that you don't have the UPDATE and SELECT inside the same transaction. So the X lock on the row will be released as soon as the update statement finishes and it will be possible for another transaction to change the column value (or even delete the whole row) before the SELECT is executed.
If you execute both statements inside the same transaction then I still wouldn't be concerned about deadlock potential as the exclusive lock is taken first (it would be a different matter if the SELECT happened before the UPDATE)
You can also address the concurrency issue by getting rid of the SELECT entirely and using the OUTPUT clause to return the post-update value to the client.
UPDATE Test SET Column1 = Column1 & ~2
OUTPUT INSERTED.Column1
WHERE Id = 1
What do you mean "is it safe"?
Your id is a unique identifier for each row. I would strongly encourage you to declare it as a primary key. But you should have an index on the column.
Without an index, you do have a potential issue with performance (and deadlocks) because SQL Server has to scan the entire table. But with the appropriate primary key declaration (or another index), then you are only updating a single row in a single table. If you have no triggers on the table, then there is not much going on that can interfere with other transactions.

Improve Speed of Script with Indexing

I have the following script that used to be ok, but since our user base has now expanded to almost a million members, the script is now very sluggish. I want to improve it and need expert assistance to make this faster, with either coding changes, create indexes, or both. Here is the code:
IF #MODE = 'CREATEREQUEST'
BEGIN
IF NOT EXISTS (SELECT * FROM FriendRequest WHERE FromMemberID = #FromMemberID AND ToMemberID = #ToMemberID)
AND NOT EXISTS (SELECT * FROM MemberConnection WHERE MemberID = #FromMemberID AND ConnMemberID = #ToMemberID)
AND NOT EXISTS (SELECT * FROM MemberConnection WHERE MemberID = #ToMemberID AND ConnMemberID = #FromMemberID)
BEGIN
INSERT INTO FriendRequest (
FromMemberID,
ToMemberID,
RequestMsg,
OnHold)
VALUES (
#FromMemberID,
#ToMemberID,
#RequestMsg,
#OnHold)
END
BEGIN
UPDATE Member SET FriendRequestCount = (FriendRequestCount + 1) WHERE MemberID = #ToMemberID
END
END
Any assistance you can provide would be greatly appreciated.
You can use SQL Server Management Studio to view the indexes on a table. If, for example, your FriendRequest table has a PK on FriendRequestID, you will be able to see that you have a clustered index on that field. You can have only one clustered index per table, and the table records are stored in that order.
You might want to try adding non-clustered indexes to your foreign key fields. You could use the New Index wizard, or else syntax like this:
CREATE NONCLUSTERED INDEX [IX_FromMemberID] ON [dbo].[FriendRequest] (FromMemberID)
CREATE NONCLUSTERED INDEX [IX_ToMemberID] ON [dbo].[FriendRequest] (ToMemberID)
But you should be aware that indexing will generally slow down the INSERT and UPDATE operations you showed in your code. It will generally tend to speed up the SELECT queries that can use the indexed fields (see Execution Plans).
You can try the Database Engine Tuning Advisor to get an idea of some possible indexes and their effect on your application's workload.
Indexing is a large subject and you may wish to take it a small step at a time.

Is it possible to add index to a temp table? And what's the difference between create #t and declare #t

I need to do a very complex query.
At one point, this query must have a join to a view that cannot be indexed unfortunately.
This view is also a complex view joining big tables.
View's output can be simplified as this:
PID (int), Kind (int), Date (date), D1,D2..DN
where PID and Date and Kind fields are not unique (there may be more than one row having same combination of pid,kind,date), but are those that will be used in join like this
left join ComplexView mkcs on mkcs.PID=q4.PersonID and mkcs.Date=q4.date and mkcs.Kind=1
left join ComplexView mkcl on mkcl.PID=q4.PersonID and mkcl.Date=q4.date and mkcl.Kind=2
left join ComplexView mkco on mkco.PID=q4.PersonID and mkco.Date=q4.date and mkco.Kind=3
Now, if I just do it like this, execution of the query takes significant time because the complex view is ran three times I assume, and out of its huge amount of rows only some are actually used (like, out of 40000 only 2000 are used)
What i did is declare #temptable, and insert into #temptable select * from ComplexView where Date... - one time per query I select only the rows I am going to use from my ComplexView, and then I am joining this #temptable.
This reduced execution time significantly.
However, I noticed, that if I make a table in my database, and add a clustered index on PID,Kind,Date (non-unique clustered) and take data from this table, then doing delete * from this table and insert into this table from complex view takes some seconds (3 or 4), and then using this table in my query (left joining it three times) take down query time to half, from 1 minute to 30 seconds!
So, my question is, first of all - is it possible to create indexes on declared #temptables.
And then - I've seen people talk about "create #temptable" syntax. Maybe this is what i need? Where can I read about what's the difference between declare #temptable and create #temptable? What shall I use for a query like mine? (this query is for MS Reporting Services report, if it matters).
#tablename is a physical table, stored in tempdb that the server will drop automatically when the connection that created it is closed, #tablename is a table stored in memory & lives for the lifetime of the batch/procedure that created it, just like a local variable.
You can only add a (non PK) index to a #temp table.
create table #blah (fld int)
create nonclustered index idx on #blah (fld)
It's not a complete answer but #table will create a temporary table that you need to drop or it will persist in your database. #table is a table variable that will not persist longer than your script.
Also, I think this post will answer the other part of your question.
Creating an index on a table variable
Yes, you can create indexes on temp tables or table variables. http://sqlserverplanet.com/sql/create-index-on-table-variable/
The #tableName syntax is a table variable. They are rather limited. The syntax is described in the documentation for DECLARE #local_variable. You can kind of have indexes on table variables, but only indirectly by specifying PRIMARY KEY and UNIQUE constraints on columns. So, if your data in the columns that you need an index on happens to be unique, you can do this. See this answer. This may be “enough” for many use cases, but only for small numbers of rows. If you don’t have indexes on your table variable, the optimizer will generally treat table variables as if they contain one row (regardless of how many rows there actually are) which can result in terrible query plans if you have hundreds or thousands of rows in them instead.
The #tableName syntax is a locally-scoped temporary table. You can create these either using SELECT…INTO #tableName or CREATE TABLE #tableName syntax. The scope of these tables is a little bit more complex than that of variables. If you have CREATE TABLE #tableName in a stored procedure, all references to #tableName in that stored procedure will refer to that table. If you simply reference #tableName in the stored procedure (without creating it), it will look into the caller’s scope. So you can create #tableName in one procedure, call another procedure, and in that other procedure read/update #tableName. However, once the procedure that created #tableName runs to completion, that table will be automatically unreferenced and cleaned up by SQL Server. So, there is no reason to manually clean up these tables unless if you have a procedure which is meant to loop/run indefinitely or for long periods of time.
You can define complex indexes on temporary tables, just as if they are permanent tables, for the most part. So if you need to index columns but have duplicate values which prevents you from using UNIQUE, this is the way to go. You do not even have to worry about name collisions on indexes. If you run something like CREATE INDEX my_index ON #tableName(MyColumn) in multiple sessions which have each created their own table called #tableName, SQL Server will do some magic so that the reuse of the global-looking identifier my_index does not explode.
Additionally, temporary tables will automatically build statistics, etc., like normal tables. The query optimizer will recognize that temporary tables can have more than just 1 row in them, which can in itself result in great performance gains over table variables. Of course, this also is a tiny amount of overhead. Though this overhead is likely worth it and not noticeable if your query’s runtime is longer than one second.
To extend Alex K.'s answer, you can create the PRIMARY KEY on a temp table
IF OBJECT_ID('tempdb..#tempTable') IS NOT NULL
DROP TABLE #tempTable
CREATE TABLE #tempTable
(
Id INT PRIMARY KEY
,Value NVARCHAR(128)
)
INSERT INTO #tempTable
VALUES
(1, 'first value')
,(3, 'second value')
-- will cause Violation of PRIMARY KEY constraint 'PK__#tempTab__3214EC071AE8C88D'. Cannot insert duplicate key in object 'dbo.#tempTable'. The duplicate key value is (1).
--,(1, 'first value one more time')
SELECT * FROM #tempTable

SQL Indexing question

I've got a stored procedure which seems to be my bottle neck in my application. The issue is that the tables it is being applied on are updated very frequently (about once a second with tens of records) - so indexing is not trivial.
It seems that for every X runs of the SP - there is one that takes about 1.5 seconds to run (where as the others run for about 300-400ms or less). In my understaning, it's the indexing tree being updated.
THE RBDMS is SQL Server 2008 R2.
Here is the SP:
THE PK for the archive and live table is "pk1" (for example) - which is not being used here.
the FK is userid (which is a PK in Table_Users)
INSERT INTO Table_Archive
SELECT userid, longitude, latitude, direction, RcvDate
FROM Table_Live
WHERE userid = #userid
DELETE FROM Table_Live WHERE userid = #userid
-- write down the new location
INSERT INTO
Table_Live (userid, longitude, latitude, direction)
VALUES (#userid, #lon, #lat, #dir)
UPDATE Table_Users
SET location = 'true'
WHERE loginid = (SELECT MAX(loginid) as loginid
FROM Logins
WHERE userid = #userid)
Any idea what could be done to make it run optimally? Preferably it should run under 200ms.
It isn't the index tree being updated: that happens as part of ACID. When the DML completes, all internal structures (which includes indexes, checks, foreign key checks etc) will be completed too. There is no deferral of such checks in SQL Server
This is probably statistics update and compile time (plans are invalidated when stats are updated). A statistics update (IIRC) is caused by 500 rows + 20% changes. So for if you are inserting "tens of rows per second" on a table with "thousands" of rows, you'll require statistics refreshed
My first thought would be to set asynchronous statistics: don't disable them
The only obvious thing would be: are there indices on loginid and userid in Table_Users ??
Both are being used for the WHERE clause in the UPDATE statement, and also there's a MAX() function applied to loginid
Another thing that would help quite a bit: don't actually delete the rows inside your stored proc. That'll save a you a lot of time. Try to do the update asynchronously - separately from your database. E.g. write the #userid values into a "command table" and have a SQL job delete those rows e.g. once an hour or so.