I created this pretty basic stored procedure, that gets called by our cms when a user creates a specific type of item. However, it looks like there are times when we get two rows for each cms item created with the same data, but an off-by-one SourceID. I don't do much SQL work, so this might be something basic - but do I need to explicitly lock the table somehow in the stored procedure to keep this from happening?
Here is the stored procedure code:
BEGIN
SET #newid = (SELECT MAX(SourceID)+1 from [dbo].[sourcecode])
IF NOT EXISTS(SELECT SourceId from [dbo].[sourcecode] where SourceId = #newid)
INSERT INTO [dbo].[sourcecode]
(
SourceID,
Description,
RunCounts,
ShowOnReport,
SourceParentID,
ApprovedSource,
Created
)
VALUES
(
#newid,
#Desc,
1,
#ShowOnReport,
1,
1,
GetDate()
)
RETURN #newid
END
and here is an example of the duplicated data (less a couple of irrelevant columns):
SourceId Description Created
676 some text 2012-10-17 09:42:36.553
677 some text 2012-10-17 09:43:01.380
I am sure this has nothing to do with SP. As Oded mentioned, this could be the result of your code.
I don't see anything in the stored procedure which is capable of generating duplicates.
Also, I wouldn't use MAX(SourceId) + 1. Why don't you use "Auto Increment" if you want a new Source Id all the time anyways?
As it has been said in the comments, I think your issue is more in the code layer; none of the data seems to be violating any constraints. You may want to do a check to see if the same user has submitted the same data "recently" before performing the insert.
You can use locking when using stored procedures. On the ones I use I usually use WITH (ROWLOCK). Locking is used to ensure data integrity. I think a simple Google should bring up lots of information about why you should be using locking.
But as other commentators had said, see if there isn't anything in your code as well. Is there something that is calling the same method twice? Are there 'events' referencing the method that is doing the updating?
The description is probably duplicated because you are calling the same function twice, by clicking the button twice, or whatever.
You should use an IDENTITY on your SourceID column and use the Scope_Identity() function
If you don't want to do that for some reason, then you should wrap the above code in a transaction with the isolation level set to Serializable
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
SET #newid = ....
COMMIT
Related
I have an update trigger in SQL Server and I want to remove this trigger and make update operation with a stored procedure instead of the trigger. But I have UPDATE(end_date) control in update trigger.
What can I use instead of below UPDATE(end_date) control? How can I compare old and new end_dates in stored procedure efficiently?
Update trigger
ALTER TRIGGER [dbo].[trig_tbl_personnel_car_update]
ON [dbo].[tbl_personnel_cars]
FOR UPDATE
AS
IF (UPDATE(end_date))
UPDATE pc
SET pc.owner_changed = 1
FROM tbl_personnel_cars pc, inserted i
WHERE pc.pk_id = i.pk_id
Sample updated script in stored procedure
ALTER PROCEDURE [dbo].[personnel_car_update]
(#PkId INT)
UPDATE tbl_personnel_cars
SET end_date = GETDATE()
WHERE pk_id = #PkId
I update tbl_personnel_cars table inside many stored procedures like this. How can I update this table like trigger does instead of update trigger?
I tried below codes to get old and new end_dates but I can't.
Sample updated script in stored procedure:
ALTER PROCEDURE [dbo].[personnel_car_update]
(#PkId INT)
UPDATE tbl_personnel_cars
SET end_date = GETDATE()
WHERE pk_id = #PkId
EXEC update_operation_sp_instead_trigger #PkId
ALTER PROCEDURE [dbo].[update_operation_sp_instead_trigger]
(#PkId INT)
UPDATE pc
SET pc.owner_changed = 1
FROM tbl_personnel_cars pc
JOIN tbl_personnel_cars pc2 ON pc.pk_id = pc2.pk_id
WHERE pc.end_date <> pc2.end_date
And last question. Is it a correct choice to use stored procedure instead of trigger where the table is updated?
Firstly, I want to clarify a misunderstanding you appear to have about the UPDATE function in Triggers. UPDATE returns a boolean result based on if the column inside the function was assigned a value in the SET clause of the UPDATE statement. It does not check if that value changed. This is both documented feature, and is stated to be "by-design".
This means that if you had a TRIGGER with UPDATE(SomeColumn) the function would return TRUE for both of these statements, even though no data was changed:
UPDATE dbo.SomeTable
SET SomeColumn = SomeColumn;
UPDATE ST
SET SomeColumn = NULL
FROM dbo.SomeTable ST
WHERE SomeColumn IS NULL;
If, within a TRIGGER, you need to check if a value has changed you need to reference the inserted and deleted pseudo-tables. For non-NULLable columns equality (=) can be checked, however, for NULLable columns you'll also need to check if the column changed from/to NULL. In the latest version of the data engine (at time of writing) IS DISTINCT FROM makes this far easier.
Now onto the problem you are actually trying to solve. It looks like you are, in truth, overly complicated this. Firstly, you are setting the value to GETDATE so it is almost certainly impossible that the column will be set to be the same value it already set to; you have a 1/300 second window to do the same UPDATE twice, and if you add IO operations, connection timing, etc, that basically makes hitting that window twice impossible.
For what you want, just UPDATE both columns in your procedure's definition:
ALTER PROCEDURE [dbo].[personnel_car_update] #PkId int AS --This has a trailing comma, which is invalid syntax. The parathesis are also not needed; SP's aren't functions. You were also missing the AS
BEGIN
SET NOCOUNT ON;
UPDATE dbo.tbl_personnel_cars --Always schema qualify
SET end_date = GETDATE(),
owner_changed = 1
WHERE pk_id = #PkId;
END;
Larnu gave you a great answer about the stored procedure logic, so I want to answer your question about "Is it a correct choice to use stored procedure instead of trigger where the table is updated?"
The upsides of DML triggers are following in my opinion:
When you have a lot of places that manipulate a table, and there need to be some common logic performed together with this manipulation like audit / logging, trigger can solve it nicely because you don't have to repeat your code in a lot of places
Triggers can prevent "stupid" actions like: DELETEs / UPDATEs without WHERE by performing some specific validation etc. They can also make sure you are setting all mandatory fields (for example change date) when performing adhoc updates
They can simplify quick and dirty patches when you can concentrate your logic to one place instead of X number of procedures / compiled code.
Downside of triggers is performance in some cases, as well as some more specific problems like output clause not working on the triggered tables and more complicated maintenance.
It's seldom you can't solve these issues with other non-trigger solutions, so i'd say if your shop already uses triggers, fine, but if they don't, then there's seldom a really good reason to start either
I have a stored procedure that runs the inputs through a series of validation queries, and kicks out a parameter detailing if record will be saved with "complete" or "incomplete" information. Then, we start a transaction, save the data to a couple of tables, commit. Done.
Essentially this:
EXEC dbo.ValidationProcedure --(sets output params)
BEGIN TRANSACTION
MERGE TABLE A (uses output params)
MERGE TABLE B
COMMIT TRANSACTION
In addition to this, we have a view that has the same validation queries, written such that it returns all occurrences of invalid data within our tables; the queries in the procedure version merely check the inputs provided (both the validation procedure and view look at data for both TABLE A AND B). This view returns the exact error code, message, recordID, etc. and we use it audit queries, to get error messages on displaying the input form, yada yada yada....
My issue is that we have the same logic in multiple places, which I hate. My question, and potential solution to this, would it be bad practice/poor design, to remove the validation procedure all together, and do the following:
BEGIN TRANSACTION
MERGE TABLE A
MERGE TABLE B
IF EXISTS (SELECT * FROM ValidationView T WHERE T.ID = #ID)
BEGIN
UPDATE TABLE A SET Incomplete = 1 WHERE [ID] = #ID;
END
COMMIT TRANSACTION
I thought of doing it this way a while ago, but, I was not fond of affecting the same row twice. It seemed wasteful, unnecessary, and an incorrect way to go about it; I'm hoping I am wrong about this thinking. But now, I'm having second thoughts and would like to know if we can unload some code overhead by going with the second example, or should we keep the course and maintain both the validation procedure and validation view.
I'm using SQL Server 2008 R2.
I have a view; let's call it view1. This view is complex and slow. It cannot be made into an indexed view because it uses left joins and various other trickery. As such, we created a stored procedure which basically:
obtains an exclusive lock
selects * into computed_view1_tmp from view1; (slow)
creates indexes on the above computed table (slow)
renames computed_view1 to computed_view1_todelete; and does the same for its indexes (assumed fast)
renames computed_view1_tmp to computed_view1; and does the same for its indexes (assumed fast)
drops the table computed_view1_todelete (slow)
releases the lock.
We run this procedure when we know we're changing the data in our web application. We then have other views, such as view2 using computed_view1 instead of view1.
Once in a while, we get:
Invalid object name 'dbo.computed_view1'. Could not use view or
function 'dbo.view2 because of binding errors.
I assume this is because we're trying to access dbo.computed_view1 at the same time as it's being renamed. I assume this is a very short period, but the frequency I am seeing this error in my logs makes me wonder if something else might be at play. I'm getting the error many times per day on a site with about a dozen users active throughout the day.
In development, this procedure takes about five seconds given the amount of data in the view. Renaming is instantaneous. In production, it must be taking longer but I don't understand why. I once saw the procedure fail to obtain the exclusive lock within 90 seconds.
Any thoughts on how to fix or a better solution?
Edit: Extra notes on my locking - maybe I'm not doing this right:
BEGIN TRANSACTION
DECLARE #result int
EXEC #result = sp_getapplock #Resource = 'lock_computed_view1', #LockMode = 'Exclusive', #LockTimeout = 90
IF #result NOT IN ( 0, 1 ) -- Only successful return codes
BEGIN
PRINT #result
RAISERROR ( 'Lock failed to acquire...', 16, 1 )
END
ELSE
BEGIN
// rest of the magic
END
EXEC #result = sp_releaseapplock #Resource = 'lock_computed_view1'
COMMIT TRANSACTION
If you're locking and transaction scope is right I would expect other transactions to wait and never see the view missing. This might be a SQL Server idiosyncrasy that I don't know about.
It is often possible to do without dynamic DDL. Here are two ways to do it:
TRUNCATE the computed table and insert into it. This takes an exclusive automatically. No need to rename. All of this is atomic and supports rollback.
Use a staging table with the same schema. Work on that. So far no service interruption at all. Then, SWITCH PARTITION the staging table with the production table. This is quick and atomic. This does not require Enterprise Edition.
With these approaches the problem is solved by just not renaming.
I need to provide 4 MySQL stored procedures for each table in a database. They are for get, update, insert and delete.
"Get", "delete" and "insert" are straightforward. The problem is "update", because I don't know which parameters will be set and which ones not. Some parameters could be set to NULL, and other simply won't change so they won't be provided.
As I'm already working with XML, after several search in Google I've found that is possible to use a function called UpdateXML, but the examples are too complex and some articles are from 2007. So I don't know if there is a better technique at this moment or something easier.
Any comment, documentation, link, article or whatever of something that you've used and you're happy with, will be well appreciated :D
Cheers.
Usually when you have data from a row in your database in the front-end, you should have all of the values that you might use to update that row in the database. You should pass all of those values into your update, regardless of whether or not they have actually changed. Otherwise, your database doesn't really know whether it's getting a NULL value for a column because that's what it's supposed to be or because you just didn't pass the real value along.
If you are going to have areas of the application where you don't need certain columns from a table, then it's possible to set up additional stored procedures that do not use those columns. It's often easier though to just retrieve all of the columns from the database when you fill your front-end object. The overhead of the extra columns is usually minimal and worth the saved maintenance of multiple update stored procedures.
Here's an example. It's MS SQL Server syntax, so you may have to alter it slightly, but hopefully it illustrates the idea:
CREATE PROCEDURE Update_My_Table
#my_table_id INT,
#name VARCHAR(40),
#description VARCHAR(500),
#some_other_col INT
AS
BEGIN
UPDATE
My_Table
SET
name = #name,
description = #description,
some_other_col = #some_other_col
WHERE
my_table_id = #my_table_id
END
CREATE PROCEDURE Update_My_Table_Limited
#my_table_id INT,
#name VARCHAR(40),
#description VARCHAR(500)
AS
BEGIN
UPDATE
My_Table
SET
name = #name,
description = #description
WHERE
my_table_id = #my_table_id
END
As you can see, just eliminate those columns that you're not updating from the UPDATE statement. Just don't go overboard and try to have a stored procedure for every possible combination of columns that you might want to update. It's much easier to just get the extra columns from the DB when you select from the table in the first place. You'll end up passing the same value back and your server will wind up updating the column with the same exact value, but that's not a big deal. You can code your front end to make sure that at least one column has changed before it will actually try to update anything in the database.
(Note: this is for MS SQL Server)
Say you have a table ABC with a primary key identity column, and a CODE column. We want every row in here to have a unique, sequentially-generated code (based on some typical check-digit formula).
Say you have another table DEF with only one row, which stores the next available CODE (imagine a simple autonumber).
I know logic like below would present a race condition, in which two users could end up with the same CODE:
1) Run a select query to grab next available code from DEF
2) Insert said code into table ABC
3) Increment the value in DEF so it's not re-used.
I know that, two users could get stuck at Step 1), and could end up with same CODE in the ABC table.
What is the best way to deal with this situation? I thought I could just wrap a "begin tran" / "commit tran" around this logic, but I don't think that worked. I had a stored procedure like this to test, but I didn't avoid the race condition when I ran from two different windows in MS:
begin tran
declare #x int
select #x= nextcode FROM def
waitfor delay '00:00:15'
update def set nextcode = nextcode + 1
select #x
commit tran
Can someone shed some light on this? I thought the transaction would prevent another user from being able to access my NextCodeTable until the first transaction completed, but I guess my understanding of transactions is flawed.
EDIT: I tried moving the wait to after the "update" statement, and I got two different codes... but I suspected that. I have the waitfor statement there to simulate a delay so the race condition can be easily seen. I think the key problem is my incorrect perception of how transactions work.
Set the Transaction Isolation Level to Serializable.
At lower isolation levels, other transactions can read the data in a row that is read, (but not yet modified) in this transaction. So two transactions can indeed read the same value. At very low isolation (Read Uncommitted) other transactions can even read data after it's been modified (but before committed)...
Review details about SQL Server Isolation Levels here
So bottom line is that the Isolation level is crtitical piece here to control what level of access other transactions get into this one.
NOTE. From the link, about Serializable
Statements cannot read data that has been modified but not yet committed by other transactions.
This is because the locks are placed when the row is modified, not when the Begin Trans occurs, So what you have done may still allow another transaction to read the old value until the point where you modify it. So I would change the logic to modify it in the same statement as you read it, thereby putting the lock on it at the same time.
begin tran
declare #x int
update def set #x= nextcode, nextcode += 1
waitfor delay '00:00:15'
select #x
commit tran
As other responders have mentioned, you can set the transaction isolation level to ensure that anything you 'read' using a SELECT statement cannot change within a transaction.
Alternatively, you could take out a lock specifically on the DEF table by adding the syntax WITH HOLDLOCK after the table name, e.g.,
SELECT nextcode FROM DEF WITH HOLDLOCK
It doesn't make much difference here, as your transaction is small, but it can be useful to take out locks for some SELECTs and not others within a transaction. It's a question of 'repeatability versus concurrency'.
A couple of relavant MS-SQL docs.
Isolation levels
Table hints
Late answer. You want to avoid a race condition...
"SQL Server Process Queue Race Condition"
Recap:
You began a transaction. This doesn't actually "do" anything in and of itself, it modifies subsequent behavior
You read data from a table. The default isolation level is Read Committed, so this select statement is not made part of the transaction.
You then wait 15 seconds
You then issue an update. With the declared transaction, this will generate a lock until the transaction is committed.
You then commit the transaction, releasing the lock.
So, guessing you ran this simultaneously in two windows (A and B):
A read the "next" value from table def, then went into wait mode
B read the same "next" value from the table, then went into wait mode. (Since A only did a read, the transaction did not lock anything.)
A then updated the table, and probably commited the change before B exited the wait state.
B then updated the table, after A's write was committed.
Try putting the wait statement after the update, before the commit, and see what happens.
It's not a real race condition. It's more a common problem with concurrent transactions. One solution is to set a read lock on the table and therefor have a serialization in place.
This is actually a common problem in SQL databases and that is why most (all?) of them have some built in features to take care of this issue of obtaining a unique identifier. Here are some things to look into if you are using Mysql or Postgres. If you are using a different database I bet the provide something very similar.
A good example of this is postgres sequences which you can check out here:
Postgres Sequences
Mysql uses something called auto increments.
Mysql auto increment
You can set the column to a computed value that is persisted. This will take care of the race condition.
Persisted Computed Columns
NOTE
Using this method means you do not need to store the next code in a table. The code column becomes the reference point.
Implementation
Give the column the following properties under computed column specification.
Formula = dbo.GetNextCode()
Is Persisted = Yes
Create Function dbo.GetNextCode()
Returns VarChar(10)
As
Begin
Declare #Return VarChar(10);
Declare #MaxId Int
Select #MaxId = Max(Id)
From Table
Select #Return = Code
From Table
Where Id = #MaxId;
/* Generate New Code ... */
Return #Return;
End