SQL server transaction visibility issue

SQL server transaction visibility issue - sql

When I execute the following code (Case 1) I get the value 2 for the count. Which means inside the same transaction the chagnes made to the table are visible. So this behaves in the way I expect.
Case 1
begin tran mytran
begin try
CREATE TABLE [dbo].[ft](
[ft_ID] [int] IDENTITY(1,1) NOT NULL,
[ft_Name] [nvarchar](100) NOT NULL
CONSTRAINT [PK_FileType] PRIMARY KEY CLUSTERED
(
[ft_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
INSERT INTO [dbo].[ft]([ft_Name])
VALUES('xxxx')
INSERT INTO [dbo].[ft]([ft_Name])
VALUES('yyyy')
select count(*) from [dbo].[ft]
commit tran mytran
end try
begin catch
rollback tran mytran
end catch
However when I alter a column (e.g. add a new column within the transaction) it is not visible to the (self/same) transaction (Case 2). Let's assume there is a product table without a column called ft_ID and I am adding a column withing the same transaction and going to read it.
Case 2
begin tran mytran
begin try
IF NOT EXISTS (
SELECT *
FROM sys.columns
WHERE object_id = OBJECT_ID(N'dbo.Products')
AND name = 'ft_ID'
)
begin
alter table dbo.Products
add ft_ID int null
end
select ft_ID from dbo.Products
commit tran mytran
end try
begin catch
rollback tran mytran
end catch
When trying to execute Case 2 I get the error "Invalid column name 'ft_ID'" because the newly added column is not visible within the same transaction.
Why this discrepancy happens? Create table is atomic (Case 1) and works in the way I expect but alter table is not. Why changes made within the same transaction are not visible to the statements down (Case 2).

You get a compile errors. The batch is never launched into execution. See Understanding how SQL Server executes a query. Transaction visibility and boundaries has nothing to do with what you're seeing.
You should always separate DDL and DML into separate requests. Without going into too much details, due to the way recovery works, mixing DDL and DML in the same transaction is just asking for trouble. Just take my word on this one.

Rules for Using Batches
...
A table cannot be changed and then the new columns referenced in the same batch.
See this
Alternative is to spawn a child batch and reference your new column from there, like...
exec('select ft_ID from dbo.Products')
However, as Remus said, be very careful about mixing schema changes and selecting data from that schema, especially in one same transaction. Even WITHOUT transaction this code will have side-effects: try wrapping this exec() workaround in the stored procedure, and you will get a recompile every time you call it. Tough luck, but it simply works that way.

Related

Transact-SQL transaction failing without GO

I'm looking at a trivial query and struggle to understand why SQL Server cannot execute it.
Say I have a table
CREATE TABLE [dbo].[t2](
[id] [nvarchar](36) NULL,
[name] [nvarchar](36) NULL
)
And I want to add a new column and set some value to it. So I do the following:
BEGIN TRANSACTION
ALTER TABLE [t2] ADD [name2] [nvarchar](255) NULL
UPDATE [t2] SET [name2] = CONCAT(name, '-XXXX')
COMMIT TRANSACTION
And if I execute the query, I have
I know, it failing because the SQL Server executes the query in a different order for optimization purposes, and one way to fix it would be to separate those two sentences with GO statement. Thus the following query will pass without issues.
BEGIN TRANSACTION
ALTER TABLE [t2] ADD [name2] [nvarchar](255) NULL
GO
UPDATE [t2] SET [name2] = CONCAT(name, '-XXXX')
COMMIT TRANSACTION
Actually, not exactly without issues, as I have to use GO statement which will make transaction scope useless, as discussed on this Stackoverflow question
So I have two questions:
How to make that script work without using GO statement
Why SQL server is not smart enough to figure out such a trivial case? (it is more like a rhetorical question)

This is a parser error. When you run a statement it is parsed before hand, however, only certain DDL operations are "cached" by the parser so that it is aware of later. CREATE is something it will "cache" however, ALTER is not. That is why you can CREATE a table in the same batch and then reference it.
As you have an ALTER then when the parser parses the batch and it gets to the UPDATE statement it will fail, and the error you see is raised. One method is to defer to parsing of the statement:
BEGIN TRANSACTION;
ALTER TABLE [t2] ADD [name2] [nvarchar](255) NULL;
EXEC sys.sp_executesql N'UPDATE [t2] SET [name2] = CONCAT(name, N''-XXXX'');';
COMMIT TRANSACTION;
If, however, N'-XXXX' is meant to be the default value, you could qualify that in the DDL statement instead:
BEGIN TRANSACTION;
ALTER TABLE t2 ADD name2 nvarchar(255) NULL DEFAULT N'-XXXX' WITH VALUES;
COMMIT TRANSACTION;

How to get rid of a lock by getting an error instead (or skipping it) in SQL Server

I am trying to not get locked on a specific query in SQL Server.
Use case - High Load Parallel Processing
I need a way to handle "work queues" using SQL Server and its transactionnal system to ensure that the work has been completed (with the integrated rollback of SQL Server transaction in case of a unhandled failure like a IIS pool crash/recycle or app crash).
The system has to be able to handle many workers (I call them "WorkerApp") which will that have to do some random work ("Work Item") and do parallel processing, one work item should NOT be run twice in any case (even high load).
I want to have an error (anything even 'SQL Victim') or any way to understand that a row is being used, instead of a real lock which will result in blocks/deadlocks... which I really don't want because it will just result in a poor performance in my use case.
SQL Structure and value initialisation of this example :
CREATE TABLE [worker].[Item](
[ItemId] [bigint] IDENTITY(1,1) NOT NULL,
[Content] [xml] NULL,
[IsRunning] [bit] NOT NULL,
CONSTRAINT [PK_Item] PRIMARY KEY CLUSTERED
(
[ItemId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [worker].[Item] ADD CONSTRAINT [DF_Item_IsRunning] DEFAULT ((0)) FOR [IsRunning]
GO
INSERT INTO worker.Item (IsRunning) VALUES (0)
INSERT INTO worker.Item (IsRunning) VALUES (0)
First script simulating a long work
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
BEGIN TRANSACTION
DECLARE #workId BIGINT;
SELECT TOP 1
#workId = ItemId
FROM
worker.Item
WHERE
IsRunning = 0
;
SELECT #workId;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
UPDATE
worker.Item
SET
IsRunning = 1
WHERE
ItemId = #workId
;
WAITFOR DELAY '00:05:00'
PRINT 'Finished'
COMMIT TRANSACTION
Second problematic script ran in parallel and locking (which is what I am trying to avoid)
BEGIN TRANSACTION
set deadlock_priority low
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
DECLARE #workId BIGINT;
SELECT TOP 1
#workId = ItemId
FROM
worker.Item
WHERE
IsRunning = 1
;
-- Here is where I don't want it to lock
UPDATE
worker.Item
SET
IsRunning = 0
WHERE
ItemId = 1
;
SELECT * FROM worker.Item
COMMIT TRANSACTION
What I am trying to achieve
The goal would be the second script to run and fail instantly (or have a way to know that the update didnt complete because of a lock) when it tries to update a record that has been hold by the SERIALIZABLE.
Any other solution that would protect each work item (in "IsRunning" state) would be interesting for me. The SERIALIZABLE was just an attempt.

[ 1 ] First thing: instead of using a two steps approach with a SELECT TOP(1) ... and then UPDATE ... plus one transaction I would use a single UPDATE thus:
UPDATE TOP(1)
worker.Item
SET
IsRunning = 1,
#ItemId = ItemId
WHERE IsRunning = 0;
SELECT #ItemId
Notes:
1.1 UPDATE TOP(1) will update maximum one row WHERE IsRunning = 0
1.2 And #ItemId = ItemId (yep, it's possible to do this) will copy into #ItemId scalar variable the value of ItemId column.
[ 2 ] If you want to get an error / exception when following source code
UPDATE
worker.Item
SET
IsRunning = 0
WHERE
ItemId = 1
is executed and also when current row is locked (by another concurrent connect / Tx) then I would use SET LOCK_TIMEOUT thus: {-1 (default) means wait forever if row has an incompatible lock granted to a concurrent connection, 0 (non default) means it will not wait, instead an error / exception will be raised}:
...
SET LOCK_TIMEOUT -1 -- Default behaviour
SELECT TOP 1
#workId = ItemId
FROM
worker.Item
WHERE
IsRunning = 1
;
-- Here is where I don't want it to lock
SET LOCK_TIMEOUT 0 -- Raise an exception if row is locked
UPDATE
worker.Item
SET
IsRunning = 0
WHERE
ItemId = 1
...
Result: second statement will raise following exception
[ 3 ] Also, if for second script, the goal is to skip locked rows (by other concurrent connection/Tx) then one solution is to use READPAST (and also ROWLOCK) table hint(s) together with READ COMMITTED or REPEATABLE READ isolation levels thus:
[3.1]
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
...
SELECT TOP 1
#workId = ItemId
FROM
worker.Item WITH(READPAST, ROWLOCK)
WHERE
IsRunning = 1
or
[3.2]
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
...
;WITH cte
AS (
SELECT TOP(1)
q.workId, q.IsRunning
FROM Work.Item AS q WITH (ROWLOCK, READPAST)
WHERE q.IsRunning = 0
ORDER BY q.workId
)
UPDATE cte
SET #workId = workId,
IsRunning = 1;
[ 4 ] Anyway, original requirement is not clear. If none of above answers doesn't fit then you should add more info.
[ 5 ] I would use following approach:
Instead of IsRunning BIT I would use IsProccesed BIT NOT NULL CONSTRAINT DF_Item_IsProcessed' DEFAULT(0) and then within transaction I would use single step approach from [ 1 ] thus:
SET XACT_ABORT ON
BEGIN TRY
BEGIN TRAN
DECLARE #Id INT;
WITH cte
AS (
SELECT TOP(1)
q.Id, q.IsProcessed
FROM Work.Item AS q WITH (ROWLOCK, READPAST)
WHERE q.IsProcessed = 0
ORDER BY q.Id
)
UPDATE cte
SET #Id = Id,
IsProcessed = 1;
...
source code to process item #Id
...
COMMIT
END TRY
BEGIN CATCH
IF ##TRANCOUNT > 0
BEGIN
ROLLBACK
END
... other code for ex/err management ...
END CATCH
This way, every Tx will lock a different row=#Id (READPAST) and if an error happen XACT_ABORT/CATCH will automatically rollback current Tx and IsProcessed will return to initial value 0. Work.Item should have an index on IsProcessed:
CREATE UNIQUE INDEX IUN_Work_Item_IsProcessed_Id
ON Work.Item (IsProcessed, Id)
Another option for indexing is to create a filtered index if number of IsProcessed = 0 will be [very] small compared with IsProcessed = 1:
CREATE UNIQUE INDEX IUN_Work_Item_Id_IsProcessed0
ON Work.Item (Id)
INCLUDE (IsProcessed)
WHERE IsProcessed = 0
Note: see "Required SET Options for Filtered Indexes" section for proper SETtings during filtered index creation and DML execution. From my point of view, processed rows should be deleted (see Remus's approach), this way Work.Item tables will remain small.

This is how SQL Engine works, UPDATE statements will always ask for Exclusive Intent locks on the table.
The only way a query crashes from locks is if a deadlock happens or like Bogdan Sahlean stated with a SET LOCK_TIMEOUT 0 but I greatly disrecommend this behaviour.
In your scenario, a deadlock will not happen since the UPDATE on the first query already happened when your 2nd query comes in. After first query finished, the 2nd query will execute normally (with "just" a long wait time before it).
If every process that encountered a wait-for-lock had to crash, your user experience would be VERY low, getting error messages instead of simple slowness.

Execute SQL Asynchronously or change locking from Trigger

I have a complex unit of work from an application that might commit changes to 10-15 tables as a single transaction. The unit of work executes under snapshot isolation.
Some of the tables have a trigger which executes a stored procedure to log messages into a queue. The message contains the Table Name, Key and Change Type. This is necessary to provide backwards compatibility with SQL2005, I can't use the built in queuing.
The problem is I am getting blocking and time-outs in the queue writing stored procedure. I either get a message saying:
Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table 'dbo.tblObjectChanges' directly or indirectly in database
or I get a timeout writing to that table.
Is there a way to change the transaction isolation of the particular call to (or within) the stored procedure that does the message queue writing, from within the trigger? As a last resort, can I make the call to the delete or update parts of the stored procedure run asynchronously?
Here is the SQL for the Stored Procedure:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[usp_NotifyObjectChanges]
#ObjectType varchar(20),
#ObjectKey int,
#Level int,
#InstanceGUID varchar(50),
#ChangeType int = 2
AS
SET NOCOUNT ON
DECLARE #ObjectChangeID int
--Clean up any messages older than 10 minutes
DELETE from tblObjectChanges Where CreatedTime < DATEADD(MINUTE, -10, GetDate())
--If the object is already in the queue, change the time and instanceID
SELECT #ObjectChangeID = [ObjectChangeID] FROM tblObjectChanges WHERE [ObjectType] = #ObjectType AND [ObjectKey] = #ObjectKey AND [Level] = #Level
IF NOT #ObjectChangeID is NULL
BEGIN
UPDATE [dbo].[tblObjectChanges] SET
[CreatedTime] = GETDATE(), InstanceGUID = #InstanceGUID
WHERE
[ObjectChangeID] = #ObjectChangeID
END
ELSE
BEGIN
INSERT INTO [dbo].[tblObjectChanges] (
[CreatedTime],
[ObjectType],
[ObjectKey],
[Level],
ChangeType,
InstanceGUID
) VALUES (
GETDATE(),
#ObjectType,
#ObjectKey,
#Level,
#ChangeType,
#InstanceGUID
)
END
Definition of tblObjectChanges:
CREATE TABLE [dbo].[tblObjectChanges](
[CreatedTime] [datetime] NOT NULL,
[ObjectType] [varchar](20) NOT NULL,
[ObjectKey] [int] NOT NULL,
[Rowversion] [timestamp] NOT NULL,
[Level] [int] NOT NULL,
[ObjectChangeID] [int] IDENTITY(1,1) NOT NULL,
[InstanceGUID] [varchar](50) NULL,
[ChangeType] [int] NOT NULL,
CONSTRAINT [PK_tblObjectChanges] PRIMARY KEY CLUSTERED
(
[ObjectChangeID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]
GO

This line is almost certainly your problem:
DELETE from tblObjectChanges Where CreatedTime < DATEADD(MINUTE, -10, GetDate())
There are two BIG problems with this statement. First, according to your table definition, CreatedTime is not indexed. This means that in order to execute this statement, the entire table must be scanned, and that will cause the entire table to be locked for the duration of whatever transaction this happens to be a part of. So put an index on this column.
The second problem, is that even with an index, you really shouldn't be performing operational maintenance tasks like this from within a trigger. Besides slowing down the OLTP transactions that have to execute it, this statement only really needs to be executed once every 5-10 minutes. Instead, you are executing it any time (and every time) any of these tables are modified. That is a lot of additional load that gets worse as your system gets busier.
A better approach would be to take this statement out of the triggers entirely, and instead have a SQL Agent Job that runs every 5-10 minutes to execute this clean-up operation. If you do this along with adding the index, most of your problems should disappear.
An additional problem is this statement:
SELECT #ObjectChangeID = [ObjectChangeID] FROM tblObjectChanges WHERE [ObjectType] = #ObjectType AND [ObjectKey] = #ObjectKey AND [Level] = #Level
Unlike the first statement above, this statement belongs in the trigger. However, like the first statement, it too will have (and cause) serious performance and locking issues under load, because again, according to your posted table definition, none of the columns being searched are indexed.
The solution again is to put an additional index on these columns as well.

A few ideas:
Move the delete into a separate scheduled job if possible
Add an index on CreatedTime
Add an index on ObjectType, ObjectKey, Level
add WITH(UPDLOCK, ROWLOCK) to the SELECT
add WITH(ROWLOCK) to the INSERT and the UPDATE
You need to test all of these to see what helps. I would go through them in this order, but see the note below.
Even if you decide against all this, at least leave the WITH(UPDLOCK) on the SELECT as you otherwise might loose updates.

SCOPE_IDENTITY And Instead of Insert Trigger work-around

OK, I have a table with no natural key, only an integer identity column as it's primary key. I'd like to insert and retrieve the identity value, but also use a trigger to ensure that certain fields are always set. Originally, the design was to use instead of insert triggers, but that breaks scope_identity. The output clause on the insert statement is also broken by the instead of insert trigger. So, I've come up with an alternate plan and would like to know if there is anything obviously wrong with what I intend to do:
begin contrived example:
CREATE TABLE [dbo].[TestData] (
[TestId] [int] IDENTITY(1,1) PRIMARY KEY NOT NULL,
[Name] [nchar](10) NOT NULL)
CREATE TABLE [dbo].[TestDataModInfo](
[TestId] [int] PRIMARY KEY NOT NULL,
[RowCreateDate] [datetime] NOT NULL)
ALTER TABLE [dbo].[TestDataModInfo] WITH CHECK ADD CONSTRAINT
[FK_TestDataModInfo_TestData] FOREIGN KEY([TestId])
REFERENCES [dbo].[TestData] ([TestId]) ON DELETE CASCADE
CREATE TRIGGER [dbo].[TestData$AfterInsert]
ON [dbo].[TestData]
AFTER INSERT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
INSERT INTO [dbo].[TestDataModInfo]
([TestId],
[RowCreateDate])
SELECT
[TestId],
current_timestamp
FROM inserted
-- Insert statements for trigger here
END
End contrived example.
No, I'm not doing this for one little date field - it's just an example.
The fields that I want to ensure are set have been moved to a separate table (in TestDataModInfo) and the trigger ensures that it's updated. This works, it allows me to use scope_identity() after inserts, and appears to be safe (if my after trigger fails, my insert fails). Is this bad design, and if so, why?

As you mentioned, SCOPE_IDENTITY is designed for this situation. It's not affected by AFTER trigger code, unlike ##IDENTITY.
Apart from using stored procs, this is OK.
I use AFTER triggers for auditing because they are convenient... that is, write to another table in my trigger.
Edit: SCOPE_IDENTITY and parallelism in SQL Server 2005 cam have a problem

HAve you tried using OUTPUT to get the value back instead?

Have you tried using:
SELECT scope_identity();
http://wiki.alphasoftware.com/Scope_Identity+in+SQL+Server+with+nested+and+INSTEAD+OF+triggers

You can use an INSTEAD OF trigger just fine, by in the trigger capturing the value just after the insert to the main table, then spoofing the Scope_Identity() into ##Identity at the end of the trigger:
-- Inside of trigger
SET NOCOUNT ON;
INSERT dbo.YourTable VALUES(blah, blah, blah);
SET #YourTableID = Scope_Identity();
-- ... other DML that inserts to another identity-bearing table
-- Last statement in trigger
SELECT YourTableID INTO #Trash FROM dbo.YourTable WHERE YourTableID = #YourTableID;
Or, here's an alternate final statement that doesn't use any reads, but may cause permission issues if the executing user doesn't have rights (though there are solutions to this).
SET #SQL =
'SELECT identity(smallint, ' + Str(#YourTableID) + ', 1) YourTableID INTO #Trash';
EXEC (#SQL);
Note that Scope_Identity() may return NULL on a table with an INSTEAD OF trigger on it in some cases, even if you use this spoofing method. But you can at least get the value using ##Identity. This can make MS Access ADP projects start working right again after breaking because you put a trigger on a table that the front end inserts to.
Also, be aware that any parallelism at all can make ##Identity and Scope_Identity() return incorrect values—so use OPTION (MAXDOP 1) or TOP 1 or a single-row VALUES clause to defeat this problem.

Atomic Upgrade Scripts

With my database upgrade scripts, I typically just have one long script that makes the necessary changes for that database version. However, if one statement fails halfway through the script, it leaves the database in an inconsistent state.
How can I make the entire upgrade script one atomic operation? I've tried just wrapping all of the statements in a transaction, but that does not work. Even with SET XACT_ABORT ON, if one statement fails and rolls back the transactions, the rest of the statements just keep going. I would like a solution that doesn't require me to write IF ##TRANCOUNT > 0... before each and every statement. For example:
SET XACT_ABORT ON;
GO
BEGIN TRANSACTION;
GO
CREATE TABLE dbo.Customer
(
CustomerID int NOT NULL
, CustomerName varchar(100) NOT NULL
);
GO
CREATE TABLE [dbo].[Order]
(
OrderID int NOT NULL
, OrderDesc varchar(100) NOT NULL
);
GO
/* This causes error and should terminate entire script. */
ALTER TABLE dbo.Order2 ADD
A int;
GO
CREATE TABLE dbo.CustomerOrder
(
CustomerID int NOT NULL
, OrderID int NOT NULL
);
GO
COMMIT TRANSACTION;
GO

The way Red-Gate and other comparison tools work is exactly as you describe... they check ##ERROR and ##TRANCOUNT after every statement, jam it into a #temp table, and at the end they check the #temp table. If any errors occurred, they rollback the transaction, else they commit. I'm sure you could alter whatever tool generates your change scripts to add this kind of logic. (Or instead of re-inventing the wheel, you could use a tool that already creates atomic scripts for you.)

Something like:
TRY
....
CATCH
ROLLBACK TRAN
http://msdn.microsoft.com/en-us/library/ms175976.aspx

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas