I am trying to convert my DELETE statements to TRUNCATE using How to delete large data of table in SQL without log?
Here is what I am trying,
-- Move recent records from Main table to a Temp table
-- TRUNCATE the Main table
-- Return back data from Temp table to Main table
In this period, I wanna stop any INSERT/UPDATE/DELETE statements (until TRUNCATE statement ran) to run on my Main table because if I allow then we might loss some data during TRUNCATE.
TRUNCATE statement acquires SCH-M lock it means that it creates a Schema Modification lock
Second type of the lock is schema modification lock – SCH-M. This lock
type is acquired by sessions that are altering the metadata and live
for duration of transaction. This lock can be described as
super-exclusive lock and it’s incompatible with any other lock types
including intent locks
Locking in Microsoft SQL Server (Part 13 – Schema locks)
During this time, the update, select and delete statements will be waiting for the table truncating operation. As a result, the CRUD operation will stop automatically until the TRUNCATE statement will be completed.
Below is an example script that reduces logging the FULL recovery model using SWITCH and TRUNCATE. The SWITCH is a fast meta data only operation. The space deallocation performed by TRUNCATE is done by an asynchronous background thread with larger tables (64MB+) so it is also fast and reduces logging greatly compared to DELETE;
A transaction is used to ensure all-or-none behavior and a schema modification lock is held for the duration of the transaction to quiesce data modifications during the process.
Below is the transaction log space used before and after the process by the example with 1M rows initially and 50K retained:
+--------+---------------+--------------------+
| | Log Size (MB) | Log Space Used (%) |
+--------+---------------+--------------------+
| Before | 1671.992 | 27.50415 |
| After | 1671.992 | 30.65533 |
+--------+---------------+--------------------+
Test setup:
--example main table
CREATE TABLE dbo.Main(
MainID int NOT NULL CONSTRAINT PK_Main PRIMARY KEY
, MainData char(1000) NOT NULL
);
--staging table with same schema and indexes as main table
CREATE TABLE dbo.MainStaging(
MainID int NOT NULL CONSTRAINT PK_MainStaging PRIMARY KEY
, MainData char(1000) NOT NULL
);
--load 1M rows into main table for testing
WITH
t10 AS (SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) t(n))
,t1k AS (SELECT 0 AS n FROM t10 AS a CROSS JOIN t10 AS b CROSS JOIN t10 AS c)
,t1g AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS num FROM t1k AS a CROSS JOIN t1k AS b CROSS JOIN t1k AS c)
INSERT INTO dbo.Main WITH(TABLOCKX) (MainID, MainData)
SELECT num, CAST(num AS char(1000))
FROM t1g
WHERE num <= 1000000;
GO
Example script:
SET XACT_ABORT ON; --ensures transaction is rolled back immediately even if script is cancelled
BEGIN TRY
BEGIN TRAN;
--truncate in same transaction so entire script can be safely rerun
TRUNCATE TABLE dbo.MainStaging;
--ALTER TABLE will block other activity until committed due to schema modification lock
--main table will be empty after switch
ALTER TABLE dbo.Main SWITCH TO dbo.MainStaging;
--keep 5% of rows
INSERT INTO dbo.Main WITH(TABLOCKX) (MainID, MainData)
SELECT MainID, MainData
FROM dbo.MainStaging
WHERE MainID > 950000;
COMMIT;
END TRY
BEGIN CATCH
IF ##TRANCOUNT > 0 ROLLBACK;
THROW;
END CATCH;
GO
Try to use transaction:
BEGIN TRANSACTION
SELECT TOP 1 *
FROM table_name
WITH (TABLOCK, HOLDLOCK)
-- do your stuff
COMMIT
Related
I am designing an incremental update process for a cloud based database (Azure). The only existing changelog is a .txt file that records every insert, delete, and update statement that the database processes. There is no change data capture table available, or any database table that records changes and I cannot enable watermarking on the database. The .txt file is structured as follows:
update [table] set x = 'data' where y = 'data'
go
insert into [table] values (data)
go
delete from [table] where x = data
go
I have built my process to convert the .txt file into a table in the cloud as follows:
update_id | db_operation | statement | user | processed_flag
----------|--------------|-------------------------------------------------|-------|---------------
1 | 'update' | 'update [table] set x = data where y = data' | user1 | 0
2 | 'insert' | 'insert into [table] values (data)' | user2 | 0
3 | 'delete' | 'delete from [table] where x = data' | user3 | 1
I use this code to create a temporary table of the unprocessed transactions, and then loop over the table, create a sql statement and then execute that transaction.
CREATE TABLE temp_incremental_updates
WITH
(
DISTRIBUTION = HASH ( [user] ),
HEAP
)
AS
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS Sequence,
[user],
[statement]
FROM upd.incremental_updates
WHERE processed_flag = 0;
DECLARE #nbr_statements INT = (SELECT COUNT(*) FROM temp_incremental_updates),
#i INT = 1;
WHILE #i <= #nbr_statements
BEGIN
DECLARE #sql_code NVARCHAR(4000) = (SELECT [statement] FROM temp_incremental_updates WHERE Sequence = #i);
EXEC sp_executesql #sql_code;
SET #i +=1;
END
DROP TABLE temp_incremental_updates;
UPDATE incremental_updates SET processed_flag = 1
This is taking a very long time, upwards of an hour. Is there a different way I can quickly processes multiple sql statements that need to occur in a specific order? Order is relevant because, for example: if I try to process a delete statement before the insert statement that created that data, azure synapse will throw an error.
Less than 2 hours for 20k individual statements is pretty good for Synapse!
Synapse isn't meant to do transactional processing. You need to convert individual updates to batch updates and execute statements like MERGE for big batches or rows instead of INSERT, UPDATE and DELETE for each row.
In your situation, you could:
Group all inserts/updates by table name
Create a temp table for each group. E.g. table1_insert_updates
Run MERGE like statement from table1_insert_updates to table1.
For deletes:
Group primary keys by table name
Run one DELETE FROM table1 where key in (primary keys) per table.
Frankly 20k is such a bad number, it's not too small and far from big enough. So even after "grouping" you could still have performance issues if you batch/group sizes are too small.
Synapse isn't meant for transaction processing. It'll merge a table with a million rows into a table with a billion rows in less than 5 minutes using a single MERGE statement to upsert a million rows, but if you run 1000 delete and 1000 insert statements one after the other it'll probably take longer!
EDIT: You'll also have to use PARTITION BY and RANK (or ROWNUMBER) to de-duplicate in case there are multiple updates to same row in a single batch. Not easy depending on how your input is (update contains all columns (even unchanged) or only changed columns) this might become very complicated.
Again Synapse is not meant for transaction processing.
Try to declare a cursor for selecting all the data from temp_incremental_updates at once, instead of making multiple reads:
CREATE TABLE temp_incremental_updates
WITH
(
DISTRIBUTION = HASH ( [user] ),
HEAP
)
AS
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS Sequence,
[user],
[statement]
FROM upd.incremental_updates
WHERE processed_flag = 0;
DECLARE cur CURSOR FOR SELECT [statement] FROM temp_incremental_updates ORDER BY Sequence
OPEN cur
FETCH NEXT FROM cur INTO #sql_code
WHILE ##FETCH_STATUS = 0 BEGIN
EXEC sp_executesql #sql_code;
FETCH NEXT FROM cur INTO #sql_code
END
-- Rest of the code
This question already has answers here:
If it is not allowed to rollback a TRUNCATE statement then how is it possible to use it in a transaction? [duplicate]
(3 answers)
Closed 2 years ago.
I have a table named a that has 26 rows:
Select * from a
Output is 26
Begin Tran
Truncate table a
Select * from a
Output is zero rows
Rollback Tran
Select * from a
Output is again 26 rows
Truncate is ddl command we cannot perform rollback operation then why we get same number of rows after rollback ?
Please confirm in detail.
Yes, a TRUNCATE can be rolled back in a transaction in SQL Server. There are actually only a few things that can't be rolled back with a transaction in SQL Server. For example, you can even roll back other DDL statements (such as the DROP and CREATE below):
USE Sandbox;
GO
CREATE TABLE dbo.Table1 (I int);
CREATE TABLE dbo.Table2 (I int);
GO
WITH N AS (
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N))
INSERT INTO dbo.Table1
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3, N N4;
WITH N AS (
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N))
INSERT INTO dbo.Table2
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3, N N4;
GO
BEGIN TRANSACTION SampleTran;
TRUNCATE TABLE dbo.Table1;
CREATE TABLE dbo.Table3 (I int);
INSERT INTO dbo.Table3
SELECT I
FROM dbo.Table2;
DROP TABLE dbo.Table2;
ROLLBACK TRANSACTION SampleTran;
GO
--Contains 10,000 rows
SELECT *
FROM dbo.Table1;
GO
--Still exists
SELECT *
FROM dbo.Table2;
GO
--Doesn't exist
SELECT *
FROM dbo.Table3;
GO
--Clean up
DROP TABLE dbo.Table1;
DROP TABLE dbo.Table2;
Despite "intellisense" probably telling you that dbo.Table2 doesn't exist in the lower batches, it does, as that transaction was rolled back. (Intellisense will also think that dbo.Table3 still exists, which it will not.)
Unlike the myth that people seem to believe, TRUNCATE is logged. Unlike DELETE, however, TRUNCATE deletes the pages the data is stored on, not the individual rows. A log of what pages are deleted is still written, so a TRUNCATE can still be rolled back, as the deletion of those pages is simply not committed.
I think the misunderstanding is, that in Oracle TRUNCATE can't be rolled back.
In SQL-Server Truncate just deletes the whole table-rows with more efficiency
Truncate does not delete the actual data, it just de-allocates the pages. The de-allocations are logged, enable the transaction to rolled back by simply re-allocating the original pages - data intact :-)
I am trying to insert 1,500,000 records into a table. Am facing table lock issues during the insertion. So I came up with the below batch insert.
DECLARE #BatchSize INT = 50000
WHILE 1 = 1
BEGIN
INSERT INTO [dbo].[Destination]
(proj_details_sid,
period_sid,
sales,
units)
SELECT TOP(#BatchSize) s.proj_details_sid,
s.period_sid,
s.sales,
s.units
FROM [dbo].[SOURCE] s
WHERE NOT EXISTS (SELECT 1
FROM dbo.Destination d
WHERE d.proj_details_sid = s.proj_details_sid
AND d.period_sid = s.period_sid)
IF ##ROWCOUNT < #BatchSize
BREAK
END
I have a clustered Index on Destination table (proj_details_sid ,period_sid ). NOT EXISTS part is just to restrict inserted records from again inserting into the table
Am I doing it right, will this avoid table lock ? or is there any better way.
Note : Time taken is more or less same with batch and without batch insert
Lock escalation is not likely to be related to the SELECT part of your statement at all.
It is a natural consequence of inserting a large number of rows
Lock escalation is triggered when lock escalation is not disabled on the table by using the ALTER TABLE SET LOCK_ESCALATION option, and when either of the following conditions exists:
A single Transact-SQL statement acquires at least 5,000 locks on a single nonpartitioned table or index.
A single Transact-SQL statement acquires at least 5,000 locks on a single partition of a partitioned table and the ALTER TABLE SET LOCK_ESCALATION option is set to AUTO.
The number of locks in an instance of the Database Engine exceeds memory or configuration thresholds.
If locks cannot be escalated because of lock conflicts, the Database Engine periodically triggers lock escalation at every 1,250 new locks acquired.
You can easily see this for yourself by tracing the lock escalation event in Profiler or simply trying the below with different batch sizes. For me TOP (6228) shows 6250 locks held but TOP (6229) it suddenly plummets to 1 as lock escalation kicks in. The exact numbers may vary (dependant on database settings and resources currently available). Use trial and error to find the threshold where lock escalation appears for you.
CREATE TABLE [dbo].[Destination]
(
proj_details_sid INT,
period_sid INT,
sales INT,
units INT
)
BEGIN TRAN --So locks are held for us to count in the next statement
INSERT INTO [dbo].[Destination]
SELECT TOP (6229) 1,
1,
1,
1
FROM master..spt_values v1,
master..spt_values v2
SELECT COUNT(*)
FROM sys.dm_tran_locks
WHERE request_session_id = ##SPID;
COMMIT
DROP TABLE [dbo].[Destination]
You are inserting 50,000 rows so almost certainly lock escalation will be attempted.
The article How to resolve blocking problems that are caused by lock escalation in SQL Server is quite old but a lot of the suggestions are still valid.
Break up large batch operations into several smaller operations (i.e. use a smaller batch size)
Lock escalation cannot occur if a different SPID is currently holding an incompatible table lock - The example they give is a different session executing
BEGIN TRAN
SELECT * FROM mytable (UPDLOCK, HOLDLOCK) WHERE 1=0
WAITFOR DELAY '1:00:00'
COMMIT TRAN
Disable lock escalation by enabling trace flag 1211 - However this is a global setting and can cause severe issues. There is a newer option 1224 that is less problematic but this is still global.
Another option would be to ALTER TABLE blah SET (LOCK_ESCALATION = DISABLE) but this is still not very targeted as it affects all queries against the table not just your single scenario here.
So I would opt for option 1 or possibly option 2 and discount the others.
Instead of checking the data exists in Destination, it seems better to store all data in temp table first, and batch insert into Destination
Reference: Using ROWLOCK in an INSERT statement (SQL Server)
DECLARE #batch int = 100
DECLARE #curRecord int = 1
DECLARE #maxRecord int
-- remove (nolock) if you don't want to have dirty read
SELECT row_number over (order by s.proj_details_sid, s.period_sid) as rownum,
s.proj_details_sid,
s.period_sid,
s.sales,
s.units
INTO #Temp
FROM [dbo].[SOURCE] s WITH (NOLOCK)
WHERE NOT EXISTS (SELECT 1
FROM dbo.Destination d WITH (NOLOCK)
WHERE d.proj_details_sid = s.proj_details_sid
AND d.period_sid = s.period_sid)
-- change this maxRecord if you want to limit the records to insert
SELECT #maxRecord = count(1) from #Temp
WHILE #maxRecord >= #curRecord
BEGIN
INSERT INTO [dbo].[Destination]
(proj_details_sid,
period_sid,
sales,
units)
SELECT proj_details_sid, period_sid, sales, units
FROM #Temp
WHERE rownum >= #curRecord and rownum < #curRecord + #batch
SET #curRecord = #curRecord + #batch
END
DROP TABLE #Temp
I added (NOLOCK) your destination table -> dbo.Destination(NOLOCK).
Now, You won't lock your table.
WHILE 1 = 1
BEGIN
INSERT INTO [dbo].[Destination]
(proj_details_sid,
period_sid,
sales,
units)
SELECT TOP(#BatchSize) s.proj_details_sid,
s.period_sid,
s.sales,
s.units
FROM [dbo].[SOURCE] s
WHERE NOT EXISTS (SELECT 1
FROM dbo.Destination(NOLOCK) d
WHERE d.proj_details_sid = s.proj_details_sid
AND d.period_sid = s.period_sid)
IF ##ROWCOUNT < #BatchSize
BREAK
END
To do this you can use WITH (NOLOCK) in your select statement.
BUT NOLOCK is not recommended on OLTP Databases.
I have a table Which has more than 1 million records, I have created a stored Procedure to insert data in that table, before Inserting the data I need to truncate the table but truncate is taking too long.
I have read on some links that if a table is used by another person or some locks are applied then truncate takes too long time but here I am the only user and I have applied no locks on that.
Also no other transactions are open when I tried to truncate the table.
As my database is on SQL Azure I am not supposed to drop the indexes as it does not allow me to insert the data without an index.
Drop all the indexes from the table and then truncate, if you want to insert the data then insert data and after inserting the data recreate the indexes
When deleting from Azure you can get into all sorts of trouble, but truncate is almost always an issue of locking. If you can't fix that you can always do this trick when deleting from Azure.
declare #iDeleteCounter int =1
while #iDeleteCounter > 0
begin
begin transaction deletes;
with deleteTable as
(
select top 100000 * from mytable where mywhere
)
delete from deleteTable
commit transaction deletes
select #iDeleteCounter = count(1) from mytable where mywhere
print 'deleted 100000 from table'
end
If I run a SQL statement like
UPDATE table
SET col = value
WHERE X=Y
And no rows match, therefore no rows are changed, are any locks created by the update?
The DBMS is Sybase + SQL Server
You can play with this script and see for yourself that sometimes locks are acquired and held even when no rows are updated:
CREATE TABLE dbo.Test
(
i INT NOT NULL
PRIMARY KEY ,
j INT NULL
) ;
go
INSERT dbo.Test
( i, j )
VALUES ( 1, 2 ) ;
GO
SELECT ##spid ;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE ;
BEGIN TRANSACTION ;
UPDATE dbo.Test
SET j = 3
WHERE i = 3 ;
SELECT *
FROM sys.dm_tran_locks
WHERE request_session_id = ##spid;
COMMIT ;
If field x is indexed, then there will probably be a shared lock on that index while your UPDATE is checking it for matching records.
There should not be any row locks, but all locking behavior is contingent on your server-level isolation settings.
In case an update statement is used which does not effect the records then an exclusive intent lock is being taken for the update statement while in transaction as first the rows effected are to be selected followed by the update on the table, however as there are no rows that need to be updated this intent lock is taken on the table for the transaction in an exclusive mode.