I have two SQL Server tables. Table1 is a live table to store live events from users, the other table, Table2 is storing these events as historical data for archive purposes.
When INSERT happens on Table1 a trigger INSERTs that data to Table2 and DELETEs old rows from both tables with different time intervals.
When I'm using only INSERT to Table2 and DELETE FROM table1 everything goes alright, but when It goes to deletion of the old files on table2 I've got deadlock.
Can anyone give me some hints why this can happen and what can be the cause? (I'm newbie, sorry)
CREATE TRIGGER trigger_events ON dbo.TAble1
FOR INSERT
AS
INSERT INTO dbo.TAble2
(id, event_time, resource_type)
SELECT id, event_time, resource_type FROM inserted WITH (NOLOCK);
DECLARE #done_main bit = 0;
WHILE (#done_main = 0)
BEGIN
DELETE TOP(1000) FROM dbo.TAble1 WHERE EVENT_TIME < (SELECT cast(DATEDIFF(second,'1970-01-01 00:00:00',(DATEADD(HOUR,-1,GETDATE())))AS bigint)* 1000 );
IF ##rowcount < 1000 SET #done_main = 1;
END;
DECLARE #done bit = 0;
WHILE (#done = 0)
BEGIN
DELETE TOP(1000) FROM dbo.TAble2 WHERE EVENT_TIME < (SELECT cast(DATEDIFF(second,'1970-01-01 00:00:00',(DATEADD(MONTH,-1,GETDATE())))AS bigint)* 1000 );
IF ##rowcount < 1000 SET #done = 1;
END;
GO
Related
I am trying to batch inserting rows from one table to another.
DECLARE #batch INT = 10000;
WHILE #batch > 0
BEGIN
BEGIN TRANSACTION
INSERT into table2
select top (#batch) *
FROM table1
SET #batch = ##ROWCOUNT
COMMIT TRANSACTION
END
It runs on the first 10,000 and inserts them. Then i get error message "Cannot insert duplicate key" which its trying to insert the same primary key so i assume its trying to repeat the same batch. What logic am i missing here to loop through the batches? Probably something simple but i cant figure it out.
Can anyone help? thanks
Your code keeps inserting the same rows. You can avoid it by "paginating" your inserts:
DECLARE #batch INT = 10000;
DECLARE #page INT = 0
DECLARE #lastCount INT = 1
WHILE #lastCount > 0
BEGIN
BEGIN TRANSACTION
INSERT into table2
SELECT col1, col2, ... -- list columns explicitly
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY YourPrimaryKey ) AS RowNum, *
FROM table1
) AS RowConstrainedResult
WHERE RowNum >= (#page * #batch) AND RowNum < ((#page+1) * #batch)
SET #lastCount = ##ROWCOUNT
SET #page = #page + 1
COMMIT TRANSACTION
END
You need some way to eliminate existing rows. You seem to have a primary key, so:
INSERT into table2
SELECT TOP (#batch) *
FROM table1 t1
WHERE NOT EXISTS (SELECT 1 FROM table2 t2 WHERE t2.id = t1.id);
I have multiple tables with millions of rows in them. To be safe and not overflow the transaction log, I am deleting them in batches of 100,000 rows at a time. I have to first filter out based on date, and then delete all rows less than a certain date.
To do this I am creating a table in my stored procedure which holds the ID's of the rows that need to be deleted:
I then insert into that table and delete the rows from the desired table using loops. This seems to run successfully but it is extremely slow. Is this being done correctly? Is this the fastest way to do it?
DECLARE #FILL_ID_TABLE TABLE (
FILL_ID varchar(16)
)
DECLARE #TODAYS_DATE date
SELECT
#TODAYS_DATE = GETDATE()
--This deletes all data older than 2 weeks ago from today
DECLARE #_DATE date
SET #_DATE = DATEADD(WEEK, -2, #TODAYS_DATE)
DECLARE #BatchSize int
SELECT
#BatchSize = 100000
BEGIN TRAN FUTURE_TRAN
BEGIN TRY
INSERT INTO #FILL_ID_TABLE
SELECT DISTINCT
ID
FROM dbo.ID_TABLE
WHERE CREATED < #_DATE
SELECT
#BatchSize = 100000
WHILE #BatchSize <> 0
BEGIN
DELETE TOP (#BatchSize) FROM TABLE1
OUTPUT DELETED.* INTO dbo.TABLE1_ARCHIVE
WHERE ID IN (SELECT
ROLLUP_ID
FROM #FILL_ID_TABLE)
SET #BatchSize = ##rowcount
END
SELECT
#BatchSize = 100000
WHILE #BatchSize <> 0
BEGIN
DELETE TOP (#BatchSize) FROM TABLE2
OUTPUT DELETED.* INTO dbo.TABLE2_ARCHIVE
WHERE ID IN (SELECT
FILL_ID
FROM #FILL_ID_TABLE)
SET #BatchSize = ##rowcount
END
PRINT 'Succeed'
COMMIT TRANSACTION FUTURE_TRAN
END TRY
BEGIN CATCH
PRINT 'Failed'
ROLLBACK TRANSACTION FUTURE_TRAN
END CATCH
Try join instead of subquery
DELETE TOP (#BatchSize) T1
OUTPUT DELETED.* INTO dbo.TABLE1_ARCHIVE
FROM TABLE1 AS T1
JOIN #FILL_ID_TABLE AS FIL ON FIL.ROLLUP_ID = T1.Id
Fellow Techies--
I've got an endless loop condition happening here. Why is ##rowcount never getting set back to 0? I must not be understanding what ##rowcount really does--or I am setting the value in the wrong place. I think the value should be decrementing on each pass until I eventually hit zero.
DECLARE #ChunkSize int = 250000;
WHILE #ChunkSize <> 0
BEGIN
BEGIN TRANSACTION
INSERT TableName
(col1,col2)
SELECT TOP (#ChunkSize)
col1,col2
FROM TableName2
COMMIT TRANSACTION;
SET #ChunkSize = ##ROWCOUNT
END -- transaction block
END -- while-loop block
I'm not sure, by what you posted, how you are going to ensure you catch rows that you haven't already inserted. If you don't, it'll be an infinite loop of course. Here is a way using test data--but naturally you'd want to base it off a PK or other unique column. Perhaps you just left that part off, or I'm missing something all together. I'm just interested in what your final code is for your chunking and the logic behind it, so this is an answer and inquiry.
if object_id('tempdb..#source') is not null drop table #source
if object_id('tempdb..#destination') is not null drop table #destination
create table #source(c1 int, c2 int)
create table #destination (c1 int, c2 int)
insert into #source (c1,c2) values
(1,1),
(2,1),
(3,1),
(4,1),
(5,1),
(6,1),
(7,1),
(8,1),
(9,1),
(10,1),
(11,1),
(12,1)
DECLARE #ChunkSize int = 2;
WHILE #ChunkSize <> 0
BEGIN
INSERT INTO #destination (c1,c2)
SELECT TOP (#ChunkSize) c1,c2 FROM #source WHERE c1 NOT IN (SELECT DISTINCT c1 FROM #destination) ORDER BY ROW_NUMBER() OVER (ORDER BY C1)
SET #ChunkSize = ##ROWCOUNT
--SELECT #ChunkSize
END
select * from #source
select * from #destination
Nothing is happening because you're setting chunksize to itself without ever looking at what you've already inserted. Using your example, #Chunksize = 250000. First, select performs SELECT TOP 250000 and returns (presumably) 250000 rows. You then use ##RowCount to update #Chunksize, but the row count returned will be 250000, so you just set it to 250000 again. Which could be fine, except there is no way that number will ever change without ruling out rows that you've already inserted - you will keep inserting the same 250000 rows over and over.
You need something like NOT EXISTS to filter out the rows you've already inserted:
DECLARE #ChunkSize int = 250000;
WHILE #ChunkSize > 0
BEGIN
BEGIN TRANSACTION
INSERT INTO TableName
(col1,col2)
SELECT TOP (#ChunkSize)
col1,col2
FROM TableName2 T2
WHERE NOT EXISTS (SELECT *
FROM TableName T
WHERE T.Col1 = T2.Col1
AND T.Col2 = T2.Col2)
SET #ChunkSize = ##ROWCOUNT
PRINT CONVERT(nvarchar(10),#ChunkSize) + ' Rows Inserted.';
COMMIT TRANSACTION
END -- while-loop block
Implemented solution
In the end, I decided to pump the SQL through SSIS, where I could set the commit batch size accordingly. Had I not chosen hat route, I would have had to follow #scsimon's suggestion and basically maintain a tracking table for the records completed and the records left to cycle through.
Browsing on various examples on how to create a "good" UPSERT statement shown here, I have created the following code (I have changed the column names):
BEGIN TRANSACTION
IF EXISTS (SELECT *
FROM Table1 WITH (UPDLOCK, SERIALIZABLE), Table2
WHERE Table1.Data3 = Table2.data3)
BEGIN
UPDATE Table1
SET Table1.someColumn = Table2.someColumn,
Table1.DateData2 = GETDATE()
FROM Table1
INNER JOIN Table2 ON Table1.Data3 = Table2.data3
END
ELSE
BEGIN
INSERT INTO Table1 (DataComment, Data1, Data2, Data3, Data4, DateData1, DateData2)
SELECT
'some comment', data1, data2, data3, data4, GETDATE(), GETDATE()
FROM
Table2
END
COMMIT TRANSACTION
My problem is, that it never does the INSERT part. The INSERT alone works fine. The current script only does the update part.
I have an idea that the insert is only good, if it can insert the entire data it finds (because of the select query)? Otherwise it won't work. If so, how can I improve it?
I have also read about the MERGE clause and would like to avoid it.
//EDIT:
After trying out few samples found on the internet and explained here, I re-did my logic as follows:
BEGIN TRANSACTION
BEGIN
UPDATE table1
SET something
WHERE condition is met
UPDATE table2
SET helpColumn = 'DONE'
WHERE condition is met
END
BEGIN
INSERT INTO table1(data)
SELECT data
FROM table2
WHERE helpColumn != 'DONE'
END
COMMIT TRANSACTION
When trying other solutions, the INSERT usually failed or ran for a long time (on a few tables, I can accept it, but not good, if you plan to migrate entire data from one database to another database).
It's probably not the best solution, I think. But for now it works, any comments?
Instead of
if (something )
update query
else
insert query
Structure your logic like this:
update yourTable
etc
where some condition is met
insert into yourTable
etc
select etc
where some condition is met.
You cannot check this in general, like you are doing. You have to check each ID from Table 2 if it exists in Table 1 or not. If it exists, then update Table 1 else insert into Table 1. This can be done in following way.
We are going to iterate on Table 2 for each ID using CURSORS in SQL,
DECLARE #ID int
DECLARE mycursor CURSOR
FOR
SELECT ID FROM Table2 FORWARD_ONLY --Any Unique Column
OPEN mycursor
FETCH NEXT FROM mycursor
INTO #ID
WHILE ##FETCH_STATUS = 0
BEGIN
IF EXISTS (SELECT 1 FROM Table1 WHERE ID = #ID)
UPDATE t1 SET t1.data= T2.data --And whatever else you want to update
FROM
Table1 t1
INNER JOIN
Table2 t2
ON t1.ID = t2.ID --Joining column
WHERE t1.id = #ID
ELSE
INSERT INTO Table1
SELECT * FROM Table2 WHERE ID = #ID
FETCH NEXT FROM mycursor
INTO #ID
END
CLOSE mycursor
DEALLOCATE mycursor
I need to write a T-SQL stored procedure that updates a row in a table. If the row doesn't exist, insert it. All this steps wrapped by a transaction.
This is for a booking system, so it must be atomic and reliable. It must return true if the transaction was committed and the flight booked.
I'm sure on how to use ##rowcount. This is what I've written until now. Am I on the right road?
-- BEGIN TRANSACTION (HOW TO DO?)
UPDATE Bookings
SET TicketsBooked = TicketsBooked + #TicketsToBook
WHERE FlightId = #Id AND TicketsMax < (TicketsBooked + #TicketsToBook)
-- Here I need to insert only if the row doesn't exists.
-- If the row exists but the condition TicketsMax is violated, I must not insert
-- the row and return FALSE
IF ##ROWCOUNT = 0
BEGIN
INSERT INTO Bookings ... (omitted)
END
-- END TRANSACTION (HOW TO DO?)
-- Return TRUE (How to do?)
I assume a single row for each flight? If so:
IF EXISTS (SELECT * FROM Bookings WHERE FLightID = #Id)
BEGIN
--UPDATE HERE
END
ELSE
BEGIN
-- INSERT HERE
END
I assume what I said, as your way of doing things can overbook a flight, as it will insert a new row when there are 10 tickets max and you are booking 20.
Take a look at MERGE command. You can do UPDATE, INSERT & DELETE in one statement.
Here is a working implementation on using MERGE
- It checks whether flight is full before doing an update, else does an insert.
if exists(select 1 from INFORMATION_SCHEMA.TABLES T
where T.TABLE_NAME = 'Bookings')
begin
drop table Bookings
end
GO
create table Bookings(
FlightID int identity(1, 1) primary key,
TicketsMax int not null,
TicketsBooked int not null
)
GO
insert Bookings(TicketsMax, TicketsBooked) select 1, 0
insert Bookings(TicketsMax, TicketsBooked) select 2, 2
insert Bookings(TicketsMax, TicketsBooked) select 3, 1
GO
select * from Bookings
And then ...
declare #FlightID int = 1
declare #TicketsToBook int = 2
--; This should add a new record
merge Bookings as T
using (select #FlightID as FlightID, #TicketsToBook as TicketsToBook) as S
on T.FlightID = S.FlightID
and T.TicketsMax > (T.TicketsBooked + S.TicketsToBook)
when matched then
update set T.TicketsBooked = T.TicketsBooked + S.TicketsToBook
when not matched then
insert (TicketsMax, TicketsBooked)
values(S.TicketsToBook, S.TicketsToBook);
select * from Bookings
Pass updlock, rowlock, holdlock hints when testing for existence of the row.
begin tran /* default read committed isolation level is fine */
if not exists (select * from Table with (updlock, rowlock, holdlock) where ...)
/* insert */
else
/* update */
commit /* locks are released here */
The updlock hint forces the query to take an update lock on the row if it already exists, preventing other transactions from modifying it until you commit or roll back.
The holdlock hint forces the query to take a range lock, preventing other transactions from adding a row matching your filter criteria until you commit or roll back.
The rowlock hint forces lock granularity to row level instead of the default page level, so your transaction won't block other transactions trying to update unrelated rows in the same page (but be aware of the trade-off between reduced contention and the increase in locking overhead - you should avoid taking large numbers of row-level locks in a single transaction).
See http://msdn.microsoft.com/en-us/library/ms187373.aspx for more information.
Note that locks are taken as the statements which take them are executed - invoking begin tran doesn't give you immunity against another transaction pinching locks on something before you get to it. You should try and factor your SQL to hold locks for the shortest possible time by committing the transaction as soon as possible (acquire late, release early).
Note that row-level locks may be less effective if your PK is a bigint, as the internal hashing on SQL Server is degenerate for 64-bit values (different key values may hash to the same lock id).
i'm writing my solution. my method doesn't stand 'if' or 'merge'. my method is easy.
INSERT INTO TableName (col1,col2)
SELECT #par1, #par2
WHERE NOT EXISTS (SELECT col1,col2 FROM TableName
WHERE col1=#par1 AND col2=#par2)
For Example:
INSERT INTO Members (username)
SELECT 'Cem'
WHERE NOT EXISTS (SELECT username FROM Members
WHERE username='Cem')
Explanation:
(1) SELECT col1,col2 FROM TableName WHERE col1=#par1 AND col2=#par2
It selects from TableName searched values
(2) SELECT #par1, #par2 WHERE NOT EXISTS
It takes if not exists from (1) subquery
(3) Inserts into TableName (2) step values
I finally was able to insert a row, on the condition that it didn't already exist, using the following model:
INSERT INTO table ( column1, column2, column3 )
(
SELECT $column1, $column2, $column3
WHERE NOT EXISTS (
SELECT 1
FROM table
WHERE column1 = $column1
AND column2 = $column2
AND column3 = $column3
)
)
which I found at:
http://www.postgresql.org/message-id/87hdow4ld1.fsf#stark.xeocode.com
This is something I just recently had to do:
set ANSI_NULLS ON
set QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[cjso_UpdateCustomerLogin]
(
#CustomerID AS INT,
#UserName AS VARCHAR(25),
#Password AS BINARY(16)
)
AS
BEGIN
IF ISNULL((SELECT CustomerID FROM tblOnline_CustomerAccount WHERE CustomerID = #CustomerID), 0) = 0
BEGIN
INSERT INTO [tblOnline_CustomerAccount] (
[CustomerID],
[UserName],
[Password],
[LastLogin]
) VALUES (
/* CustomerID - int */ #CustomerID,
/* UserName - varchar(25) */ #UserName,
/* Password - binary(16) */ #Password,
/* LastLogin - datetime */ NULL )
END
ELSE
BEGIN
UPDATE [tblOnline_CustomerAccount]
SET UserName = #UserName,
Password = #Password
WHERE CustomerID = #CustomerID
END
END
You could use the Merge Functionality to achieve. Otherwise you can do:
declare #rowCount int
select #rowCount=##RowCount
if #rowCount=0
begin
--insert....
INSERT INTO [DatabaseName1].dbo.[TableName1] SELECT * FROM [DatabaseName2].dbo.[TableName2]
WHERE [YourPK] not in (select [YourPK] from [DatabaseName1].dbo.[TableName1])
Full solution is below (including cursor structure). Many thanks to Cassius Porcus for the begin trans ... commit code from posting above.
declare #mystat6 bigint
declare #mystat6p varchar(50)
declare #mystat6b bigint
DECLARE mycur1 CURSOR for
select result1,picture,bittot from all_Tempnogos2results11
OPEN mycur1
FETCH NEXT FROM mycur1 INTO #mystat6, #mystat6p , #mystat6b
WHILE ##Fetch_Status = 0
BEGIN
begin tran /* default read committed isolation level is fine */
if not exists (select * from all_Tempnogos2results11_uniq with (updlock, rowlock, holdlock)
where all_Tempnogos2results11_uniq.result1 = #mystat6
and all_Tempnogos2results11_uniq.bittot = #mystat6b )
insert all_Tempnogos2results11_uniq values (#mystat6 , #mystat6p , #mystat6b)
--else
-- /* update */
commit /* locks are released here */
FETCH NEXT FROM mycur1 INTO #mystat6 , #mystat6p , #mystat6b
END
CLOSE mycur1
DEALLOCATE mycur1
go
Simple way to copy data from T1 to T2 and avoid duplicate in T2
--Insert a new record
INSERT INTO dbo.Table2(NoEtu, FirstName, LastName)
SELECT t1.NoEtuDos, t1.FName, t1.LName
FROM dbo.Table1 as t1
WHERE NOT EXISTS (SELECT (1) FROM dbo.Table2 AS t2
WHERE t1.FName = t2.FirstName
AND t1.LName = t2.LastName
AND t1.NoEtuDos = t2.NoEtu)
INSERT INTO table ( column1, column2, column3 )
SELECT $column1, $column2, $column3
EXCEPT SELECT column1, column2, column3
FROM table
The best approach to this problem is first making the database column UNIQUE
ALTER TABLE table_name ADD UNIQUE KEY
THEN INSERT IGNORE INTO table_name ,the value won't be inserted if it results in a duplicate key/already exists in the table.