SQL chunk update with JOIN

SQL chunk update with JOIN - sql

I have a 2 tables in my DB, one contain data about clients (called Clients), the other table contains clientID, Guid, AddedTime and IsValid (called ClientsToUpdate).
ClientID is related to the clients table, Guid is a unique identifier, AddedTime is the time when the record was added to the table, and IsValid is a bit indicated if this ClientID was updated or not.
What I want to do, is update all the Clients that their ID is in ClientsToUpdate, the problem is, the ClientsToUpdate table contains over than 80,000 records and I am getting deadlocks.
What I though I can do, is update 2000 clients at a time, using a while loop or something similar.
MY Stored Procedure looks like:
UPDATE client SET LastLogin=GETDATE()
FROM Clients client
JOIN ClientsToUpdate ctu ON client.ID = ctu.ClientID;
Any idea how can I do it?

declare #done table (ClientID int primary key)
while 1=1
begin
update top (2000) c
set lastlogin = getdate()
output deleted.id into #done
from Clients c
join ClientsToUpdate ctu
on c.id = ctu.ClientID
where not exists
(
select *
from #done d
where d.ClientID = ctu.ClientID
)
if ##rowcount = 0
break
end
Example at SQL Fiddle.

If you experience deadlocks, updating in chunks might reduce errors (assuming you carefully manage your transactions and commit your chunk updates), but does not resolve the deadlocks origin. IMHO you should investigate lock problems and find why you have deadlocks

Related

SQL Server : trigger firing every time

For my school project I need to add a trigger to my SQL Server database. I decided a 'no double usernames' trigger on my Users table would be relevant.
The problem is, that this trigger is firing every time I execute an INSERT query. I can't figure out why this is happening every time. I even tried different ways of writing my trigger.
The trigger I have now:
CREATE TRIGGER [Trigger_NoDuplicates]
ON [dbo].[Users]
FOR INSERT
AS
BEGIN
SET NOCOUNT ON
IF(EXISTS(SELECT Username FROM Users
WHERE Username = (SELECT Username FROM inserted)))
BEGIN;
RAISERROR('This username already exists!',15, 0)
ROLLBACK
END
END
Thanks in advance!

A trigger always fires every time, do you mean "raises an error every time"?
You currently have the following (expanded to multiple lines to make it clearer)...
IF (
EXISTS (
SELECT Username
FROM users
WHERE Username = (SELECT Username FROM inserted)
)
)
The key point here is the name of the table inserted. Past tense. It's already happened.
Anything in the inserted table has already been inserted into the target table.
So, what you need to check is that the username is in the target table more than once already.
However, it is possible to insert more than one record in to a table at once. This means that Username = (SELECT Username FROM inserted) will cause its own error. (You can't compare a single value to a set of values, and inserted can contain more than one row => more than one username...)
This is how I would approach your trigger...
IF EXISTS (
SELECT
users.Username
FROM
users
INNER JOIN
inserted
ON inserted.Username = users.Username
GROUP BY
users.Username
HAVING
COUNT(*) > 1
)
This takes the (already inserted in to) users table, and picks out all the records that mach username with any record in the inserted table.
Then it GROUPs them by they username field.
Then it filters the results to only include groups with more than 1 record.
These groups (usernames), have duplicate entries and should cause your trigger to raise an error.
An alternative is a bit more similar to your approach, but many people won't recognise it, so I generally wouldn't recommend it...
IF EXISTS (
SELECT
users.Username
FROM
users
WHERE
users.Username = ANY (SELECT username FROM inserted)
GROUP BY
users.Username
HAVING
COUNT(*) > 1
)
The ANY keyword gets very rarely used, but does what it sounds like. It allows a single value to be compared to a set of values.
Finally, if your table has an IDENTITY column, you can avoid the GROUP BY by explicitly stating you don't want to compare a row to itself...
IF EXISTS (
SELECT
users.Username
FROM
users
INNER JOIN
inserted
ON inserted.Username = users.Username
AND inserted.id <> users.id
)

How to Bulk Update with SQL Server?

I have a table with 10 millions rows that I need to join with another table and update all data. This is taking more than 1 one hour and it is increasing my transaction log in 10+ GBs. Is there another way to enhance this performance?
I believe that after each update, the indexes and constraints are checked and all information are logged. Is there a way to tell SQL Server to check constraints only after the update is finished and to minimally log the update action?
My query follows below. I've modified some names so it becomes easier to read.
UPDATE o
SET o.Info1 = u.Info1, o.Info2 = u.Info2, o.Info3 = u.Info3
FROM Orders o
INNER JOIN Users u
ON u.ID = o.User_ID
EDIT: as asked in comments, the table definition would be something like the following (simplifying again to create a generic question).
Table Orders
ID int PK
OrderNumber nvarchar(20)
User_ID int FK to table Users
Info1 int FK to table T1
Info2 int FK to table T2
Info2 int FK to table T3
Table Users
ID int PK
UserName nvarchar(20)
Info1 int FK to table T1
Info2 int FK to table T2
Info2 int FK to table T3

First of all there is no such thing as BULK UPDATE, a few things that you can do are as follow:
If possible put your database in simple recovery mode before doing this operation.
Drop indexes before doing update and create them again once update is completed.
do updates in smaller batches , something like
WHILE (1=1)
BEGIN
-- update 10,000 rows at a time
UPDATE TOP (10000) O
FROM Table O inner join ... bla bla
IF (##ROWCOUNT = 0)
BREAK;
END
Note
if you go with the simple mode option, dont forget to take a full-backup after you switch the recovery mode back to full. Since simply switching it back to full recovery mode will not strat logging until you take a full backup.

For my case, load data as need(Dotnet winform), and try create a new table, apply bulk and update the basic table with join by bulk table!, for 1M rows , it take me about 5 second

Issue with UPDATE QUERY using EXISTS clause causing slow Tablespace scans in DB2

Our workplace has a database with a client table that holds 5 million records. Each time a client is updated, another row is added to a client_history table that holds 100 million records. All columns in the Client table are indexed. Only the Primary Key (ID), Foreign Key (FK_Client_ID) and Creation Timestamp in the Client History table are indexed.
I've been asked to update several hundred thousand client records, but only if the corresponding client history record indicates that the client record has not been updated since a certain date (e.g. 19th September 2012).
I've written an SQL update query that uses an EXISTS clause. I've been told by the DBA's that I shouldn't use an EXISTS clause, as this would trigger a tablespace scan that would slow down execution of the query. This is obviously an issue when updating several hundred thousand client records -
UPDATE Client_History SET Surname = 'MisterX',
Update_Timestamp = CURRENT_TIMESTAMP
WHERE (FK_Client_ID = 123 AND ID = 456)
AND NOT EXISTS
(SELECT *
FROM Client
WHERE Client.Client_Id = Client_History.FK_Client_ID
AND Client_History.Update_Timestamp > TIMESTAMP('2012-09-21-00:00:00')
AND Client_History.Update_Timestamp < TIMESTAMP('4000-12-31-00:00:00')
AND Client_History.Creation_Timestamp < NAME.Update_Timestamp);
Can anyone think of a better solution?

A shot in the dark: try hoisting all the constants up into the main query (where they belong)
UPDATE Client_History ch
SET Surname = 'MisterX'
, Update_Timestamp = CURRENT_TIMESTAMP
WHERE ch.FK_Client_ID = 123
AND ch.ID = 456
AND ch.Update_Timestamp > TIMESTAMP('2012-09-21-00:00:00')
AND ch.Update_Timestamp < TIMESTAMP('4000-12-31-00:00:00')
AND ch.Creation_Timestamp < NAME.Update_Timestamp
AND NOT EXISTS (
SELECT *
FROM Client cl
WHERE cl.Client_Id = ch.FK_Client_ID
)
;
BTW: what is NAME ? Some kind of pseudo table, like Oracle's dual ?

Update a SQL table with values from another nested query

I am currently using a SQL Server Agent job to create a master user table for my in-house web applications, pulling data from 3 other databases; Sharepoint, Practice Management System and Our HR Database.
Currently it goes...
truncate table my_tools.dbo.tb_staff
go
insert into my_tools.dbo.tb_staff
(username
,firstname
,surname
,chargeoutrate)
select right(wss.nt_user_name,
,hr.firstname
,hr.surname
,pms.chargeoutrate
from sqlserver.pms.dbo.staff as pms
inner join sqlserver.wss_content.dbo.vw_staffwss as wss
on pms.nt_user_name = wss.nt_user_name
inner join sqlserver.hrdb.dbo.vw_staffdetails as hr
on wss.fullname = hr.knownas
go
The problem is that the entire table is cleared as the first step so my auto increment primary key/identified on tb_staff is certain to change. Also if someone is removed from sharepoint or the PMS they will not be recreated on this table and this will cause inconsistencies throughout the database.
I want to preserve entries in this table, even after they are removed from one of the other systems.
I suppose what I want to do is:
1) Mark all exiting entries in tb_staff as inactive (using a column called active and set it to false)
2) Run the query on the three joined tables and update every found record, also marking them as active.
I can't see how I can nest a select statement within an Update statement like I have here with the Insert statement.
How can I achieve this please?
*please note I have edited my SQL down to 4 columns and simplified it so small errors are probably due to rushed editing. The real query is far bigger.

WITH source AS(
SELECT RIGHT(wss.nt_user_name, 10) nt_user_name, /*Or whatever - this is invalid in the original SQL*/
hr.firstname,
hr.surname,
pms.chargeoutrate
FROM staff AS pms
INNER JOIN vw_staffwss AS wss
ON pms.nt_user_name = wss.nt_user_name
INNER JOIN vw_staffdetails AS hr
ON wss.fullname = hr.knownas
)
MERGE
INTO tb_staff
USING source
ON source.nt_user_name= tb_staff.username /*Or whatever you are using as the key */
WHEN MATCHED
THEN UPDATE SET active=1 /*Can synchronise other columns here if needed*/
WHEN NOT MATCHED BY TARGET
THEN INSERT (username, firstname, surname, chargeoutrate, active) VALUES (nt_user_name,firstname, surname, chargeoutrate, 1)
WHEN NOT MATCHED BY source
THEN UPDATE SET active=0;

SQL Delete When column a and column b does not exist

Ok, so you have something like this working. You Insert into a table from a tmp table, where the Equipment Number and the Account Number are missing...
Insert INTO ClientEquipments(
SUB_ACCT_NO_SBB,
EquipmentDate,
EquipmentText,
EquipmentNumber)
Select
replace(a.SUB_ACCT_NO_SBB,'"','') as SUB_ACCT_NO_SBB,
getdate() as edate,'' as etext,
replace(a.equipmentNumber,'"','') equipmentNumber
from clientspaymenttemp a
where not exists
(select b.equipmentNumber
from clientEquipments b
where b.sub_acct_no_sbb=replace(a.SUB_ACCT_NO_SBB,'"','') and b.equipmentNumber=replace(a.equipmentNumber,'"',''))
group by SUB_ACCT_NO_SBB,equipmentNumber
But found a problem if the Equipment Number belonged to a different account number before, then my previous query will insert a new row, with the same Equipment Number but a new Account Number.
What I need it to do is simple:
If Account Number and Equipment Number exists, leave it alone no need to insert.
If Equipment Number exists, but it's assigned to a different Account Number, delete the old row. (Import file handles assignments so I am 100% sure that it needs to be assigned to new account)
Something Like this added somewhere in the previous code:
DELETE FROM ClientEquipments
WHERE (clientEquipmentId =
(SELECT clientEquipmentId
FROM ClientEquipments AS ClientEquipments_1
WHERE (equipmentNumber = '0012345CAEC6')))
If nothing exists then Insert a new row.
:::EDIT SOME MORE INFORMATION TO HELP ME OUT:::
I am reading a CSV file:
Sample Data:
Account | Name | Address | Some Extra Stuff | Equipment Number
"1234","First1,Last1","Address 1",etc etc... "ENum1234"
"1234","First1,Last1","Address 1",etc etc... "ENum5678"
"5678","First2,Last2","Address 2",etc etc... "ENum9123"
"9123","First3,Last3","Address 3",etc etc... "ENum4567"
This gets bulked imported into a temp table. (dbo.clients_temp)
Notice how account 1234 has 2 equipment numbers.
From here I insert new accounts into dbo.clients by doing a query from dbo.clients_temp to dbo.clients
Then I update dbo.clients with new information from dbo.clients_temp (ie Account 1234 might exists but now they have a new address.)
Now that my dbo.clients table is update with new clients, and new information for existing clients, I need to update my dbo.equipments table. I was originally doing what you see above, Insert Where Not Exists Account Number and Equipment Number.
Now the problem is that since equipments do change accounts, for example, Account Number 5678 might have become inactive which I don't track or care for at the database level, but the equipment Number might now belong to Account Number 1234. In this case, my original query will insert a new row into the database, since Account 1234 and Equipment Number are not returned in the SELECT.
Ok, I have lost this now :P I will try and revisit the question later on the weekend because I just confused myself O.o

I had to modify Gordon's answer above a bit, but that did the trick...
Below is the relevant line of code that deletes the inactive accounts.
DELETE FROM ClientEquipments WHERE EquipmentNumber =
(SELECT E.equipmentNumber FROM ClientEquipments As E INNER JOIN ClientsPaymentTemp AS T
on E.equipmentNumber = T.equipmentNumber and e.SUB_ACCT_NO_SBB <> T.SUB_ACCT_NO_SBB)

-- Fix Account Numbers and Equipment Numbers
update ClientPaymentTemp
set SUB_ACCT_NO_SBB = replace(SUB_ACCT_NO_SBB,'"',''),
equipmentNumber = replace(equipmentNumber,'"','')
-- Delete Existing Accounts Mapped to New Equipment
delete e
from ClientEquipments e
inner join clientspaymenttemp t
on e.EquipmentNumber = t.EquipmentNumber
and e.SUB_ACCT_NO_SBB <> t.SUB_ACCT_NO_SBB
-- Insert New Accounts
insert into ClientEquipments
(SUB_ACCT_NO_SBB,
EquipmentDate,
EquipmentText,
EquipmentNumber)
Select
SUB_ACCT_NO_SBB,
getdate() as edate,
'' as etext,
equipmentNumber
from ClientsPaymentTemp a
where not exists (select 1 from ClientEquipments where SUB_ACCT_NO_SBB = a.SUB_ACCT_NO_SBB and EquipmentNumber = a.EquipmentNumber)

I may be misunderstanding, but if all you're looking to do is delete a record where the account number isn't equal to something and the equipment number is equal to something, can't you just perform a delete with multiple where conditions?
Example:
DELETE FROM table
WHERE
equipmentNumber = someNumber AND
accountNumber <> someAccount
You could then get the number of rows affected using ##ROWCOUNT to check the number of rows affected and then insert if nothing was deleted. The example from the TechNet link above uses the following example:
USE AdventureWorks;
GO
UPDATE HumanResources.Employee
SET Title = N'Executive'
WHERE NationalIDNumber = 123456789
IF ##ROWCOUNT = 0
PRINT 'Warning: No rows were updated';
GO
I would think you could easily adapt that to do what you're looking to do.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas