MS SQL Execute multiple updates based on a list - sql

I have the problem that I found out how to fix the database the only problem is that I have to insert the CaseNumber for one execution everytime.
In C# I would use somekind of string list for the broken records is there something in MS SQL.
My Code so far I implemented a variable CaseNumber. I have a table with a lot of Casenumber records that are broken. Is there a way to execute this for every Casenumber of a different table.
Like:
1. Take the first casenumber and run this script.
2. Than take the second one and run this script again until every casenumber was fixed.
Thx in advance for any idea.
GO
DECLARE #CaseNumber VARCHAR(50)
SET #CaseNumber = '25615'
print 'Start fixing broken records.'
print 'Fixing FIELD2'
UPDATE t
SET t.FIELD2 = ( SELECT DISTINCT TOP 1 FIELD2
FROM {myTable} t2
WHERE IDFIELD = #CaseNumber
AND FIELD2 IS NOT NULL )
FROM {myTable} t
WHERE FIELD2 IS NULL
AND IDFIELD = #CaseNumber
</Code>

Here are a couple different options...
-- This verion will just "fix" everything that can be fixed.
UPDATE mt1 SET
mt1.FIELD2 = mtx.FIELD2
FROM
dbo.myTable mt1
CROSS APPLY (
SELECT TOP (1)
mt2.FIELD2
FROM
dbo.myTable mt2
WHERE
mt1.IDFIELD = mt2.IDFIELD
AND mt2.FIELD2 IS NOT NULL
) mtx
WHERE
mt1.FIELD2 IS NULL;
And if, for whatever reason, you don't want to fix the entire table all in one go. You can restrain to to just those you specify...
-- This version will works off the same principal but limits itself to only those values in the #CaseNumCSV parameter.
DECLARE #CaseNumCSV VARCHAR(8000) = '25615,25616,25617,25618,25619';
IF OBJECT_ID('tempdb..#CaseNum', 'U') IS NOT NULL
BEGIN DROP TABLE #CaseNum; END;
CREATE TABLE #CaseNum (
CaseNumber VARCHAR(50) NOT NULL,
PRIMARY KEY (CaseNumber)
WITH(IGNORE_DUP_KEY = ON) -- just in case the same CaseNumber is in the string multiple times.
);
INSERT #CaseNum(CaseNumber)
SELECT
CaseNumber = dsk.Item
FROM
dbo.DelimitedSplit8K(#CaseNumCSV, ',') dsk;
-- a copy of DelimitedSplit8K can be found here: http://www.sqlservercentral.com/articles/Tally+Table/72993/
UPDATE mt1 SET
mt1.FIELD2 = mtx.FIELD2
FROM
#CaseNum cn
JOIN dbo.myTable mt1
ON cn.CaseNumber = mt1.IDFIELD
CROSS APPLY (
SELECT TOP (1)
mt2.FIELD2
FROM
dbo.myTable mt2
WHERE
mt1.IDFIELD = mt2.IDFIELD
AND mt2.FIELD2 IS NOT NULL
) mtx
WHERE
mt1.FIELD2 IS NULL;

Related

Short-circuiting tables

I'm upgrading several identical copies of a database which may already be upgraded partially, and for some reason bool values were stored in an nvarchar(5).
So in the below, (which exists inside an INSERT > SELECT block), I need to check if the column ShowCol exists, fill it with 0 if it does not, or fill it with the result of evaluating the string bool if it does:
CASE
WHEN COL_LENGTH('dbo.TableName', 'ShowCol') IS NULL THEN 0
ELSE IIF(LOWER(ShowCol) = 'false', 0, 1)
END
...but I'm getting an error "Invalid column name 'ShowCol'". I can't seem to short-circuit this, can you help?
Its worth noting that the column if it does exist contains a mix of "false", "False" and "FALSE", so that's the point of the LOWER(). (The True column also occasional trailing spaces to contend with, which is why I'm just dealing with False and everything else is true.)
I suspect that its because of this wrap in LOWER() which is causing the server to always evaluate the expression.
You can’t short circuit the existence of a column (and it has nothing to do with LOWER(); if you remove it, nothing will change).
You’ll need dynamic SQL, e.g.:
DECLARE #sql nvarchar(max) = N'UPDATE trg SET
trg.col1 = src.col1,
trg.col2 = src.col2';
IF COL_LENGTH('dbo.TableName', 'ShowCol') > 0
BEGIN
SET #sql += N', trg.ShowCol = IIF(LOWER(src.ShowCol) = ''false'', 0, 1)';
END
SET #sql += N' ...
FROM dbo.TableName AS trg
INNER JOIN dbo.Origin AS src
ON ...';
EXEC sys.sp_executesql #sql; -- ,N'params', #params;
When you're selecting data, you can fool the parser a little bit by introducing constants to take the place of columns, taking advantage of SQL Server's desire to find a column reference even at a different scope than the syntax would suggest. I talk about this in Make SQL Server DMV Queries Backward Compatible. I don't know of any straightforward way to make that work with writes without dynamic SQL, as the parser does more strict checking there, so it's harder to fool.
Imagine you have these tables:
CREATE TABLE dbo.SourceTable(a int, b int, c int);
INSERT dbo.SourceTable(a,b,c) VALUES(1,2,3);
CREATE TABLE dbo.DestinationWithAllColumns(a int, b int, c int);
INSERT dbo.DestinationWithAllColumns(a,b,c) VALUES(1,2,3);
CREATE TABLE dbo.DestinationWithoutAllColumns(a int, b int);
INSERT dbo.DestinationWithoutAllColumns(a,b) VALUES(1,2);
You can write a SELECT against either of them that produces an int output column called c:
;WITH optional_columns AS
(
SELECT c = CONVERT(int, NULL)
)
SELECT trg.a, trg.b, trg.c
FROM optional_columns
CROSS APPLY
(SELECT a,b,c FROM dbo.DestinationWithAllColumns) AS trg
INNER JOIN dbo.SourceTable AS src ON src.a = trg.a;
Output:
a
b
c
1
2
3
;WITH optional_columns AS
(
SELECT c = CONVERT(int, NULL)
)
SELECT trg.a, trg.b, trg.c
FROM optional_columns
CROSS APPLY
(SELECT a,b,c FROM dbo.DestinationWithoutAllColumns) AS trg
INNER JOIN dbo.SourceTable AS src ON src.a = trg.a;
Output:
a
b
c
1
2
null
So far, so good. But as soon as you try and update:
;WITH optional_columns AS
(
SELECT c = CONVERT(int, NULL)
)
UPDATE trg SET trg.b = src.b, trg.c = src.c
FROM optional_columns
CROSS APPLY
(SELECT a,b,c FROM dbo.DestinationWithoutAllColumns) AS trg
INNER JOIN dbo.SourceTable AS src ON src.a = trg.a;
Msg 4421, Level 16, State 1
Derived table 'trg' is not updatable because a column of the derived table is derived or constant.
Example db<>fiddle

SQL IN operator in update query causes a lot of time

Below is a update query which is to update a table with about 40000 records:
UPDATE tableName
SET colA = val, colB = val
WHERE ID IN (select RecordIDs from tableB where needUpdate = 'Y')
When the above query is executed, I found out that the below query taken ~ 15 seconds
SELECT RecordIDs
FROM tableB
WHERE needUpdate = 'Y'
But when I take away the where clause (i.e. update tableName set colA = val, colB = val) The query runs smoothly.
Why this happens? are there any ways to shorten the time of execution?
Edited:
Below is the structure of both tables:
tableName:
ID int,
VehicleBrandID int,
VehicleLicenseExpiryDate nvarchar(25),
LicensePlateNo nvarchar(MAX),
ContactPerson nvarchar(MAX),
ContactPersonID nvarchar(MAX),
ContactPersonPhoneNumber nvarchar(MAX),
ContactPersonAddress nvarchar(MAX),
CreatedDate nvarchar(MAX),
CreatedBy nvarchar(MAX)
PRIMARY KEY (ID)
tableB:
RowNumber int
RecordIDs int
NeedUpdate char(1)
PRIMARY KEY (RowNumber)
Edited
Below screenshot is the execution plan for the update query
The execution plan shows you are using table variables and are missing a useful index.
Keep the existing PK on #output
DECLARE #output TABLE (
ID INT PRIMARY KEY,
VehicleBrandID INT,
VehicleLicenseExpiryDate NVARCHAR(25),
LicensePlateNo NVARCHAR(MAX),
ContactPerson NVARCHAR(MAX),
ContactPersonID NVARCHAR(MAX),
ContactPersonPhoneNumber NVARCHAR(MAX),
ContactPersonAddress NVARCHAR(MAX),
CreatedDate NVARCHAR(MAX), /*<-- Don't store dates as strings*/
CreatedBy NVARCHAR(MAX))
And add a new index to #tenancyEditable
DECLARE #tenancyEditable TABLE (
RowNumber INT PRIMARY KEY,
RecordIDs INT,
NeedUpdate CHAR(1),
UNIQUE (NeedUpdate, RecordIDs, RowNumber))
With these indexes in place the following query
UPDATE #output
SET LicensePlateNo = ''
WHERE ID IN (SELECT RecordIDs
FROM #tenancyEditable
WHERE NeedUpdate = 'Y')
OPTION (RECOMPILE)
Can generate the more efficient looking
Also you should use appropriate datatypes rather than storing everything as NVARCHAR(MAX). A person name isn't going to need more than nvarchar(100) at most and CreatedDate should be stored as date[time2] for example.
I suppose you are in one of the 2 cases below:
1/ STATISTICS are not updated due to a some recently modification of in your table. In this case you should execute this:
UPDATE STATISTICS tableB
2/ I suppose a wrong query plan is used, case when I recommend to execute this in order to force recompilation of the query:
SELECT RecordIDs
FROM tableB
WHERE needUpdate = 'Y'
OPTION (RECOMPILE)
Tell us the result and we'll come with more details about.
This is an alternative. It is worth it to try in your environment as it has been demonstrated for others to be faster.
MERGE INTO tableName tn
USING (
SELECT recordIDs
FROM tableB
WHERE needUpdate = 'Y'
) tb
ON tn.ID = tb.recordID
WHEN MATCHED THEN
UPDATE
SET colA = tb.val,
colB = tb.val;
EDIT:
I am not claiming this to be faster in every case or in every setup/environment - just that it is worth a try as it has worked for me and others I have worked with or read about.
you can use inner join instead of IN clause.
update t
set
t.colA = val, t.colB = val
From tablename
inner join tableb x on
t.id = x.recordid
where x.needUpdate = 'Y'
Although the UPDATE...FROM
syntax is essential in some
circumstances, I prefer to use
subqueries (by using IN clause) whenever
possible.

SQL Azure doesn't support 'select into' - Is there another way?

I have a very complicated table I'd like to take a temporary backup of whilst I make some changes. Normally, I'd just do the following:
SELECT *
INTO temp_User
FROM dbo.[User] AS u
Unfortunately I'm using Azure, and it appears this isn't supported:
Msg 40510, Level 16, State 1, Line 2 Statement 'SELECT INTO' is not
supported in this version of SQL Server.
Is there a way to re-create this feature into a function, potentially? I could do this by scripting the table, creating it and then inserting data using a select statement but given how frequently I use Azure, and how many databases I need to work on in this area this is very unwieldy.
Azure requires a clustered index on all tables, therefore SELECT INTO is not supported.
You'll have to:
CREATE TABLE temp_User () --fill in table structure
INSERT INTO temp_User
SELECT *
FROM dbo.[User]
To script table easily you can write your own or use one of the answers to this question:
Script CREATE Table SQL Server
Update: As Jordan B pointed out, V12 will include support for heaps (no clustered index requirement) which means SELECT INTO will work. At the moment V12 Preview is available, Microsoft of course only recommends upgrading with test databases.
The new Azure DB Update preview has this problem resolved:
The V12 preview enables you to create a table that has no clustered
index. This feature is especially helpful for its support of the T-SQL
SELECT...INTO statement which creates a table from a query result.
http://azure.microsoft.com/en-us/documentation/articles/sql-database-preview-whats-new/
Unfortunately it cant be done. Here is how I worked around it:
Open SQL Server Management Studio
Right click on the table
Select Script as ... Create Table
Edit the generated script to change the table name to what you specified in your query
Execute your query
INSERT INTO temp_User
SELECT * FROM dbo.[User]
You can try the above. It's basically a select that is applied to an insert statement
http://blog.sqlauthority.com/2011/08/10/sql-server-use-insert-into-select-instead-of-cursor/
Lets assume you have a table with Id, Column1 and Column2. Then this could be your solution
CREATE TABLE YourTableName_TMP ....
GO
SET IDENTITY_INSERT YourTableName_TMP ON
GO
INSERT INTO YourTableName_TMP
([Id] ,[Column1] ,[Column2])
SELECT [Id] ,[Column1] ,[Column2]
FROM
(
SELECT *
FROM
(
SELECT [Id] ,[Column1] ,[Column2] ROW_NUMBER() OVER(ORDER BY ID DESC) AS RowNum
FROM YourTableName
)
WHERE RowNum BETWEEN 0 AND 500000
)
GO
SET IDENTITY_INSERT YourTableName_TMP OFF
GO
First you create a temporary table and then you insert rows windowed. It's a mess, I know. My experiences are, that executing this using SQL Server Management Studio from a client makes approximately 200.000 rows a minute.
As wrote above - you need to rewrite your query from using select into to create table like
It is my sample. Was :
select emrID, displayName --select into
into #tTable
from emrs
declare #emrid int
declare #counter int = 1
declare #displayName nvarchar(max)
while exists (select * from #tTable)
begin
-- some business logic
select top 1 #displayName = displayname
from #tTable
group by displayname
update emrs set groupId = #counter where #displayName = displayname
delete #tTable
where #displayName = displayname
set #counter = #counter + 1
end
drop table #tTable
Modified :
CREATE TABLE #tTable ([displayName] nvarchar(max)) --create table
INSERT INTO #tTable -- insert to next select :
select displayName
from emrs
declare #emrid int
declare #counter int = 1
declare #displayName nvarchar(max)
while exists (select * from #tTable)
begin
-- some business logic
select top 1 #displayName = t.displayName
from #tTable as t
group by t.displayname
update emrs set groupId = #counter where #displayName = displayname
delete #tTable
where #displayName = displayname
set #counter = #counter + 1
end
drop table #tTable
Do not forget to drop your temp table.
Also, you can find more simple example with description here :
http://www.dnnsoftware.com/wiki/statement-select-into-is-not-supported-in-this-version-of-sql-server

What's the best way to lock a record while it is being updated?

If I need to SELECT a value from a table column (happens to be the primary key column) based on a relatively complex WHERE clause in the stored procedure, and I then want to update that record without any other concurrent stored procedures SELECTing the same record, is it as simple as just using a transaction? Or do I also need to up the isolation to Repeatable Read?
It looks like this:
Alter Procedure Blah
As
Declare #targetval int
update table1 set field9 = 1, #targetval = field1 where field1 = (
SELECT TOP 1 field1
FROM table1 t
WHERE
(t.field2 = 'this') AND (t.field3 = 'that') AND (t.field4 = 'yep') AND (t.field9 <> 1))
return
I then get my targetval in my program so that I can do work on it, and meanwhile I don't have to worry about other worker threads grabbing the same targetval.
I'm talking SQL 2000, SQL 2005, and SQL 2008 here.
Adding ROWLOCK,UPDLOCK to the sub query should do it.
ALTER PROCEDURE Blah
AS
DECLARE #targetval INT
UPDATE table1
SET field9 = 1,
#targetval = field1
WHERE field1 = (SELECT TOP 1 field1
FROM table1 t WITH (rowlock, updlock)
WHERE ( t.field2 = 'this' )
AND ( t.field3 = 'that' )
AND ( t.field4 = 'yep' )
AND ( t.field9 <> 1 ))
RETURN
Updated
The currently accepted answer to this question does not use updlock. I'm not at all convinced that this will work. As far as I can see from testing in this type of query with a sub query SQL Server will only take S locks for the sub query. Sometimes however the sub query will get optimised out so this approach might appear to work as in Query 2.
Test Script - Setup
CREATE TABLE test_table
(
id int identity(1,1) primary key,
col char(40)
)
INSERT INTO test_table
SELECT NEWID() FROM sys.objects
Query 1
update test_table
set col=NEWID()
where id=(SELECT top (1) id from test_table )
Query 2
update test_table
set col=NEWID()
where id=(SELECT max(id) from test_table)

How can I efficiently do a database massive update?

I have a table with some duplicate entries. I have to discard all but one, and then update this latest one. I've tried with a temporary table and a while statement, in this way:
CREATE TABLE #tmp_ImportedData_GenericData
(
Id int identity(1,1),
tmpCode varchar(255) NULL,
tmpAlpha3Code varchar(50) NULL,
tmpRelatedYear int NOT NULL,
tmpPreviousValue varchar(255) NULL,
tmpGrowthRate varchar(255) NULL
)
INSERT INTO #tmp_ImportedData_GenericData
SELECT
MCS_ImportedData_GenericData.Code,
MCS_ImportedData_GenericData.Alpha3Code,
MCS_ImportedData_GenericData.RelatedYear,
MCS_ImportedData_GenericData.PreviousValue,
MCS_ImportedData_GenericData.GrowthRate
FROM MCS_ImportedData_GenericData
INNER JOIN
(
SELECT CODE, ALPHA3CODE, RELATEDYEAR, COUNT(*) AS NUMROWS
FROM MCS_ImportedData_GenericData AS M
GROUP BY M.CODE, M.ALPHA3CODE, M.RELATEDYEAR
HAVING count(*) > 1
) AS M2 ON MCS_ImportedData_GenericData.CODE = M2.CODE
AND MCS_ImportedData_GenericData.ALPHA3CODE = M2.ALPHA3CODE
AND MCS_ImportedData_GenericData.RELATEDYEAR = M2.RELATEDYEAR
WHERE
(MCS_ImportedData_GenericData.PreviousValue <> 'INDEFINITO')
-- SELECT * from #tmp_ImportedData_GenericData
-- DROP TABLE #tmp_ImportedData_GenericData
DECLARE #counter int
DECLARE #rowsCount int
SET #counter = 1
SELECT #rowsCount = count(*) from #tmp_ImportedData_GenericData
-- PRINT #rowsCount
WHILE #counter < #rowsCount
BEGIN
SELECT
#Code = tmpCode,
#Alpha3Code = tmpAlpha3Code,
#RelatedYear = tmpRelatedYear,
#OldValue = tmpPreviousValue,
#GrowthRate = tmpGrowthRate
FROM
#tmp_ImportedData_GenericData
WHERE
Id = #counter
DELETE FROM MCS_ImportedData_GenericData
WHERE
Code = #Code
AND Alpha3Code = #Alpha3Code
AND RelatedYear = #RelatedYear
AND PreviousValue <> 'INDEFINITO' OR PreviousValue IS NULL
UPDATE
MCS_ImportedData_GenericData
SET
PreviousValue = #OldValue, GrowthRate = #GrowthRate
WHERE
Code = #Code
AND Alpha3Code = #Alpha3Code
AND RelatedYear = #RelatedYear
AND MCS_ImportedData_GenericData.PreviousValue ='INDEFINITO'
SET #counter = #counter + 1
END
but it takes too long time, even if there are just 20000 - 30000 rows to process.
Does anyone has some suggestions in order to improve performance?
Thanks in advance!
WITH q AS (
SELECT m.*, ROW_NUMBER() OVER (PARTITION BY CODE, ALPHA3CODE, RELATEDYEAR ORDER BY CASE WHEN PreviousValue = 'INDEFINITO' THEN 1 ELSE 0 END)
FROM MCS_ImportedData_GenericData m
WHERE PreviousValue <> 'INDEFINITO'
)
DELETE
FROM q
WHERE rn > 1
Quassnoi's answer uses SQL Server 2005+ syntax, so I thought I'd put in my tuppence worth using something more generic...
First, to delete all the duplicates, but not the "original", you need a way of differentiating the duplicate records from each other. (The ROW_NUMBER() part of Quassnoi's answer)
It would appear that in your case the source data has no identity column (you create one in the temp table). If that is the case, there are two choices that come to my mind:
1. Add the identity column to the data, then remove the duplicates
2. Create a "de-duped" set of data, delete everything from the original, and insert the de-deduped data back into the original
Option 1 could be something like...
(With the newly created ID field)
DELETE
[data]
FROM
MCS_ImportedData_GenericData AS [data]
WHERE
id > (
SELECT
MIN(id)
FROM
MCS_ImportedData_GenericData
WHERE
CODE = [data].CODE
AND ALPHA3CODE = [data].ALPHA3CODE
AND RELATEDYEAR = [data].RELATEDYEAR
)
OR...
DELETE
[data]
FROM
MCS_ImportedData_GenericData AS [data]
INNER JOIN
(
SELECT
MIN(id) AS [id],
CODE,
ALPHA3CODE,
RELATEDYEAR
FROM
MCS_ImportedData_GenericData
GROUP BY
CODE,
ALPHA3CODE,
RELATEDYEAR
)
AS [original]
ON [original].CODE = [data].CODE
AND [original].ALPHA3CODE = [data].ALPHA3CODE
AND [original].RELATEDYEAR = [data].RELATEDYEAR
AND [original].id <> [data].id
I don't understand used syntax perfectly enough to post an exact answer, but here's an approach.
Identify rows you want to preserve (eg. select value, ... from .. where ...)
Do the update logic while identifying (eg. select value + 1 ... from ... where ...)
Do insert select to a new table.
Drop the original, rename new to original, recreate all grants/synonyms/triggers/indexes/FKs/... (or truncate the original and insert select from the new)
Obviously this has a prety big overhead, but if you want to update/clear millions of rows, it will be the fastest way.