Updating a column in every table in a schema in SQL Server - sql

I want to update the column Last_Modified in every table in a given schema. This column is updated with latest timestamp if another column in the same table (ENDTIME) is updated.
To do this I have the following script in SQL Server:
DECLARE #TotalRows FLOAT
SET #TotalRows = (SELECT COUNT(*) FROM table1)
DECLARE #TotalLoopCount INT
SET #TotalLoopCount = CEILING(#TotalRows / 100000)
DECLARE #InitialLoopCount INT
SET #InitialLoopCount = 1
DECLARE #AffectedRows INT
SET #AffectedRows = 0
DECLARE #intialrows INT;
SET #intialrows = 1
DECLARE #lastrows INT
SET #lastrows = 100000;
WHILE #InitialLoopCount <= #TotalLoopCount
BEGIN
WITH updateRows AS
(
SELECT
t1.*,
ROW_NUMBER() OVER (ORDER BY caster) AS seqnum
FROM
table1 t1
)
UPDATE updateRows
SET last_modified = ENDTIME AT TIME ZONE 'Central Standard Time'
WHERE last_modified IS NULL
AND updateRows.ENDTIME IS NOT NULL
AND updateRows.seqnum BETWEEN #intialrows AND #lastrows;
SET #AffectedRows = #AffectedRows + ##ROWCOUNT
SET #intialrows = #intialrows + 100000
SET #lastrows = #lastrows + 100000
-- COMMIT
SET #Remaining = #TotalRows - #AffectedRows
SET #InitialLoopCount = #InitialLoopCount + 1
END
This script determines the count of a table, divides it by 100000 and runs only that many loops to perform the entire update. It breaks down the update in batches/loops and then perform updates on certain rows until it completes updating them all.
This script is only for 1 table, i.e table1. I want to now modify this script in such a way that it dynamically takes all the tables in a schema and runs the above script for each of them. Let's say the schema name is schema1 and it has 32 tables, so this script should run for all those 32 tables.
I am able to retrieve the tables in schema1 but I am not able to dynamically send those to this script. Can anyone please help me with this?

To dynamically change table names at runtime you're going to need something like sp_executesql. See here for an example of its use: https://stackoverflow.com/a/3556554/22194
Then you could have an outer cursor that fetches the table names and then assembles the queries in a string and executes them. Its going to look horrible though.
If your schema doesn't change much another approach would be to generate a long script with a section for each table. You generate the script by querying the table names and then repeating the script with each different table name. Excel is actually pretty good for doing that sort of thing - paste your table names into Excel, use Excel to generate the script then copy/paste it back into SSMS.
This will be a long repetitive script but will avoid the disadvantage of having all the SQL in strings.

Related

I have 203 tables in my database and I want to write a query which returns all of the latest rows for each table

I have 203 tables in my SQL database, I want to print the latest records for each table. I know the query to get the latest row of one table at one time. How do I query for the latest row of each table in one go?
Here I do not know what are the names of your DB. So I assume that they can be indexed in the way I am about to show:
DECLARE #Counter INT
SET #Counter=1
WHILE ( #Counter <= 203)
BEGIN
EXEC('SELECT TOP(5) * FROM TABLE_'+#Counter+'ORDER BY Date DESC')
SET #Counter = #Counter + 1
END
Here make sure that you have defined everything using dynamic queries. In addition, I did not know in what format you need your pulled results to look.
Use SHOW TABLES and GROUP CONCAT and
SET #Expression = SELECT CONCAT('SELECT...
SELECT GROUP_CONCAT(...
PREPARE myquery FROM #Expression;
EXECUTE myquery;
Is it possible to execute a string in MySQL?

Adding/updating bulk data using SQL

We are inserting bulk data into one of our database tables using SQL Server Management studio. Currently we are in a position where the data being sent to the database will be added to a particular row in a table (this is controlled by a stored procedure). What we are finding is that a timeout is occurring before the operation completes; at this point we think the operation is slow because of the while loop but we're unsure of how to approach writing a faster equivalent.
-- Insert statements for procedure here
WHILE #i < #nonexistingTblCount
BEGIN
Insert into AlertRanking(MetricInstanceID,GreenThreshold,RedThreshold,AlertTypeID,MaxThreshold,MinThreshold)
VALUES ((select id from #nonexistingTbl order by id OFFSET #i ROWS FETCH NEXT 1 ROWS ONLY), #greenThreshold, #redThreshold, #alertTypeID, #maxThreshold, #minThreshold)
set #id = (SELECT ID FROM AlertRanking
WHERE MetricInstanceID = (select id from #nonexistingTbl order by id OFFSET #i ROWS FETCH NEXT 1 ROWS ONLY)
AND GreenThreshold = #greenThreshold
AND RedThreshold = #redThreshold
AND AlertTypeID = #alertTypeID);
set #i = #i + 1;
END
Where #nonexistingTblCount is the total number of rows inside the table #nonexistingTbl. The #nonexistingTbl table is declared earlier and contains all the values we want to add to the table.
Instead of using a loop, you should be able to insert all of the records with a single statement.
INSERT INTO AlertRanking(MetricInstanceID,GreenThreshold,RedThreshold,AlertTypeID,MaxThreshold,MinThreshold)
SELECT id, #greenThreshold, #redThreshold, #alertTypeID, #maxThreshold, #minThreshold FROM #nonexistingTbl ORDER BY id

SQL Server : update value if it does not exist in a column

I am using SQL Server 2008 and trying to create a statement which will update a single value within a row from another table if a certain parameter is met. I need to make this as simple as possible for a member of my team to use.
So in this case I want to store 2 values, the Sales Order and the reference. Unfortunately the Sales order has a unique identifier that I need to record and enter into the jobs table and NOT the actual sales order reference.
The parameter which needs to be met is that the Sales order unique identifier cannot exist anywhere in the sales order column within the jobs table. I can get this to work when the while value is set to 1 but not when it's set to 0 and in my head it should be set to 0. Anyone got any ideas why this doesn't work?
/****** Attach an SO to a WO ******/
/****** ONLY EDIT THE VALUES BETWEEN '' ******/
Declare #Reference nvarchar(30);
Set #Reference = 'WO16119';
Declare #SO nvarchar(30);
Set #SO = '0000016205';
/****** DO NOT ALTER THE CODE BEYOND THIS POINT!!!!!!!!!!!!! ******/
/* store more values */
Declare #SOID nvarchar(30);
Set #SOID = (Select SOPOrderReturnID
FROM Test_DB.dbo.SOTable
Where DocumentNo = #SO);
/* check if update should run */
Declare #Check nvarchar (30);
Set #Check = (Select case when exists (select *
from Test_DB.dbo.Jobs
where SalesOrderNumber != #SOID)
then CAST(1 AS BIT)
ELSE CAST(0 AS BIT) End)
While (#Check = 0)
/* if check is true run code below */
Begin
Update Test_DB.dbo.jobs
SET SalesOrderNumber = (Select SOPOrderReturnID
FROM Test_DB.dbo.SOPOrderReturn
Where DocumentNo = #SO)
Where Reference = #Reference
END;
A few comments. First in order to avoid getting into a never ending loop you may want to change your while for an IF statement. You aren't changing the #check value so that will run forever:
IF (#Check = 0)
BEGIN
/* if check is true run code below */
Update Test_DB.dbo.jobs
SET SalesOrderNumber = (Select SOPOrderReturnID
FROM Test_DB.dbo.SOPOrderReturn
Where DocumentNo = #SO)
Where Reference = #Reference
END
Then, without knowing your application I would say that the way you make checks is going to require you to lock your tables to avoid other users invalidating the results of your SELECTs.
I would go instead to creating a UNIQUE constraint over the column you want to be unique and handle the error gracefully. This way you don't need to create big locks on your tables
CREATE UNIQUE INDEX IX_UniqueIndex ON Test_DB.dbo.Jobs(SalesOrderNumber)
As per your comment if you cannot create a unique index you may want to try the following SQL although it could cause too much locking and be affected by race conditions:
IF NOT EXISTS (SELECT 1 FROM Test_DB.dbo.Jobs j INNER JOIN Test_DB.dbo.SOTable so ON j.SalesOrderNumber = so.SPOrderReturnId)
BEGIN
UPDATE Test_DB.dbo.jobs
SET SalesOrderNumber = so.SOPOrderReturnID
FROM
Test_DB.dbo.Jobs j
INNER JOIN Test_DB.dbo.SOTable so ON j.SalesOrderNumber = so.SPOrderReturnId
Where
Reference = #Reference
END
The risk of this are that you are running to separate queries (the select and the update) so between them the state of the database could change. So it may be possible that the first query returns nothing exists for that Id but at the moment of the update other user has inserted/updated that data so the previous result is no longer true.
You can try to avoid this problem by using a isolation level that locks the table on the read (like Serializable) but that could cause locks and even deadlocks in the database.
The best solution here is the unique index. If a certain column has to be unique inside a table the best controller system is the db itself by defining constraints.

T-SQL setting int on a column based on random p-key

I created the following script below. I am pretty much looking to update a colomn called "C1Int" with the number 1234567 randomly based on the pkey of the row.
I created a random generator for the pkey that uses 1 as the min and the total rows as the max.
Then there is a loop that should update the rows over and over based on the number in the WHILE statement. When I run it, it just updates one random row with the 1234567 number and even though its still running the loop, it never updates anything else. Am I missing something? Is there a better way to do this?
DECLARE #a INT
DECLARE #maxpkey INT
DECLARE #minpkey INT
DECLARE #randompkey INT
SET #a = 1
SET #maxpkey = (select count(*) from [LoadTestTwo].[dbo].[actbenchdb.Table1])
SET #minpkey = 1
SET #randompkey = ROUND(((#maxpkey - #minpkey -1) * RAND() + #minpkey),0)
WHILE #a < 500000000000000000000
BEGIN
UPDATE [LoadTestTwo].[dbo].[actbenchdb.Table1]
SET C1Int = (1234567)
WHERE Pkey = #randompkey
SET #a = #a + 1
END
Your current loop only updates a single row because you set the random key outside the loop and then just run the same update statement 500,000,000,000,000,000,000 times (Which assuming 1 million executions per second would still take 15 million years to complete, and would likely hit all your records anyway).
It is clear what you are trying to do, I just don't know why you would want to randomly change data in your database. Anyway, I may not understand why, but I can at least say how, if you want to update n random rows, then rather than running a loop n times, it would be better to perform a single update:
DECLARE #n INT = 100; -- NUMBER OF RANDOM ROWS TO UPDATE
UPDATE t
SET CInt = 12345467
FROM ( SELECT TOP (#n) *
FROM [LoadTestTwo].[dbo].[actbenchdb.Table1]
ORDER BY NEWID()
) AS t;
There are simpler and better ways to create a DB Load Generator!
Let me Google that for you

Batch commit on large INSERT operation in native SQL?

I have a couple large tables (188m and 144m rows) I need to populate from views, but each view contains a few hundred million rows (pulling together pseudo-dimensionally modelled data into a flat form). The keys on each table are over 50 composite bytes of columns. If the data was in tables, I could always think about using sp_rename to make the other new table, but that isn't really an option.
If I do a single INSERT operation, the process uses a huge amount of transaction log space, typicalyl filing it up and prompting a bunch of hassle with the DBAs. (And yes, this is probably a job the DBAs should handle/design/architect)
I can use SSIS and stream the data into the destination table with batch commits (but this does require the data to be transmitted over the network, since we are not allowed to run SSIS packages on the server).
Any things other than to divide the process up into multiple INSERT operations using some kind of key to distribute the rows into different batches and doing a loop?
Does the view have ANY kind of unique identifier / candidate key? If so, you could select those rows into a working table using:
SELECT key_columns INTO dbo.temp FROM dbo.HugeView;
(If it makes sense, maybe put this table into a different database, perhaps with SIMPLE recovery model, to prevent the log activity from interfering with your primary database. This should generate much less log anyway, and you can free up the space in the other database before you resume, in case the problem is that you have inadequate disk space all around.)
Then you can do something like this, inserting 10,000 rows at a time, and backing up the log in between:
SET NOCOUNT ON;
DECLARE
#batchsize INT,
#ctr INT,
#rc INT;
SELECT
#batchsize = 10000,
#ctr = 0;
WHILE 1 = 1
BEGIN
WITH x AS
(
SELECT key_column, rn = ROW_NUMBER() OVER (ORDER BY key_column)
FROM dbo.temp
)
INSERT dbo.PrimaryTable(a, b, c, etc.)
SELECT v.a, v.b, v.c, etc.
FROM x
INNER JOIN dbo.HugeView AS v
ON v.key_column = x.key_column
WHERE x.rn > #batchsize * #ctr
AND x.rn <= #batchsize * (#ctr + 1);
IF ##ROWCOUNT = 0
BREAK;
BACKUP LOG PrimaryDB TO DISK = 'C:\db.bak' WITH INIT;
SET #ctr = #ctr + 1;
END
That's all off the top of my head, so don't cut/paste/run, but I think the general idea is there. For more details (and why I backup log / checkpoint inside the loop), see this post on sqlperformance.com:
Break large delete operations into chunks
Note that if you are taking regular database and log backups you will probably want to take a full to start your log chain over again.
You could partition your data and insert your data in a cursor loop. That would be nearly the same as SSIS batchinserting. But runs on your server.
create cursor ....
select YEAR(DateCol), MONTH(DateCol) from whatever
while ....
insert into yourtable(...)
select * from whatever
where YEAR(DateCol) = year and MONTH(DateCol) = month
end
I know this is an old thread, but I made a generic version of Arthur's cursor solution:
--Split a batch up into chunks using a cursor.
--This method can be used for most any large table with some modifications
--It could also be refined further with an #Day variable (for example)
DECLARE #Year INT
DECLARE #Month INT
DECLARE BatchingCursor CURSOR FOR
SELECT DISTINCT YEAR(<SomeDateField>),MONTH(<SomeDateField>)
FROM <Sometable>;
OPEN BatchingCursor;
FETCH NEXT FROM BatchingCursor INTO #Year, #Month;
WHILE ##FETCH_STATUS = 0
BEGIN
--All logic goes in here
--Any select statements from <Sometable> need to be suffixed with:
--WHERE Year(<SomeDateField>)=#Year AND Month(<SomeDateField>)=#Month
FETCH NEXT FROM BatchingCursor INTO #Year, #Month;
END;
CLOSE BatchingCursor;
DEALLOCATE BatchingCursor;
GO
This solved the problem on loads of our large tables.
There is no pixie dust, you know that.
Without knowing specifics about the actual schema being transfered, a generic solution would be exactly as you describe it: divide processing into multiple inserts and keep track of the key(s). This is sort of pseudo-code T-SQL:
create table currentKeys (table sysname not null primary key, key sql_variant not null);
go
declare #keysInserted table (key sql_variant);
declare #key sql_variant;
begin transaction
do while (1=1)
begin
select #key = key from currentKeys where table = '<target>';
insert into <target> (...)
output inserted.key into #keysInserted (key)
select top (<batchsize>) ... from <source>
where key > #key
order by key;
if (0 = ##rowcount)
break;
update currentKeys
set key = (select max(key) from #keysInserted)
where table = '<target>';
commit;
delete from #keysInserted;
set #key = null;
begin transaction;
end
commit
It would get more complicated if you want to allow for parallel batches and partition the keys.
You could use the BCP command to load the data and use the Batch Size parameter
http://msdn.microsoft.com/en-us/library/ms162802.aspx
Two step process
BCP OUT data from Views into Text files
BCP IN data from Text files into Tables with batch size parameter
This looks like a job for good ol' BCP.