Fastest way to copy sql table - sql

Im looking for the fastest way to copy a table and its contents on my sql server just simple copy of the table with the source and destination on the same server/database.
Currently with a stored procedure select * into sql statement it takes 6.75 minutes to copy over 4.7 million records. This is too slow.
CREATE PROCEDURE [dbo].[CopyTable1]
AS
BEGIN
DECLARE #mainTable VARCHAR(255),
#backupTable VARCHAR(255),
#sql VARCHAR(255),
#qry nvarchar(max);
SET NOCOUNT ON;
Set #mainTable='Table1'
Set #backupTable=#mainTable + '_Previous'
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(#backupTable) AND type in (N'U'))
BEGIN
SET #Sql = 'if exists (select * from sysobjects '
SET #Sql = #Sql + 'where id = object_id(N''[' + #backupTable + ']'') and '
SET #Sql = #Sql + 'OBJECTPROPERTY(id, N''IsUserTable'') = 1) ' + CHAR(13)
SET #Sql = #Sql + 'DROP TABLE [' + #backupTable + ']'
EXEC (#Sql)
END
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(#mainTable) AND type in (N'U'))
SET #Sql = 'SELECT * INTO dbo.[' + #backupTable + '] FROM dbo.[' + #mainTable + ']'
EXEC (#Sql)
END

If you are concerned about speed, it seems you have two alternatives; copying by block or the BCP/Bulk insert method.
Block Transfer
DECLARE
#CurrentRow bigint, #RowCount bigint, #CurrentBlock bigint
SET
#CurrentRow = 1
SELECT
#RowCount = Count(*)
FROM
oldtable
WITH (NOLOCK)
WHILE #CurrentRow < #RowCount
BEGIN
SET
#CurrentBlock = #CurrentRow + 1000000
INSERT INTO
newtable
(FIELDS,GO,HERE)
SELECT
FIELDS,GO,HERE
FROM (
SELECT
FIELDS,GO,HERE, ROW_NUMBER() OVER (ORDER BY SomeColumn) AS RowNum
FROM
oldtable
WITH (NOLOCK)
) AS MyDerivedTable
WHERE
MyDerivedTable.RowNum BETWEEN #startRow AND #endRow
SET
#CurrentRow = #CurrentBlock + 1
end
How to copy a huge table data into another table in SQL Server
BCP/Bulk Insert
SELECT
*
INTO
NewTable
FROM
OldTable
WHERE
1=2
BULK INSERT
NewTable
FROM
'c:\temp\OldTable.txt'
WITH (DATAFILETYPE = 'native')
What is the fastest way to copy data from one table to another
http://www.databasejournal.com/features/mssql/article.php/3507171/Transferring-Data-from-One-Table-to-Another.htm

You seem to want to copy a table that is a heap and has no indexes. That is the easiest case to get right. Just do a
insert into Target with (tablock) select * from Source
Make sure, that minimal logging for bulk operations is enabled (search for that term). Switch to the simple recovery model.
This will take up almost no log space because only allocations are logged with minimal logging.
This just scans the source in allocation order and append new bulk-allocated pages to the target.
Again, you have asked about the easiest case. Things get more complicated when indexes come into play.
Why not insert in batches? It's not necessary. Log space is not an issue. And because the target is not sorted (it is a heap) we don't need sort buffers.

Related

Verify all columns can convert from varchar to float

I have tried a bunch of different ways like using cursors and dynamic SQL, but is there a fast way to verify that all columns in a given table can convert from varchar to float (without altering the table)?
I want to get a print out of which columns fail and which columns pass.
I am trying this method now but it is slow and cannot get the list of columns that pass or error out.
drop table users;
select *
into users_1
from users
declare #cols table (i int identity, colname varchar(100))
insert into #cols
select column_name
from information_schema.COLUMNS
where TABLE_NAME = 'users'
and COLUMN_NAME not in ('ID')
declare #i int, #maxi int
select #i = 1, #maxi = MAX(i) from #cols
declare #sql nvarchar(max)
while(#i <= #maxi)
begin
select #sql = 'alter table users_1 alter column ' + colname + ' float NULL'
from #cols
where i = #i
exec sp_executesql #sql
select #i = #i + 1
end
I found this code on one of the SQL tutorials sites.
Why all the drop/create/alter nonsense? If you just want to know if a column could be altered, why leave your table in a wacky state, where the columns that can be altered are altered, and the ones that can't just raise errors?
Here's one way to accomplish this with dynamic SQL (and with some protections):
DECLARE #tablename nvarchar(513) = N'dbo.YourTableName';
IF OBJECT_ID(#tablename) IS NOT NULL
BEGIN
DECLARE #sql nvarchar(max) = N'SELECT ',
#tmpl nvarchar(max) = N'[Can $colP$ be converted?]
= CASE WHEN EXISTS
(
SELECT 1 FROM ' + #tablename + N'
WHERE TRY_CONVERT(float, COALESCE($colQ$,N''0'')) IS NULL
)
THEN ''No, $colP$ cannot be coverted''
ELSE ''Yes, $colP$ CAN be converted'' END';
SELECT #sql += STRING_AGG(
REPLACE(REPLACE(#tmpl, N'$colQ$',
QUOTENAME(name)), N'$colP$', name), N',')
FROM sys.columns
WHERE object_id = OBJECT_ID(#tablename)
AND name <> N'ID';
EXEC sys.sp_executesql #sql;
END
Working db<>fiddle
This is never going to be "fast" - there is no great shortcut to having to read and validate every value in the table.

Retrieve Max loaded date across all tables on a DB

Output I'm trying to get to;
(Database name = ATT)
Table Name
Column name
MAX loaded date = MAX(loaded_date) for this column only
loaded_date is a column in around 50 tables in a database with the same name and datatype (Datetime)
select * FROM sys.tables
select * FROM syscolumns
I've been exploring the system tables without much luck, looking at some posts it may be done dynamic SQL which I've never done.
You can write an sql that writes an sql..
SELECT REPLACE(
'select ''{tn}'' as table_name, max(loaded_date) as ld from {tn} union all'
,'{tn}',table_name)
FROM
information_schema.columns
WHERE
column_name = 'loaded_date'
Run that, then copy all but the final UNION ALL out of the results window and into the query window, and run again
If you wanted to get all this into a single string for dynamic exec, i guess it'd look like (untested) a procedure that contained:
DECLARE #x NVARCHAR(MAX);
SELECT #x =
STRING_AGG(
REPLACE(
'select ''{tn}'' as table_name, max(loaded_date) as ld from {tn}'
,'{tn}',table_name)
,' union all ')
FROM
information_schema.columns
WHERE
column_name = 'loaded_date';
EXECUTE sp_executesql #x;
If your SQLS is old and doesnt have string_agg it's a bit more awkward - but there are many examples of "turn rows into CSV" in sql server that look like STUFF..FOR XML PATH - https://duckduckgo.com/?t=ffab&q=rows+to+CSV+SQLS&ia=web
I wrote up a more permanent type of script that does this. It returns a result set of the list of tables in the current database with a column named loaded_date along with the MAX(loaded_date) result from each table. This script individually queries each table by looping through and running the query on each table individually and keeping track of the max value for each table in a table variable. It also has a #Debug variable that allows you to see the text of the queries that would be run instead of actually running them and implements custom error message to troubleshoot any issues.
/*disable row count messages*/
SET NOCOUNT ON;
/*set to 1 to debug (aka just print queries instead of running)*/
DECLARE #Debug bit = 0;
/*get list of tables to query and assign a unique index to row to assist in looping*/
DECLARE #TableList TABLE(
SchemaAndTableName nvarchar(257) NOT NULL
,OrderToQuery bigint NOT NULL
,MaxLoadedDate datetime NULL
,PRIMARY KEY (OrderToQuery)
);
INSERT INTO #TableList (SchemaAndTableName,OrderToQuery)
SELECT
CONCAT(QUOTENAME(s.name),N'.', QUOTENAME(t.name)) AS SchemaAndTableName
,ROW_NUMBER() OVER(ORDER BY s.name, t.name) AS OrderToQuery
FROM
sys.columns AS c
INNER JOIN sys.tables AS t ON c.object_id = t.object_id
INNER JOIN sys.schemas AS s ON t.schema_id = s.schema_id
WHERE
c.name = N'loaded_date';
/*declare and set some variables for loop*/
DECLARE #NumTables int = (SELECT TOP (1) OrderToQuery FROM #TableList ORDER BY OrderToQuery DESC);
DECLARE #I int = 1;
DECLARE #CurMaxDate datetime;
DECLARE #CurTable nvarchar(257);
DECLARE #CurQuery nvarchar(max);
/*start loop*/
WHILE #I <= #NumTables
BEGIN
/*build text of current query*/
SET #CurTable = (SELECT SchemaAndTableName FROM #TableList WHERE OrderToQuery = #I);
SET #CurQuery = CONCAT(N'SELECT #MaxDateOut = MAX(loaded_date) FROM ', #CurTable, N';');
/*check debugging status*/
IF #Debug = 0
BEGIN
BEGIN TRY
EXEC sys.sp_executesql #stmt = #CurQuery
,#params = N'#MaxDateOut datetime OUTPUT'
,#MaxDateOut = #CurMaxDate OUTPUT;
END TRY
BEGIN CATCH
DECLARE #ErrorMessage nvarchar(max) = CONCAT(
N'Error querying table ', #CurTable, N'.', NCHAR(13), NCHAR(10)
,N'Errored query: ', NCHAR(13), NCHAR(10), #CurQuery, NCHAR(13), NCHAR(10)
,N'Error message: ', ERROR_MESSAGE()
);
RAISERROR(#ErrorMessage,16,1) WITH NOWAIT;
/*on error end loop so error can be investigated*/
SET #I = #NumTables + 1;
END CATCH;
END;
ELSE /*currently debugging*/
BEGIN
PRINT(CONCAT(N'Debug output: ', #CurQuery));
END;
/*update value in our table variable*/
UPDATE #TableList
SET MaxLoadedDate = #CurMaxDate
WHERE
OrderToQuery = #I;
/*increment loop*/
SET #I = #I + 1;
END;
SELECT
SchemaAndTableName AS TableName
,MaxLoadedDate AS Max_Loaded_date
FROM
#TableList;
I like this solution better as querying each table one at a time would be much less system impact than attempting one large UNION ALL query. Querying a large set of a tables all at once could cause some serious resource semaphore or locking contention (depending on usage of your db).
It is fairly well commented, but let me know if something is not clear.
Also, just a note, dynamic SQL should be used as a last resort. I provided this script to answer your question, but you should explore better options than something like this.
You can go for undocumented stored procedure sp_MSforeachtable. But, don't use in production code, as this stored procedure might not be available in future versions.
Read more on sp_MSforeachtable
EXEC sp_MSforeachtable 'SELECT ''?'' as tablename, max(loaded_Date) FROM ?'

Looping through a column in SQL table that contains names of other tables

I have fairly new to using SQL, currently I have a table that has a column that contains the names of all the tables I want to use for one query, so what I want to do is to loop through that column and go to every single one of these tables and then search one of their columns for a value (there could be multiple values), so whenever a table contains the value, I will list the name of the table. Could someone give me a hint of how this is done? Is cursor needed for this?
I don't have enough reputation to comment but is the table with the column that contain the table names all in one column, meaning that all the table names are comma separated or marked with some sort of separator? This would cause the query to be a little more complicated as you would have to take care of that before you start looping through your table.
However, this would require a cursor, as well as some dynamic sql.
I will give a basic example of how you can go about this.
declare #value varchar(50)
declare #tableName varchar(50)
declare #sqlstring nvarchar(100)
set #value = 'whateveryouwant'
declare #getTableName = cursor for
select tableName from TablewithTableNames
OPEN #getTableName
fetch NEXT
from #getTableName into #tableName
while ##FETCH_STATUS = 0
BEGIN
set #sqlstring = 'Select Count(*) from ' + #tableName + 'where ColumnNameYouwant = ' + #value
exec #sqlstring
If ##ROWcount > 0
insert into #temptable values (#tableName)
fetch next
from #getTableName into #tableName
END
select * from #temptable
drop table #temptable
close #getTableName
deallocate #getTableName
I'm currently not able to test this out as for time constraint reasons, but this is how I would go about doing this.
You could try something like this:
--Generate dynamic SQL
DECLARE #TablesToSearch TABLE (
TableName VARCHAR(50));
INSERT INTO #TablesToSearch VALUES ('invoiceTbl');
DECLARE #SQL TABLE (
RowNum INT,
SQLText VARCHAR(500));
INSERT INTO
#SQL
SELECT
ROW_NUMBER() OVER (ORDER BY ts.TableName) AS RowNum,
'SELECT * FROM ' + ts.TableName + ' WHERE ' + c.name + ' = 1;'
FROM
#TablesToSearch ts
INNER JOIN sys.tables t ON t.name = ts.TableName
INNER JOIN sys.columns c ON c.object_id = t.object_id;
--Now run the queries
DECLARE #Count INT;
SELECT #Count = COUNT(*) FROM #SQL;
WHILE #Count > 0
BEGIN
DECLARE #RowNum INT;
DECLARE #SQLText VARCHAR(500);
SELECT TOP 1 #RowNum = RowNum, #SQLText = SQLText FROM #SQL;
EXEC (#SQLText);
DELETE FROM #SQL WHERE RowNum = #RowNum;
SELECT #Count = COUNT(*) FROM #SQL;
END;
You would need to change the "1" I am using as an example to the value you are looking for and probably add a CONVERT/ CAST to make sure the column is the right data type?
You actually said that you wanted the name of the table, so you would need to change the SQL to:
'SELECT ''' + ts.TableName + ''' FROM ' + ts.TableName + ' WHERE ' + c.name + ' = 1;'
Another thought, it would probably be best to insert the results from this into a temporary table so you can dump out the results in one go at the end?

How to backup and restore table

During my testing, I want to make a copy of a few tables within the same database before running the tests. After tests are complete, I want to restore the original table with the copy.
What is the best way to do this?
I also want to make sure all indexes and constraints are restored.
DECLARE #Tablename NVARCHAR(500)
DECLARE #BuildStr NVARCHAR(500)
DECLARE #SQL NVARCHAR(500)
SELECT #Tablename = 'my_Users'
SELECT #BuildStr = CONVERT(NVARCHAR(16),GETDATE(),120)
SELECT #BuildStr = REPLACE(REPLACE(REPLACE(REPLACE(#BuildStr,'
',''),':',''),'-',''),' ','')
SET #SQL = 'select * into '+#Tablename+'_'+#BuildStr+' from '+#Tablename
SELECT #SQL
EXEC (#SQL) -- Execute SQl statement
How do I restore if I use the above to make a copy.
SQL2005
Something like:
truncate table OriginalTable
insert into OriginalTable select * from CopiedTable
Depending on which database you're using, there are faster alternatives.
I think the script that I recently used can be useful to somebody.
To backup table you can use next query:
DECLARE #tableName nvarchar(max), #tableName_bck nvarchar(max)
SET #tableName = 'SomeTable';
SET #tableName_bck = 'SomeTable_bck';
-- Backup
DECLARE #insertCommand nvarchar(max)
--SELECT INTO SomeTable_bck FROM SomeTable
SET #insertCommand = 'SELECT * INTO ' + #tableName_bck + ' FROM ' + #tableName
PRINT #insertCommand
EXEC sp_executesql #insertCommand
For restore, because tables often can have IDENTITY fields, you need to SET IDENTITY_INSERT ON and also you need to provide the column list when inserting records. That's why script is a bit more complex:
DECLARE #tableName nvarchar(max), #tableName_bck nvarchar(max)
SET #tableName = 'SomeTable';
SET #tableName_bck = 'SomeTable_bck';
-- Restore
DECLARE #columnList nvarchar(max)
DECLARE #insertCommand nvarchar(max)
SELECT
#columnList = SUBSTRING(
(
SELECT ', ' + column_name AS [text()]
From INFORMATION_SCHEMA.COLUMNS
WHERE table_name = #tableName
ORDER BY table_name
For XML PATH ('')
), 2, 1000);
--INSERT INTO SomeTable(Column1, Column2) SELECT Column1, Column2 FROM SomeTable_bck
SELECT #insertCommand = 'INSERT INTO ' + #tableName + '(' + #columnList + ') SELECT ' + #columnList + ' FROM ' + #tableName_bck
IF EXISTS (
SELECT column_name, table_name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE table_schema = 'dbo' AND table_name = #tableName
AND COLUMNPROPERTY(object_id(table_name), column_name, 'IsIdentity') = 1
)
BEGIN
SET #insertCommand =
'SET IDENTITY_INSERT ' + #tableName + ' ON;'
+ 'TRUNCATE TABLE ' + #tableName + ';'
+ #insertCommand + ';'
+ 'SET IDENTITY_INSERT ' + #tableName + ' OFF;'
/*
SET IDENTITY_INSERT SomeTable ON
TRUNCATE TABLE SomeTable
INSERT INTO SomeTable(Column1, Column2) SELECT Column1, Column2 FROM SomeTable_bck
SET IDENTITY_INSERT SomeTable OFF
*/
END
ELSE
BEGIN
SET #insertCommand =
'TRUNCATE TABLE ' + #tableName + ';'
+ #insertCommand
/*
TRUNCATE TABLE SomeTable
INSERT INTO SomeTable(Column1, Column2) SELECT Column1, Column2 FROM SomeTable_bck
*/
END
PRINT #insertCommand
EXEC sp_executesql #insertCommand
It's easy to see, that you can specify #tableName and #tableName_bck however you like it. For example, this can be in a stored procedure, so the script is reusable.
There are MANY methods to do this, but by far, the simplest is to simply take a backup of the database, work with it, then restore from backup when done. (Instructions here)
Backing up the table is certainly viable, but it's not the easiest method, and once you start working with multiple tables, it gets harder. So rather than address your specific example of restoring a single table, I'm offering general advice on better management of test data.
The safest way of doing this is to NOT restore the original, but rather to not even touch the original. Take a backup of it, and then restore it to a new test server. (Instructions here) Best practices dictate that you should never be doing test or development work on a live database anyway. This is also pretty easy, as well as safe.
Have you consdered using a SQL Server unit testing framework such as the open source tSQLt framework?
See http://tsqlt.org/
A tSQLt test runs in a transaction so whatever you do within your test will get rolled back.
It has a concept of a "faketable" which is a copy of the original table minus the constraints, if these get in the way of your test setup.

using temp tables in SQL Azure

I am writing a query to pivoting table elements where column name is generated dynamically.
SET #query = N'SELECT STUDENT_ID, ROLL_NO, TITLE, STUDENT_NAME, EXAM_NAME, '+
#cols +
' INTO ##FINAL
FROM
(
SELECT *
FROM #AVERAGES
UNION
SELECT *
FROM #MARKS
UNION
SELECT *
FROM #GRACEMARKS
UNION
SELECT *
FROM #TOTAL
) p
PIVOT
(
MAX([MARKS])
FOR SUBJECT_ID IN
( '+
#cols +' )
) AS FINAL
ORDER BY STUDENT_ID ASC, DISPLAYORDER ASC, EXAM_NAME ASC;'
EXECUTE(#query)
select * from ##FINAL
This query works properly in my local database, but it doesn't work in SQL Azure since global temp tables are not allowed there.
Now if i change ##FINAL to #FINAL in my local database, but it gives me error as
Invalid object name '#FINAL' .
How can I resolve this issue?
Okay, after saying I didn't think it could be done, I might have a way. It's ugly though. Hopefully, you can play with the below sample and adapt it to your query (without having your schema and data, it's too tricky for me to attempt to write it):
declare #cols varchar(max)
set #cols = 'object_id,schema_id,parent_object_id'
--Create a temp table with the known columns
create table #Boris (
ID int IDENTITY(1,1) not null
)
--Alter the temp table to add the varying columns. Thankfully, they're all ints.
--for unknown types, varchar(max) may be more appropriate, and will hopefully convert
declare #tempcols varchar(max)
set #tempcols = #cols
while LEN(#tempcols) > 0
begin
declare #col varchar(max)
set #col = CASE WHEN CHARINDEX(',',#tempcols) > 0 THEN SUBSTRING(#tempcols,1,CHARINDEX(',',#tempcols)-1) ELSE #tempcols END
set #tempcols = CASE WHEN LEN(#col) = LEN(#tempcols) THEN '' ELSE SUBSTRING(#tempcols,LEN(#col)+2,10000000) END
declare #sql1 varchar(max)
set #sql1 = 'alter table #Boris add [' + #col + '] int null'
exec (#sql1)
end
declare #sql varchar(max)
set #sql = 'insert into #Boris (' + #cols + ') select ' + #cols + ' from sys.objects'
exec (#sql)
select * from #Boris
drop table #Boris
They key is to create the temp table in the outer scope, and then inner scopes (code running within EXEC statements) have access to the same temp table. The above worked on SQL Server 2008, but I don't have an Azure instance to play with, so not tested there.
If you create a temp table, it's visible from dynamic sql executed in your spid, if you create the table in dynamic sql, it's not visible outside of that.
There is a workaround. You can create a stub table and alter it in your dynamic sql. It requires a bit of string manipulation but I've used this technique to generate dynamic datasets for tsqlunit.
CREATE TABLE #t1
(
DummyCol int
)
EXEC(N'ALTER TABLE #t1 ADD foo INT')
EXEC ('insert into #t1(DummyCol, foo)
VALUES(1,2)')
EXEC ('ALTER TABLE #t1 DROP COLUMN DummyCol')
select *from #t1