We have a database called AVL in SQL Server 2008 R2 SE. This database has many tables, but there is one in particular called ASSETLOCATION that has 46 millon rows right now and accounts for 99.9% of the total database size.
This table has information from 2008 to date, and the actual rate of growing is about 120k records daily.
Now, there are 2 situations we like to address:
Performance is starting to degrade slowly, and everything is
optimized so there's not much to do
Backup times are increasing and is becoming a problem (we do 1 full
backup everyday). The BAK file is 11GB, after winrar does his thing
the final size 2GB and then a script sends the file offsite. We have
a T1 and pulling 2GB through the wire is taking about 5 hours.
All this is normal but here's the catch I want to capitalize on: +90% of SQL statements use information only 3 months old or less, in other words, data from 2008, 2009 and 2010 don't get accessed often.
I was thinking on creating one new database for each year. Lets say:
- AVL2008 database, only table there will be ASSETLOCATION with records from 2008
- AVL2009 database, only table there will be ASSETLOCATION with records from 2009
- AVL2010 database, only table there will be ASSETLOCATION with records from 2010
As you have already guessed data from the past don't get changed, so this will be great from the backup perspective since the AVL database will only have the records from the current year. This approach will also help performance a lot.
Now for the problem. Assume the ASSETLOCATION table has the following columns:
- IDASSETLOCATION (int, PK identity)
- IDASSET (int, FK to ASSET table)
- WHEN (datetime)
- LATLONG (varchar(22), spatial info)
I need to create a view in the AVL database called "vASSETLOCATION", witch is quite simple, but I don't want the view accessing all the databases and joining the ASSETLOCATION tables via UNION, rather the only ones needed based on the WHEN field. For example:
select * from vASSETLOCATION where [WHEN] between '2008-01-01' and '2008-01-02'
In this case the view should ONLY ACCESS the AVL2008.ASSETLOCATION table
select * from vASSETLOCATION where [WHEN] between '2008-12-29' and '2009-01-05'
In this case the view should access AVL2008.ASSETLOCATION and AVL2009.ASSETLOCATION
select * from vASSETLOCATION where
([WHEN] between '2008-01-01' and '2008-01-01')
or
([WHEN] = getdate())
In this case the view should access AVL2008.ASSETLOCATION and AVL.ASSETLOCATION
I know a table scalar UDF in place of the view will solve the problem, but there are more than only 4 fields and [WHEN] is not the only field we may want to include in the where part.
Before anyone suggest it, the table partitioning feature will perhaps help in performance, but NOT in the backup problem.
If there a way to do this in a view?
Thanks.-
This sounds like a classic case for table partitioning or distributed partitioned views.
However you can work around this without ponying up the price for Enterprise edition (or doing all the prep work required to support those features) using some smart code that looks at the problem a little differently. You don't want a single view that accesses all of the tables across the different databases, but what if you had multiple views and a stored procedure to control how they're accessed?
Create views for the most common access patterns. Perhaps you have a view that covers date ranges for 2008-2010, 2008-2009, 2009-2010, etc. They might look like this:
CREATE VIEW dbo.vAL_2008_2009
AS
SELECT * FROM AVL2008.dbo.ASSETLOCATION
UNION ALL
SELECT * FROM AVL2009.dbo.ASSETLOCATION;
GO
CREATE VIEW dbo.vAL_2008_2010
AS
SELECT * FROM AVL2008.dbo.ASSETLOCATION
UNION ALL
SELECT * FROM AVL2009.dbo.ASSETLOCATION
UNION ALL
SELECT * FROM AVL2010.dbo.ASSETLOCATION;
GO
-- etc. etc.
Now your code that handles the queries can take the input date parameters and calculate which view it needs to query. For example:
CREATE PROCEDURE dbo.DetermineViews
#StartDate DATETIME,
#EndDate DATETIME,
#optionalToday BIT = 0
AS
BEGIN
SET NOCOUNT ON;
DECLARE #sql NVARCHAR(MAX) = N'';
SET #sql = #sql + N'SELECT * FROM ' + CASE
WHEN #StartDate >= '20080101' AND #EndDate < '20090101' THEN 'AVL2008.dbo.ASSETLOCATION'
WHEN #StartDate >= '20080101' AND #EndDate < '20100101' THEN 'dbo.vAL_2008_2009'
WHEN #StartDate >= '20080101' AND #EndDate < '20110101' THEN 'dbo.vAL_2008_2010'
-- etc. etc.
WHEN YEAR(#StartDate) = YEAR(CURRENT_TIMESTAMP) THEN 'AVL.dbo.ASSETLOCATION'
ELSE '' END;
IF #OptionalToday = 1 AND YEAR(#StartDate) <> YEAR(CURRENT_TIMESTAMP)
BEGIN
SET #sql = #sql + N'UNION ALL SELECT * FROM AVL.dbo.ASSETLOCATION'
END
SET #sql = #sql + ' WHERE [WHEN] BETWEEN '''
+ CONVERT(CHAR(8), #StartDate, 112) + ''' AND '''
+ CONVERT(CHAR(8), #EndDate, 112) + '''';
IF #OptionalToday = 1
BEGIN
SET #sql = #sql + ' OR ([WHEN] >= DATEDIFF(DAY, 0, CURRENT_TIMESTAMP)
AND [WHEN] < DATEADD(DAY, 1, DATEDIFF(DAY, 0, CURRENT_TIMESTAMP)';
END
PRINT #sql;
-- EXEC sp_executeSQL #sql;
END
GO
I'm probably missing some of your business logic, and you'll certainly want to add some error-handling in there and test the junk out of it, but this is a relatively easy-to-maintain solution that only requires updating that coincides with creating a new database to archive last year's data, which it sounds like only happens once a year.
Related
I have a table with 3.4 million rows. I want to copy this whole data into another table.
I am performing this task using the below query:
select *
into new_items
from productDB.dbo.items
I need to know the best possible way to do this task.
I had the same problem, except I have a table with 2 billion rows, so the log file would grow to no end if I did this, even with the recovery model set to Bulk-Logging:
insert into newtable select * from oldtable
So I operate on blocks of data. This way, if the transfer is interupted, you just restart it. Also, you don't need a log file as big as the table. You also seem to get less tempdb I/O, not sure why.
set identity_insert newtable on
DECLARE #StartID bigint, #LastID bigint, #EndID bigint
select #StartID = isNull(max(id),0) + 1
from newtable
select #LastID = max(ID)
from oldtable
while #StartID < #LastID
begin
set #EndID = #StartID + 1000000
insert into newtable (FIELDS,GO,HERE)
select FIELDS,GO,HERE from oldtable (NOLOCK)
where id BETWEEN #StartID AND #EndId
set #StartID = #EndID + 1
end
set identity_insert newtable off
go
You might need to change how you deal with IDs, this works best if your table is clustered by ID.
If you are copying into a new table, the quickest way is probably what you have in your question, unless your rows are very large.
If your rows are very large, you may want to use the bulk insert functions in SQL Server. I think you can call them from C#.
Or you can first download that data into a text file, then bulk-copy (bcp) it. This has the additional benefit of allowing you to ignore keys, indexes etc.
Also try the Import/Export utility that comes with the SQL Management Studio; not sure whether it will be as fast as a straight bulk-copy, but it should allow you to skip the intermediate step of writing out as a flat file, and just copy directly table-to-table, which might be a bit faster than your SELECT INTO statement.
I have been working with our DBA to copy an audit table with 240M rows to another database.
Using a simple select/insert created a huge tempdb file.
Using a the Import/Export wizard worked but copied 8M rows in 10min
Creating a custom SSIS package and adjusting settings copied 30M rows in 10Min
The SSIS package turned out to be the fastest and most efficent for our purposes
Earl
Here's another way of transferring large tables. I've just transferred 105 million rows between two servers using this. Quite quick too.
Right-click on the database and choose Tasks/Export Data.
A wizard will take you through the steps but you choosing your SQL server client as the data source and target will allow you to select the database and table(s) you wish to transfer.
For more information, see https://www.mssqltips.com/sqlservertutorial/202/simple-way-to-export-data-from-sql-server/
If it's a 1 time import, the Import/Export utility in SSMS will probably work the easiest and fastest. SSIS also seems to work better for importing large data sets than a straight INSERT.
BULK INSERT or BCP can also be used to import large record sets.
Another option would be to temporarily remove all indexes and constraints on the table you're importing into and add them back once the import process completes. A straight INSERT that previously failed might work in those cases.
If you're dealing with timeouts or locking/blocking issues when going directly from one database to another, you might consider going from one db into TEMPDB and then going from TEMPDB into the other database as it minimizes the effects of locking and blocking processes on either side. TempDB won't block or lock the source and it won't hold up the destination.
Those are a few options to try.
-Eric Isaacs
Simple Insert/Select sp's work great until the row count exceeds 1 mil. I've watched tempdb file explode trying to insert/select 20 mil + rows. The simplest solution is SSIS setting the batch row size buffer to 5000 and commit size buffer to 1000.
I know this is late, but if you are encountering semaphore timeouts then you can use row_number to set increments for your insert(s) using something like
INSERT INTO DestinationTable (column1, column2, etc)
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY ID) AS RN , column1, column2, etc
FROM SourceTable ) AS A
WHERE A.RN >= 1 AND A.RN <= 10000 )
The size of the log file will grow, so there is that to contend with. You get better performance if you disable constraints and index when inserting into an existing table. Then enable the constraints and rebuild the index for the table you inserted into once the insertion is complete.
I like the solution from #Mathieu Longtin to copy in batches thereby minimising log file issues and created a version with OFFSET FETCH as suggested by #CervEd.
Others have suggested using the Import/Export Wizard or SSIS packages, but that's not always possible.
It's probably overkill for many but my solution includes some checks for record counts and outputs progress as well.
USE [MyDB]
GO
SET NOCOUNT ON;
DECLARE #intStart int = 1;
DECLARE #intCount int;
DECLARE #intFetch int = 10000;
DECLARE #strStatus VARCHAR(200);
DECLARE #intCopied int = 0;
SET #strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Getting count of HISTORY records currently in MyTable...';
RAISERROR (#strStatus, 10, 1) WITH NOWAIT;
SELECT #intCount = COUNT(*) FROM [dbo].MyTable WHERE IsHistory = 1;
SET #strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Count of HISTORY records currently in MyTable: ' + CONVERT(VARCHAR(20), #intCount);
RAISERROR (#strStatus, 10, 1) WITH NOWAIT; --(note: PRINT resets ##ROWCOUNT to 0 so using RAISERROR instead)
SET #strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Starting copy...';
RAISERROR (#strStatus, 10, 1) WITH NOWAIT;
WHILE #intStart < #intCount
BEGIN
INSERT INTO [dbo].[MyTable_History] (
[PK1], [PK2], [PK3], [Data1], [Data2])
SELECT
[PK1], [PK2], [PK3], [Data1], [Data2]
FROM [MyDB].[dbo].[MyTable]
WHERE IsHistory = 1
ORDER BY
[PK1], [PK2], [PK3]
OFFSET #intStart - 1 ROWS
FETCH NEXT #intFetch ROWS ONLY;
SET #intCopied = #intCopied + ##ROWCOUNT;
SET #strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Records copied so far: ' + CONVERT(VARCHAR(20), #intCopied);
RAISERROR (#strStatus, 10, 1) WITH NOWAIT;
SET #intStart = #intStart + #intFetch;
END
--Check the record count is correct.
IF #intCopied = #intCount
BEGIN
SET #strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Correct record count.';
RAISERROR (#strStatus, 10, 1) WITH NOWAIT;
END
ELSE
BEGIN
SET #strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Only ' + CONVERT(VARCHAR(20), #intCopied) + ' records were copied, expected: ' + CONVERT(VARCHAR(20), #intCount);
RAISERROR (#strStatus, 10, 1) WITH NOWAIT;
END
GO
If your focus is Archiving (DW) and are dealing with VLDB with 100+ partitioned tables and you want to isolate most of these resource intensive work on a non production server (OLTP) here is a suggestion (OLTP -> DW)
1) Use backup / Restore to get the data onto the archive server (so now, on Archive or DW you will have Stage and Target database)
2) Stage database: Use partition switch to move data to corresponding stage table
3) Use SSIS to transfer data from staged database to target database for each staged table on both sides
4) Target database: Use partition switch on target database to move data from stage to base table
Hope this helps.
select * into new_items from productDB.dbo.items
That pretty much is it. THis is the most efficient way to do it.
Our development team works on SQL Server and writes stored procedures for our product.
We need something like a version control system for those procedures or any other objects.
Sometimes I change a stored procedure, and someone else in my team changes it and I don't know any thing about it.
Is there any solution for that?
If you want to do it via code you could run this on a daily or hourly basis to get a list of all procs that were changed in the last day:
select *
from sys.objects
where datediff(dd, create_date, getdate()) < 1
or datediff(dd, modify_date, getdate() < 1)
and type = 'P';
or you could create a ddl trigger:
Create trigger prochanged On database
For create_procedure, alter_procedure, drop procedure
as
begin
set nocount on
Declare #data xml
set #data = Eventdata()
-- save #data to a table...
end
This will allow you to save all kinds of information every time a proc is created, changed or deleted.
This is my first post on Stackoverflow so I hope I'm correctly following all protocols!
I'm struggling with a stored procedure in which I create a table variable and filling this table with an insert statement using an inner join. The insert itself is simple, but it gets complicated because the inner join is done on a local variable. Since the optimizer doesn't have statistics for this variable my estimated row count is getting srewed up.
The specific piece of code that causes trouble:
declare #minorderid int
select #minorderid = MIN(lo.order_id)
from [order] lo with(nolock)
where lo.order_datetime >= #datefrom
insert into #OrderTableLog_initial
(order_id, order_log_id, order_id, order_datetime, account_id, domain_id)
select ot.order_id, lol.order_log_id, ot.order_id, ot.order_datetime, ot.account_id, ot.domain_id
from [order] ot with(nolock)
inner join order_log lol with(nolock)
on ot.order_id = lol.order_id
and ot.order_datetime >= #datefrom
where (ot.domain_id in (1,2,4) and lol.order_log_id not in ( select order_log_id
from dbo.order_log_detail lld with(nolock)
where order_id >= #minorderid
)
or
(ot.domain_id = 3 and ot.order_id not IN (select order_id
from dbo.order_log_detail_spa llds with(nolock)
where order_id >= #minorderid
)
))
order by lol.order_id, lol.order_log_id
The #datefrom local variable is also declared earlier in the stored procedure:
declare #datefrom datetime
if datepart(hour,GETDATE()) between 4 and 9
begin
set #datefrom = '2011-01-01'
end
else
begin
set #datefrom = DATEADD(DAY,-2,GETDATE())
end
I've also tested this with a temporary table in stead of a table variable, but nothing changes. However, when I replace the local variable >= #datefrom with a fixed datestamp then my estimates and actuals are almost the same.
ot.order_datetime >= #datefrom = SQL Sentry Plan Explorer
ot.order_datetime >= '2017-05-03 18:00:00.000' = SQL Sentry Plan Explorer
I've come to understand that there's a way to fix this by turning this code into a dynamic sp, but I'm not sure how to do this. I would be grateful if someone could give me suggestions on how to do this. Maybe I have to use a complete other approach? Forgive me if I forgot something to mention, this is my first post.
EDIT:
MSSQL version = 11.0.5636
I've also tested with trace flag 2453, but with no success
Best regards,
Peter
Indeed, the behavior what you are experiencing is because the variables. SQL Server won't store an execution plan for each and every possible inputs, thus for some queries the execution plan may or may not optimal.
To answer your explicit question: You'll have to create a varchar variable and build the query as a string, then execute it.
Some notes before the actual code:
This can be prone to SQL injection (in general)
SQL Server will store the plans separately, meaning they will use more memory and possibly knock out other plans from the cache
Using an imaginary setup, this is what you want to do:
DECLARE #inputDate DATETIME2 = '2017-01-01 12:21:54';
DELCARE #dynamiSQL NVARCHAR(MAX) = CONCAT('SELECT col1, col2 FROM MyTable WHERE myDateColumn = ''', FORMAT(#inputDate, 'yyyy-MM-dd HH:mm:ss'), ''';');
INSERT INTO #myTableVar (col1, col2)
EXEC sp_executesql #stmt = #dynamicSQL;
As an additional note:
you can try to use EXISTS and NOT EXISTS instead of IN and NOT IN.
You can try to use a temp table (#myTempTable) instead of a local variable and put some indexes on it. Physical temp tables can perform better with large amount of data and you can put indexes on it. (For more info you can go here: What's the difference between a temp table and table variable in SQL Server? or to the official documentation)
I am using the following SQL in my stored procedure to not filter by date parameters if they are null.
WHERE (Allocated >= ISNULL(#allocatedStartDate, '01/01/1900')
AND Allocated <= ISNULL(#allocatedEndDate,'01/01/3000'))
AND
(MatterOpened >= ISNULL(#matterOpenedStartDate, '01/01/1900')
AND MatterOpened <= ISNULL(#matterOpenedEndDate, '01/01/3000'))
Will this give any kind of performance hit when dealing with a lot of records?
Is there a better way to do this?
Number of records - around 500k
Or just let the query optimizer have it:
WHERE ( #allocatedStartDate is NULL or Allocated >= allocatedStartDate ) and
( #allocatedEndDate is NULL or Allocated <= #allocatedEndDate ) and
( #matterOpenedStartDate is NULL or MatterOpened >= #matterOpenedStartDate ) and
( #matterOpenedEndDate is NULL or MatterOpened <= #matterOpenedEndDate )
Note that this is not logically equivalent to your query. The last line uses column MatterOpened, not Allocated, as I assume that was a typographic error.
If performance is really an issue, you may want to consider adding indexes and changing the stored procedure to execute different queries based on the parameters. At least break it into: no filter, filter only on Allocated, filter only on MatterOpened, filter on both columns.
In a lot of cases, dynamic SQL can be better for you instead of trying to rely on the optimizer to cache a good plan for both NULL and non-NULL parameters.
DECLARE #sql NVARCHAR(MAX);
SET #sql = N'SELECT
...
WHERE 1 = 1';
SET #sql = #sql + CASE WHEN #allocatedStartDate IS NOT NULL THEN
' AND Allocated >= ''' + CONVERT(CHAR(8), #allocatedStartDate, 112) + '''';
-- repeat for other clauses
EXEC sp_executesql #sql;
No, it's not fun to maintain, but each variation should get its own plan in the cache. You'll want to test with different settings for "Optimize for ad hoc workloads" and database-level paramaterization settings. Oops, just noticed 2005. Keep those in mind for the future (and any readers who aren't still stuck on 2005).
Also make sure to use EXEC sp_executesql and not EXEC.
Instead of checking to see if the variable is null in your query, check them at the beginning of your stored procedure and change the value to your default
SELECT
#allocatedStartDate = ISNULL(#allocatedStartDate, '01/01/1900'),
#allocatedEndDate = ISNULL(#allocatedEndDate,'01/01/3000'),
#matterOpenedStartDate = ISNULL(#matterOpenedStartDate, '01/01/1900'),
#matterOpenedEndDate = ISNULL(#matterOpenedEndDate, '01/01/3000')
Maybe something like this:
DECLARE #allocatedStartDate DATETIME=GETDATE()
DECLARE #allocatedEndDate DATETIME=GETDATE()-2
;WITH CTE AS
(
SELECT
ISNULL(#allocatedStartDate, '01/01/1900') AS allocatedStartDate,
ISNULL(#allocatedEndDate,'01/01/3000') AS allocatedEndDate
)
SELECT
*
FROM
YourTable
CROSS JOIN CTE
WHERE (Allocated >= CTE.allocatedStartDate
AND Allocated <= CTE.allocatedEndDate)
AND
(MatterOpened >= CTE.allocatedStartDate
AND Allocated <= CTE.allocatedEndDate)
I have a table with 3.4 million rows. I want to copy this whole data into another table.
I am performing this task using the below query:
select *
into new_items
from productDB.dbo.items
I need to know the best possible way to do this task.
I had the same problem, except I have a table with 2 billion rows, so the log file would grow to no end if I did this, even with the recovery model set to Bulk-Logging:
insert into newtable select * from oldtable
So I operate on blocks of data. This way, if the transfer is interupted, you just restart it. Also, you don't need a log file as big as the table. You also seem to get less tempdb I/O, not sure why.
set identity_insert newtable on
DECLARE #StartID bigint, #LastID bigint, #EndID bigint
select #StartID = isNull(max(id),0) + 1
from newtable
select #LastID = max(ID)
from oldtable
while #StartID < #LastID
begin
set #EndID = #StartID + 1000000
insert into newtable (FIELDS,GO,HERE)
select FIELDS,GO,HERE from oldtable (NOLOCK)
where id BETWEEN #StartID AND #EndId
set #StartID = #EndID + 1
end
set identity_insert newtable off
go
You might need to change how you deal with IDs, this works best if your table is clustered by ID.
If you are copying into a new table, the quickest way is probably what you have in your question, unless your rows are very large.
If your rows are very large, you may want to use the bulk insert functions in SQL Server. I think you can call them from C#.
Or you can first download that data into a text file, then bulk-copy (bcp) it. This has the additional benefit of allowing you to ignore keys, indexes etc.
Also try the Import/Export utility that comes with the SQL Management Studio; not sure whether it will be as fast as a straight bulk-copy, but it should allow you to skip the intermediate step of writing out as a flat file, and just copy directly table-to-table, which might be a bit faster than your SELECT INTO statement.
I have been working with our DBA to copy an audit table with 240M rows to another database.
Using a simple select/insert created a huge tempdb file.
Using a the Import/Export wizard worked but copied 8M rows in 10min
Creating a custom SSIS package and adjusting settings copied 30M rows in 10Min
The SSIS package turned out to be the fastest and most efficent for our purposes
Earl
Here's another way of transferring large tables. I've just transferred 105 million rows between two servers using this. Quite quick too.
Right-click on the database and choose Tasks/Export Data.
A wizard will take you through the steps but you choosing your SQL server client as the data source and target will allow you to select the database and table(s) you wish to transfer.
For more information, see https://www.mssqltips.com/sqlservertutorial/202/simple-way-to-export-data-from-sql-server/
If it's a 1 time import, the Import/Export utility in SSMS will probably work the easiest and fastest. SSIS also seems to work better for importing large data sets than a straight INSERT.
BULK INSERT or BCP can also be used to import large record sets.
Another option would be to temporarily remove all indexes and constraints on the table you're importing into and add them back once the import process completes. A straight INSERT that previously failed might work in those cases.
If you're dealing with timeouts or locking/blocking issues when going directly from one database to another, you might consider going from one db into TEMPDB and then going from TEMPDB into the other database as it minimizes the effects of locking and blocking processes on either side. TempDB won't block or lock the source and it won't hold up the destination.
Those are a few options to try.
-Eric Isaacs
Simple Insert/Select sp's work great until the row count exceeds 1 mil. I've watched tempdb file explode trying to insert/select 20 mil + rows. The simplest solution is SSIS setting the batch row size buffer to 5000 and commit size buffer to 1000.
I know this is late, but if you are encountering semaphore timeouts then you can use row_number to set increments for your insert(s) using something like
INSERT INTO DestinationTable (column1, column2, etc)
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY ID) AS RN , column1, column2, etc
FROM SourceTable ) AS A
WHERE A.RN >= 1 AND A.RN <= 10000 )
The size of the log file will grow, so there is that to contend with. You get better performance if you disable constraints and index when inserting into an existing table. Then enable the constraints and rebuild the index for the table you inserted into once the insertion is complete.
I like the solution from #Mathieu Longtin to copy in batches thereby minimising log file issues and created a version with OFFSET FETCH as suggested by #CervEd.
Others have suggested using the Import/Export Wizard or SSIS packages, but that's not always possible.
It's probably overkill for many but my solution includes some checks for record counts and outputs progress as well.
USE [MyDB]
GO
SET NOCOUNT ON;
DECLARE #intStart int = 1;
DECLARE #intCount int;
DECLARE #intFetch int = 10000;
DECLARE #strStatus VARCHAR(200);
DECLARE #intCopied int = 0;
SET #strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Getting count of HISTORY records currently in MyTable...';
RAISERROR (#strStatus, 10, 1) WITH NOWAIT;
SELECT #intCount = COUNT(*) FROM [dbo].MyTable WHERE IsHistory = 1;
SET #strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Count of HISTORY records currently in MyTable: ' + CONVERT(VARCHAR(20), #intCount);
RAISERROR (#strStatus, 10, 1) WITH NOWAIT; --(note: PRINT resets ##ROWCOUNT to 0 so using RAISERROR instead)
SET #strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Starting copy...';
RAISERROR (#strStatus, 10, 1) WITH NOWAIT;
WHILE #intStart < #intCount
BEGIN
INSERT INTO [dbo].[MyTable_History] (
[PK1], [PK2], [PK3], [Data1], [Data2])
SELECT
[PK1], [PK2], [PK3], [Data1], [Data2]
FROM [MyDB].[dbo].[MyTable]
WHERE IsHistory = 1
ORDER BY
[PK1], [PK2], [PK3]
OFFSET #intStart - 1 ROWS
FETCH NEXT #intFetch ROWS ONLY;
SET #intCopied = #intCopied + ##ROWCOUNT;
SET #strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Records copied so far: ' + CONVERT(VARCHAR(20), #intCopied);
RAISERROR (#strStatus, 10, 1) WITH NOWAIT;
SET #intStart = #intStart + #intFetch;
END
--Check the record count is correct.
IF #intCopied = #intCount
BEGIN
SET #strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Correct record count.';
RAISERROR (#strStatus, 10, 1) WITH NOWAIT;
END
ELSE
BEGIN
SET #strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Only ' + CONVERT(VARCHAR(20), #intCopied) + ' records were copied, expected: ' + CONVERT(VARCHAR(20), #intCount);
RAISERROR (#strStatus, 10, 1) WITH NOWAIT;
END
GO
If your focus is Archiving (DW) and are dealing with VLDB with 100+ partitioned tables and you want to isolate most of these resource intensive work on a non production server (OLTP) here is a suggestion (OLTP -> DW)
1) Use backup / Restore to get the data onto the archive server (so now, on Archive or DW you will have Stage and Target database)
2) Stage database: Use partition switch to move data to corresponding stage table
3) Use SSIS to transfer data from staged database to target database for each staged table on both sides
4) Target database: Use partition switch on target database to move data from stage to base table
Hope this helps.
select * into new_items from productDB.dbo.items
That pretty much is it. THis is the most efficient way to do it.