SSIS process to archive production DB - sql

I am new to SSIS, i got a task to archive the data from production to Archive DB and then delete the data from production keeping 13 months of data in production. There is around 300+ tables of these i have to archive around 50 tables. Out of these 50 table 6 tables have size around 1 TB.
Listing out the the 2 methods which we are planning.
1. Using 50 data flow tasks in a sequence container.
2. Using SELECT * FROM...INSERT INTO.. where table name and column name can be stored in some configuration table and through loop we can archive the data.
Which will be the better option?
Is there any other better method so please let me know.
What precautions(Performanec tips) i have to take while doing the archive process so that it should not affect the Production server?
Please give your suggestion
Thanks

Related

Merge rows from multiple tables SQL (incremental)

I'm consolidating the information of 7 SQL databases into one.
I've made a SSIS package and used Lookup transformation and I managed to get the result as expected.
The problem: 30 million rows. And I want to perform a daily task to add to the destination table the new rows in the source tables.
So it takes like 4 hours to execute the package...
Any suggestion ?
Thanks!
I have only tried full cache mode...

SSIS Incremental Load-15 mins

I have 2 tables. The source table being from a linked server and destination table being from the other server.
I want my data load to happen in the following manner:
Everyday at night I have scheduled a job to do a full dump i.e. truncate the table and load all the data from the source to the destination.
Every 15 minutes to do incremental load as data gets ingested into the source on second basis. I need to replicate the same on the destination too.
For incremental load as of now I have created scripts which are stored in a stored procedure but for future purposes we would like to implement SSIS for this case.
The scripts run in the below manner:
I have an Inserted_Date column, on the basis of this column I take the max of that column and delete all the rows that are greater than or equal to the Max(Inserted_Date) and insert all the similar values from the source to the destination. This job runs evert 15 minutes.
How to implement similar scenario in SSIS?
I have worked on SSIS using the lookup and conditional split using ID columns, but these tables I am working with have a lot of rows so lookup takes up a lot of the time and this is not the right solution to be implemented for my scenario.
Is there any way I can get Max(Inserted_Date) logic into SSIS solution too. My end goal is to remove the approach using scripts and replicate the same approach using SSIS.
Here is the general Control Flow:
There's plenty to go on here, but you may need to learn how to set variables from an Execute SQL and so on.

Pentaho | Tools-> Wizard-> Copy Tables

I want to copy tables from one database to another database.
I have gone through google and find out that we can do this with Wizard option of Tools Menu in Spoon.
Currently I am trying to copy just one table from one database into another table.
My table has just 130 000 records and it took 10 mins to copy table.
Can we improve this loading timings? I mean just to copy 100k records, it should not take more than 10 seconds.
Try the mysql bulk loader - note: that is linux only
OR
fix the batch size:
http://julienhofstede.blogspot.co.uk/2014/02/increase-mysql-output-to-80k-rowssecond.html
You'll get massive improvements that way.

Verify multiple tables and copy data in ssis/BIML?

I have a package that have about 6 to 7 dataflow tasks.Within those dataflow tasks, I have up from 5 to 70 tasks thaht copy data from a source(ORACLE database) to a destination(sql database). I need to make to make a count of the source table and then if the source is not empty I will copy the data .I have presently a execute sql task taht trucate all the tables.I would like to truncate if my parameters is > 0 .But wuth my use number of tables(177), I can't afford to use a variable for each one to hold the result of the count and then test the rest.Can I make something work with BIML.Can I use a stored procedure and loop throug it. I need some advice.
EDIT: ////
I think i did not explain myself correctly. I have multiple dataflow task with a lot of source to destination copy.In my control flow , I have an execute sql task that truncate all my 177 tables. I need to do a count on all the sources tables and store the results so i can send it to my execute sqltask.After thaht i want to check if my variable is > 0 then I would not do the task.Is there any easier way to do this than to create 177 variables.
Thanks.
I hope i'm not too late for you. You can use bimlonline.com to reverse engineer your package.
Bimlonline.com is free

Using SSIS to create new Database from two separate databases

I am new to SSIS.I got the task have according to the scenario as explained.
Scenario:
I have two databases A and B on different machines and have around 25 tables and 20 columns with relationships and dependencies. My task is to create a database C with selected no of tables and in each table I don't require all the columns but selected some. Conditions to be met are that the relationships should be intact and created automatically in new database.
What I have done:
I have created a package using the transfer SQL Server object task to transfer the tables and relationships.
then I have manually edited the columns that are not required
and then I transferred the data using the data source and destination
My question is: can I achieve all these things in one package? Also after I have transferred the data how can I schedule the package to just transfer the recently inserted rows in the database to the new database?
Please help me
thanks in advance
You can schedule the package by using a SQL Server Agent Job - one of the options for a job step is run SSIS package.
With regard to transferring new rows, I would either:
Track your current "position" in another table, assumes you have either an ascending key or a time stamp column - load the current position into an SSIS variable, use this variable in the WHERE statement of your data source queries.
Transfer all data across into "dump" copies of each table (no relationships/keys etc required just the same schema) & use a T-SQL MERGE statement to load new rows in, then truncate "dump" tables.
Hope this makes sense - its a bit difficult to get across in writing.