My Source and Destination tables exist on different servers. I am using Execute SQL Task to write Merge Statements to synchronize them.
Could anyone explain how I can reference two different databases that exist on different servers inside my Execute SQL Task?
Possible approaches:
I would suggest the following approaches instead of trying to use MERGE statement within Execute SQL Task between two database servers.
Approach #1:
Create two OLEDB Connection Managers to each of the SQL Server instances. For example, if you have two databases SourceDB and DestinationDB, you could create two connection managers named OLEDB_SourceDB and OLEDB_DestinationDB. You could also use ADO.NET connection manager, if you prefer that. Based on what I have read in SSIS based books, OLEDB performs better than ADO.NET connection manager.
Drag and drop a Data Flow Task on the Control Flow tab.
Within the Data Flow Task, configure an OLE DB Source to read the data from source database table.
Use Lookup Transformation that checks whether if the data already exists in the destination table using the uniquer key between source and destination tables.
If the source table row does not exist in the destination table, then insert the rows into destination table using OLE DB Destination
If the source table row exists in the destination table, then insert the rows into a staging table on the destination database using another OLE DB Destination.
Place an Execute SQL Task after the Data Flow Task on the Control Flow tab. Write a query that would update the data in destination table using the staging table data.
Check the answer to the below SO question for detailed steps.
How do I optimize Upsert (Update and Insert) operation within SSIS package?
Approach #2:
Create two OLEDB Connection Managers to each of the SQL Server instances. For example, if you have two databases SourceDB and DestinationDB, you could create two connection managers named OLEDB_SourceDB and OLEDB_DestinationDB.
Drag and drop a Data Flow Task on the Control Flow tab.
Within the Data Flow Task, configure an OLE DB Source to read the data from source database table and insert into a staging table using OLE DB Destination.
Place an Execute SQL Task after the Data Flow Task on the Control Flow tab. Write a query that would use the MERGE statement between staging table and the destination table.
See this link - http://technet.microsoft.com/en-us/library/cc280522%28v=sql.105%29.aspx
Basically, to do this, you would need to get the data from the different servers into the same place with Data Flow tasks, and then perform an Execute SQL task to do the merge.
The Merge and Merge Join SSIS Data Flow tasks don't look like they do what you want to do.
Related
I cannot use linked server.
Both databases on both servers have the same structure but different data.
I have 10k rows to transfer from the DB on one server to the same DB on the other. I cannot restore the DB on the other server as it will take a huge amount of space that I don't have on the other server.
So, I have 2 options that I don't know how to execute:
Backup and restoring only one table - the table is linked to many other tables but these other tables exist on the other server too. Can I somehow delete or remove the other tables or make a backup only over one table?
I need to transfer 10k rows. How is it possible to create 10k insert queries based on selected 10k rows?
Can I somehow delete or remove the other tables or make a backup only over one table?
No you can not do this, unfortunately
How is it possible to create 10k insert queries based on selected 10k rows?
Right-click on Database -> Tasks -> Generate scripts -> (Introduction) Next
Chose Select specific database objects -> Tables, chose table you need -> Next
Advanced -> Search for Types of data script change from Schema only (by default) to Data only -> OK
Chose where to save -> Next -> Next. Wait the operation to end.
This will generate the file with 10k inserts.
Another way is to use Import/Export wizard (the most simple way for one-time-import/export) if you have link between databases.
There are many ways to choose from, here is one way using BCP. That's a tool that ships with SQL Server to Import and Export Bulk Data.
The outlines of the process:
Export the data from the source server to a file using BCP - BCP OUT for a whole table, or BCP QUERYOUT with a query to select the 10k rows you want exported
Copy the file to the destination server
Import the data using BCP on the destination database - BCP IN.
My suggestion would be to export these rows to excel( you can do this by copy pasting your query output) and transfer this to other server and import it there.
this is the official method :-
https://learn.microsoft.com/en-us/sql/relational-databases/import-export/import-data-from-excel-to-sql
and this is the the unofficial method :
http://therealdanvega.com/blog/2010/08/04/create-sql-insert-statements-from-a-spreadsheet.
Here I have assumed that you only need to transfer the transactional data and your reference data is same on both server. So you will need to execute only one query for exporting your data
I would definietely go down the SSIS route once you use SSIS to do a task like this you will not use anything else very simple to script up. You can use any version and it will be a simple job and very quick.
Open new SSIS project in available visual studio version/s there are many different but even a 2008 version will do this simple task you may have to install integration services or something similar used to be called bids (business information development studio in 2008) (anything up to 2015 although support is nearly there in 2017)
add a data flow task
double click the data flow task
Bottom of screen add two connection managers (1 to source and 1 to destination database)
add oledb source pointing to source database table
add oledb destination pointing to destination database table
drag line between the source and destination (should auto map all columns if the same name)
hit Start and the data will flow very quickly
you have create DbInstaller. using dbInstaller you have share whole database. Dbinstaller work both ado.Net and Entity Frame Work but I have using Entity Frame Work.
you can do it by sql server query
first select first database like
Use database1 --- this will be your first database
after select first database we will put our table row in temp table by this query
select * into #Temp from select * from table1
now we select second database and insert temp table data into second database table by this code
use secondDatabaseName
INSERT INTO tableNameintoinsert (col1, col2, )
SELECT col1, col2 FROM #temp;
I have a database1 which has more than 500 tables and I have database2 which also has the same number of tables and in both the databases the name of tables are same.. some of the tables have different table definitions, for example a table reports in database1 has 9 columns and the table reports in database2 has 10.
I want to copy all the data from database1 to database2 and it should overwrite the same data and append the columns if structure does not match. I have tried the import export wizard in SQL Server 2008 but it gives an error when it comes to the last step of copying rows. I don't have the screen shot of that error right now, it is my office PC. It says that error inserting into the readonly column xyz, some times it says that vs_isbroken, for the read only column error as I mentioned a enabled the identity insert but it did not help..
Please help me. It is an opportunity in my office for me.
SSIS and SQL Server 2008 Wizards can be finicky tools.
If you get a "can't insert into column ABC", then it could be one of the following:
Inserting into a PK column -> when setting up the mappings, you need to indicate to overwrite the value
Inserting into a column with a smaller range -> for example from nvarchar(256) into nvarchar(50)
Inserting into a calculated column (pointed out by #Nick.McDermaid)
You could also get issues with referential integrity if your database uses this (most do).
If you're going to do this more often, then I suggest you build an SSIS package instead of using the wizard tooling. This way you will see warnings on all sorts of issues like the ones I've described above. You can then run your package on demand.
Another suggestion I would make, is that you insert DB1 into "stage" tables in DB2. These tables should have no relational integrity and will allow you to break the process into several steps as follows.
Stage the data from DB1 into DB2
Produce reports/queries on issues pertinent to your database/rules
Merge the data from stage tables into target tables using SQL
That last step is where you can use merge statements, or simple insert/updates depending on a key match. Using SQL here in the local database is then able to use set theory to manage the overlap of the two sets and figure out what is new or to be updated.
SSIS "can" do this, but you will not be able to do a bulk update using SSIS, whereas with SQL you can. SSIS would do what is known as RBAR (row by agonizing row), something slow and to be avoided.
I suggest you inform your seniors that this will take a little longer to ensure it is reliable and the results reportable. Then work step by step, reporting on each stages completion.
Another two small suggestions:
Create _Archive tables of each of the stage tables and add a Tstamp column to each. Merge into these after the stage step which will allow you to quickly see when which rows were introduced into DB2
After stage and before the SQL merge step, create indexes on your stage tables. This will improve the merge performance
Drop those Indexes after each merge, this will increase the bulk insert Performance
Basic on Staging (response to question clarification):
Links:
http://www.codeproject.com/Articles/173918/How-to-Create-your-First-SQL-Server-Integration-Se
http://www.jasonstrate.com/tag/31daysssis/
http://blogs.msdn.com/b/andreasderuiter/archive/2012/12/05/designing-an-etl-process-with-ssis-two-approaches-to-extracting-and-transforming-data.aspx
Staging is the act of moving data from one place to another without any checks.
First you need to create the target tables, the schema should match the source tables.
Open up BIDS and create a new Project and in it a new SSIS package.
In the package, create a connection for the source server and another for the destination.
Then create a data flow step, in the step create a data source for each table you want to copy from.
Connect each source to a new data destination and set the appropriate connection and table.
When done, save and do a test run.
Before the data flow step, you might like to add a SQL step that will truncate all the target tables.
If you're open to using tools then what about using something like Red Gate Sql Compare and Red Gate SQL Data Compare?
First I would use data compare to manage the schema differences, add the new columns you want to your destination database (database2) from the source (database1). Then with data compare you match the contents of the tables any columns it can't match based on names you specify how to handle. Then you can pick and choose what data you want to copy from your destination. So you'll see what data is new and what's different (you can delete data in the destination that's not in the source or ignore it). You can either have the tool do the work or create you a script to run when you want.
There's a 15 day trial if you want to experiment.
Seems like maybe you are looking for Replication technology as is offered by SQL Server Replication.
Well, if i understood your requirement correctly, you need to make database2 a replica of database1. Why not take a full backup of database1 and restore it as database2? Your database2 will be exactly what database1 is at the time of backup.
I'm trying to merge 2 tables from 2 databases on 2 differents servers.
For now, I create a linked server on one of the servers and I use a query like this:
MERGE INTO tablename1 as T1
using linkedservername.dbname.tablename2 as T2 ON
WHEN MATCHED THEN
UPDATE SET ...
WHEN NOT MATCHED THEN
INSERT ...
I would like to know if there is a solution to do that without create a linked server.
There are three general ways to do this in SSIS. But there is a lot more information if you check online.
Either way you first need to create a connection manager in SSIS pointing directly at your linked server. Start with that.
Then create a data flow task where you select from dbname.tablename2 in a data flow source
Then you can do it a few ways:
A. Staging Table
Dump that result into a staging table then run your merge statement locally in a subsequent SQL Task. This is usually the quickest (and simplest) way unless you aren't allowed to create tables/data in the target.
B. Lookup
Use a lookup in your data flow to identify if the record exists or not, followed by a OLEDB destination (for inserts) or a OLEDB command (for updates)
This is generally slow because both the lookup and update are inefficient.
C. row level merge
Feed the result into a OLEDB command, and put your merge directly in there
This is probably the slowest.
If you want more info, get your connection manager sorted and post back.
I am just trying to find out whether this is the right way to do this task.
Any other suggestions to improve this is greatly appreciated.
I have the following on my SSIS package.
Data Flow task and established a OLE DB connection to the source database where the view is.
Execute SQL task - I am executing a query with a INSERT INTO Destination Except (all those records that are already there from the source.)
Send mail task is to send out an email.
How to know that the data transfer is successful? So that I can use the send mail to
indicate success or failure.
How to schedule this package so that it runs automatically (Every Tuesday.)
I have tried the suggestion below. Please refer to the new Data Flow task.
OLE DB Source - Points to a view in database server 1
Lookup gets all the rows from OLE DB source. (the rowcount on source and on the lookup)
matches.
On the lookup task, I have configured error output to use 'Redirect row' on all the mapped columns.
On the OLE DB Destination (Destination table where it already has a subset of records from the source. So the Configured Error output to get unmatches rows for insert.
When, I execute the package - I am getting an Primary key constraint error as - Cannot insert duplicate key.
Any suggestions?
You will want to double click the connector from the Execute SQL Task to Send Mail Task Currently it's green which indicates it will only take that path on Success. You will want to update the constraint to be on Completion as you don't care if it's Success or Fail.
It sounds like you have your data flow pulling all of the data from your source and writing to a staging table. In your Execute SQL Task, you then use a query to add data into your target table where it doesn't exist.
This can be consolidated into a single Data Flow. Between your OLE DB Source and OLE DB Destination, add a Lookup task. Since you are on 2005, the Lookup behaves a bit differently than 2008+. You will write a query that pulls back the business keys in your target table and then compares that to what is coming from your OLE DB Source. Map those keys in the interface.
You only want the rows that aren't matched so you will need to get the "unmatched records" from the lookup. In 2005, the option for Unmatched output didn't exist so you will need to route the Error output to your OLE DB Destination.
Andy Leonard has a nice little writeup on how to accomplish this: Configuration an SSIS 2005 Lookup Transformation for a Left Outer Join The only difference for your case, is that you don't care about the matched rows. Instead of Ignore Failure, you want to select Redirect Row. Then when you go to connect the Lookup to the OLE DB Destination, you will be presented with two options. The Green Connector is the Matched, Red Connector is the Unmatched rows. Tie the Red path to your Destination
I am currently using a database that is poorly designed and a slow pipeline so i decided to copy a small portion of the database (15 tables) and only bring over some of those tables for example i want to bring only the rows that have a certain id.
But this is not a one time move i need to add all the stuff that is added to the old database added to the new one on an hourly basis. My research has led me to SSIS and that it may have a way of accomplishing this but i have found no clear examples on how it is done if in fact it is possible. Thanks in advance.
Yes it is possible . You can schedule your ssis package through sql agent to run on hourly basis .
For a table ,you can drag a Data Flow Task on to the control flow .Inside DFT ,you need to place an oledb Source component ,Lookup ,Data conversion (if the types are different in source and target table) and Oledb destination .
oledb Source component : Create a variable of type string and in the expression write your sql query to fetch the data based on ID.Now use this variable in source component.
Lookup: You need to select your source table and combine the primary key column from the source and destination table.It acts similar to inner join query .After combining the primary key from the both the tables ,select the columns which you need from the source .
Oledb destination : Simply select your target table and map the columns from Lookup no matched output .If you need to update the values from the source then use Lookup matched output and connect it to an execute SQL task and write the update query .
Please go thru the link and SO
Scheduling of SSIS package