How to merge two databases on two different servers? - sql

I have two different databases, the client one is attached from a .MDF file to a .\SQLEXPRESS server. The master one is running on a server on another computer called COMPUTER_NAME.
I want to merge these using C# to run a .SQL file. I'll paste my code below for reference, but basically my problem is that if I connect to the server using
string sqlConnectionString = #"Server=.\SQLEXPRESS; Trusted_Connection=True";
Then I can't find the database on COMPUTER_NAME. And if I use
string sqlConnectionString = #"Server=COMPUTER_NAME; Trusted_Connection=True";
It will look for my .MDF file on the C: drive of COMPUTER_NAME, not the local machine.
How can I connect to both of these databases on different servers?
Additional info:
The SQL script I'm using. This worked perfectly back when both the databases were on the same server, but I can't do that anymore.
CREATE DATABASE ClientDB
ON (Filename = 'C:\Clayton.mdf')
, (Filename = 'C:\Clayton_log.ldf')
FOR ATTACH;
-- update the client from the master
MERGE [ClientDB].[dbo].[table] trgt
using [MasterDB].[dbo].[table] src
ON trgt.id = src.id
WHEN matched AND trgt.lastmodified <= src.lastmodified THEN -- if master row is newer
UPDATE SET trgt.[info] = src.[info], ... -- update the client
WHEN NOT matched BY source -- delete rows added by client
THEN DELETE
WHEN NOT matched BY target -- insert rows added by master
THEN INSERT ( [info], ... ) VALUES (src.[info], ... );
-- close all connections to database
ALTER DATABASE ClientDB SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
-- detach database
EXEC sp_detach_db 'ClientDB', 'true';
And I run it using C# like so:
string sqlConnectionString = #"Server=.\SQLEXPRESS; Trusted_Connection=True";
string script = File.ReadAllText(Environment.CurrentDirectory + #"\MergeTotal.sql");
SqlConnection conn = new SqlConnection(sqlConnectionString);
IEnumerable<string> commandStrings = Regex.Split(script, #"^\s*GO\s*$",
RegexOptions.Multiline | RegexOptions.IgnoreCase);
conn.Open();
foreach (string commandString in commandStrings)
{
if (commandString.Trim() != "")
{
using (var command = new SqlCommand(commandString, conn))
{
command.ExecuteNonQuery();
}
}
}
I don't care if the entire process happens in the .SQL or in C# so long as it has the desired effect.
Thanks in advance for any guidance or recommendations.

Linking the servers would help you to be able to access the data simultaneously, if that's the requirement. If you're looking to merge data together, though, I'd suggest you check out sp_generate_merge to pull the data into a merge script for you (very handy for moving data). See also my question on generating merge data here.

Okay, I had to completely throw out the whole .MDF thing. Instead of attaching and reattaching the database from an .MDF, I just set up the database.
Here's my code to initialize the local database on the tablet:
CREATE DATABASE LocalClaytonDB
ON (Filename = 'C:\ProgramData\Clayton\Clayton.mdf')
, (Filename = 'C:\ProgramData\Clayton\Clayton_log.ldf')
FOR ATTACH;
GO
EXEC sp_addlinkedserver #server='Server'
Here's my code to synchronize the two databases:
-- update the client from the master
MERGE [LocalClaytonDB].[dbo].[tableName] trgt
using [Server].[Clayton].[dbo].[tableName] src
ON trgt.id = src.id
WHEN matched AND trgt.lastmodified <= src.lastmodified THEN
-- if the master has a row newer than the client
-- update the client
UPDATE SET trgt.[allColumns] = src.[allColumns],
trgt.[id] = src.[id],
trgt.[lastmodified] = src.[lastmodified]
-- delete any rows added by a client
WHEN NOT matched BY source
THEN
DELETE
-- insert any rows added by the master
WHEN NOT matched BY target
THEN
INSERT ( [allColumns],
[id],
[lastmodified])
VALUES (src. [allColumns],
src.[id],
src.[lastmodified]);
-- now we update the master from the client
-- Note:
-- because the serverDB is a linked server
-- we can't use another MERGE statement, otherwise
-- we get the error: "The target of a MERGE statement
-- cannot be a remote table, a remote view, or a view over remote tables."
UPDATE
serverDB
SET
[allColumns] = [localDB].[allColumns],
[id] = [localDB].[id],
[lastmodified] = [localDB].[lastmodified]
FROM
[Server].[Clayton].[dbo].[tableName] serverDB
INNER JOIN
[LocalClaytonDB].[dbo].[tableName] localDB
-- update where the id is the same but the client is newer than the master
ON serverDB.id = localDB.id
AND localDB.lastmodified >= serverDB.lastmodified

Related

SSIS package insert data from CSV file to a staging table and then move from a staging table to a main table with a Site Column

I have a CSV file that successfully moves from Source File to a staging table in a Data Flow Task from a sequence container. There will be a few sequences of this. I then need to move this data from the staging table to the main table that contains an extra column (Site). I am using a SQL task to move from staging to the main table. When I run this, it goes to my staging table but, never hits my main table.
Here is my Code in my Execute SQL Task
USE ENSTRDW
UPDATE
AxonOrders
SET
AxonOrders.OrderNbr = AxonOrdersExtractCleanCreated.OrderNbr
,AxonOrders.OrderStatus = AxonOrdersExtractCleanCreated.OrderStatus
,AxonOrders.OrderEndDate = AxonOrdersExtractCleanCreated.OrderEndDate
,AxonOrders.InvoiceDate = AxonOrdersExtractCleanCreated.InvoiceDate
,AxonOrders.OrderDate = AxonOrdersExtractCleanCreated.OrderDate
,AxonOrders.RevenuePerMile = AxonOrdersExtractCleanCreated.RevenuePerMile
,AxonOrders.ReadyToInvoice = AxonOrdersExtractCleanCreated.ReadyToInvoice
,AxonOrders.OrderCommodity = AxonOrdersExtractCleanCreated.OrderCommodity
,AxonOrders.OrderTractors = AxonOrdersExtractCleanCreated.OrderTractors
,AxonOrders.BillableMileage = AxonOrdersExtractCleanCreated.BillableMileage
,AxonOrders.Site = 'GT'
,AxonOrders.LastModified = AxonOrdersExtractCleanCreated.LastModified
,AxonOrders.VoidedOn = AxonOrdersExtractCleanCreated.VoidedOn
,AxonOrders.OrderDateTimeEntered = AxonOrdersExtractCleanCreated.OrderDateTimeEntered
FROM
AxonOrdersExtractCleanCreated
Why using an UPDATE command to INSERT data?!
You should use an INSERT INTO command rather than UPDATE:
USE ENSTRDW;
INSERT INTO [AxonOrders](OrderNbr,OrderStatus,OrderEndDate,InvoiceDate,OrderDate,RevenuePerMile,ReadyToInvoice,
OrderCommodity,OrderTractors,BillableMileage,Site,LastModified,VoidedOn,OrderDateTimeEntered)
SELECT
OrderNbr,OrderStatus,OrderEndDate,InvoiceDate,OrderDate,RevenuePerMile,ReadyToInvoice,
OrderCommodity,OrderTractors,BillableMileage,'GT',LastModified,VoidedOn,OrderDateTimeEntered
FROM
AxonOrdersExtractCleanCreated

SSIS is hanging during Update with 3 millions of rows

I'm implementing a new method for a warehouse. The new method consist on perform incremental loading between source and destination tables (Insert,Update or Delete).
All the table are working really well, except for 1 table which the Source has more than 3 millions of rows, as you will see in the image below it just start running but never finish.
Probable I'm not doing the update in the correct way or there is another way to do it.
Here are some pictures of my SSIS package:
Highlighted object is where it hangs.
This is the stored procedure I call to update the table:
ALTER PROCEDURE [dbo].[UpdateDim_A]
#ID INT,
#FileDataID INT
,#CategoryID SMALLINT
,#FirstName VARCHAR(50)
,#LastName VARCHAR(50)
,#Company VARCHAR(100)
,#Email VARCHAR(250) AS BEGIN
SET NOCOUNT ON;
BEGIN TRAN
UPDATE DIM_A
SET
[FileDataID] = #FileDataID,
[CategoryID] = #CategoryID,
[FirstName] = #FirstName,
[LastName] = #LastName,
[Company] = #Company,
[Email] = #Email
WHERE PartyID=#ID
COMMIT TRAN; END
Note:
I already tried Dropping the constraint and indexes and changing the recovery mode of the database to simple.
Any help will be appreciate.
After Apply the solution provided by #Prabhat G, this is how my package looks like, running in 39 seconds (avg)!!!
Inside Dim_A DataFlow
Follow these 2 performance enhancers and you'll avoid your bottleneck.
Remove sort transformation. In your source, while fetching the data use order by sql. Reason being, sort takes up all the records in memory before sorting. You don't want that, be it incremental or full load.
In the last step of update, introduce another Staging Table instead of update records oledb command, which will be replica of Dim table. Once all the matching records are inserted in this new staging table, exit the Data flow task and create EXECUTE SQL TASK which will simply UPDATE Dim table based on joining ID/conditions.
Reason for this is, oledb command hits row by row. Always prefer update using Execute SQL Task as its a batch process.
Edit:
As per comments, to update only changed rows in Execute SQL Task, add the conditions in where clause:
eg:
UPDATE x
SET
x.attribute_A = y.attribute_A
,x.attribute_B = y.attribute_B
FROM
DimA x
inner join stg_DimA y
ON x.Id = y.Id
WHERE
(x.Attribute_A <> y.Attribute_A
OR x.Attribute_B <> y.Attribute_B)
So your problem is actually very simple the method you are using is executing that stored procedure for every row returned. If you have 9961(as in your picture) rows to update it will run that statement 9961 sepreate time. Chances are if you are to look at active queries running on SQL server you'll see that procedure executing over and over.
What you should do to speed this up is dump that data into a staging table then use the execute SQL task further in your package to run a standard SQL update. This will run much faster.
The problem is that you are trying to execute a stored procedure within the data flow. The correct SqlCommand will be an explicit UPDATE query and then map the columns from SSIS to the columns on the table that you are updating.
UPDATE DIM_A
SET FileDataID = ?
,CategoryID = ?
,FirstName = ?
,LastName = ?
,Company = ?
,Email = ?
WHERE PartyID = ?
Note: The #Id needs to be included as a column in your data flow.
One final thing you should consider, as Zane correctly pointed out: you should only update rows that have changed. So, in your data flow you should add a Conditional Split transformation that checks to see if any of the columns in the new source row are different from the existing table rows. Only rows that are different should be send to the OLE DB Command - the rest can be disregarded.

Export SQL Server data from one database to another

I have a SQL server database of a web application, my requirement is to read 2-3 table data from one source database and insert the data in the destination database. My input will be a Name and an ID, based on that I have to read data from the source database and I have to validate whether the similar Name already exists in the destination database. I have to do this via a C# windows application or a web application.
So far in my research, people have recommended using SqlBulkCopy or an SSIS package, I tried to transfer one table data using the following code.
using (SqlConnection connSource = new SqlConnection(csSource))
using (SqlCommand cmd = connSource.CreateCommand())
using (SqlBulkCopy bcp = new SqlBulkCopy(csDest))
{
bcp.DestinationTableName = "SomeTable";
cmd.CommandText = "myproc";
cmd.CommandType = CommandType.StoredProcedure;
connSource.Open();
using (SqlDataReader reader = cmd.ExecuteReader())
{
bcp.WriteToServer(reader);
}
}
The problem I'm facing is that, I have to copy 2 table data, based on table 1, table 2 value like ID(primary key) changes, I have to update this in the SqlDataReader, so in order to get this new ID in the destination database, I have to insert one table data first, then get the ID, then update this ID in my reader object and then do another SqlBulkCopy, this doesn't look like the ideal way to do this, is there any other to do this?
On the source SQL instance I would create a linked server referencing destination/target SQL instance and then I would create a stored procedure within source database thus:
USE SourceDatabase
GO
CREATE PROCEDURE dbo.ExportSomething
#param1 DataType1,
#param2 DataType2, ...
AS
BEGIN
INSERT LinkedServer.DestinationDatabase.dbo.TargetTable
SELECT ...
FROM dbo.Table1 INNER JOIN dbo.Table2 ....
WHERE Col1 = #param1 AND/OR ...
END
GO
Then, the final step is to call this stored procedure from client application.

How can I check if a SQL Server database engine is updating?

Context: I have a SQL Server database engine called DB4, and it's updating all its databases from another database engine called DB5 through the SQL Server agent every 5 hours. I don't have access to DB5, but I have been told DB5 is also updating from somewhere else.
Problem: The problem is that sometimes the two database engines will update their databases simultaneously, so the DB4 cannot update completely.
Question: Is there any way I can detect if DB5 is updating? Then I can write in the SQL server agent jobs, like if the DB5 is not updating then update DB4, otherwise do nothing.
PS: The way DB4 updates is processed by many Agent Jobs. Somebody wrote many scripts in the jobs. Basically, the scripts are like this format:
TRUNCATE Table_Name
INSERT INTO Table_Name
SELECT field_name1,field_name2 ......
FROM DB5.database_name.table_Name
By DB4, DB5 you mean servers not databases which the names here are confusing.
Nevertheless, if you have access to DB4 and DB4 is selecting from DB5, that means DB5 is a linked server registered in DB4 with a user who has a query permission on DB5 databases and MAY have insert/update/delete/create objects permissions. If so, then you can create a table in DB5.database_name as follows:
CREATE TABLE DB5.database_name.dbo.Table_Flag(Uflag bit NOT NULL)
GO
INSERT INTO DB5.database_name.dbo.Table_Flag(Uflag) values (0)
then you can create a trigger for the updating table in DB5-Database which will update the Uflag
to 1 if there are any newly updated/inserted/deleted rows
Then you can modify the job in DB4 as:
declare #count int
set #count = (select count(*) fromDB5.database_name.dbo.Table_Flag where Uflag = 1)
if (#count > 1)
begin
TRUNCATE Table_Name
INSERT INTO Table_Name
SELECT field_name1,field_name2 ......
FROM DB5.database_name.table_Name
UPDATE DB5.database_name.dbo.Table_Flag SET Uflag = 0
end

SQL Server : Remote Table insert fails

Here is my setting: within my schema I have a stored procedure (shown below) which will insert a dummy row (for testing) into a remote table. The table MY_REMOTE_TABLE is a synonym pointing to the correct remote table.
I can successfully query it like
SELECT * FROM MY_REMOTE_TABLE;
on my local system. This does work.
So for testing I created the procedure with a dummy test value in it, which should be inserted into the remote table if the row is new to the remote table. Hint: the remote table is empty at the time of insertion, so it really should perform the insert.
But it always fails giving me a return value of -6. I have no clue what -6 stands for or even what the error could be. All I have is -6 and I know that nothing will be inserted into the remote table. Interestingly when I copy the insert statement to the remote server, replace the synonym with the real table name on the remote machine and execute, it works all fine.
So I'm really lost here seeking for your help!
CREATE PROCEDURE [dbo].[my_Procedure]
AS
BEGIN
SET NOCOUNT ON;
BEGIN TRY
BEGIN DISTRIBUTED TRANSACTION
-- LEFT JOIN AND WHERE NULL WILL ONLY FIND NEW RECORDS -> THEREFORE INSERT INTO TARGET REMOTE TABLE
INSERT INTO MY_REMOTE_TABLE
(--id is not needed because it's an IDENTITY column
user_id,
customer_id,
my_value, year,
Import_Date, Import_By, Change_Date, Change_By)
SELECT
Source.user_id,
Source.customer_id,
Source.my_value,
Source.year,
Source.Import_Date,
Source.Import_By,
Source.Change_Date,
Source.Change_By
FROM
(SELECT
null as id,
126616 as user_id,
17 as customer_id,
0 as my_value,
2012 as year,
GETDATE() AS Import_Date,
'test' AS Import_By,
GETDATE() AS Change_Date,
'test' AS Change_By) AS Source
LEFT JOIN
MY_REMOTE_TABLE AS Target ON Target.id = Source.id
AND Target.user_id = Source.user_id
AND Target.customer_id = Source.customer_id
AND Target.year = Source.year
WHERE
Target.id IS NULL; -- BECAUSE OF LEFT JOIN NEW RECORDS WILL BE NULL, SO WE ONLY SELECT THEM FOR THE INSERT !!!
IF (##TRANCOUNT > 0 AND XACT_STATE() = 1)
BEGIN
COMMIT TRANSACTION
END
END TRY
BEGIN CATCH
IF (##TRANCOUNT > 0AND XACT_STATE() = -1)
ROLLBACK TRANSACTION
END CATCH;
END
another question related to this one. if my insert would violate an FK constraint on my remote table, how could I manage to promote the error message from the remote DB server to my local procedure to capture it?
Look here: http://msdn.microsoft.com/de-de/library/ms188792.aspx
Short version:
XACT_ABORT must be set ON for data modification statements in an
implicit or explicit transaction against most OLE DB providers,
including SQL Server. The only case where this option is not required
is if the provider supports nested transactions.
So insert a SET XACT_ABORT ON at the start of the stored procedure.