SSIS Migrating data to Azure from multiple sources

SSIS Migrating data to Azure from multiple sources - sql

The scenario is this: We have an application that is deployed to a number of locations. Each application is using a local-instance of SQL Server (2016) with exactly the same DB schema.
The reason for local-instance DBs is that the servers on which the application is deployed will not have internet access - most of the time.
We were now considering keeping the same solution but adding an SSIS package that can be executed at a later time - when the server is connected to the internet.
For now let's assume that once the package is executed - no further DB changes will be made to the local instance.
All tables (except for many-to-many intermediary) have an INT IDENTITY primary key.
What I need is that the table PKs get auto-generated on the Azure DB - which I'm currently doing by setting the mapping property to for the PK, however I would also need all FKs pointing to that PK to get the newly generated ID instead of pointing to the original ID.
Since data would be coming from multiple deployments, I want to keep all data as new entries - without updating / deleting existent records.
Could someone kindly explain or link me to some resource that handles this situation?
[ For future references I'm considering using UNIQUEIDENTIFIER instead of INT, but this is what we have atm... ]
Edit: Added example
So for instance, one of the tables would be Events. Now each DB deployment will have at least one Event starting off from Id 1. I'd like that when consolidating the data into the Azure DB, their actual Id is ignored and instead get an auto-generated Id from the Azure DB. - That part is Ok. But then I'd need all FKs pointing to EventId to point to the new Id, so instead of e.g. 1 they'd get the new Id according to Azure DB (e.g. 3).

Related

Azure SQL data sync initial sync not working

I need to setup a constant sync between two databases on Azure on the same SQL Server. The database is about 2TB with 2000 tables and about 20 million rows.
I cannot setup an Azure data sync because each time, it freezes on "refresh schema" in the Azure portal. I know there is a limitation of 500 tables, but it takes too long before the schema is visible to select less than the 2000 tables that need to be synced.
Another thing we have tried is to initialize the second database with the tables we want from the first database. Those tables are empty and then we can "refresh schema" on the empty tables and set a sync from member to hub. However, when doing this, the initial sync does not work. The second database remains empty while in the portal, the sync seems to run OK.
Is there another possibility to setup a sync with such a large database?
Will it help to create a data sync between empty tables, run the initial sync and then insert all the rows into the empty tables? This way the initial sync will work (because there is no data) and all the other data will be synced like other data that is appended in the future.
EDIT: According to the following blog (https://azure.microsoft.com/sv-se/blog/sync-sql-data-in-large-scale-using-azure-sql-data-sync/), you should explicitly deny permissions on the tables. I have done this, but I now get an error while trying to retrieve the Schema because there are tables with '.', '[' or ']' in their name. Even though I deny the permissions on these tables, Azure gives an error (it probably executes some query to get the schema and in the results, the tables that the user has no access to, are still displayed).

SQL Replication error: "The row was not found at the subscriber" but point to a table of another publication

I get the following error in Replication Monitor:
The row was not found at the Subscriber when applying the replicated UPDATE command for Table '[dgv].[POSCustomer]' with Primary Key =
The error is actually not about the missing row, but that the table's schema says dgv.
The publication that generated the error is supposed to only replicate to [ppv].[POSCustomer], and should not even be aware of [dgv].[POSCustomer]. And only rows created AFTER the initial snapshot is delivered are affected.
The background:
I'm setting up transactional replication for 3 on-premises databases PPV, DGV, and PAC to a single Azure SQL database.
The three databases belong to different legal entities, on two separate servers (PPV on one, DGV and PAC on another), and have identical schemas.
Tables with the same names from each dbs are set up to be replicated.
To differentiate them in the target db, I put them in three different schemas using the name of their source dbs, i.e ppv.POSCustomer, dgv.POSCustomer, pac.POSCustomer.
This is done by changing the setting in Publication properties -> Articles -> Article properties -> Destination object owner.
The initial snapshots are delivered without problems; however, after some time, the row was not found started showing up in the replication monitor.
I tried re-initializing the subscriptions several times, but the error keeps showing up after the snapshot is delivered.
All rows created after the snapshots are delivered are affected.
The databases are totally isolated from each other, there are no cross database queries, no stored procedures, no triggers that says a record from PPV.dbo.POSCustomer should be updated in DGV.dbo.POSCustomer, so I'm at a loss as why this error happened.
I used sp_browsereplcmd to trace the command that generated the error, which leads me to:
{CALL [sp_MSupd_dboPOSCustomer] (,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2019-05-14 00:00:00.000,,27280000.0000,10,,,,,,,,,,,,2019-05-14 18:30:04.000,,,,,,,,,,,,,,,,,,,,N'vinhn4-00001395',0x00000000d000080000)}
which I don't understand, and the sp is not part of our POS app.
How can I make this error go away? Manually inserting missing rows will not work, as all new rows are affected. Turning on -skiperrors is not an option. Replicating to different target databases have been done successfully before, but setting up cross database query is such a pain with Azure SQL that I'd prefer to avoid \if possible.

Getting data from different database on different server with one SQL Server query

Server1: Prod, hosting DB1
Server2: Dev hosting DB2
Is there a way to query databases living on 2 different server with a same select query? I need to bring all the new rows from Prod to dev, using a query
like below. I will be using SQL Server DTS (import export data utility)to do this thing.
Insert into Dev.db1.table1
Select *
from Prod.db1.table1
where table1.PK not in (Select table1.PK from Dev.db1.table1)

Creating a linked server is the only approach that I am aware of for this to occur. If you are simply trying to add all new rows from prod to dev then why not just create a backup of that one particular table and pull it into the dev environment then write the query from the same server and database?
Granted this is a one time use and a pain for re-occuring instances but if it is a one time thing then I would recommend doing that. Otherwise make a linked server between the two.
To backup a single table in SQL use the SQl Server import and export wizard. Select the prod database as your datasource and then select only the prod table as your source table and make a new table in the dev environment for your destination table.
This should get you what you are looking for.

You say you're using DTS; the modern equivalent would be SSIS.
Typically you'd use a data flow task in an SSIS package to pull all the information from the live system into a staging table on the target, then load it from there. This is a pretty standard operation when data warehousing.
There are plenty of different approaches to save you copying all the data across (e.g. use a timestamp, use rowversion, use Change Data Capture, make use of the fact your primary key only ever gets bigger, etc. etc.) Or you could just do what you want with a lookup flow directly in SSIS...
The best approach will depend on many things: how much data you've got, what data transfer speed you have between the servers, your key types, etc.

When your servers are all in one Active Directory, and when you use Windows Authentification, then all you need is an account which has proper rights on all the databases!
You can then simply reference all tables like server.database.schema.table
For example:
insert into server1.db1.dbo.tblData1 (...)
select ... from server2.db2.dbo.tblData2;

Data Replication - SQL Server 2008

Good day
I needed to create a data replication between two databases. I created the Local Publication with one table for testing purposes. I then created the Local Subscription and it worked 100%. I tested it and the data gets updated. I then started to add more tables to the Local Publication that I created. I noticed that the new tables did not pull through to the new database through the Local Subscription I created. Do I need to create a new Subscription for the updates? Do I need to delete the current Subscription or is there another way that I can just update the Current Subscription?
Thanks
Ruan
Got this description from this Article: http://www.mssqltips.com/sqlservertip/2502/limit-snapshot-size-when-adding-new-article-to-sql-server-replication/

You must start snapshot agent, but check that already replicated tables are not marked for reinitialization, because in such case data from old tables will be transfered once more.

Flaws of mapping tables with auto incremented ID in sql

I am currently developing ASP.NET Application & using SQL as backend.
I am having very serious problem with the DB. I am providing DB backup to the user.
If suppose Sql crashes and my client wants to restore the DB with the backup.
As I have implemented my application by mapping table with the auto incremented ID column. So, If user will try to get the old data back from the Backup & that backup will have suppose 1 to 50 as auto incremented number in each table of the column and with respect to that DB will try to map that data again on the basis of auto incremented column.
But AS DB crashed & auto increment column will not start from again 1. It will start respective count of the column from 51.
Then all the mappings will go wrong and not a single table will give me the proper mapping information.
I have around 25 tables in my application.
Now, what will be the possible solution to get proper DB.
What should I do for mapping my table.
Please suggest me the best possible way to resolve this.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas