How to transfer Data from One SQL server to another with out transactional replication - sql

I have a database connected with website, data from website is inserting in that Database, i need to transfer data from that database to another Primary Database (SQL) on another server in real time (minimum latency).
I can not use transactional replication in this case. What are the other alternates to achieve this? Can i integrate DataStreams like Apache kafka etc with SQL server?

Without more detail it's hard to give a full answer. There's what's technically possible, and there's architecturally what actually makes sense :)
Yes you can stream from RDBMS to Kafka, and from Kafka to RDBMS. You can use the Kafka Connect JDBC source and sink. There are also CDC tools (e.g. Attunity, GoldenGate, etc) that support integration with MS SQL and other RDBMS)
BUT…it depends why you want the data in the second database. Do you need an exact replica of the first? If so DB-DB replication may be a better option. Kafka's a great option if you want to process the data elsewhere and/or persist it in another store. But if you just want MS SQL-MS SQL…Kafka itself may be overkill.

Related

Trigger Based Replication (Live Sync) OR Transactional Replication in MSSQL

can someone give me a clear idea about which technique/ method is more reliable, less memory consuming and faster in replicating data from one Database to another in MSSQL database(SQl Server 2012) and why. We are in the process of developing a Live GPS based tracking application and I am confused with which method to proceed with
Trigger Based Replication (Live Sync)
(OR)
Transactional Replication
Thanks in Advance ☺
I would recommend using standardised solutions whenever possible. Within the choice given to you, transaction replication should be an obvious favourite, because:
It doesn't require any coding and can be deployed using standard tools. This makes it much faster to deploy and maintain - any proper DBA can do it, some of them even being blindfolded.
Actual data transfer is done by replication agents which are separate applications external to the SQL Server process and client connections. Any network issues within the publisher-distributor-subscriber(s) chain will lead to delays in copying the data, but they will not affect the performance of the publisher database itself.
With triggers, you have neither of these advantages: you will have to add a lot of code, and sluggish network will make data-changing queries slower, potentially leading to timeouts.
Of course, there are many more ways to move the data between the databases in SQL Server, such as (in no particular order):
AlwaysOn Availability Groups (Database mirroring);
Log shipping;
CDC (Change Data Capture);
Service Broker.
However, given your needs, transaction replication still looks like your best bet, overall.

Sync NoSQL with SQL

Is there a way to sync NoSQL and SQL databases?
My problem is: We have a software that uses MSSQL. We also have a mobile application that uses MongoDB. We want to sync data (on create/update) between those databases. Mostly from MongoDB to MSSQL.
It is not a problem for us (if we have to) to use different NoSQL DBMs, but we can't find clear instructions of how to sync those two the way i described.
Can anyone help? Thanks.
Have you looked at CDC (Change Data Capture) tools that allow you to capture events from your source database (in this case MSSQL) and deal with the event to update/create/delete data in the NoSQL database.
I invite you to look at https://debezium.io/ and the MSSQL Connector & MongoDB connector:
MS SQL Server Connector
MongoDB Connector

Managing data in two relational databases in a single location

Background: We currently have our data split between two relational databases (Oracle and Postgres). There is a need to run ad-hoc queries that involve tables in both databases. Currently we are doing this in one of two ways:
ETL from one database to another. This requires a lot of developer
time.
Oracle foreign data wrapper on our
Postgres server. This is working, but the queries run extremely
slowly.
We already use Google Cloud Platform (for the project that uses the Postgres server). We are familiar with Google BigQuery (BQ).
What we want to do:
We want most of our tables from both these databases (as-is) available at a single location, so querying them is easy and fast. We are thinking of copying over the data from both DB servers into BQ, without doing any transformations.
It looks like we need to take full dumps of our tables on a periodic basis (daily) and update BQ since BQ is append-only. The recent availability of DML in BQ seems to be very limited.
We are aware that loading the tables as is to BQ is not an optimal solution and we need to denormalize for efficiency, but this is a problem we have to solve after analyzing the feasibility.
My question is whether BQ is a good solution for us, and if yes, how to efficiently keep BQ in sync with our DB data, or whether we should look at something else (like say, Redshift)?
WePay has been publishing a series of articles on how they solve these problems. Check out https://wecode.wepay.com/posts/streaming-databases-in-realtime-with-mysql-debezium-kafka.
To keep everything synchronized they:
The flow of data starts with each microservice’s MySQL database. These
databases run in Google Cloud as CloudSQL MySQL instances with GTIDs
enabled. We’ve set up a downstream MySQL cluster specifically for
Debezium. Each CloudSQL instance replicates its data into the Debezium
cluster, which consists of two MySQL machines: a primary (active)
server and secondary (passive) server. This single Debezium cluster is
an operational trick to make it easier for us to operate Debezium.
Rather than having Debezium connect to dozens of microservice
databases directly, we can connect to just a single database. This
also isolates Debezium from impacting the production OLTP workload
that the master CloudSQL instances are handling.
And then:
The Debezium connectors feed the MySQL messages into Kafka (and add
their schemas to the Confluent schema registry), where downstream
systems can consume them. We use our Kafka connect BigQuery connector
to load the MySQL data into BigQuery using BigQuery’s streaming API.
This gives us a data warehouse in BigQuery that is usually less than
30 seconds behind the data that’s in production. Other microservices,
stream processors, and data infrastructure consume the feeds as well.

wso2cep : Data Storage in addition to display

I was wondering if in addition to process and display data on dashboard in wso2cep, can I store it somewhere for a long period of time to get further information later? I have studied there are two types of tables used in wso2cep, in-memory and rdbms tables.
Which one should I choose?
There is one more option that is to switch to wso2das. Is it a good approach?
Is default database is fine for that purpose or I should move towards other supported databases like sql, orcale etc?
In-memory or RDBMS?
In-memory tables will internally use java collections structures, so it'll get destroyed once the JVM is terminated (after server restart, data won't be available). On the other hand, RDBMS tables will persist data permanently. For your scenario, I think you should proceed with RDBMS tables.
CEP or DAS?
CEP will only provide real-time analytics, where DAS provides batch analytics (with Spark SQL) in addition to real-time analytics. If you have a scenario which require batch processing, incremental processing, etc ... You can go ahead with DAS. Note that, migration form CEP to DAS is quite simple (since the artifacts are identical).
Default (H2) DB or other DB?
By default WSO2 products use embedded H2 DB as data source. However, it's recommended to use MySQL or Oracle in production environments.

secure database distribution to external clients

We want to distribute / synchronize data from our Datawarehouse (MS SQL Server) to external customers (also MS SQL Server). The connection has to be secure, because we are dealing with trusted data. Transmission of data from our system to external client system must be via the http/https
In addition it is possible that the clients still run their systems with an older database schema, so already existing tables and columns should be transmitted and non existing ones should be ignored.
Its most likely that we will have large database updates and the updates have to arrive in almost real-time.
And it is definitely necessary that the data is stored in a client side datawarehouse / SQL database.
The whole process should also include good monitoring possibilities in case something goes wrong.
We started to develop our own .NET solution but I thought it should be almost a common problem to exchange data between different systems.
Does anybody know about an existing solution which we can adapt to our scenario?
Any help is appreciated!
The problem is so common that it has a dedicated component in SQL Server: Service Broker. Rather than start your own .Net thing and take care of the many problems (how are you gonna handle down time? Retries? duplicates? out of order delivery? authentication of non-domain joined computers? routing for machines that change names? service upgrades? transactional consistency, rollbacks? are you gonna use dtc?). You can look at the demo I gave to SQL connections to see how you can easily scale SSB to a throughput of well over 1000 msgs/sec (1k payload) on commodity hardware.
the only requirement is that all partitcipants must be at least SQL Server 2005 (no SSB in 2000).
Just use regular SQL connections over a secure VPN or an SSH tunnel. Should be very easy to setup for your networking guys.
For example, you can create a linked server. Then a SQL scheduled job could move the data:
truncate table targetserver.dbname.dbo.tablename
insert into targetserver.dbname.dbo.tablename
select a, b, c
from dbname.dbo.sourcetable
Since the linked server talks to your server over a VPN or SSH tunnel, all data is send encrypted over the internet.