Can Apache Ignite update when 3rd party SQL Server database changed something directly? - ignite

Can I get some advice on whether it is possible to proceed like the steps below?
SQL Server data is loaded in Ignite Cluster
The data in SQL Server has been changed.
-> Is there any other way to reflect this changed data without reloading the data from SQL Server?
When used as a cache in front of the database, when changes are made directly to the DB without going through the Ignite Cluster, can the already loaded cache data be directly reflected in the Ignite cache?
Is it possible to set only the value to change without loading the data again?
If possible, which part should I set? Please.

I suppose the real question is - how to propagate changes applied to SQL Server first to the Apache Ignite cluster. And the short answer is - you need to do it by yourself, i.e. you need to implement some synchronization logic between the two databases. This should not be a complex task if most of the data updates come through Ignite and SQL Server-first updates are rare.
As for the general approach, you can check for the Change Data Capture (CDC) pattern implementations. There are multiple articles on how you can achieve it using external tools, for sample, CDC Between MySQL and GridGain With Debezium or this video.
It's worth mentioning that Apache Ignite is currently working on its own native implementation of CDC.

Take a look at Ignite's external storage integration, and the read/write through features. See: https://ignite.apache.org/docs/latest/persistence/external-storage
and https://ignite.apache.org/docs/latest/persistence/custom-cache-store
examples here: https://github.com/apache/ignite/tree/master/examples/src/main/java/org/apache/ignite/examples/datagrid/store

Related

Does PostgreSQL provide Change Tracking feature similar to SQL Server change tracking?

Does PostgreSQL provide change tracking feature like that on SQL Server.
this is what I basically want. I want to move my data after few minutes intervals to other database. for this I just want to fetch changed data only in PGSQL through change tracking like that of SQL Server change tracking. What is the best way to achieve this?
It's not so easy with PostgreSQL. You can use WAL’s aka Write Ahead Logs or triggers. May be the best approach will be using a external library like https://debezium.io
I'm trying to achieve the same goal. This is what I found surfing the Internet. There are few possible approaches:
Streaming replication (ships a binary change log to a standby server)
Slony (uses triggers to accumulate DML changes into tables that are periodically shipped to standby servers)
Logical changeset logging
Audit trigger 91plus

Apache Ignite cache to SQL and vice versa

I'm working on a system which will fetch data from a service and put pieces of the response in to a cache and/or into a SQL table.
The cache is needed for consumption by other Java services directly. These services require a more direct connection than the SQL abstraction, so we need to connect directly to the cache.
The table is needed for a JDBC SQL connection to external SQL clients e.g. SQL Workbench, DBeaver, Tableau, 3rd party systems.
My question is how Ignite works regarding caches vs tables. I know it stores its caches as maps similar to other IMDGs. What I guess I don't understand is how that gets turned into a table, or what APIs are available to set/get between the two.
So the question is, how can I take an INSERT from the JDBC/SQL side and query it via the Cache? How can I add() into the Cache and SELECT it from the JDBC/SQL side? If I have a table named "foo", does that also create a cache named "foo"?
Or am I supposed to use one or the other and not bleed between the two? I haven't found many good examples of this, so it seems to be either you use caches or you use tables.
It would be extremely advantageous to have a bridge between the two. We're migrating to Ignite from an H2 implementation where we mushed a Hazelcast cache and H2's SQL together and are hoping Ignite, being built atop H2, has done something similar already.
In particular, I was hoping to use DataStreamers but I'm not finding much in the way of how it relates to the SQL/table side of things.
Ignite cache falls under key-value type of nosql database. You can fire SQL like query from java code to ignite caches as it supports it. For example,
SELECT _KEY, _VAL from "foo".val
Here, foo is your cache name and val is the value part of key-value pair. As this is all NOSQL, relating it to RDBMS SQL is not so much rational, still we can relate all non primary columns in SQL table to the fields of your value object and primary one to the key part.
So, in datastreamer, you can construct collection of key, value objects and stream it. This internally calls nothing but put operation on cache.
To select in SQL fashon, you can fire query like below-
SqlFieldsQuery query = new SqlFieldsQuery(queryString);
FieldsQueryCursor<List<?>> cursor = cache.query(query);
There are multiple ways to do this, SqlFieldsQuery is one of that.
This was answered couple times already, basically you need to refer to Query Entities, Indexed Types or key_type/value_type parameters of CREATE TABLE to make it work. I.e. every entry in cache of correct type will be a row of table and vice versa.

Filtered one-way synchronization of Azure SQL database

We have a multi-tenant, single db application where some customers have expressed the desire to get direct access to their own data.
I have been suggested looking into Azure Data Sync to achieve a setup where each of the customers get their own Azure SQL instance to which we setup a one-way synchronization of their data from the master database.
I managed to find some documentation on this, but one I got around to try it out in a lab setup, it looks like the ability to filter rows in the sync job has been removed in a later iteration of the Azure Data Sync service.
Am I wrong or is that feature really gone? If so, what would be your suggestions to achieve something similar on Azure?
You cannot filter rows using Azure SQL Data Sync. However, you can build a custom solution based on Sync Framework as explained here.

How to transfer Data from One SQL server to another with out transactional replication

I have a database connected with website, data from website is inserting in that Database, i need to transfer data from that database to another Primary Database (SQL) on another server in real time (minimum latency).
I can not use transactional replication in this case. What are the other alternates to achieve this? Can i integrate DataStreams like Apache kafka etc with SQL server?
Without more detail it's hard to give a full answer. There's what's technically possible, and there's architecturally what actually makes sense :)
Yes you can stream from RDBMS to Kafka, and from Kafka to RDBMS. You can use the Kafka Connect JDBC source and sink. There are also CDC tools (e.g. Attunity, GoldenGate, etc) that support integration with MS SQL and other RDBMS)
BUT…it depends why you want the data in the second database. Do you need an exact replica of the first? If so DB-DB replication may be a better option. Kafka's a great option if you want to process the data elsewhere and/or persist it in another store. But if you just want MS SQL-MS SQL…Kafka itself may be overkill.

What database replication model do I need?

I am on Rails 3 with a local Postgres database. What we want to do is replicate the entire database onto a second server in real time. We are thinking of using Octopus.
I'm confused about what model I'm looking for and how the master-slave model applies.
Postgres 9.1 and later comes with streaming replication built in (for master-slave configurations). Check out http://www.postgresql.org/docs/9.2/static/warm-standby.html#STREAMING-REPLICATION for more information on configuration and setup.
There are other third-party solutions for configuration, but I'd start there and see if that meets your needs.