I am studying Redis, and I surprise how Redis works. I found that Redis, store recent data in cache in NoSQL format and have their own query for that. But I am curious about the following working:
How data store in persistence database. Do we need to fire same insert query in both the database?
If Redis uses NoSQL database is it compulsory that persistence database we are using is follow NoSQL structure?
How data synchronisation works between Redis database and persistence database?
Redis offers persistence. There are several options depending on what exactly do you need. Here is the official documentation.
Related
Is there a way to sync NoSQL and SQL databases?
My problem is: We have a software that uses MSSQL. We also have a mobile application that uses MongoDB. We want to sync data (on create/update) between those databases. Mostly from MongoDB to MSSQL.
It is not a problem for us (if we have to) to use different NoSQL DBMs, but we can't find clear instructions of how to sync those two the way i described.
Can anyone help? Thanks.
Have you looked at CDC (Change Data Capture) tools that allow you to capture events from your source database (in this case MSSQL) and deal with the event to update/create/delete data in the NoSQL database.
I invite you to look at https://debezium.io/ and the MSSQL Connector & MongoDB connector:
MS SQL Server Connector
MongoDB Connector
I have a database connected with website, data from website is inserting in that Database, i need to transfer data from that database to another Primary Database (SQL) on another server in real time (minimum latency).
I can not use transactional replication in this case. What are the other alternates to achieve this? Can i integrate DataStreams like Apache kafka etc with SQL server?
Without more detail it's hard to give a full answer. There's what's technically possible, and there's architecturally what actually makes sense :)
Yes you can stream from RDBMS to Kafka, and from Kafka to RDBMS. You can use the Kafka Connect JDBC source and sink. There are also CDC tools (e.g. Attunity, GoldenGate, etc) that support integration with MS SQL and other RDBMS)
BUT…it depends why you want the data in the second database. Do you need an exact replica of the first? If so DB-DB replication may be a better option. Kafka's a great option if you want to process the data elsewhere and/or persist it in another store. But if you just want MS SQL-MS SQL…Kafka itself may be overkill.
Background: We currently have our data split between two relational databases (Oracle and Postgres). There is a need to run ad-hoc queries that involve tables in both databases. Currently we are doing this in one of two ways:
ETL from one database to another. This requires a lot of developer
time.
Oracle foreign data wrapper on our
Postgres server. This is working, but the queries run extremely
slowly.
We already use Google Cloud Platform (for the project that uses the Postgres server). We are familiar with Google BigQuery (BQ).
What we want to do:
We want most of our tables from both these databases (as-is) available at a single location, so querying them is easy and fast. We are thinking of copying over the data from both DB servers into BQ, without doing any transformations.
It looks like we need to take full dumps of our tables on a periodic basis (daily) and update BQ since BQ is append-only. The recent availability of DML in BQ seems to be very limited.
We are aware that loading the tables as is to BQ is not an optimal solution and we need to denormalize for efficiency, but this is a problem we have to solve after analyzing the feasibility.
My question is whether BQ is a good solution for us, and if yes, how to efficiently keep BQ in sync with our DB data, or whether we should look at something else (like say, Redshift)?
WePay has been publishing a series of articles on how they solve these problems. Check out https://wecode.wepay.com/posts/streaming-databases-in-realtime-with-mysql-debezium-kafka.
To keep everything synchronized they:
The flow of data starts with each microservice’s MySQL database. These
databases run in Google Cloud as CloudSQL MySQL instances with GTIDs
enabled. We’ve set up a downstream MySQL cluster specifically for
Debezium. Each CloudSQL instance replicates its data into the Debezium
cluster, which consists of two MySQL machines: a primary (active)
server and secondary (passive) server. This single Debezium cluster is
an operational trick to make it easier for us to operate Debezium.
Rather than having Debezium connect to dozens of microservice
databases directly, we can connect to just a single database. This
also isolates Debezium from impacting the production OLTP workload
that the master CloudSQL instances are handling.
And then:
The Debezium connectors feed the MySQL messages into Kafka (and add
their schemas to the Confluent schema registry), where downstream
systems can consume them. We use our Kafka connect BigQuery connector
to load the MySQL data into BigQuery using BigQuery’s streaming API.
This gives us a data warehouse in BigQuery that is usually less than
30 seconds behind the data that’s in production. Other microservices,
stream processors, and data infrastructure consume the feeds as well.
I was wondering if in addition to process and display data on dashboard in wso2cep, can I store it somewhere for a long period of time to get further information later? I have studied there are two types of tables used in wso2cep, in-memory and rdbms tables.
Which one should I choose?
There is one more option that is to switch to wso2das. Is it a good approach?
Is default database is fine for that purpose or I should move towards other supported databases like sql, orcale etc?
In-memory or RDBMS?
In-memory tables will internally use java collections structures, so it'll get destroyed once the JVM is terminated (after server restart, data won't be available). On the other hand, RDBMS tables will persist data permanently. For your scenario, I think you should proceed with RDBMS tables.
CEP or DAS?
CEP will only provide real-time analytics, where DAS provides batch analytics (with Spark SQL) in addition to real-time analytics. If you have a scenario which require batch processing, incremental processing, etc ... You can go ahead with DAS. Note that, migration form CEP to DAS is quite simple (since the artifacts are identical).
Default (H2) DB or other DB?
By default WSO2 products use embedded H2 DB as data source. However, it's recommended to use MySQL or Oracle in production environments.
I have a project in which i need to replace the SQL DB with REDIS. Its a job scheduling system. There are tables like JobInfo, TaskInfo, Result, BatchInfo etc.
What is the best way to map DB tables in REDIS server key value pair?
There are join and group by kind of queries used in the project.
What is the best way to replace the sql server with the redis server? Also does redis provides a way with which i can query the data like i can in join and group by queries?
Redis is basically a key-value store (a bit more sophisticated than just a simple one, but yet - a key-value db). the value may be a document that follows some schema, but Redis isn't optimized to search for those documents and query them like other Document Databases or like relational database such as SQL Server.
I dont know why you're trying to migrate from SQL Server to Redis, but you need to re-check yourself if that's the right design choice. If you need fixed schema and join operations - it may suggest that Redis isn't the right solution.
If all you're looking for is caching, you can cache in the application layer, or use other solution to integrate your Redis and SQL Server (I wrote simple open-source project that does that: http://redisql.ishahar.net ).
Hope this helps.
I guess its not possible though you can see below post to implement JOIN like feature in Redis.
Can we take join in Redis?
Please refer below post as well:
Redis database table desing like sql?