Only one redis database can be read at a time, and only one database can be written to. Data is being sunk into two databases. how to set it up.
I want to create a redis cluster concept. how to it's is use full and it's problems.
Related
Background: We currently have our data split between two relational databases (Oracle and Postgres). There is a need to run ad-hoc queries that involve tables in both databases. Currently we are doing this in one of two ways:
ETL from one database to another. This requires a lot of developer
time.
Oracle foreign data wrapper on our
Postgres server. This is working, but the queries run extremely
slowly.
We already use Google Cloud Platform (for the project that uses the Postgres server). We are familiar with Google BigQuery (BQ).
What we want to do:
We want most of our tables from both these databases (as-is) available at a single location, so querying them is easy and fast. We are thinking of copying over the data from both DB servers into BQ, without doing any transformations.
It looks like we need to take full dumps of our tables on a periodic basis (daily) and update BQ since BQ is append-only. The recent availability of DML in BQ seems to be very limited.
We are aware that loading the tables as is to BQ is not an optimal solution and we need to denormalize for efficiency, but this is a problem we have to solve after analyzing the feasibility.
My question is whether BQ is a good solution for us, and if yes, how to efficiently keep BQ in sync with our DB data, or whether we should look at something else (like say, Redshift)?
WePay has been publishing a series of articles on how they solve these problems. Check out https://wecode.wepay.com/posts/streaming-databases-in-realtime-with-mysql-debezium-kafka.
To keep everything synchronized they:
The flow of data starts with each microservice’s MySQL database. These
databases run in Google Cloud as CloudSQL MySQL instances with GTIDs
enabled. We’ve set up a downstream MySQL cluster specifically for
Debezium. Each CloudSQL instance replicates its data into the Debezium
cluster, which consists of two MySQL machines: a primary (active)
server and secondary (passive) server. This single Debezium cluster is
an operational trick to make it easier for us to operate Debezium.
Rather than having Debezium connect to dozens of microservice
databases directly, we can connect to just a single database. This
also isolates Debezium from impacting the production OLTP workload
that the master CloudSQL instances are handling.
And then:
The Debezium connectors feed the MySQL messages into Kafka (and add
their schemas to the Confluent schema registry), where downstream
systems can consume them. We use our Kafka connect BigQuery connector
to load the MySQL data into BigQuery using BigQuery’s streaming API.
This gives us a data warehouse in BigQuery that is usually less than
30 seconds behind the data that’s in production. Other microservices,
stream processors, and data infrastructure consume the feeds as well.
Currently I have 3 (same code base apps) with it's own databases and own unique data. Were moving towards doing multi tenancy in rails, after a couple of prototype testing we've decided to go for a shared tenancy. My only biggest problem is that, each databases have their own data with unique ids and etc. How would it be possible to merge them either via sql command/dump or rails script that way they will have their own account_id + keep all data integrity?
Absolutely doable. It depends on a lot of details.
Basically I would
Make a full backup of all three.
Prep each database to hold compatible data (no duplicates).
Select one to be the new master.
Dump the other two (data only).
Hack the dump, to make sure. Typical COPY statements in dumps are just fine.
Restore data from the two additional database on top of existing data in the master.
Make sure all sequences are set properly.
Run vaccumdb -fz master.
If I have two databases and views in one in which both databases are JOIN'ed or UNION'ed is this an issue for for GCSql? This feature according to MySQL only requires that both databases remain within the same hardware cluster.
I am not totally clear on what constitutes a hardware cluster, but how does that relate to google SQL instances, etc?
Each Google Cloud SQL instances has a single MySQL instance at any one time. The data is replicated to multiple locations when that single MySQL instance writes it to persistent storage -- this means that the instance can failover to a new location if there is a problem.
There isn't any hardware clustering in the sense used here.
I have a online Database which will be updated Daily from various Sources.
I need to have a local Database with some tables from Server Database which have to check for any changes or new rows in tables in server and update the local Database for particular Intervals of Time. How can I Achieve this???
You may want to look into SQL Server Replication.
Replication will manage the data synchronization between the two copies of your database. You can configure replication for any tables in the database, including all tables. Replication will take care of checking for updates, adds and deletes from the Server Database and transfer the changes to the local database.
You can setup replication to update the local database at near-real-time or you can schedule periodic updates.
Replication is a high-maintenance solution. It's designed to maintain two copies of the same database with significant reliability. This makes replication a good solution when you must avoid data problems or recover from problems with little to no data loss.
If you don't require the high-maintenance solution, then SQL Server Integration Services (SSIS) may be a good alternative. With SSIS, you develop the data transfer and data management solution. Along with managing data problems, you design the solution to identify data adds, deletes and updates.
I have two applications using two nearly identical MySQL databases within the same cluster. Some tables must contain separate data, but others should hold identical contents (i.e. all writes and rows in db1.tbl should be accessible in db2.tbl and vice versa).
What's the proper way to go about this? Note that the applications use hardcoded table (but not database) names, so simply telling application 2 to access db1.tbl is not an option.
What you need to do is set up replication for the tables that you need. See http://dev.mysql.com/doc/refman/5.0/en/replication.html for the documentation on setting up replication in MySQL.
For databases on different mysqld processes
You should check the official manual for replicating individual tables:
http://dev.mysql.com/doc/refman/5.1/en/replication-options-slave.html#option_mysqld_replicate-do-table
You can setup an Master-Master relation between the two mysql processes just keep in mind to be carefull and have uniqueness on your Primary Key.
For databases residing on the same server & mysqld service
IMHO design wise you should consider the idea of moving all your shared tables under a different DB.
This way you will avoid all the overkill of triggers for updating them.