Monolith to Microservice data migration

Monolith to Microservice data migration - migration

We are doing monolith to microservices transformation.
Transformation strategy is to develop baseline functionality that could suffice functionality for few accounts and functionalities would be added incrementally to suffice different accounts.
I need to design database migration system from monolith to microservice(database per microservice) with minimal/no manual efforts to migrate each account.
The main challenge is monolith data, it is scattered in different tables with different logic.
Example:
for an account - cost details are in cost table
for another account - it is in conf table.
Basically each account may have different source table and column for some data.
I'm planning to keep transformation and loading as generic as possible because we are offloading all the data fetching logic to different layer.

Related

Financial information in same ERP database?

I am making a back-end that handles inventory, user data, documentation, & images (basically a custom ERP system) for an application I am developing.
I am deriving some accounting information (payroll, TVM calculations) based on price data and hours logged by users. Is it typical to use the same relational database (MySQL) to store EVERYTHING (but in separate tables of course)? Or do you want to split things into separate databases?

Integration questions when migrate monolithic to microservices using Quarkus

Currently I have a monolithic application with some modules like financial and accounting. This application uses a single database and the modules are divided into schemas, so when I need to display the data on user interface or in a report I just need to do a simple query with a couple joins.
My question is, in a microservices structure where each module has his own database, how do I retrieve this data getting the same result as if I were in a single database?

When talking about splitting the database in the process of migrating a monolith to Microservices, there are some known patterns like:
The shared database
Database view
Database wrapping service
Database as a service
Seems the database view or the database as a service could be a candidate in this case, but of course no one better than you can decide which one is viable in your project.
I highly recommend you to have a look at chapter 4 of "Monolith to Microservices" by Sam Newman.

Creating a Datawarehouse

Currently our team is having a major database management/data management issue where hundreds of databases are being built and used for minor/one off applications where the app should really be pulling from an already existing database.
Since our security is so tight, the owners of these Systems of authority will not allow others to pull data from them at a consistent (App Necessary) rate, rather they allow a single app to do a weekly pull and that data is then given to the org.
I am being asked to compile all of those publicly available (weekly snapshots) into a single data warehouse for end users to go to. We realistically are talking 30-40 databases each with hundreds of thousands of records.
What is the best way to turn this into a data warehouse? Create a SQL server and treat each one as its own DB on the server? As far as the individual app connections I am less worried, I really want to know what is the best practice to house all of the data for consumption.

What you're describing is more of a simple data lake. If all you're being asked for is a single place for the existing data to live as-is, then sure, directly pulling all 30-40 databases to a new server will get that done. One thing to note is that if they're creating Database Snapshots, those wouldn't be helpful here. With actual database backups, it would be easy to build a process that would copy and restore those to your new server. This is assuming all of the sources are on SQL Server.
"Data warehouse" implies a certain level of organization beyond that, to facilitate reporting on an aggregate of the data across the multiple sources. Generally you'd identify any concepts that are shared between the databases and create a unified table for each concept, then create an ETL (extract, transform, load) process to standardize the data from each source and move it into those unified tables. This would be a large lift for one person to build. There's plenty of resources that you could read to get you started--Ralph Kimball's The Data Warehouse Toolkit is a comprehensive guide.
In either case, a tool you might want to look into is SSIS. It's good for copying data across servers and has drivers for multiple different RDBMS platforms. You can schedule SSIS packages from SQL Agent. It has other features that could help for data warehousing as well.

How to isolate SQL Data from different customers?

I'm currently developing a service for an App with WCF. I want to host this data on windows-azure and it should host data from differed users. I'm searching for the right design of my database. In my opinion there are only two differed possibilities:
Create a new database for every customer
Store a customer-id to every table (or the main table when every table is connected via entities)
The first approach has very good speed and isolating, but it's very expansive on windows azure (or am I understanding something of the azure pricing wrong?). Also I don't know how to configure a WCF- Service that way, that it always use another database.
The second approach is low on speed and the isolating is poor. But it's easy to implement and cheaper.
Now to my question:
Is there any other way to get high isolation of data and also easy integration in a WCF- service using azure?
What design should I use and why?

You have two additional options: build multiple schema containers within a database (see my blog post about this technique), or even better use SQL Database Federations (you can use my open-source project called Enzo SQL Shard to access federations). The links I am providing give you access to other options as well.
In the end it's a rather complex decision that involves a tradeoff of performance, security and manageability. I usually recommend Federations, even if it has its own set of limitations, because it is a flexible multitenant option for the cloud with the option to filter data automatically. Check out the open source project - you will see how to implement good separation of customer of data independently of the physical storage.

Database Client Specific Tables v/s Relational Tables

I have a scenario, my application is a SAAS based app catering to multiple clients. Data Integrity to clients is very essential.
Is it better to keep my Tables
Client specific
OR
Relational Tables
For Ex: I have a mapping table with fields MapField1,MapField2. I need this kind of data for each client.
Should I have tables like MappingData_
or a Single Table with mapping to the ClientId
MappingData with Fields MapField1,MapField2,ClientId

I would have a separate database for each customer. (Multiple databases in a single SQL Server instance.)
This would allow you to design it once, with a single schema.
No dynamically named tables compromising test & development
Upgrades and maintenance can be designed and tested in one DB, then rolled out to all
A single customer's data can be backed-up, restored or dropped exceedingly simply
Bugs discovered/exploited in one DB won't comprise the integrity of other DBs
Data access (read and write) can be managed using SQL Logins (No re-inventing the wheel)
If there is a need for globally shared data, that would go in another database, with it's own set of permissions for the different SQL Logins.
The use of a single database, with all users in it is my next best choice. You still have a single schema. But you don't get to partition the customers' data, you need to manage access rights and permissions yourself, and a whole host of other additional design and testing work.
I would never go near dynamically creating new tables for additional customers. A new table name means all your queries need to be updated with the new table name, and a whole host of other maintenance head-aches.
I'm pretty much of the opinion that if you want to create tables dynamically during the Business As Usual use of an application/service, you've designed it badly.

SO has a tag for the thing you're describing: "multi-tenant".
Visualize the architecture for supporting a multi-tenant database application as a spectrum. At one extreme of the spectrum is "shared nothing", which means each tenant has its own database. At the other extreme of the spectrum is "shared everything", which means tenants share tables, and each row in each table belongs to one tenant. (Each row contains a tenant identifier.)
Terminology seems to overlap, so read carefully. What one writer means by shared schema might be identical to what another writer means by shared everything.
This SO answer, also written by me, describes the differences and the tradeoffs in terms of cost, data isolation and protection, maintenance, and disaster recovery. It also links to a fairly good introductory article.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Monolith to Microservice data migration - migration

Related

Financial information in same ERP database?

Integration questions when migrate monolithic to microservices using Quarkus

Creating a Datawarehouse

How to isolate SQL Data from different customers?

Database Client Specific Tables v/s Relational Tables

Categories

Resources