What is the best structure to separate all tables per client

What is the best structure to separate all tables per client - sql

For instance, i have these entities
Client : table
TransactionA : table
TransactionB : table
..
TransactionZ : table
TransactionA to TransactionZ table is referenced to Client
in database structure, i've been thinking of creating new table TransactionA for every new Client registered and has a schema with the Client.Code so it looks like clientA.tbl_TransactionA.
with this structure, i think my database would generate thousands of table depending on how many clients will register which i think that it is hard in maintenance if there's a modification in core.
I would like to ask for your opinion on the best approach on this matter, advantage and disadvantage.
PS:
I am using Entity Framework (code first), MSSQL
Thanks in advance.

Creating a table per client would not be a good idea on many levels. To pick one of the more obvious ones, using Entity Framework you would have to alter and recompile your code each time you wanted to add a client. You'd probably have to use reflection or to figure out which client DbSet to reference when seeking a transaction.
It isn't clear what has driven you to this design consideration, but it would seem obvious that the more reasonable model would be to have a Transactions table that had a foreign key / navigation property to the Client table. I assume there's some good but unstated reason why this would not suffice, though.

Related

Best way to keep track of users and records in a NET Core Web Application

I'm trying to build an Inventory web application with .NET Core. In this app, I want to keep track of every create and update operation, so almost every model in my application has CreatedBy and ModifiedBy fields and each of those fields have a one-to-many relationship with the UserId field from the Users model.
So there are a lot of foreign keys in my models and lots of navigational properties in my Users model. It works but looks kind of messy especially in my Users model so it got me thinking maybe there is something wrong with my approach. I thought of some other ways but I am just learning the ropes so I can't really predict the possible downsides of those approaches, thus, I need help.
So what's the best way to deal with this kind of situation in a web application?
Should I keep defining foreign keys?
Should I store UserId as string in those columns?
Should I create another table which holds records for every create / update operation?
Is there a better way out there?

After some research I decided to go on with temporal tables solution from SQL Server directly. You have to add just a couple of codes to your dbcontext's onmodelcreating method to set it up and it looks like it's working very good for my needs.

Degenerate a single SQL table into multiple domaines tables

In short: I have a client who wish to be able to add domain tables, without adding SQL tables.
I am working with an application in wich data are organized and made available with a postgresql catalogue. What I mean by catalogue is that the database hold the path to the actual data file(s) as well as some metadata.
Adding a new table means that the (Java class of the) client application has to be updated. This is a costly process for the client, who want us to find a way to let him add new kind of data in the catalogue, without having to change the schema.
I don't have many more specificities about the db itself and it's configuration as I'm usualy mostly a client of the said db.
My idea: to solve this was to have a generic table with the most often used columns (like date, comment etc.) and a column containing a domain key. The domain key would be used by the client application to request the kind of generic data is needed (and would have no meaning whatsoever to the db provider). Adding metadata could be done with a companion file within the catalogue and further filtering would have to be done on the client side.
Question: as I am by no mean an SQL expert, I would like to know if it is an acceptable solution, and what limitation I could be facing ? I'm thinking of performance, data volume etc. Or maybe a different approach, is advisable ?
Regarding expected volume, for a single domain data type, it could be arround 30 new entry per day.

many-to-many database

I'm trying to create a database to analyze the configurations of my servers.
But i have many services that can run on many servers (for failover/load balancing). Also, the configuration of a same service can change from one server to an other, this is why I can't just have a service table
I tried to link the different tables using a single table that tie them all together. I think i'm in NF3 but i'm not 100% sure.
Is that a "valide" database design? database design
I fear that the request to find stuff in the database are going to be a bit complicated.
thank you

It would help more if you showed the actual db design. But...
If you have two tables associated in a many to many relationship, you will need a table in between them that represents the relationship. Tables normally represent entities in the real world, and foreign key represent the relationships. But in a many to many relationship you need a table to handle that complexity.
That table represents and could be called ServiceRunningOnServer and it's primary key should be the combination of a ServiceId (with a foreign key to Service.Id), and a ServerId (with a foreign key to Server.Id).
Any setting for a service that is across the board (not server specific) is an attribute of the service entity and so belongs in the Service table. But any setting that is specific to the server it is running on, is an attribute of the relationship between that service and that server and so it belongs in the ServiceRunningOnServer table.
Yes, this is a perfectly normalized db design. And actually it is the design with the most optimized complexity. Meaning that other designs might make some things easier, but they will also make other things harder. In the end, and in the total sum of things, other designs will over-complicate things. This design will keep the total complexity of adding, updating, reading and deleting data from your database to a minimum.

Refactoring database and preserve existing data best practice?

I have been working on a very data intensive application that has around 250 tables. Recently there have been some design changes required. Some of the design changes require adding new tables and linking those up with existing tables (foregin key) in a 1-N manner for parent - child relationships (in ORM).
Take this example. Current design allows for one Rental Vehicle per Contract. New design requires multiple Vehicles in the same Contract with Multiple rates.
So the data in one table needs to be put in 2 additional tables now.
I have completed the changes to the schema but I can't deploy those changes to the test environment until I find a way to convert the existing data and put it in the new design format.
My current process.
Add 3 new Tables nContract, nContractedAsset, nContractRate
Copy information from Contract table into 3 new tables. Preserve primary key field on nContract table same as Contract table.
Copy foregin key references / Indexes / Rights to nContract from Contract table
Drop Contract table
Rename nContract to Contract and so on.
The only issue I have is I am not comfortable doing part 2 in SQL. I want to use the power of the ORM and .Net to do more intelligent and complex tasks for more complex scenarios than this example
Is there a way I can write the data migration using ADO.Net or ORM for step 2?
What are best practices or the processes for this? Am I doing something wrong?

I ended up using FluentMigrator https://github.com/schambers/fluentmigrator
It allowed me to do Entity Framework like migrations (See: Ruby On Rails Active Records migrations)
Most of the DDL can be written in .NET in a fluent format. It supports UP and DOWN migrations wrapped up in transactions and even supports full SQL scripts for data migration.
Best thing about it is all your migration scripts can be put in source control and even tested.

Ideas for Combining Thousand Databases into One Database

We have a SQL server that has a database for each client, and we have hundreds of clients. So imagine the following: database001, database002, database003, ..., database999. We want to combine all of these databases into one database.
Our thoughts are to add a siteId column, 001, 002, 003, ..., 999.
We are exploring options to make this transition as smoothly as possible. And we would LOVE to hear any ideas you have. It's proving to be a VERY challenging problem.
I've heard of a technique that would create a view that would match and then filter.
Any ideas guys?

Create a client database id for each of the client databases. You will use this id to keep the data logically separated. This is the "site id" concept, but you can use a derived key (identity field) instead of manually creating these numbers. Create a table that has database name and id, with any other metadata you need.
The next step would be to create an SSIS package that gets the ID for the database in question and adds it to the tables that have to have their data separated out logically. You then can run that same package over each database with the lookup for ID for the database in question.
After you have a unique id for the data that is unique, and have imported the data, you will have to alter your apps to fit the new schema (actually before, or you are pretty much screwed).
If you want to do this in steps, you can create views or functions in the different "databases" so the old client can still hit the client's data, even though it has been moved. This step may not be necessary if you deploy with some downtime.
The method I propose is fairly flexible and can be applied to one client at a time, depending on your client application deployment methodology.

Why do you want to do that?
You can read about Multi-Tenant Data Architecture and also listen to SO #19 (around 40-50 min) about this design.

The "site-id" solution is what's done.
Another possibility that may not work out as well (but is still appealing) is multiple schemas within a single database. You can pull common tables into a "common" schema, and leave the customer-specific stuff in customer-specific schema. In some database products, however, the each schema is -- effectively -- a separate database. In other products (Oracle, DB2, for example) you can easily write queries that work in multiple schemas.
Also note that -- as an optimization -- you may not need to add siteId column to EVERY table.
Sometimes you have a "contains" relationship. It's a master-detail FK, often defined with a cascade delete so that detail cannot exist without the parent. In this case, the children don't need siteId because they don't have an independent existence.

Your first step will be to determine if these databases even have the same structure. Even if you think they do, you need to compare them to make sure they do. Chances are there will be some that are customized or missed an upgrade cycle or two.
Now depending on the number of clients and the number of records per client, your tables may get huge. Are you sure this will not create a performance problem? At any rate you may need to take a fresh look at indexing. You may need a much more powerful set of servers and may also need to partion by client anyway for performance.
Next, yes each table will need a site id of some sort. Further, depending on your design, you may have primary keys that are now no longer unique. You may need to redefine all primary keys to include the siteid. Always index this field when you add it.
Now all your queries, stored procs, views, udfs will need to be rewritten to ensure that the siteid is part of them. PAy particular attention to any dynamic SQL. Otherwise you could be showing client A's information to client B. Clients don't tend to like that. We brought a client from a separate database into the main application one time (when they decided they didn't still want to pay for a separate server). The developer missed just one place where client_id had to be added. Unfortunately, that sent emails to every client concerning this client's proprietary information and to make matters worse, it was a nightly process that ran in the middle of the night, so it wasn't known about until the next day. (the developer was very lucky not to get fired.) The point is be very very careful when you do this and test, test, test, and test some more. Make sure to test all automated behind the scenes stuff as well as the UI stuff.

what I was explaining in Florence towards the end of last year is if you had to keep the database names and the logical layer of the database the same for the application. In that case you'd do the following:
Collapse all the data into consolidated tables into one master, consolidated database (hereafter referred to as the consolidated DB).
Those tables would have to have an identifier like SiteID.
Create the new databases with the existing names.
Create views with the old table names which use row-level security to query the tables in the consolidated DB, but using the SiteID to filter.
Set up the databases for cross-database ownership chaining so that the service accounts can't "accidentally" query the base tables in the consolidated DB. Access must happen through the views or through stored procedures and other constructs that will enforce row-level security. Now, if it's the same service account for all sites, you can avoid the cross DB ownership chaining and assign the rights on the objects in the consolidated DB.
Rewrite the stored procedures to either handle the change (since they are now referring to views and they don't know to hit the base tables and include SiteID) or use InsteadOf Triggers on the views to intercept update requests and put the appropriate site specific information into the base tables.

If the data is large you could look at using a partioned view. This would simplify your access code as all you'd have to maintain is the view; however, if the data is not large, just add a column to identify the customer.

Depending on what the data is and your security requirements the threat of cross contamination may be a show stopper.
Assuming you have considered this and deem it "safe enough". You may need/want to create VIEWS or impose some other access control to prevent customers from seeing each-other's data.
IIRC a product called "Trusted Oracle" had the ability to partition data based on such a key (about the time Oracle 7 or 8 was out). The idea was that any given query would automagically have "and sourceKey = #userSecurityKey" (or some such) appended. The feature may have been rolled into later versions of the popular commercial product.

To expand on Gregory's answer, you can also make a parent ssis that calls the package doing the actual moving within a foreach loop container.
The parent package queries a config table and puts this in an object variable. The foreach loop then uses this recordset to pass variables to the package, such as your database name and any other details the package might need.
You table could list all of your client databases and have a flag to mark when you are ready to move them. This way you are not sitting around running the ssis package on 32,767 databases. I'm hooked on the foreach loop in ssis.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas