Multitenancy data architecture [closed] - sql

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Currently working on booking management system. This is a multitenant application and there will be around 50 tenants.
We are planning to host this ASP.Net MVC4/SQL Server 2008 application in some hosting provider like winasp.net, etc(Yet to decide)
Business Model Diagram
There are many levels of users like Super Admin, Tenant Admin, Customer Service, Doctors are described in the above pics.
For achieving this as a Database model, we chosen Shared Database with Shared Schema approach mentioned in MSDN Multitenant Data Architecture
Mean we added a column TenantId in each table
Our shared database & shared schema decision was made based on the below
No of tenants (50 +)
Easy to share the common meta data between the tenants
Moving big tenant(one/two) into a seperate instance if a tenant have more volume of data
We are now in progress and we still afraid of below issues to address
Data Security -> Everytime need to pass/check TenantId
Backup for a single tenant --> Need to write a SQL query for backup and Considering foreign key/auto increment is headache at backup
Data volume. Single database stores all tenant data, Querying data is slow
Indexing (Not sure whether we need to index all TenantId column in each table, since it involves in all WHERE
There are other options like
Single database/tenant
Shared database, seperate Schema
Also This Article has added some more approaches
We would like to get some advise/better design for our current design.
New approach match the above business diagram
A tenant admin/customer service user must be able to see the sub tenant records
Query performamce
Common Meta data sharing between tenant
Tenant Specific Meta data
Tenant Specific Data Fields (optional)
Easy backup

Seems to me that you should revisit your decision of having a shared database. If you have a requirement of strict data separation because of the confidentiality than you should have separate databases.
Indexing (Not sure whether we need to index all TenantId column in each table,
since it involves in all WHERE
Yes, you will have to index TenantId in each table and include it in all the queries.
Also, it looks like you've made a decision of using SQL Server before you analysed the requirements. There are probably more natural solutions for storing multitenant data, ie. RavenDB, that will make sharding / backups much simpler. I don't want to start any discussion about nosql, etc. - just suggesting that one should start with the requirements and choose the appropriate technology later.

Related

How to store user for many application in same sql database? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have three applications (it can be grow) and they use same database (mysql).
All those applications must have login.
So my question is which approach is database normiliztion?
One: creates user table for each application. user-app1, user-app2, user-app3. but its create a lot of tables (is it okay?)
Two: create user table once and column to indicate which app is authorize role: app1, app3. (but its create the same user for all the applications)
Or should I do in a different way?
Creating a separate user table for each application would generally be the wrong thing to do. This approach would duplicate the same data (i.e. user information) across different tables.
In fact, what you would do is have two tables:
Users
UserApplications
The first would have one row per user. It would have all the information about the users -- name, date the user is created and so on.
The second would have one row per user and application the user has signed up for. It would have additional information, such as the date they signed up for that application.
This allows you to both extend the number of users quite easily and to extend the number of applications (or to remove application).
The best way to evaluate this is to see whether
Knowing data about your user from one application differs from the other
Will these users use multiple applications together
If users will have 80-90% similar properties e.g. email, name, password hash, etc. and you expect them not to change then approach 2 works best. If you also expect these users to use multiple of these applications then it's definitely 2.
If users will have very different properties i.e. some users for app1 don't have emails but app2 needs emails and some users for app2 don't have names but app1 needs names then you might want to keep them in separate tables for data cleanliness.

Creating a blog service or a persistent chat with Table Storage

I'm trying azure storage and can't come up with real life scenarios when I would use it. As far as I understand the only index Table Storage has is Partition Key and Row Key. I can't sort or query on other columns without doing a full partition scan, right?
If I would migrate my blog service from a traditional sql server or a richer nosql database like Mongo i would probably be alright, considering users don't blog that much in one year (I would partition all blog posts per user per year for example). Even if someone would hit around a thousand blog posts a year i would be OK to load them all metadata in memory. I could do smarter partitioning if this won't work well.
If I would migrate my persistent chat service to table storage how would I do that? Users post thousands of messages a day and query history pretty often from desktop clients, mobile devices, web site etc. I don't want to lose on this and only return 1 day history with paging (which can be slow as well).
Any ideas or patterns or what am I missing here?
btw I can always use different database, however considering Table Storage is so cheap I don't want to.
PartitionKey and RowKey values are the only two indexed properties. To work around the lack of secondary indexes, you can store multiple copies of each entity with each copy using a different RowKey value. For instance, one entity will have PartitionKey=DepartmentName and RowKey=EmployeeID, while the other entity will have PartitionKey=DepartmentName and RowKey=EmailAddress. That will allow you to look up either by EmployeeID or emailAddress. Azure Storage Table Design Guide ( http://azure.microsoft.com/en-us/documentation/articles/storage-table-design-guide/) has more detailed example and has all the information that you need to design a scalable and performant Tables.
We will need more information to answer your second question about how you would migrate contents of your chat service to table storage. We need to understand the format and structure of the data that you currently store in your chat service.

PET technology Fluent Nhibernate

For a web application (with some real private data) we want to use privacy enhancing technology to prevent big risks when someone gets permission to our database.
The application is build with different layers, and we use (as said in the topic title) Fluent NHibernate to connect to our database and we've created our own wrapper class to create query's.
Security is a big issue for the kind of application we're building. I'll try to explain the setting by a simple example:
Our customers got some clients in their application (each installation of the application uses its own database), for which some sensitive data is added, there is a client table, and a person table, that are linked.
The base table, which links to the other tables (there will be hundreds of them soon), probably containing sensitive data, is the client table
At this moment, the client has a cleint_id, and a table_id in the database, our customer only knows the client_id, the system links the data by the table_id, which is unknown to the user.
What we want to ensure:
A possible hacker who would have gained access to our database, should not be able to see the link between the customer and the other tables by just opening the database. So actually there should be some kind of "hidden link" between the customer and other tables. The personal data and all sensitive other tables should not be obviously linked together.
Because of the data sensitivity we're looking for a more robust solution then "statically hash the table_id and use this in other tables", because when one of the persons is linked to the corresponding client, not all other clients data is compromised too.
Ultimately, the customer table cannot be linked to the other tables at all, just by working inside the database, the application-code is needed to link the tables.
To accomplish this we've been looking into different methods, but because of the multiple linked tables to this client, and further development (thus probably even more tables) we're looking for a centralised solution. That's why we concluded this should be handled in the database connector. Searching on the internet and here on Stack Overflow, did not point us in the right direction, perhaps we couldn't find this because of wrong search terms (PET, Privacy enhancing technology, combined with NHibernate did not give us any directions.
How can we accomplish our goals in this specific situation, or where to search to help us fix this.
We have a similar requirement for our application and what we ended up with using database schema's.
We have one database and each customer has a separate schema, where all the data for that customer is stored. It is possible to link from the schema to the rest of the database, but not to different schema's.
Security can be set for each schema separately so you can make the life of a hacker harder.
That being said I can also imagine a solution where you let NHibernate encrypt every peace of data it will send to the database and decrypt everything it gets back. The data will be store savely, but it will be very difficult to query over data.
So there is probably not a single answer to this question, and you have to decide what is better: Not being able to query, or just making it more difficult for a hacker to get to the data.

Working of Login System in Large Applications [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am crazy to know that, how login system works in large applications like Facebook, Gmail, youtube, yahoo etc. Once after entering credentials server is responding more quickly. How is that possible ?
There must be more db servers for storing user information. So my question is
How they look for authentication information over more db servers?
Do they look over all the db servers to check for a particular user and if so how it is responding more quickly ?
Do they allocate db server based on geographical location of the user ?
And do they also have more application servers and how these are interconnected with each other.
RDBMS have the functionality to link servers that issue distributed queries, updates, commands, and transactions on heterogeneous data sources.
The database system will use some form of cached information about the user, in SQL Server an execution plan is stored and used when a query is executed. The database management system will decide which execution plan to take in order to generate the fastest results or use a cached data set. Note: Google, Facebook, Amazon etc will lot of server processing power behind the scenes which will make it seem instantaneous. They will also have dedicated teams to manage their databases, perform indexes, tuning, optimization and identify bottlenecks.
The geographical location of the server could be a factor. The closer the server is to the user the faster they can get the information but IMO this would be a matter of nano/milli seconds difference depending on where their data center is located. If the server gets too busy then the load balance will migrate you/other users to a server with more available resources.
Yes. Using more than one web server is needed in scenarios like this and is tied in to part 3 of the question, which server you hit depends on how much available resources the closest server has and if it will accept your connection. They are distributed but the whole process seems transparent to the user, i.e. they think they are using the same server as every other client. They can be interconnected by using session management, Web Services and other interoperability techniques and technologies.

Database Client Specific Tables v/s Relational Tables

I have a scenario, my application is a SAAS based app catering to multiple clients. Data Integrity to clients is very essential.
Is it better to keep my Tables
Client specific
OR
Relational Tables
For Ex: I have a mapping table with fields MapField1,MapField2. I need this kind of data for each client.
Should I have tables like MappingData_
or a Single Table with mapping to the ClientId
MappingData with Fields MapField1,MapField2,ClientId
I would have a separate database for each customer. (Multiple databases in a single SQL Server instance.)
This would allow you to design it once, with a single schema.
No dynamically named tables compromising test & development
Upgrades and maintenance can be designed and tested in one DB, then rolled out to all
A single customer's data can be backed-up, restored or dropped exceedingly simply
Bugs discovered/exploited in one DB won't comprise the integrity of other DBs
Data access (read and write) can be managed using SQL Logins (No re-inventing the wheel)
If there is a need for globally shared data, that would go in another database, with it's own set of permissions for the different SQL Logins.
The use of a single database, with all users in it is my next best choice. You still have a single schema. But you don't get to partition the customers' data, you need to manage access rights and permissions yourself, and a whole host of other additional design and testing work.
I would never go near dynamically creating new tables for additional customers. A new table name means all your queries need to be updated with the new table name, and a whole host of other maintenance head-aches.
I'm pretty much of the opinion that if you want to create tables dynamically during the Business As Usual use of an application/service, you've designed it badly.
SO has a tag for the thing you're describing: "multi-tenant".
Visualize the architecture for supporting a multi-tenant database application as a spectrum. At one extreme of the spectrum is "shared nothing", which means each tenant has its own database. At the other extreme of the spectrum is "shared everything", which means tenants share tables, and each row in each table belongs to one tenant. (Each row contains a tenant identifier.)
Terminology seems to overlap, so read carefully. What one writer means by shared schema might be identical to what another writer means by shared everything.
This SO answer, also written by me, describes the differences and the tradeoffs in terms of cost, data isolation and protection, maintenance, and disaster recovery. It also links to a fairly good introductory article.