I'm creating a Microsoft SQL server that initially only served one client but am now looking to have many (Up to several thousand if things go well). The entire structure will be the same for each client with only the data within each table being client specific.
I am thinking of adding ClientID to almost all tables and referencing this in all functions (basically a where ClientID = #ClientID on every statement). Along with a Clients table that gains a new entry for every new client
The alternative being a create database [Client_Name] script that is fired whenever a new client joins the server to create another client specific database and all its associated structure and procedures.
Is there any advantage performance wise to either option?
The decision on how to structure such a database should not be made only on performance issues. In fact, that is probably the least of the issues. Some things to consider:
How will you manage updates to your application? Multiple databases can make this easier or harder.
Will individual clients have customizations? This favors multiple databases.
What are the security requirements for the data? This can go either way.
What are the replication and recovery requirements for the data? This would tend to be easier with one database, but not in all scenarios.
Will concurrent usage by different clients interfere with each other?
Will clients be responsible for managing their own data or is this part of your offering?
Is any data shared among clients? How will you maintain common reference tables?
In general, performance is going to be better with a single database (think half-filled data pages occupying memory). Maintenance and development will be easier with a single database (managing multiple client databases is cumbersome). But actual requirements on the application should be driving such a decision.
Related
I'm currently developing a service for an App with WCF. I want to host this data on windows-azure and it should host data from differed users. I'm searching for the right design of my database. In my opinion there are only two differed possibilities:
Create a new database for every customer
Store a customer-id to every table (or the main table when every table is connected via entities)
The first approach has very good speed and isolating, but it's very expansive on windows azure (or am I understanding something of the azure pricing wrong?). Also I don't know how to configure a WCF- Service that way, that it always use another database.
The second approach is low on speed and the isolating is poor. But it's easy to implement and cheaper.
Now to my question:
Is there any other way to get high isolation of data and also easy integration in a WCF- service using azure?
What design should I use and why?
You have two additional options: build multiple schema containers within a database (see my blog post about this technique), or even better use SQL Database Federations (you can use my open-source project called Enzo SQL Shard to access federations). The links I am providing give you access to other options as well.
In the end it's a rather complex decision that involves a tradeoff of performance, security and manageability. I usually recommend Federations, even if it has its own set of limitations, because it is a flexible multitenant option for the cloud with the option to filter data automatically. Check out the open source project - you will see how to implement good separation of customer of data independently of the physical storage.
I have a scenario, my application is a SAAS based app catering to multiple clients. Data Integrity to clients is very essential.
Is it better to keep my Tables
Client specific
OR
Relational Tables
For Ex: I have a mapping table with fields MapField1,MapField2. I need this kind of data for each client.
Should I have tables like MappingData_
or a Single Table with mapping to the ClientId
MappingData with Fields MapField1,MapField2,ClientId
I would have a separate database for each customer. (Multiple databases in a single SQL Server instance.)
This would allow you to design it once, with a single schema.
No dynamically named tables compromising test & development
Upgrades and maintenance can be designed and tested in one DB, then rolled out to all
A single customer's data can be backed-up, restored or dropped exceedingly simply
Bugs discovered/exploited in one DB won't comprise the integrity of other DBs
Data access (read and write) can be managed using SQL Logins (No re-inventing the wheel)
If there is a need for globally shared data, that would go in another database, with it's own set of permissions for the different SQL Logins.
The use of a single database, with all users in it is my next best choice. You still have a single schema. But you don't get to partition the customers' data, you need to manage access rights and permissions yourself, and a whole host of other additional design and testing work.
I would never go near dynamically creating new tables for additional customers. A new table name means all your queries need to be updated with the new table name, and a whole host of other maintenance head-aches.
I'm pretty much of the opinion that if you want to create tables dynamically during the Business As Usual use of an application/service, you've designed it badly.
SO has a tag for the thing you're describing: "multi-tenant".
Visualize the architecture for supporting a multi-tenant database application as a spectrum. At one extreme of the spectrum is "shared nothing", which means each tenant has its own database. At the other extreme of the spectrum is "shared everything", which means tenants share tables, and each row in each table belongs to one tenant. (Each row contains a tenant identifier.)
Terminology seems to overlap, so read carefully. What one writer means by shared schema might be identical to what another writer means by shared everything.
This SO answer, also written by me, describes the differences and the tradeoffs in terms of cost, data isolation and protection, maintenance, and disaster recovery. It also links to a fairly good introductory article.
I am working on a database for a monitoring application, and I got all the business logic sorted out. It's all well and good, but one of the requirements is that the monitoring data is to be completely stand-alone.
I'm using a local database on my web-server to do some event handling and caching notifications. Since there is one event row per system on my monitor database, it's easy to just get the id and query the monitoring data if needed, and since this is something only my web server uses, integrity can be enforced externally. Querying is not an issue either, as all the relationships are one-to-one so it's very straight forward.
My problem comes with user administration. My original plan had it on yet another database (to meet the requirement of leaving the monitoring database alone), but I don't think I was thinking straight when I thought of that. I can get all the ids of the systems a user has access to easily enough, but how then can I efficiently pass that to a query on the other database? Is there a solution for this? Making a chain of ors seems like an ugly and buggy solution.
I assume this kind of problem isn't that uncommon? What do most developers do when they have to integrate different database servers? In any case, I am leaning towards just talking my employer into putting user administration data in the same database, but I want to know if this kind of thing can be done.
There are a few ways to accomplish what you are after:
Use concepts like linked servers (SQL Server - http://msdn.microsoft.com/en-us/library/ms188279.aspx)
Individual connection strings within your front end driving the database layer
Use things like replication to duplicate the data
Also, the concept of multiple databases on a single database server instance seems like it would not be violating your business requirements, and I investigate that as a starting point, with the details you have given.
SQL Server 2008 database design problem.
I'm defining the architecture for a service where site users would manage a large volume of data on multiple websites that they own (100MB average, 1GB maximum per site). I am considering whether to split the databases up such that the core site management tables (users, payments, contact details, login details, products etc) are held in one database, and the database relating to the customer's own websites is held in a separate database.
I am seeing a possible gain in that I can distribute the hardware architecture to provide more meat to the heavy lifting done in the websites database leaving the site management database in a more appropriate area. But I'm also conscious of losing the ability to directly relate the sites to the customers through a Foreign key (as far as I know this can't be done cross database?).
So, the question is two fold - in general terms should data in this sort of scenario be split out into multiple databases, or should it all be held in a single database?
If it is split into multiple, is there a recommended way to protect the integrity and security of the system at the database layer to ensure that there is a strong relationship between the two?
Thanks for your help.
This question and thus my answer may be close to the gray line of subjective, but at the least I think it would be common practice to separate out the 'admin' tables into their own db for what it sounds like you're doing. If you can tie a client to a specific server and db instance then by having separate db instances, it opens up some easy paths for adding servers to add clients. A single db would require you to monkey with various clustering approaches if you got too big.
[edit]Building in the idea early that each client gets it's own DB also just sets the tone for how you develop when it is easy to make structural and organizational changes. Discovering 2 yrs from now you need to do it will become a lot more painful. I've worked with split dbs plenty of times in the past and it really isn't hard to deal with as long as you can establish some idea of what the context is. Here it sounds like you already have the idea that the client is the context.
Just my two cents, like I said, you could be close to subjective on this one.
Single Database Pros
One database to maintain. One database to rule them all, and in the darkness - bind them...
One connection string
Can use Clustering
Separate Database per Customer Pros
Support for customization on per customer basis
Security: No chance of customers seeing each others data
Conclusion
The separate database approach would be valid if you plan to support per customer customization. I don't see the value if otherwise.
You can use link to connect the databases.
Your architecture is smart.
If you can't use a link, you can always replicate critical data to the website database from the users database in a read only mode.
concerning security - The best way is to have a service layer between ASP (or other web lang) and the database - so your databases will be pretty much isolated.
If you expect to have to split the databases across different hardware in the future because of heavy load, I'd say split it now. You can use replication to push copies of some of the tables from the main database to the site management databases. For now, you can run both databases on the same instance of SQL Server and later on, when you need to, you can move some of the databases to a separate machine as your volume grows.
Imagine we have infinitely fast computers, would you split your databases? Of course not. The only reason why we split them is to make it easy for us to scale out at some point. You don't really have any choice here, 100MB-1000MB per client is huge.
Most questions about tenancy are centered around multi-tenancy database design issues. I want to know about single tenancy but multiple applications. The software I'm developing allows for a single user to create, from a single code base, multiple applications(I call them "sections"):
user could create a blog inside domain.com/application-blog1, another blog on domain.com/application-blog2.
I've already decided for a single database for everything But I am undecided whether or not I should use multiple tables for different application instances or the same table, maybe with a "sectionId" field to distinguish between them.
I'm using mysql and myisam tables. Could storing everything inside the same table lead to locking issues in the case of having many application instances running?
What's your experience on the subject?
I don't think people will typically use multiple tables in the same database. If you have multiple instances of the same application, you often have separate databases - typically only if new instances are created administratively, rather than through end-user actions. In this case, you'ld put the name of the database into a configuration file, and have the software connect to the right database.
In your case, I would go for the single-schema single-database approach, using sectionIds. This really is the same as multi-tenancy, perhaps minus the need to do access control.
You will of course have locking across concurrent transactions. However, this should never cause problems, since transactions for different sections won't operate on records in a conflicting manner (except when new sections are created - you'll probably have another table telling you what sections you have).