Split Database Security - sql

I'm working on an .NET MVC SQL application that will contain sensitive data, for example- HIV test results or income. I want to error-proof this privacy as much as possible so no one except the user can access it (think Joe the Plumber having his information hacked by a state employee).
I read hear that splitting the database in two doesn't seem reasonable:
Is splitting databases a legitimate security measure?
although I've heard of this being done. If we could just use two tables... better.
But when I say error-proofing, I mean impossible for ANYONE in our company to access both databases/tables. I'm thinking about putting access to the application code (which would access both databases) and to both databases in the hands of a deep-pockets third party (like PWC or EY) for when the government came calling or some other real need to see both data sources came along.
Anyone have any thoughts on the cleanest way to do this? We'd want to design the tables such that most queries would not require access to both data sources so the relative cost in throughput wouldn't be that much.

You can encrypt a column of data in SQL. So the columns which has the sensitive data e.g. HIV test results/income, you can encrypt the data while storing it in the DB.
Check the details here:
http://msdn.microsoft.com/en-us/library/ms179331.aspx
http://msdn.microsoft.com/en-us/library/bb964742.aspx
Let me know if it helps.

Related

PET technology Fluent Nhibernate

For a web application (with some real private data) we want to use privacy enhancing technology to prevent big risks when someone gets permission to our database.
The application is build with different layers, and we use (as said in the topic title) Fluent NHibernate to connect to our database and we've created our own wrapper class to create query's.
Security is a big issue for the kind of application we're building. I'll try to explain the setting by a simple example:
Our customers got some clients in their application (each installation of the application uses its own database), for which some sensitive data is added, there is a client table, and a person table, that are linked.
The base table, which links to the other tables (there will be hundreds of them soon), probably containing sensitive data, is the client table
At this moment, the client has a cleint_id, and a table_id in the database, our customer only knows the client_id, the system links the data by the table_id, which is unknown to the user.
What we want to ensure:
A possible hacker who would have gained access to our database, should not be able to see the link between the customer and the other tables by just opening the database. So actually there should be some kind of "hidden link" between the customer and other tables. The personal data and all sensitive other tables should not be obviously linked together.
Because of the data sensitivity we're looking for a more robust solution then "statically hash the table_id and use this in other tables", because when one of the persons is linked to the corresponding client, not all other clients data is compromised too.
Ultimately, the customer table cannot be linked to the other tables at all, just by working inside the database, the application-code is needed to link the tables.
To accomplish this we've been looking into different methods, but because of the multiple linked tables to this client, and further development (thus probably even more tables) we're looking for a centralised solution. That's why we concluded this should be handled in the database connector. Searching on the internet and here on Stack Overflow, did not point us in the right direction, perhaps we couldn't find this because of wrong search terms (PET, Privacy enhancing technology, combined with NHibernate did not give us any directions.
How can we accomplish our goals in this specific situation, or where to search to help us fix this.
We have a similar requirement for our application and what we ended up with using database schema's.
We have one database and each customer has a separate schema, where all the data for that customer is stored. It is possible to link from the schema to the rest of the database, but not to different schema's.
Security can be set for each schema separately so you can make the life of a hacker harder.
That being said I can also imagine a solution where you let NHibernate encrypt every peace of data it will send to the database and decrypt everything it gets back. The data will be store savely, but it will be very difficult to query over data.
So there is probably not a single answer to this question, and you have to decide what is better: Not being able to query, or just making it more difficult for a hacker to get to the data.

Local SQL database interface to cloud database

Excuse me if the question is simple. We have multiple medical clinics running each running their own SQL database EHR.
Is there anyway I can interface each local SQL database with a cloud system?
I essentially want to use the current patient data that one is consulting with at that moment to generate a pathology request that links to a cloud ?google app engine database.
As a medical student / software developer this project of yours interests me greatly!
If you don't mind me asking, where are you based? I'm from the UK and unfortunately there's just no way a system like this would get off the ground as most data is locked in proprietary databases.
What you're talking about is fairly complex anyway, whatever country you're in I assume there would have to be a lot of checks / security around any cloud system that dealt with patient data. Theoretically though, what you would want to do ideally is create an online database (cloud, hosted, intranet etc), and scrap the local databases entirely.
You then have one 'pool' of data each clinic can pull information from (i.e. ALL records for patient #3563). They could then edit that data and/or insert new records and SAVE them, exporting them back to the main database.
If there is a need to keep certain information private to one clinic only this could still be achieved on one database in a number of ways, or you could retain parts of the local database and have them merge with the cloud data as they're requested by the clinic
This might be a bit outdated, but you guys should checkout https://www.firebase.com/. It would let you do what you want fairly easily. We just did this for a client in the exact same business your are.
Basically, Firebase lets you work with a Central Database on the Cloud, that is automatically synchronised with all its front-ends. It even handles losing the connection to the server automagically. It's the best solution I've found so far to keep several systems running against one only cloud database.
We used to have our own backend that would try its best to sync changes, but you need to be really careful with inter-system unique IDs for your tables (i.e. going to one of the branches and making a new user won't yield the same id that one that already exists in any other branch or the central database). It becomes cumbersome very quickly.
CakePHP can automatically generate this kind of Unique IDs pretty easily and automatically, but you still have to work on sync'ing all the local databases with the central repository.

Querying multiple database servers?

I am working on a database for a monitoring application, and I got all the business logic sorted out. It's all well and good, but one of the requirements is that the monitoring data is to be completely stand-alone.
I'm using a local database on my web-server to do some event handling and caching notifications. Since there is one event row per system on my monitor database, it's easy to just get the id and query the monitoring data if needed, and since this is something only my web server uses, integrity can be enforced externally. Querying is not an issue either, as all the relationships are one-to-one so it's very straight forward.
My problem comes with user administration. My original plan had it on yet another database (to meet the requirement of leaving the monitoring database alone), but I don't think I was thinking straight when I thought of that. I can get all the ids of the systems a user has access to easily enough, but how then can I efficiently pass that to a query on the other database? Is there a solution for this? Making a chain of ors seems like an ugly and buggy solution.
I assume this kind of problem isn't that uncommon? What do most developers do when they have to integrate different database servers? In any case, I am leaning towards just talking my employer into putting user administration data in the same database, but I want to know if this kind of thing can be done.
There are a few ways to accomplish what you are after:
Use concepts like linked servers (SQL Server - http://msdn.microsoft.com/en-us/library/ms188279.aspx)
Individual connection strings within your front end driving the database layer
Use things like replication to duplicate the data
Also, the concept of multiple databases on a single database server instance seems like it would not be violating your business requirements, and I investigate that as a starting point, with the details you have given.

How to divide responsibility between LDAP and RDBMS

I'm a lead developer on a project which is building web applications for my companies SaaS offering. We are currently using LDAP to store user data such as IDs, passwords, contanct details, preferences and other user specific data.
One of the applications we are building is a reporting service that will both collect and present management information to our end users. Obviously this service will require a RDBMS but it will also need to access user data stored in LDAP.
As I see it we have a two basic implementation options:
Duplicate user data in both LDAP and the RDBMS.
Have the reporting service access LDAP whenever it needs user data.
Although duplicating data (and implementing the mechanisms to make this happen) as suggested in option 1 seems the wrong way to go, my gut feeling is that option 2 would not perform well enough (how do you 'join' LDAP data to RDBMS data as efficiently as a pure RDBMS implementation?).
I did find a related question but I'm still unsure which approach to take. I'd be interested in seeing what people thought of either option or perhaps other options.
Why would you feel that duplicating data would be the wrong way to go? Reporting tools (web based and otherwise) are mostly built around RDBMS's, so any mix'n'match will introduce unnecessary complexities. Reports are likely to need to be changed fairly frequently (from experience), so you want them to be as simple as possible. The data you store about users is unlikely to change its format very often, so once you have your import function working, you won't need to touch it again.
The only obstacle I can see is latency: how do you ensure that your RDBMS copy is up to date? You might need to ensure that your updating code writes to both destinations. Personally, also, I wouldn't necessarily use LDAP for application specific personal preferences: LDAP can't handle transactions, so what happens when data is updated from several directions? (Transactionality is of course also a problem with letting updaters write to both stores...) I'd rather let the RDBMS be the master for most data, and let LDAP worry only about identity, credentials and entitlements, which are rarely changed and only for one set of purposes. For myself, LDAP's ability to deal with hierarchical data isn't all that great a selling point.
Data duplication is not always a bad thing, especially when the usage scenarios are different enough.

Single or multiple databases

SQL Server 2008 database design problem.
I'm defining the architecture for a service where site users would manage a large volume of data on multiple websites that they own (100MB average, 1GB maximum per site). I am considering whether to split the databases up such that the core site management tables (users, payments, contact details, login details, products etc) are held in one database, and the database relating to the customer's own websites is held in a separate database.
I am seeing a possible gain in that I can distribute the hardware architecture to provide more meat to the heavy lifting done in the websites database leaving the site management database in a more appropriate area. But I'm also conscious of losing the ability to directly relate the sites to the customers through a Foreign key (as far as I know this can't be done cross database?).
So, the question is two fold - in general terms should data in this sort of scenario be split out into multiple databases, or should it all be held in a single database?
If it is split into multiple, is there a recommended way to protect the integrity and security of the system at the database layer to ensure that there is a strong relationship between the two?
Thanks for your help.
This question and thus my answer may be close to the gray line of subjective, but at the least I think it would be common practice to separate out the 'admin' tables into their own db for what it sounds like you're doing. If you can tie a client to a specific server and db instance then by having separate db instances, it opens up some easy paths for adding servers to add clients. A single db would require you to monkey with various clustering approaches if you got too big.
[edit]Building in the idea early that each client gets it's own DB also just sets the tone for how you develop when it is easy to make structural and organizational changes. Discovering 2 yrs from now you need to do it will become a lot more painful. I've worked with split dbs plenty of times in the past and it really isn't hard to deal with as long as you can establish some idea of what the context is. Here it sounds like you already have the idea that the client is the context.
Just my two cents, like I said, you could be close to subjective on this one.
Single Database Pros
One database to maintain. One database to rule them all, and in the darkness - bind them...
One connection string
Can use Clustering
Separate Database per Customer Pros
Support for customization on per customer basis
Security: No chance of customers seeing each others data
Conclusion
The separate database approach would be valid if you plan to support per customer customization. I don't see the value if otherwise.
You can use link to connect the databases.
Your architecture is smart.
If you can't use a link, you can always replicate critical data to the website database from the users database in a read only mode.
concerning security - The best way is to have a service layer between ASP (or other web lang) and the database - so your databases will be pretty much isolated.
If you expect to have to split the databases across different hardware in the future because of heavy load, I'd say split it now. You can use replication to push copies of some of the tables from the main database to the site management databases. For now, you can run both databases on the same instance of SQL Server and later on, when you need to, you can move some of the databases to a separate machine as your volume grows.
Imagine we have infinitely fast computers, would you split your databases? Of course not. The only reason why we split them is to make it easy for us to scale out at some point. You don't really have any choice here, 100MB-1000MB per client is huge.