PET technology Fluent Nhibernate - fluent-nhibernate

For a web application (with some real private data) we want to use privacy enhancing technology to prevent big risks when someone gets permission to our database.
The application is build with different layers, and we use (as said in the topic title) Fluent NHibernate to connect to our database and we've created our own wrapper class to create query's.
Security is a big issue for the kind of application we're building. I'll try to explain the setting by a simple example:
Our customers got some clients in their application (each installation of the application uses its own database), for which some sensitive data is added, there is a client table, and a person table, that are linked.
The base table, which links to the other tables (there will be hundreds of them soon), probably containing sensitive data, is the client table
At this moment, the client has a cleint_id, and a table_id in the database, our customer only knows the client_id, the system links the data by the table_id, which is unknown to the user.
What we want to ensure:
A possible hacker who would have gained access to our database, should not be able to see the link between the customer and the other tables by just opening the database. So actually there should be some kind of "hidden link" between the customer and other tables. The personal data and all sensitive other tables should not be obviously linked together.
Because of the data sensitivity we're looking for a more robust solution then "statically hash the table_id and use this in other tables", because when one of the persons is linked to the corresponding client, not all other clients data is compromised too.
Ultimately, the customer table cannot be linked to the other tables at all, just by working inside the database, the application-code is needed to link the tables.
To accomplish this we've been looking into different methods, but because of the multiple linked tables to this client, and further development (thus probably even more tables) we're looking for a centralised solution. That's why we concluded this should be handled in the database connector. Searching on the internet and here on Stack Overflow, did not point us in the right direction, perhaps we couldn't find this because of wrong search terms (PET, Privacy enhancing technology, combined with NHibernate did not give us any directions.
How can we accomplish our goals in this specific situation, or where to search to help us fix this.

We have a similar requirement for our application and what we ended up with using database schema's.
We have one database and each customer has a separate schema, where all the data for that customer is stored. It is possible to link from the schema to the rest of the database, but not to different schema's.
Security can be set for each schema separately so you can make the life of a hacker harder.
That being said I can also imagine a solution where you let NHibernate encrypt every peace of data it will send to the database and decrypt everything it gets back. The data will be store savely, but it will be very difficult to query over data.
So there is probably not a single answer to this question, and you have to decide what is better: Not being able to query, or just making it more difficult for a hacker to get to the data.

Related

Database optimisation

I'm starting a web application that will be used by a lot of companies (over 20K), and most importantly a lot of information will be recorded daily. I would like your advice on the following idea: create a database for each company to do sql queries like this:
select * from enterprisedb1.tablename;
select * from enterprisedb2.tablename2 where enterprisedb2.tablename2.col='foo'
Pleace i need your advice, i don't find anything on google
If you are selling this to multiple clients then it might come down to separation of their data.
On the one hand everything for the app is in the one database for each client, and provided you get the connection string right you probably don't need to ever specify the company name again for the rest of the app. No more "where customer=123" on every single query.
Also means a client could be deleted, backed up, moved, audited, whatever in a completely independent manner.
And also means there is no risk of a developer or a query accidentally doing cross-client things. So you can even open up to generic query access that still cant accidentally cross a client-to-client border. And security set-up will be simpler.
But if you have a million clients you do end up with a lot of databases. How well this works will depend on all sorts of things, including your database of choice.
You also end up having multiple copies of reference data unless you create an additional database "common" or something like that.
Its going to be very much a "depends" answer, but that's a few things to consider.
I suggest to use common tables for each company. It will better to manage and easy to understand.
Create one table for company data and use Integer reference of that key in another mete data tables. For better performance, Index and Query must be well formed.

Degenerate a single SQL table into multiple domaines tables

In short: I have a client who wish to be able to add domain tables, without adding SQL tables.
I am working with an application in wich data are organized and made available with a postgresql catalogue. What I mean by catalogue is that the database hold the path to the actual data file(s) as well as some metadata.
Adding a new table means that the (Java class of the) client application has to be updated. This is a costly process for the client, who want us to find a way to let him add new kind of data in the catalogue, without having to change the schema.
I don't have many more specificities about the db itself and it's configuration as I'm usualy mostly a client of the said db.
My idea: to solve this was to have a generic table with the most often used columns (like date, comment etc.) and a column containing a domain key. The domain key would be used by the client application to request the kind of generic data is needed (and would have no meaning whatsoever to the db provider). Adding metadata could be done with a companion file within the catalogue and further filtering would have to be done on the client side.
Question: as I am by no mean an SQL expert, I would like to know if it is an acceptable solution, and what limitation I could be facing ? I'm thinking of performance, data volume etc. Or maybe a different approach, is advisable ?
Regarding expected volume, for a single domain data type, it could be arround 30 new entry per day.

Split Database Security

I'm working on an .NET MVC SQL application that will contain sensitive data, for example- HIV test results or income. I want to error-proof this privacy as much as possible so no one except the user can access it (think Joe the Plumber having his information hacked by a state employee).
I read hear that splitting the database in two doesn't seem reasonable:
Is splitting databases a legitimate security measure?
although I've heard of this being done. If we could just use two tables... better.
But when I say error-proofing, I mean impossible for ANYONE in our company to access both databases/tables. I'm thinking about putting access to the application code (which would access both databases) and to both databases in the hands of a deep-pockets third party (like PWC or EY) for when the government came calling or some other real need to see both data sources came along.
Anyone have any thoughts on the cleanest way to do this? We'd want to design the tables such that most queries would not require access to both data sources so the relative cost in throughput wouldn't be that much.
You can encrypt a column of data in SQL. So the columns which has the sensitive data e.g. HIV test results/income, you can encrypt the data while storing it in the DB.
Check the details here:
http://msdn.microsoft.com/en-us/library/ms179331.aspx
http://msdn.microsoft.com/en-us/library/bb964742.aspx
Let me know if it helps.

Database Client Specific Tables v/s Relational Tables

I have a scenario, my application is a SAAS based app catering to multiple clients. Data Integrity to clients is very essential.
Is it better to keep my Tables
Client specific
OR
Relational Tables
For Ex: I have a mapping table with fields MapField1,MapField2. I need this kind of data for each client.
Should I have tables like MappingData_
or a Single Table with mapping to the ClientId
MappingData with Fields MapField1,MapField2,ClientId
I would have a separate database for each customer. (Multiple databases in a single SQL Server instance.)
This would allow you to design it once, with a single schema.
No dynamically named tables compromising test & development
Upgrades and maintenance can be designed and tested in one DB, then rolled out to all
A single customer's data can be backed-up, restored or dropped exceedingly simply
Bugs discovered/exploited in one DB won't comprise the integrity of other DBs
Data access (read and write) can be managed using SQL Logins (No re-inventing the wheel)
If there is a need for globally shared data, that would go in another database, with it's own set of permissions for the different SQL Logins.
The use of a single database, with all users in it is my next best choice. You still have a single schema. But you don't get to partition the customers' data, you need to manage access rights and permissions yourself, and a whole host of other additional design and testing work.
I would never go near dynamically creating new tables for additional customers. A new table name means all your queries need to be updated with the new table name, and a whole host of other maintenance head-aches.
I'm pretty much of the opinion that if you want to create tables dynamically during the Business As Usual use of an application/service, you've designed it badly.
SO has a tag for the thing you're describing: "multi-tenant".
Visualize the architecture for supporting a multi-tenant database application as a spectrum. At one extreme of the spectrum is "shared nothing", which means each tenant has its own database. At the other extreme of the spectrum is "shared everything", which means tenants share tables, and each row in each table belongs to one tenant. (Each row contains a tenant identifier.)
Terminology seems to overlap, so read carefully. What one writer means by shared schema might be identical to what another writer means by shared everything.
This SO answer, also written by me, describes the differences and the tradeoffs in terms of cost, data isolation and protection, maintenance, and disaster recovery. It also links to a fairly good introductory article.

Local SQL database interface to cloud database

Excuse me if the question is simple. We have multiple medical clinics running each running their own SQL database EHR.
Is there anyway I can interface each local SQL database with a cloud system?
I essentially want to use the current patient data that one is consulting with at that moment to generate a pathology request that links to a cloud ?google app engine database.
As a medical student / software developer this project of yours interests me greatly!
If you don't mind me asking, where are you based? I'm from the UK and unfortunately there's just no way a system like this would get off the ground as most data is locked in proprietary databases.
What you're talking about is fairly complex anyway, whatever country you're in I assume there would have to be a lot of checks / security around any cloud system that dealt with patient data. Theoretically though, what you would want to do ideally is create an online database (cloud, hosted, intranet etc), and scrap the local databases entirely.
You then have one 'pool' of data each clinic can pull information from (i.e. ALL records for patient #3563). They could then edit that data and/or insert new records and SAVE them, exporting them back to the main database.
If there is a need to keep certain information private to one clinic only this could still be achieved on one database in a number of ways, or you could retain parts of the local database and have them merge with the cloud data as they're requested by the clinic
This might be a bit outdated, but you guys should checkout https://www.firebase.com/. It would let you do what you want fairly easily. We just did this for a client in the exact same business your are.
Basically, Firebase lets you work with a Central Database on the Cloud, that is automatically synchronised with all its front-ends. It even handles losing the connection to the server automagically. It's the best solution I've found so far to keep several systems running against one only cloud database.
We used to have our own backend that would try its best to sync changes, but you need to be really careful with inter-system unique IDs for your tables (i.e. going to one of the branches and making a new user won't yield the same id that one that already exists in any other branch or the central database). It becomes cumbersome very quickly.
CakePHP can automatically generate this kind of Unique IDs pretty easily and automatically, but you still have to work on sync'ing all the local databases with the central repository.