couchdb read authentication - authentication

how can i handle read authentication in couchdb? i know roles can be defined in seperate databases but i want to implement read authentication on document level. i am thinking about using node.js but it does not seem an elegant solution because couchdb also has a http server and i dont want to add one more (or another application server like ruby or python). is there anyone working on this?
Thanks.

In the recent O'Reilly web cast on CouchDB, J. Chris Anderson mentioned that read authentication was best handled by a combination of partial replication and multiple databases per reader group. Each database would contain only the documents pertaining to that specific group.
It makes the most sense when you think of each readers CouchDB as a filtered instance of an authority database.

That's basically the correct answer. What I'd add is that document-level read control is hard to get right, especially in the presence of views. Filtering map rows at read-time is doable, but not very IO efficient. Generate reduction values based on filtered map rows, however, is prohibitively expensive.
For those reasons we encourage you to operate something like a database per access group, and make the entire database readable by all users.

Related

Split Database Security

I'm working on an .NET MVC SQL application that will contain sensitive data, for example- HIV test results or income. I want to error-proof this privacy as much as possible so no one except the user can access it (think Joe the Plumber having his information hacked by a state employee).
I read hear that splitting the database in two doesn't seem reasonable:
Is splitting databases a legitimate security measure?
although I've heard of this being done. If we could just use two tables... better.
But when I say error-proofing, I mean impossible for ANYONE in our company to access both databases/tables. I'm thinking about putting access to the application code (which would access both databases) and to both databases in the hands of a deep-pockets third party (like PWC or EY) for when the government came calling or some other real need to see both data sources came along.
Anyone have any thoughts on the cleanest way to do this? We'd want to design the tables such that most queries would not require access to both data sources so the relative cost in throughput wouldn't be that much.
You can encrypt a column of data in SQL. So the columns which has the sensitive data e.g. HIV test results/income, you can encrypt the data while storing it in the DB.
Check the details here:
http://msdn.microsoft.com/en-us/library/ms179331.aspx
http://msdn.microsoft.com/en-us/library/bb964742.aspx
Let me know if it helps.

Querying multiple database servers?

I am working on a database for a monitoring application, and I got all the business logic sorted out. It's all well and good, but one of the requirements is that the monitoring data is to be completely stand-alone.
I'm using a local database on my web-server to do some event handling and caching notifications. Since there is one event row per system on my monitor database, it's easy to just get the id and query the monitoring data if needed, and since this is something only my web server uses, integrity can be enforced externally. Querying is not an issue either, as all the relationships are one-to-one so it's very straight forward.
My problem comes with user administration. My original plan had it on yet another database (to meet the requirement of leaving the monitoring database alone), but I don't think I was thinking straight when I thought of that. I can get all the ids of the systems a user has access to easily enough, but how then can I efficiently pass that to a query on the other database? Is there a solution for this? Making a chain of ors seems like an ugly and buggy solution.
I assume this kind of problem isn't that uncommon? What do most developers do when they have to integrate different database servers? In any case, I am leaning towards just talking my employer into putting user administration data in the same database, but I want to know if this kind of thing can be done.
There are a few ways to accomplish what you are after:
Use concepts like linked servers (SQL Server - http://msdn.microsoft.com/en-us/library/ms188279.aspx)
Individual connection strings within your front end driving the database layer
Use things like replication to duplicate the data
Also, the concept of multiple databases on a single database server instance seems like it would not be violating your business requirements, and I investigate that as a starting point, with the details you have given.

Mongodb autosharding vs. authentication

Long time lurker, first time poster, please bear with me.
I'm trying to set up a sharded, secure Mongodb environment. I would like to make use of Mongo's autosharding capability, since I'm sort of new to databases and on a tight schedule.
It seems that autosharding only applies to individual collections (tables), but I don't want users to have access to the entire collection. Further, mongoDB only allows authentication into databases, so once authenticated, a user can see 1) every collection in the db and 2) all data within each collection. So, as far as I can tell, I can either have autosharding and no authentication, or manual sharding and authentication.
I would like the best of both worlds, that is: autosharding and authentication. Is this possible? If not, how should I go about manual sharding in MongoDB?
A simplified use case of this system: collection 'Users' has data on every user. I want to authenticate user X so that X can only see X's data in the User's collection. And Users is distributed across multiple servers partitioned (sharded) by user_name.
MongoDb doesn't have authentication like traditional SQL databases. In fact if you read the manual its recommended that you use a secured environment instead of using authentication. Any access control to your data would be implemented within your application.
Even with traditional SQL, access isnt control by row. Thats usually something implemented at the application level based on some sort of key within the data.

What is the fastest way for me to take a query and turn it into a refreshable graph of the results set?

I often find myself writing one off queries to either answer someone's question or trouble shoot something and I would like to be able to quickly expose the on demand refreshable results of the query graphically so that I can share these results to others without having to go through the process of creating an SSRS report and publishing it to a reporting services server.
I have thought about using excel to do this or maybe running a local SSRS server but both of these options are still labor intensive and I cannot justify the time it would take to do these since no one has officially requested that I turn this data into a report.
The way I see it the business I work for has invested money in me creating these queries that often return potentially useful data that other people in the organization might want but since it isn't exposed in any way and I don't know that this data is something they want and they may not even realize they want this data, the potential value of the query is not realized. I want to increase the company's return on investment on all these one off queries that I and other developers write by exposing their results graphically so that they can be browsed by others and then potentially turned into more formalized SSRS reports if they provide enough value to justify the development of the report.
What is the fastest way for me to take a query and turn it into a refreshable graph of the results set?
Why dont you simply use what you may already have. Excel...you can import data via an ODBC / Oracle / SQL Connection. Get Data..and bam you can run the query and format it right in the spreadsheet and provide sorting etc. All you need to supply is the database name and user name and password to connect to the db.
JonH is right regarding Excel's built in ODBC support, but I have had tons of trouble with this. In my case, the ODBC connection required the client software to be installed so that it could use the encryption methods, etc. Also, even if that were not the case, the user (I believe) would still have to manually install and set up an ODBC connection.
Now if you just want something on your machine to do the queries and refresh them, JohH's solution is great and my caveats are probably irrelavent. But if you want other users to have access, you should consider having a middle-man app (basically a PHP script, assuming a web server is an option for you), that does a query, transforms the results into XML, and outputs it as "report-xyz.xml". You can then point anybody running a newer version of Excel to that address and they can very easily import the data into Excel with no overhead. (basically a kind of web service).
Keep in mind, I don't think you should have a web script that will allow users to make queries to your Database server! You would have some admin page where you make pass the query in and a new xml file with the results gets made. So my idea is also based on the idea that you want to run the same queries over and over without any specifics passed in. (if that were the case, I'd look into just finding a pre-built web services bridge for your database that already has security features built in. Then you could have users make the limited changes allowed.)

How to divide responsibility between LDAP and RDBMS

I'm a lead developer on a project which is building web applications for my companies SaaS offering. We are currently using LDAP to store user data such as IDs, passwords, contanct details, preferences and other user specific data.
One of the applications we are building is a reporting service that will both collect and present management information to our end users. Obviously this service will require a RDBMS but it will also need to access user data stored in LDAP.
As I see it we have a two basic implementation options:
Duplicate user data in both LDAP and the RDBMS.
Have the reporting service access LDAP whenever it needs user data.
Although duplicating data (and implementing the mechanisms to make this happen) as suggested in option 1 seems the wrong way to go, my gut feeling is that option 2 would not perform well enough (how do you 'join' LDAP data to RDBMS data as efficiently as a pure RDBMS implementation?).
I did find a related question but I'm still unsure which approach to take. I'd be interested in seeing what people thought of either option or perhaps other options.
Why would you feel that duplicating data would be the wrong way to go? Reporting tools (web based and otherwise) are mostly built around RDBMS's, so any mix'n'match will introduce unnecessary complexities. Reports are likely to need to be changed fairly frequently (from experience), so you want them to be as simple as possible. The data you store about users is unlikely to change its format very often, so once you have your import function working, you won't need to touch it again.
The only obstacle I can see is latency: how do you ensure that your RDBMS copy is up to date? You might need to ensure that your updating code writes to both destinations. Personally, also, I wouldn't necessarily use LDAP for application specific personal preferences: LDAP can't handle transactions, so what happens when data is updated from several directions? (Transactionality is of course also a problem with letting updaters write to both stores...) I'd rather let the RDBMS be the master for most data, and let LDAP worry only about identity, credentials and entitlements, which are rarely changed and only for one set of purposes. For myself, LDAP's ability to deal with hierarchical data isn't all that great a selling point.
Data duplication is not always a bad thing, especially when the usage scenarios are different enough.