In my application a requester has permission to query only certain columns. The columns may differ between requesters. The where clause changes between requests so that the rows returned change with each query. What is the best way to handle this access control? Should I use an array to store permitted columns and then do the check in my application?
I'm on PostgreSQL 9.x
Example:
We have medical professionals that can access records of patients but not all medical professionals should be able to access all information. They try to request arbitrary information about any patient (which have a uid) but we should enforce access controls.
So say the info is name, date of birth, blood type and illness
Doctor A has permission for all fields
Doctor B can see everything except blood type
Administrator can only see name and date of birth
Hematologist can only see blood type
To implement option 2, I would have a column permissions table something like the following:
CREATE TABLE ColumnPerms
(
user_or_role Varchar(50),
table_name Varchar(50),
column_name Varchar(50),
)
CREATE INDEX ix_Columnperms(user_or_role, table_name)
The *table_name* column is to allow this functionality to be implemented on more than just a single table in your app: if it's unnecessary, don't use it. You could adopt the convention that role names start with a '#' character, to ensure that there is no collision with user names.
Now, when you build your dynamic query, you can do something like
SELECT column_name
FROM ColumnPerms
WHERE user_or_role = '#manager'
AND table = 'Payroll'
AND column_name IN ('first_name', 'last_name', 'hire_date', 'base_salary', 'bonus')
(the IN clause should include EVERY column potentially to be returned).
The result of this query is a list of the column names that user is allowed to see. Just iterate through it to build your column list when constructing the dynamic SQL.
There are two approaches you could take:
Use Postgres to enforce the security using column level permissions for each user (or user role). Look at the syntax for GRANT here: http://www.postgresql.org/docs/current/static/sql-grant.html
Build dynamic sql statements limiting what rows may be returned for each user. This could become pretty tedious if there are many users, or many different column combinations. You'll probably want to keep a table of user ids, and "selectable" table, column names for building the query statement. If you want this generalized to many different queries, you could either build them on top of a table-returning function that does the column filtering, or revert to option 1.
For option 1, make sure that columns used in the join are selectable...
Related
I'm trying to obtain data from de data dictionary that Oracle provides to its databases. When I execute the query:
SELECT TABLE_NAME
FROM ALL_CONSTRAINTS;
I get the names from the tables I own as user and the tables other users have granted me access to (Around 130). Now, when I try to execute
SELECT TABLE_NAME
FROM ALL_CONS_OBJ_COLUMNS;
I get nothing, the result is an empty table. Does that make sense? As the docs says "displays information about the types that object columns (or attributes) or collection elements have been constrained to, in the tables accessible to the current user.", I think it should display the same table the previous query did.
I was wrong: I'm looking for ALL_CONS_COLUMNS. That table does not have all the names of the preiovus one, but has the ones I need. I guess I not fully understand the notion of objects yet.
In this question: How do I use row-level permissions in BigQuery? it describes how to use an authorized view to grant access to only a portion of a table. But I'd like to give different users access to different rows. Does this mean I need to create separate views for each user? Is there an easier way?
Happily, if you want to give different users access to different rows in your table, you don't need to create separate views for each one. You have a couple of options.
These options all make use of the SESSION_USER() function in BigQuery, which returns the e-mail address of the currently running user. For example, if I run:
SELECT SESSION_USER()
I get back tigani#google.com.
The simplest option, then, for displaying different rows to different users, is to add another column to your table that is the user who is allowed to see the row. For example, the schema: {customer:string, id:integer} would become {customer:string, id:integer, allowed_viewer: string}. Then you can define a view:
#standardSQL
SELECT customer, id
FROM private.customers
WHERE allowed_viewer = SESSION_USER()
(note, don't forget to authorize the view as described here).
Then I'd be able to see only the fields where tigani#google.com was the value in the allowed_viewer column.
This approach has its own drawbacks, however; You can only grant access to a single user at a time. One option would be to make the allowed_viewer column a repeated field; this would let you provide a list of users for each row.
However, this is still pretty restrictive, and requires a lot of bookkeeping about which users should have access to which row. Chances are, what you'd really like to do is specify a group. So your schema would look like: {customer:string, id:integer, allowed_group: string}, and anyone in the allowed_group would be able to see your table.
You can make this work by having another table that has your group mappings. That table would look like: {group:string, user_name:string}. The rows might look like:
{engineers, tigani#google.com}
{engineers, some_engineer#google.com}
{administrators, some_admin#google.com}
{sales, some_salesperson#google.com}
...
Let's call this table private.access_control. Then we can change our view definition:
#standardSQL
SELECT c.customer, c.id
FROM private.customers c
INNER JOIN (
SELECT group
FROM private.access_control
WHERE SESSION_USER() = user_name) g
ON c.allowed_group = g.group
(note you will want to make sure that there are no duplicates in private.access_control, otherwise it could records to repeat in the results).
In this way, you can manage the groups in the private.access_control separately from the data table (private.customers).
There is still one piece missing that you might want; the ability for groups to contain other groups. You can get this by doing a more complex join to expand the groups in the access control table (you might want to consider doing this only once and saving the results, to save the work each time the main table is queried).
In sqlite3, I can force two columns to alias to the same name, as in the following query:
SELECT field_one AS overloaded_name,
field_two AS overloaded_name
FROM my_table;
It returns the following:
overloaded_name overloaded_name
--------------- ---------------
1 2
3 4
... ...
... and so on.
However, if I create a named table using the same syntax, it appends one of the aliases with a :1:
sqlite> CREATE TABLE temp AS
SELECT field_one AS overloaded_name,
field_two AS overloaded_name
FROM my_table;
sqlite> .schema temp
CREATE TABLE temp(
overloaded_name TEXT,
"overloaded_name:1" TEXT
);
I ran the original query just to see if this was possible, and I was surprised that it was allowed. Is there any good reason to do this? Assuming there isn't, why is this allowed at all?
EDIT:
I should clarify: the question is twofold: why is the table creation allowed to succeed, and (more importantly) why is the original select allowed in the first place?
Also, see my clarification above with respect to table creation.
I can force two columns to alias to the same name...
why is [this] allowed in the first place?
This can be attributed to the shackles of compatibility. In the SQL Standards, nothing is ever deprecated. An early version of the Standard allowed the result of a table expression to include columns with duplicate names, probably because an influential vendor had allowed it, possibly due to the inclusion of a bug or the omission of a design feature, and weren't prepared to take the risk of breaking their customers' code (the shackles of compatibility again).
Is there any use to duplicate column names in a table?
In the relational model, every attribute of every relation has a name that is unique within the relevant relation. Just because SQL allows duplicate column names that doesn't mean that as a SQL coder you should utilise such as feature; in fact I'd say you have to vigilant not to invoke this feature in error. I can't think of any good reason to have duplicate column names in a table but I can think of many obvious bad ones. Such a table would not be a relation and that can't be a good thing!
why is the [base] table creation allowed to succeed
Undoubtedly an 'extension' to (a.k.a purposeful violation of) the SQL Standards, I suppose it could be perceived as a reasonable feature: if I attempt to create columns with duplicate names the system automatically disambigutes them by suffixing an ordinal number. In fact, the SQL Standard specifies that there be an implementation dependent way to ensure the result of a table expression does not implicitly have duplicate column names (but as you point out in the question this does not perclude the user from explicitly using duplicate AS clauses). However, I personally think the Standard behaviour of disallowing the duplicate name and raising an error is the correct one. Aside from the above reasons (i.e. that duplicate columns in the same table are of no good use), a SQL script that creates an object without knowing if the system has honoured that name will be error prone.
The table itself can't have duplicate column names because inserting and updating would be messed up. Which column gets the data?
During selects the "duplicates" are just column labels so do not hurt anything.
I assume you're talking about the CREATE TABLE ... AS SELECT command. This looks like an SQL extension to me.
Standard SQL does not allow you to use the same column name for different columns, and SQLite appears to be allowing that in its extension, but working around it. While a simple, naked select statement simply uses as to set the column name, create table ... as select uses it to create a brand new table with those column names.
As an aside, it would be interesting to see what the naked select does when you try to use the duplicated column, such as in an order by clause.
If you were allowed to have multiple columns with the same name, it would be a little difficult for the execution engine to figure out what you meant with:
select overloaded_name from table;
The reason why you can do it in the select is to allow things like:
select id, surname as name from users where surname is not null
union all
select id, firstname as name from users where surname is null
so that you end up with a single name column.
As to whether there's a good reason, SQLite is probably assuming you know what you're doing when you specify the same column name for two different columns. Its essence seems to be to allow a great deal of latitude to the user (as evidenced by the fact that the columns are dynamically typed, for example).
The alternative would be to simply refuse your request, which is what I'd prefer, but the developers of SQLite are probably more liberal (or less anal-retentive) than I :-)
I am building a community site where logon will be by email and members will be able to change their name/nick name.
Do you think I should keep member name/nick name in my members table with other properties of member or create another table, write member name/nick name on that table and associate member’s id.
I am in favour of second option because, I think it would be faster to pull members name from it.
Is it right/better way?
Update: reason is for other table is that I need to pull username for different sections. For example forums. Wouldn't it be faster to query a small table for each username for each post in a from topic?
I would keep it one table and set a unique constraint on Email in that table.
I can't see a single advantage in adding another table.
Why do you think the second option would be faster?
If nickname is a required one-to-one relation to member ID the appropriate place to store them is in the same table. This is still a indexed single-record search so it should be more-or-less as fast as your other option.
In fact, this solution would probably be faster, since you could get the nickname in the same SELECT as you get the other information.
Update to answer the update to the question:
The second table isn't any smaller in terms of the number of rows. The main factors in a SQL search are 1) number of records in the table and 2) number of possible matches from the indexed part of the search.
In this case, the number of records in your smaller table would be exactly the same as the larger table. And the number of possible matching records returned by the index will always be 1 because the member ID is unique.
The number of columns in the table you're searching is generally irrelevant to the time taken to return the data (the number of column you actually list in the SELECT statement can have an effect, but that's the same no matter which table you're searching).
SQL databases are very, very good at finding data. Structure your data correctly and let the database worry about getting it back to. Premature optimization is, as they say, the root of all evil.
Go with the first option: keep the name/nick name in the members table. There's no need to introduce an additional table, and the overhead of a join that goes with it, in this case.
Yes, associating member's ID to the other properties is the right way to go.
You can simply create an index on name to speed up your queries.
I have to create a table and store Active Directory SIDs representing an User or a Group.
How would you name the category representing both an User and a Group ?
Edit 1.
Table will contain four columns : ID ( PK ), SID's Name, SID's value and another column for SID's Type ( 0 for User, 1 for Group ).
Please suggest the table name, not only the columns names.
Active Directory uses the term "principal" or "security principal" for both. That also includes computers.
Here's a grahpic image from the MSDN article Managing Directory Security Principals in the .NET Framework 3.5 that shows the hierarchy.
(source: microsoft.com)
So I would probably call my table Principals and have the three columns you mentioned:
PrincipalName (string)
SID (string or binary)
PrincipalType (0 for User, 1 for Group)
From most verbose to least:
ActiveDirectorySecurityIdentifiers
ActiveDirectorySIDs
ADSIDs
Good practices dictate that table names be plural and that the names should represent and describe the contents of the tables. Depending on your level of comfort any one of the above should do just fine.
When I recently had to do this (linking a DB user table to the AD accounts) I simply named the column ADSID.
I found this made good sense for us since we were querying using DirectorySearcher and the name for that property in the LDAP database is objectSid, so our queries looked like:
deSearch.Filter = "(&(objectSid=" + ADSID + "))";
Although, as I cut an paste that code from my project, I do wonder if maybe objectSid would have been a good column name too?
As far as naming the table, I hope you are storing additional information beyond the AD details here? Otherwise, why are you duplicating the AD database?
If you are storing additional information, then you should name the table according to whatever domain/business object is modelled by the table.
As I said, I was storing the data for users, so my table was simply called [Users].
Finally - perhaps you would benefit from normalising this out into a [Groups] and a [Users] table?