SQL Server Grantting permission for specific rows - sql

I am doing the BI reports for a group of 5 companies. Since the information is more or less the same for all the companies, I am consolidating all the data of the 5 companies in one DB, restructuring the important data, indexing the tables (I can not do that in the original DB because ERP restrictions) and creating the views with all the information required.
In the group, I have some corporate roles that would be benefit of having the data of the 5 companies in one view, nevertheless, I am not interested that an employee of company 1 see the information of company 2, neither in the other way. There is any way to grant permissions restricting the information to the rows that contain employee´s company name in a specific column?.
I know that I could replicate the view and filtering the information using the WHERE clause, but I really want to avoid this. Please help. Thanks!

What you are talking about is row level security. There is little to no support out of the product for this.
Here are a couple articles on design patterns that can be used.
http://sqlserverlst.codeplex.com/
http://msdn.microsoft.com/en-us/library/bb669076(v=vs.110).aspx
What is the goal of consolidating all the companies into one database?
Here are some ideas.
1 - Separate databases makes it easier to secure data; However, hard to aggregate data.
Also, duplication of all objects.
2 - Use schema's to separate the data. Security can be given out at the schema level.
This does have the same duplicate objects, less the database container, but a super user group can see all schema's and write aggregated reports.
I think schema's are under used by DBA's and developers.
3 - Code either stored procedures and/or duplicate views to ensure security. While tables are not duplicated, some code is.
Again there is no silver bullet for this problem.
However, this is a green field project and you can dictate which way you want to implement it.

As of SQL Server 2016 there is support specifically for this problem. The MSDN link in the accepted answer already forwards to the right article. I decided to post again though as the relevant answer changed.
You can now create security policies which implement row level permissions like this (code from MSDN; assuming per-user permissions and a column named UserName in your table):
CREATE SCHEMA Security
GO
CREATE FUNCTION Security.userAccessPredicate(#UserName sysname)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN SELECT 1 AS accessResult
WHERE #UserName = SUSER_SNAME()
GO
CREATE SECURITY POLICY Security.userAccessPolicy
ADD FILTER PREDICATE Security.userAccessPredicate(UserName) ON dbo.MyTable,
ADD BLOCK PREDICATE Security.userAccessPredicate(UserName) ON dbo.MyTable
GO
Furthermore it's advisable to create stored procedures which check permission too for accessing the data as a second layer of security, as users might otherwise find out details about existing data they don't have access to i.e. by trying to violate constraints. For details see the MSDN article, which is exactly on this topic.
It points out workarounds for older versions of SQL Server too.

If you want to restrict view data using the where clause, the easiest way is to create a view and then assign permission to the user.
example:
CREATE VIEW emp AS SELECT Name, Bdate, Address FROM EMPLOYEE WHERE id=5;
GRANT SELECT ON emp TO user

Related

Best way to find the sources of SQL tables, joined in a view?

I have a view vw_Members which is a combination of tbl_Student and tbl_Staff.
Once they are seen in a table as Members, is there a way to tell which record came from which table? Or even a which database? (For the sake of this question, assume they're in different databases)
If not a method that's already implemented in SQL, what would be your best suggestion?
EDIT:
HRSystem database has a table tbl_Staff and StudentSystem database has a table tbl_Student.
The client application wants to see all data.
A view, vw_Members is used.
SELECT name, date_of_birth FROM tbl_Student
UNION ALL
SELECT name, date_of_birth FROM tbl_Staff
The use-case for knowing the source is, for instance...
If a member (Staff or Student) calls up a user of the application and says "My name has changed, that is no longer my name", without contacting IT to find out where the source of the data is, they are able to know instantly and can follow the proper procedure, changing it at the source/externally.
Note:
The databases are populated with data from external sources. These are not databases populated by our users, it's an automated process, by having the appropriate people change the information externally, this will update the databases.
There will be instances where a Member will be both a Student and Staff member simultaneously. This is where knowing the source of the field would benefit.

Custom user defined database fields, what is the best solution?

To keep this as short as possible I'm going to use and example.
So let's say I have a simple database that has the following tables:
company - ( "idcompany", "name", "createdOn" )
user - ( "iduser", "idcompany", "name", "dob", "createdOn" )
event - ( "idevent", "idcompany", "name", "description", "date", "createdOn" )
Many users can be linked to a single company as well as multiple events and many events can be linked to a single company. All companies, users and events have columns as show above in common. However, what if I wanted to give my customers the ability to add custom fields to both their users and their events for any unique extra information they wish to store. These extra fields would be on a company wide basis, not on a per record basis ( so a company adding a custom field to their users would add it to all of their users not just one specific user ). The custom fields also need to be sesrchable and have the ability to be reported on, ideally automatically with some sort of report wizard. Considering the database is expected to have lots of traffic as well as lots of custom fields, what is the best solution for this?
My current research and findings in possible solutions:
To have generic placeholder columns such as "custom1", "custom2" etc.
** This is not viable as there will eventually be too many custom columns and there will be too many NULL values stored in the database
To have 3x tables per current table. eg: user, user-custom-field, user-custom-field-value. The user table being the same. The user-custom-field table containing the information about the new field such as name, data type etc. And the user-custom-field-value table containing the value for the custom field
** This one is more of a contender if it were not for its complexity and table size implications. I think it will be impossible to avoid a user-custom-field table if I want to automatically report on these fields as I will have to store the information on how to report on these fields here. However, In order to pull almost any data you would have to do a million joins on the user-custom-field-value table as well as the fact that your now storing column data as rows which in a database expected to have a lot of traffic as well as a lot of custom fields would soon cause a problem.
Create a new user and event table for each new company that is added to the system removing the company id from within those tables and instead using it in the table name ( eg user56, 56 being the company id ). Then allowing the user to trigger DB commands that add the new custom columns to the tables giving them the power to decide if it has a default value or auto increments etc.
** Everytime I have seen this solution it has always instantly been shut down by people saying it would be unmanageable as you would eventually get thousands of tables. However nobody really explains what they mean by unmanageable. Firstly as far as my understanding goes, more tables is actually more efficient and produces faster search times as the tables are much smaller. Secondly, yes I understand that making any common table changes would be difficult but all you would have to do is run a script that changes all your tables for each company. Finally I actually see benefits using this method as it would seperate company data making it impossible for one to accidentally access another's data via a potential bug, plus it would potentially give the ability to back up and restore company data individually. If someone could elaborate on why this is perceived as a bad idea It would be appreciated.
Convert fully or partially to a NoSQL database.
** Honestly I have no experience with schemaless databases and don't really know how dynamic user defined fields on a per record basis would work ( although I know it's possible ). If someone could explain the implications of the switch or differences in queries and potential benefits that would be appreciated.
Create a JSON column in each table that requires extra fields. Then add the extra fields into that JSON object.
** The issue I have with this solution is that it is nearly impossible to filter data via the custom columns. You would not be able to report on these columns and until you have received and processed them you don't really know what is in them.
Finally if anyone has a solution not mentioned above or any thoughts or disagreements on any of my notes please tell me as this is all I have been able to find or figure out for myself.
A typical solution is to have a JSON (or XML) column that contains the user-defined fields. This would be an additional column in each table.
This is the most flexible. It allows:
New fields to be created at any time.
No modification to the existing table to do so.
Supports any reasonable type of field, including types not readily available in SQL (i.e. array).
On the downside,
There is no validation of the fields.
Some databases support JSON but do not support indexes on them.
JSON is not "known" to the database for things like foreign key constraints and table definitions.

Oracle Audit Trail to get the list of columns which got updated in last transaction

Consider a table(Student) under a schema say Candidates(NOT DBA):
Student{RollNumber : VARCHAR2(10),Name : VARCHAR2(100),CLass : VARCHAR2(5),.........}
Let us assume that the table already contains some valid data.
I executed an update query to modify the name and class of the Student table
UPDATE STUDENT SET Name = 'ASHWIN' , CLASS = 'XYZ'
WHERE ROLLNUMBER = 'AQ1212'
Followed by another update query in which I am updating some other fields
UPDATE STUDENT SET Math_marks = 100 ,PHY_marks , CLASS = 'XYZ'
WHERE ROLLNUMBER = 'AQ1212'
Since I modified different columns in two different queries. I need to fetch the particular list of columns which got updated in last transaction. I am pretty sure that oracle must be maintaining this in some table logs which could be accessed by DBA. But I don't have the DBA access.
All I need is a the list of columns that got updated in last transaction under schema Candidates I DO NOT have the DBA rights
Please suggest me some ways.
NOTE : Here above I mentioned a simple table. But In actual I have got 8-10 tables for which I need to do this auditing where a key factor lets say ROLLNUMBER acts a foreign key for all other tables. Writing triggers would be a complex for all tables. So please help me out if there exists some other way to fetch the same.
"I am pretty sure that oracle must be maintaining this in some table logs which could be accessed by DBA."
Actually, no, not be default. An audit trail is a pretty expensive thing to maintain, so Oracle does nothing out of the box. It leaves us to decide what we what to audit (actions, objects, granularity) and then to switch on auditing for those things.
The Oracle requires DBA access to enable the built-in functionality, so that may rule it out for you anyway.
Auditing is a very broad topic, with lots of things to consider and configure. The Oracle documentation devotes a big chunk of the Security manual to Auditing. Find the Introduction To Auditing here. For monitoring updates to specific columns, what you're talking about is Fine-Grained Audit. Find out more.
"I have got 8-10 tables ... Writing triggers would be a complex for all tables."
Not necessarily. The triggers will all resemble each other, so you could build a code generator using the data dictionary view USER_TAB_COLUMNS to customise some generic boilerplate text.

What is the most correct way to store a "list" in a SQL Database?

So, I've read a lot about how stashing multiple values into one column is a bad idea and violates the first rule of data normalisation (which, surprisingly, is not "Do Not Talk About Data Normalisation") so I need some help.
At the moment I'm designing an ASP .NET webpage for the place I work for. I want to display data on a web page depending on what Active Directory groups the person belongs to. The first way of doing this that comes to mind is to have a table with, essentially, a column containing the AD group and the second column containing what list of computers belong to that list.
I've learnt that this is showing great disregard for relational databases, so what is a better way to do it? I want to control this access by SQL tables, so I can add/remove from these tables and change end users access accordingly.
Thanks for the help! :)
EDIT: To describe exactly what I want to do is this:
We have a certain group of computers that need to be checked up on, however these computers are in physically difficult to reach locations. The organisation I belong to has remote control enabled for these computers, however they're not in the business of giving out the remote control password (understandable).
The added layer of complexity is that, depending on who you are, our clients should only be able to see a certain group of computers (that is, the group of computers that their area owns). So, if Group A has Thomas in it, and Group B has Jones in it, if you belong to either group then you would just see one entry. However, if you belong to both groups you should see both Thomas and Jones computers in it.
The reason why I think that storing this data in a SQL cell is the way to go is because, to store them in tables would require (in my mind) a new table for each new "group" of computers. I don't want to crank out SQL tables for every new group, I'd much rather just have an added row in a SQL table somewhere.
Does this make any sense?
You basically have three options in SQL Server:
Storing the values in a single column.
Storing the values in a junction table.
Storing the values as XML (or as some other structured data format).
(Other databases have other options, such as arrays, nested tables, and JSON.)
In almost all cases, using a junction table is the correct approach. Why? Here are some reasons:
SQL Server has (relatively) lousy string manipulation, so doing something as simple as ensuring a unique list is really, really hard.
A junction table allows you to store lots of other information (When was a machine added? What is the full description of the machine? etc. etc.).
Most queries that you want are pretty easy with a junction table (with the one exception of getting a comma-delimited list, alas -- which is just counterintuitive rather than "hard").
All the types are stored natively.
A junction table allows you to enforce constraints (both check and foreign key) on the elements of the list.
Although a delimited list is almost never the right solution, it is possible to think of cases where it might be useful:
The list doesn't change and presentation of the list is very important.
Space usage is an issue (alas, denormalization often results in fewer pages).
Queries do not really access elements of the list, just the entire thing.
XML is also a reasonable choice under some circumstances. In the most recent versions of SQL Server, this can be made pretty efficient. However, it incurs the overhead of reading and parsing XML -- and things like duplicate elimination are still not obvious.
So, you do have options. In almost all cases, the junction table is the right approach.
There is an "it depends" that you should consider. If the data is never going to be queried (or queried very rarely) storing it as XML or JSON would be perfectly acceptable. Many DBAs would freak out but it is much faster to get the blob of data that you are going to send to the client than to recompose and decompose a set of columns from a secondary table. (There is a reason document and object databases are becoming so popular.)
... though I would ask why are you replicating active directory to your database and how are you planning on keeping these in sync.
I not really a bad idea to store multiple values in one column, but will depend the search you want.
If you just only want to know the persons that is part of a group then you can store persons in one column with a group id as key. For update you just update the entire list in a group.
But if you want to search a specified person that belongs to group, then its not recommended that you store this multiple persons in one column. In this case its better to store a itermedium table that store person id, and group id.
Sounds like you want a table that maps users to group IDs and a second table that maps group IDs to which computers are in that group. I'm not sure, your language describing the problem was a bit confusing to me.
a list has some columns like: name, family name, phone number etc.
and rows like name=john familyName= lee number=12321321
name=... familyname=... number=...
an sql database works same way. every row in a sql database is a record. so you jusr add records of your list into your database using insert query.
complete explanation in here:
http://www.w3schools.com/sql/sql_insert.asp
This sounds like a typical many-to-many problem. You have many groups and many computers and they are related to eachother. In this situation, it is often recommended to use a mapping table, a.k.a. "junction table" or "cross-reference" table. This table consist solely of the two foreign keys in your other tables.
If your tables look like this:
Computer
- computerId
- otherComputerColumns
Group
- groupId
- othergroupColumns
Then your mapping table would look like this:
GroupComputer
- groupId
- computerId
And you would insert a single record for every relationship between a group and computer. This is in compliance with the rules for third normal form in regards to database normalization.
You can have a table with the group and group id, another table with the computer and computer id and a third table with the relation of group id and computer id.

Storing multiple logic databases in one physical database

I'd like to design a cloud business solution with 4 default tables, a user may add a custom field(Column?) or a add a custom object(Table?).
My first thought was to create a new database for each account but there's a limit to database number on a sql server instance,
2nd solution : for each account create a new schema by duplicating the 4 default tables for each schema.
3rd solution : create 4 unique tables with a discriminant column (ACCOUNT_ID), if a user wants a new field add a join table dedictated to that ACCOUNT_ID, if he wants a new object then create a new table.
What are your thoughts? Does any body know how existing cloud solutions store data? (for instance salesforce)
BTW, I don't want to create a VM for each account.
Thanks all for your suggestions, that helped me a lot especially the microsoft article suggested by John.
Since few architectural points are shared between accounts (the 4 default tables are just a suggestion for the user, I expect a full customization), I've opted for the schema per account design with no EAV pattern.