SQL database design suggestion : Naming a database table - sql

I have to create a table and store Active Directory SIDs representing an User or a Group.
How would you name the category representing both an User and a Group ?
Edit 1.
Table will contain four columns : ID ( PK ), SID's Name, SID's value and another column for SID's Type ( 0 for User, 1 for Group ).
Please suggest the table name, not only the columns names.

Active Directory uses the term "principal" or "security principal" for both. That also includes computers.
Here's a grahpic image from the MSDN article Managing Directory Security Principals in the .NET Framework 3.5 that shows the hierarchy.
(source: microsoft.com)
So I would probably call my table Principals and have the three columns you mentioned:
PrincipalName (string)
SID (string or binary)
PrincipalType (0 for User, 1 for Group)

From most verbose to least:
ActiveDirectorySecurityIdentifiers
ActiveDirectorySIDs
ADSIDs
Good practices dictate that table names be plural and that the names should represent and describe the contents of the tables. Depending on your level of comfort any one of the above should do just fine.

When I recently had to do this (linking a DB user table to the AD accounts) I simply named the column ADSID.
I found this made good sense for us since we were querying using DirectorySearcher and the name for that property in the LDAP database is objectSid, so our queries looked like:
deSearch.Filter = "(&(objectSid=" + ADSID + "))";
Although, as I cut an paste that code from my project, I do wonder if maybe objectSid would have been a good column name too?
As far as naming the table, I hope you are storing additional information beyond the AD details here? Otherwise, why are you duplicating the AD database?
If you are storing additional information, then you should name the table according to whatever domain/business object is modelled by the table.
As I said, I was storing the data for users, so my table was simply called [Users].
Finally - perhaps you would benefit from normalising this out into a [Groups] and a [Users] table?

Related

Is USER_NAME column unique in HANA DB USERS table?

Is USER_NAME field/column unique in HANA Database USERS table? I am seeing just number in USER_ID values.
Like BNAME in SAP USR02 table, I want to know what is the unique(or equivalent to bname) value field in HANA DB USERS table.
Amandeep Modgil's answer is not wrong but does not fully answer the question.
Of course, the documentation makes it clear that user names in SAP HANA need to be unique. However, it does not specifically explain whether or how this is enforced/guaranteed.
The "DB dev way" to find out something like this is to check the table structure used by HANA to store users.
Looking at the PUBLIC.USERS objects, we realize: this is not a table but a view instead.
Views don't have any constraints assigned to them, so any primary key or unique constraint must be implemented with one of the tables referenced by the view.
The next step is to review the source code for the view. In SAP HANA Studio one can simply mark the name of the view in the SQL editor and choose "Show Definition" from the context menu.
For PUBLIC.USERS this opens two(!) new windows:
one for the public synonym (there really is no PUBLIC schema, just synonyms) for USERS
and another one for the view SYS.USERS
This SYS schema is where SAP HANA system objects are implemented, so it's not surprising to find the view for USERS here.
In my HANA Express 2.00.045 system, the source code for the view surprisingly begins with
CREATE **ROW TABLE** "SYS"."USERS" ( "USER_NAME",
"USER_ID",
"USERGROUP_NAME" ...
That's weird at the very least, and I suspect it might be a bug as all other metadata entries for this object make it clear that this is in fact a view.
But I digress...
The question to answer was: where is the uniqueness of USER_NAME enforced?
Scrolling down the last main FROM-clause of the SYS.USERS-view points to a table: SYS.P_USERS_.
The trailing underscore in the name indicates that this is an internal HANA object that should never be directly used by any user or application. But that does not stop us from looking at it. Appropriate privileges are required for that, though. The "normal" application user account probably won't be able to directly look at this table's definition. I'm just using the SYSTEM user in this case.
Anyhow, we use the same technique as before: mark the SYS.P_USERS_ table in the SQL Editor, choose "Show definition" and we get: the definition of the table that holds the user accounts in SAP HANA.
The first three columns are defined like this:
Name SQL Data Type Dimension Column Store Data Type Key Not Null
OID BIGINT FIXED X
NAME NVARCHAR 256 STRING
LAST_SUCCESSFUL_CONNECT TIMESTAMP LONGDATE ...
Notice how there is no primary key defined on this table and how only OID has a NOT NULL constraint?
Clearly, the uniqueness of NAME is not guaranteed by table constraints.
So what else could it be?
Let's switch to the Indexes tab of the table definition and we find:
IDX_P_USERS_OID, indexed columns: "OID" ASC
IDX_P_USERS_NAME, index columns: "NAME" ASC
AND for both of these indexes the Unique-flag is set.
And there we have it:
Both OID (exposed as USER_ID) and NAME (exposed as USER_NAME) are unique in SAP HANA, enforced by unique indexes on the internal table that holds these user account entries.
You can look up the schema information on SAP portal link below:
https://help.sap.com/viewer/4fe29514fd584807ac9f2a04f6754767/2.0.01/en-US/21026099751910148e0cdbddc75652b8.html
Although it does not tell you whether a particular column is the primary key or need to be unique, but you can combine this information with the data from following system view and get the information you are after.
https://help.sap.com/viewer/4fe29514fd584807ac9f2a04f6754767/2.0.05/en-US/210197377519101481cfb213f0b84848.html
I have highlighted the columns in the tables system view you need in the screenshot below

How to restrict SQL columns based on id of requester?

In my application a requester has permission to query only certain columns. The columns may differ between requesters. The where clause changes between requests so that the rows returned change with each query. What is the best way to handle this access control? Should I use an array to store permitted columns and then do the check in my application?
I'm on PostgreSQL 9.x
Example:
We have medical professionals that can access records of patients but not all medical professionals should be able to access all information. They try to request arbitrary information about any patient (which have a uid) but we should enforce access controls.
So say the info is name, date of birth, blood type and illness
Doctor A has permission for all fields
Doctor B can see everything except blood type
Administrator can only see name and date of birth
Hematologist can only see blood type
To implement option 2, I would have a column permissions table something like the following:
CREATE TABLE ColumnPerms
(
user_or_role Varchar(50),
table_name Varchar(50),
column_name Varchar(50),
)
CREATE INDEX ix_Columnperms(user_or_role, table_name)
The *table_name* column is to allow this functionality to be implemented on more than just a single table in your app: if it's unnecessary, don't use it. You could adopt the convention that role names start with a '#' character, to ensure that there is no collision with user names.
Now, when you build your dynamic query, you can do something like
SELECT column_name
FROM ColumnPerms
WHERE user_or_role = '#manager'
AND table = 'Payroll'
AND column_name IN ('first_name', 'last_name', 'hire_date', 'base_salary', 'bonus')
(the IN clause should include EVERY column potentially to be returned).
The result of this query is a list of the column names that user is allowed to see. Just iterate through it to build your column list when constructing the dynamic SQL.
There are two approaches you could take:
Use Postgres to enforce the security using column level permissions for each user (or user role). Look at the syntax for GRANT here: http://www.postgresql.org/docs/current/static/sql-grant.html
Build dynamic sql statements limiting what rows may be returned for each user. This could become pretty tedious if there are many users, or many different column combinations. You'll probably want to keep a table of user ids, and "selectable" table, column names for building the query statement. If you want this generalized to many different queries, you could either build them on top of a table-returning function that does the column filtering, or revert to option 1.
For option 1, make sure that columns used in the join are selectable...

Storing multiple logic databases in one physical database

I'd like to design a cloud business solution with 4 default tables, a user may add a custom field(Column?) or a add a custom object(Table?).
My first thought was to create a new database for each account but there's a limit to database number on a sql server instance,
2nd solution : for each account create a new schema by duplicating the 4 default tables for each schema.
3rd solution : create 4 unique tables with a discriminant column (ACCOUNT_ID), if a user wants a new field add a join table dedictated to that ACCOUNT_ID, if he wants a new object then create a new table.
What are your thoughts? Does any body know how existing cloud solutions store data? (for instance salesforce)
BTW, I don't want to create a VM for each account.
Thanks all for your suggestions, that helped me a lot especially the microsoft article suggested by John.
Since few architectural points are shared between accounts (the 4 default tables are just a suggestion for the user, I expect a full customization), I've opted for the schema per account design with no EAV pattern.

SQL 2008: Using separate tables for each datatype to return single row

I thought I'd be flexible this time around and let the users decide what contact information the wish to store in their database. In theory it would look as a single row containing, for instance; name, address, zipcode, Category X, Listitems A.
Example
FieldType table defining the datatypes available to a user:
FieldTypeID, FieldTypeName, TableName
1,"Integer","tblContactInt"
2,"String50","tblContactStr50"
...
A user the define his fields in the FieldDefinition table:
FieldDefinitionID, FieldTypeID, FieldDefinitionName
11,2,"Name"
12,2,"Address"
13,1,"Age"
Finally we store the actual contact data in separate tables depending on its datatype.
Master table, only contains the ContactID
tblContact:
ContactID
21
22
tblContactStr50:
ContactStr50ID,ContactID,FieldDefinitionID,ContactStr50Value
31,21,11,"Person A"
32,21,12,"Address of person A"
33,22,11,"Person B"
tblContactInt:
ContactIntID,ContactID,FieldDefinitionID,ContactIntValue
41,22,13,27
Question: Is it possible to return the content of these tables in two rows like this:
ContactID,Name,Address,Age
21,"Person A","Address of person A",NULL
22,"Person B",NULL,27
I have looked into using the COALESCE and Temp tables, wondering if this is at all possible. Even if it is: maybe I'm only adding complexity whilst sacrificing performance for benefit in datastorage and user definition option.
What do you think?
I don't think this is a good way to go because:
A simple insert of 1 record for a contact suddenly becomes n inserts. e.g. if you store varchar, nvarchar, int, bit, datetime, smallint and tinyint data for a contact, that's 7 inserts in datatype specific tables, +1 for the main header record
Likewise, a query will automatically reference 7 tables, with 6 JOINs involved just to get the full details
I personally think it's better to go for a less "generic" approach. Keep it simple.
Update:
The question is, do you really need a flexible solution like this? For contact data, you always expect to be able to store at least a core set of fields (address line 1-n, first name, surname etc). If you need a way for the user to store custom/user definable data on top of that standard data set, that's a common requirement. Various options include:
XML column in your main Contacts table to store all the user defined data
1 extra table containing key-value pair data a bit like you originally talked about but to much lesser degree! This would contain the key of the contact, the custom data item name and the value.
These have been discussed before here on SO so would be worth digging around for that question. Can't seem to find the question I'm remembering after a quick look though!
Found some that discuss the pros/cons of the key-value approach, to save repeating:
Key value pairs in relational database
Key/Value pairs in a database table

What are some methods for persisting customer configurable data in a database?

I'm looking for some ideas on methods for persisting customer configurable data in a relational database (SQL Server 2000 in my case).
For example let's say you have a standard order entry application where your customer enters a product they want to buy. In addition to the fields that are important to both you and the customer (itemID, cost etc), you want to allow the client to enter information only relevant to them and persist it for them (for later retrieval on reports or invoices or whatever). You also want the labeling of these "customer fields" to be configured by the customer. So one customer might have a field called "Invoice Number" another customer might have 2 fields called "Invoice#" and "Invoice Date" etc...
I can think of a few ways to do this. You could have a customerfields table with some reasonable number of varchar fields related to each transaction and then another table clientcustomerfields which contains the meta data about how many fields a customer uses, what the field names are etc. Alternatively you could use XML to persist the customer data so you don't have to worry about filly up X # of fields, you'd still need some table to describe the customers meta data (maybe through an XSD).
Are there any standard ways of doing this type of thing?
You should read Best Practices for Semantic Data Modeling for Performance and Scalability. This is exactly the question addressed by the white paper in the link.
One strategy I'd use:
customer_fields
- field_id
- cusomer_id
- field_name
customer_transaction_field_values
- transaction_id
- field_id
- field_value
As a generalisation I would recommend against using opaque XML blobs to store field-oriented data in a relational database.
The best solution is to have some 'user' fields and configuration within the application to set up how these fields are used and presented. If the fields are varchars the overhead for empty fields is fairly minimal, IIRC about 1 byte per field. Although this looks inelegant and has a finite number of fields, it is the simplest to query and populate which makes it the fastest. One option would be to make the configuration agnostic to the number of fields and simply run a script to add a few more fields if you need them.
Another option is to have a 'coding' table hanging off entities which user-configurable fields. It has 'entity ID', 'field type' and 'field code' columns where the 'field type' column denotes the actual content. The particular disadvantage is that it makes queries slower as they have to potentially join against this table multiple times.
I've actually seen both (1) and (2) in use on the same system. The vendor originally started with (2) but then found it to be a pain in the arse and subsequent subsystems on the application went to using (1). This change in approach was borne out of bitter experience.
The principal strike against XML blobs is that they are not first class citizens in the database schema. The DBMS cannot enforce referential integrity on the blob by itself, it cannot index individual columns within the blob and querying the data from the blob is more complex and may not be supported by reporting tools. In addition, the content of the blob is completely opaque to the system data dictionary. Anyone trying to extract the data back out of the system is dependent on the application's documentation to get any insight into the contents.
In addition to your own suggestions, another way is to look at the Profile Provider system in ASP.net (Assuming a MS tech stack on this). You can extend the ProfileBase to include 2 arrays representing your user defined keys and another for the corresponding values. At that point, the SqlProfileProvider will handle the storage and retrieval of such and create your implementation of the ProfileBase object. Ultimately, this would be similar to if you were trying to use the ProfileProvider system in a Web Application project and not a Web Site project (which use different build managers).
I have done this in the past and used the concept of user defined fields. I would create four tables for the basic types:
UDFCharacter - id - int, order_id - int, value - varchar(4000)
UDFNumber - id - int, order_id - int, value - float
UDFDateTime - id - int, order_id - int, value - datetime
UDFText - id - int, order_id - int, value - text
I would then have a table that described the fields along with their type:
CustomField - id - int, customer_id - int (linked to customer table), fieldType - 'UDFCharacter, UDFNumber, etc', name - varchar, and other meta info
The responses to the fields go in the UDF tables. The fields get displayed on the page based on the CustomField table. Our system was may more complex and required more tbales but this seems like it would work.