Many-to-many hierarchical relationships - sql

In my application I have three main tables:
User
Group
Role
Any combination has a many-to-many relationship.E.g. a user can be in different group, having different roles in each of them.
The easiest part is to map user and group, done by the intermediate table User_Group. Then, when it came to design how to link the three together I had some doubts.
Q: Do I add another column in User_Group? Or create additional intermediate table?
Thinking about the second option, I tried this:
which (I think) would make easier and tidier retrieving the information I need in the front-end:
Groups available for the user (User_Group)
Roles available for a group (Group_Role)
Roles available for a user in a given group (user_Group_Role)

I don't see anything wrong with the schema you have proposed at the end of your post.
It ensures a user can be associated to a group without them being required to have a role within that group.
It can ensure that a user can only take a role applicable to the specific group.
It can allow the same role to be shared across multiple groups.
It can allow a group without any roles
To shrink the schema you can come up with some nice tricks:
Create a role for doesn't have a role in this group; to remove the need of user_group
Create a dummy user for each group, that has every role the group is eligible for; to remove the need of group_role
etc,etc
The down side here is that you end up needing bespoke code to deal with changes, or enforcing constraints. The schema you propose at the end of your post can enforce all the required constraints with foreign key constraints and composite primary key constraints. And will in general be more flexible to future changes.
I see no reason not to use the schema you have proposed, it seems perfectly correct, maintainable, understandable, and resilient to me.

I would go with additional column in existing joining table. You can still fairly easily answer all your questions:
Groups available for the user
select *
from Roles r
join JoiningTableName j on r.Role_Id = j.Role_Id
where j.User_Id = myUserId
Roles available for a group
select *
from Roles r
join JoiningTableName j on r.Role_Id = j.Role_Id
where j.Group_Id = myGroupId
Roles available for a user in a given group
select *
from Roles r
join JoiningTableName j on r.Role_Id = j.Role_Id
where j.User_Id = myUserId and j.Group_Id = myGroupId

Related

Relationship redundant?

I'm designing a database and I have a user table with users, and a group table with user's group.
These groups will have a owner (a user that has created it), and a set of users that are part of the group (like a Whatsapp group).
To represent this I have this design:
Do you think the owner column on Group table is necessary? Maybe I can add an owner column on Group table I can know easily the group's owner.
If you don't add the owner in the group then where are you going to add it? The only way I see apart from this is adding a boolean isowner to the usergroup. Anyway, this would not make sense if there will only be 1 owner. If there can be N owners then that would be the way to go.
You are on the right track, but you'll need one more step to ensure an owner must actually belong to the group she owns:
There is a FOREIGN KEY in Group {groupID, owner} that references UserGroup {groupID, userID}.
If your DBMS supports deferred foreign keys, you can make owner NOT NULL, to ensure a group cannot be owner-less. Otherwise you can leave it NULL-able (and you will still be able to break the "chicken-and-egg" problem with circular references if your DBMS supports MATCH SIMPLE FKs - almost all do).
You need 4 tables:
User
UserGroup
Group
UserRole (associated with UserGroup) - Shows the role of a user in a group (admin/owner, etc.) - If your roles are Admin and Ordinary user, you could use a Binary column on UserGroup instead.
I know a solution has already been proposed, but I am so convinced there is a better one ...
At the higher level, the concept of owner can be seen as a property of the relation existing between users and groups. It should then ideally be set as a field in the UserGroup table.
It could be either a boolean field or, even better, a generalized userGroupNatureOfRelation field, that could hold values such as 'owner', 'participant', 'user', or whatever could be the status.
Of course, such a solution allows you to implement any specific business rule, like 'there is only one owner per group'. It will let you implement any other more sophisticated business rule when needed, and it will even allow you to add a level of complexity by adding fields such as:
userGroupRelationStartDate
userGroupRelationEndDate
where you'll be able to follow the nature of the relation between a group and a person through time ...
Of course you could say 'I don't need it'. But implementing such an 'open' model does not cost anything more than what you are thinking of now. Then, if for any reason, in a near or far future, business rules have to be changed or improved, your model stays valid and efficient ...
This said, building views and manipulating data should be a lot easier with this model. So this a good and immediate reason to adopt it!

How should I record in the database that an item/product is visible by 'all' groups?

A user can be in groups. And an item/product is assigned groups that can see the item. Users can see that item if they are in one of the assigned groups.
I want neither public (anonymous users in no groups) nor groupless users (logged in users not in any groups) to see the item. But I want the interface to allow assigning the item an 'all/any groups' attribute so that users that are in any group at all can see the item.
Where/How should I store this assignment?
p.s. I expect the technique to also be extended to other entities, for example I'd assign a file to a category, and groups are linked to categories. so when a file is marked as visible by the 'all/any category' then if the user (thru groups and group-categories) is linked to at least one category then the file is visible to them.
Decision:
It seemed the choice was whether to implement as a row in a entity-groups table or as fields in the entity table. The chosen answer used the former.
And either managing the group membership in a table or adding JOIN conditions. The chosen answer used the former, but I'm going to use the latter. I'm putting an indirection between the query and usage so if (when) performance is a problem I should be able to change to a managed table underneath (as suggested) without changing usage.
I've other special groups like 'admin', 'users', etc. which can also fit into the same concept (the basis simply being a list of groups) more easily than special and variable field handling for each entity.
thanks all.
I'd put it in the items table as a boolean/bit column IsVisibleToAllGroups.
It does make queries to get all items for a user a bit less straightforward but the other alternative would be to expand out "all groups" so you add a permission row for each individual group but this can potentially lead to a huge expansion in the number of rows and you still have to keep this up-to-date if an additional group is added later and somehow distinguish between a permission that was granted explicitly to all (current and future) groups and one that just happened to be granted to all groups currently in existence.
Edit You don't mention the RDBMS you are using. One other approach you could take would be to have a hierarchy of groups.
GroupId ParentGroupId Name
----------- ------------- ----------
0 NULL Base Group
1 0 Group 1
2 0 Group 2
You could then assign your "all" permissions to GroupId=0 and use (SQL Server approach below)
WITH GroupsForUser
AS (SELECT G.GroupId,
G.ParentGroupId
FROM UserGroups UG
JOIN Groups G
ON G.GroupId = UG.GroupId
WHERE UserId = #UserId
UNION ALL
SELECT G.GroupId,
G.ParentGroupId
FROM Groups G
JOIN GroupsForUser GU
ON G.GroupId = GU.ParentGroupId)
SELECT IG.ItemId
FROM GroupsForUser GU
JOIN ItemGroups IG
ON IG.GroupId = GU.GroupId
As mentioned by both Martin Smith and Mikael Eriksson, making this a property of the entity is a very tidy and straight forward approach. Purely in terms of data representation, this has a very nice feel to it.
I would, however, also consider the queries that you are likely to make against the data. For example, based on your description, you seem most likely to have queries that start with a single user, find the groups they are a member of, and then find the entities they are associated to. Possibly something lke this...
SELECT DISTINCT -- If both user and entity relate to multiple groups, de-dupe them
entity.*
FROM
user
INNER JOIN
user_link_group
ON user.id = user_link_group.user_id
INNER JOIN
group_link_entity
ON group_link_entity.group_id = user_link_group.group_id
INNER JOIN
entity
ON entity.id = group_link_entity.entity_id
WHERE
user.id = #user_id
If you were to use this format, and the idea of a property in the entity table, you would need something much less elegant, and I think the following UNION approach is possibly the most efficient...
<ORIGINAL QUERY>
UNION -- Not UNION ALL, as the next query may duplicate results from above
SELECT
entity.*
FROM
entity
WHERE
EXISTS (SELECT * FROM user_link_group WHERE user_id = #user_id)
AND isVisibleToAllGroups != 0
-- NOTE: This also implies the need for an additional index on [isVisibleToAllGroups]
Rather than create the corner case in the "what entity can I see" query, it is instead an option to create the corner case in the maintenance of the link tables...
Create a GLOBAL group
If an enitity is visible to all groups, map them to the GLOBAL group
If a user is added to a group, ensure they are also linked to the GLOBAL group
If a user is removed from all groups, ensure they are also removed from the GLOBAL group
In this way, the original simple query works without modification. This means that no UNION is needed, with it's overhead of sorting and de-duplication, and neither is the INDEX on isVisibleToAllGroups needed. Instead, the overhead is moved to maintaining which groups a user is linked to; a one time overhead instead.
This assumes that the question "what entities can I see" is more common than changing groups. It also adds a behaviour that is defined by the DATA and not by the SCHEMA, which necessitates good documentation and understanding. As such, I do see this as a powerful type of optimisation, but I also see it as a trades-and-balances type of compromise that needs accounting for in the database design.
Instead of a boolean, which needs additional logic in every query, I'd add a column 'needs_group' which contains the name (or number) of the group that is required for the item. Whether a NULL field means 'nobody' or 'everybody' is only a (allow/deny) design-decision. Creating one 'public' group and putting everybody in it is also a design decision. YMMV
This concept should get you going:
The user can see the product if:
the corresponding row exists in USER_GROUP_PRODUCT
or PRODUCT.PUBLIC is TRUE (and user is in at least one group, if I understand your question correctly).
There are 2 key points to consider about this model:
Liberal usage of identifying relationships - primary keys of parents are "migrated" within primary keys of children, which enables "merging" of GROUP_ID at the bottom USER_GROUP_PRODUCT. This is what allows the DBMS to enforce the constraint that both user and product have to belong to the same group to be mutually visible. Usage of non-identifying relationships and surrogate keys would prevent the DBMS from being able to enforce that directly (you'd have to write custom triggers).
Usage of PRODUCT.PUBLIC - you'll have to treat this field as "magic" in your client code. The alternative is to simply fill the USER_GROUP_PRODUCT with all the possible combinations, but this approach is fragile in case a new user is added - it would not automatically see the product unless you update the USER_GROUP_PRODUCT as well, but how would you know you need to update it unless you have a field such as PRODUCT.PUBLIC? So if you can't avoid PRODUCT.PUBLIC anyway, why not treat it specially and save some storage space in the database?

SQL Server table with different user profiles

I am designing my db tables in SQL Server 2005 and have come across a small design/architecture issue... I have my main Users table (username, password, lastlogin, etc.), but I also need to store 2 different user profiles, i.e. the profile data stored will be different between the two. I've put all the common user data into the Users table.
Do I create separate tables for Consumers and Marketers? And if so, should the primary key in these tables be [table-name]_UserID with a 1:1 relationship on Users_UserID?
Basically, upon registering, the user will be given the choice to register as a Consumer or Marketer. When a user logs in, the Users table will be queried, and their accompanying profile will be queried from either table.
I know this approach is messy, which is why I've come here to ask how best this can be achieved.
Thanks!
EDIT: Additionally, in the Users table I have a Users_UserType flag that will allow me to distinguish between users when they log in, hence knowing which Profile Table to query.
Your gut feeling is correct. You want to normalize your data. Using separate tables reduces data duplication, or empty/null columns.
Unfortunantly, with a reverse relationship like this, you won't have a nice clean foreign key from User to Consumer or Marketer because it could be one table or another.
You would want to map a User_Id from the Consumer/Marketers table back to User though.
You could query it in a single query using left joins:
Select
u.*,
c.*,
m.*
From Users u
left join Consumers c on c.User_Id = u.ID
left join Marketers m on m.User_Id = u.ID
Where
u.ID = #UserId

Exclude ambiguous columns in SELECT statement (actually CREATE VIEW statement)

After reading around, I've realized that SQL (MySQL in my case) does not support column exclusion.
SELECT *, NOT excluded_column FROM table; /* shame it doesn't work */
Anyways, while I've come to accept that, I'm wondering if there's any decent workarounds to achieve this sort of behavior. Reason being, I'm creating a view to consolidate information across a few tables.
I've normalized some user data to tables user and user_profile among others; purpose being that user stores data critical to user operations, and user_profile stores non-critical data. Application requirements are still being realized, so columns are being added/removed from user_profile as necessary, and further tables may be supported down the line which would be included in the view.
Problem is, when I create the view, I get Error 1060: Duplicate Column Name because user_id is present in both tables.
Now, the solution I've come up with so far, is basically:
/* exclude user_id from user */
SELECT user.critical_field, user.other_critical_field,
user_profile.*
FROM user
LEFT JOIN user_profile
ON user.user_id = user_profile.user_id;
Since the user table is going to remain unchanged throughout the application lifecycle (hopefully) this could suffice, but I was just curious if a more dynamic approach exists.
(Table names were not copypasta'd, I know user is often a poor choice of naming convention on it's own, I use prefixes.)
Typically, I'll define which fields I want to be in my view
Using your example:
SELECT user.critical_field, user.other_critical_field,
user_profile.User_Id, user_profile.MyOtherOfield
FROM user
LEFT JOIN user_profile
ON user.user_id = user_profile.user_id;
Now, additionally, I'll make sure that I alias things properly:
SELECT u.critical_field, u.other_critical_field,
up.User_Id, up.MyOtherOfield, u.KeyField AS userKey, up.KeyField as ProfileKey
FROM user as u
LEFT JOIN user_profile as up
ON u.user_id = up.user_id;
This allows me to ensure I know what's in my view, and that the columns are named intelligently, but it does mean that I'll need to touch that view when I make changes to the underlying table structures.

Is it better to have roles as a column on my users table, or do it through join tables (Roles & Assignments)? - Rails 3

You can see my models here:
https://gist.github.com/768947
Just to explain what is happening, I have a few models. At the core of my app, there are: projects, stages, uploads, comments, users. Then there are roles & assignments to manage user authorization.
I am doing this with the plugin declarative_authorization & devise for login.
So the first question is, is it better to just add a column of 'Roles' to my user model/table and store the roles for each user there? If I have a user that has multiple roles, then I can just store all the roles as an array and cycle through them as needed.
Or is it better to do it like I have it setup now, where I use two separate tables and a bunch of joins to setup the assignments? I only have 4 roles: designer, client, admin, superuser.
Better in the sense that it is 'less expensive' from a computing resources standpoint to do each query with the column, than with the joins or is the difference not that significant?
I guess, the root of my question is...right now if I want to get a project assigned to the current_user I simply do current_user.projects.each do |project| and cycle through them that way. This is after I have done: #projects = current_user.projects in the projects controller. The same applies for all my other models - except Users & Roles.
However, if I wanted to find a user with role "client", it becomes convoluted very quickly. Or am I overcomplicating it?
Any help would be appreciated.
I think it's better to have user and role tables that are separate. It's a many-to-many relationship, because a user can have many roles and many users can have the same role. You'll need a JOIN table (e.g. user_role) to do that. I'd recommend three tables. Of course you'll have primary keys for both user and role; the user_role table will have two columns, one for each primary key, and foreign key relationships to their respective tables.
So now if you want all users with the role "client", it's an easy JOIN between user and user_role. If you want a particular user's roles, you'll need to JOIN the three tables.
I would not recommend an array of roles in user. It goes against first normal form.