I'm building a rather complex web application with Java / Spring and at least 2 different databases:
RDBMS for main data
MongoDB for files (via GridFS) and other data CLOBs/JSON/etc.
The next step is authorization. Simple role based authorization isn't enough, because users should be allowed/disallowed to view/modify different resources. Though ACL came to my mind.
The most common simple ACL table probably looks like:
TABLE | FIELDS
-------+--------------
class | id, className
object | id, class_id, objectId
acl | id, object_id, user_id, permissionBitMask (crud)
But unfortunately that's not enough for my needs :(.
I also need:
Roles:
each User can have several roles, and
an ACL entry can also belong to a role
More Permissions:
For example: each Project can have multiple tasks, but a User who can modify the project details isn't allowed to create new tasks for this project. So there must be a separate permission for that.
ObjectId of different types:
The RDBMS tables will use UUID surrogate keys (so at least I never have to deal with composite keys here)
But MongoDB of course uses its own ObjectId
Additionally I will have some static resources inside the code which must be access restricted as well.
Parent Objects to inherit permissions
If I combine all these aspects, I get the following table structure:
TABLE | FIELDS
---------------+--------------
class | id, className
object | id, class_id, objectId, parent_object_id
acl | id, object_id, user_id, role_id
permission | id, permissionName
acl_permission | id, acl_id, permission_id, granted
Of course I could split the acl table into 2 tables (1. object + user, 2. object + role), but I don't think that really matters.
The "objectId" will be a simple VARCHAR and my application has to convert it from/to String. Else I'd have 5 additional tables for my different ObjectId types. And this would result in 5 additional JOIN operations...
Now the basic lookup query would be something like this:
SELECT p.granted
FROM acl a
JOIN acl_permission p
WHERE p.permission_id = ?
AND (
a.object_id = ? AND a.user_id = ?
OR a.object_id = ? AND a.role_id IN (?)
)
(Permissions are cached, Roles for current user are also cached via session context. granted just indicates, if the user has the permission or not.)
Then I would also have to apply an recursive SELECT, in order to get the parent object's ACL, if there's no ACL entry for the current object.
This can't be really performant. So what are the alternatives? My ideas:
Different DB schema (any ideas!?)
Graph Database like Neo4j.
Neo4j advantages:
Finding the first parent with a permission entry is a simple task for this DB
Storing an array of permissions within the ACL entry is possible -> no JOIN
Basically I could store all information in a single Node:
.
{
class: ClassName,
object: ObjectId,
parent: RelationToParentNode,
user: UserId,
role: RoleId,
grantedPermissions: [Permission1, Permission2, ...]
}
(Every permission, that is not listed inside the array, is automatically not granted. It's not possible to store complex types in a Neo4j array, so there's no way to store something like permissions: [{Permission1: true}, {Permission2: false}])
Of course it's also possible to store Permissions and Classes as separate Nodes and just link them all together. But I don't know what's the better approach with Neo4j.
Any ideas on this? Is there any out-of-the-box solution? Maybe there's a reason to use MongoDB for ACL?
I read about XACML and OAuth(2), but both seem to need an additional ACL schema to do what I need. Or am I wrong?
First of all, the complex Permission System you're looking for has a standard spec called RBAC (Role-Based Access Control). I've implemented various RBAC models in SQL both simple and complex. Work fine, SQL implementation is not fast on commodity hardware when the number of relationships grow above million. Reads are instant, but writes are slow due to the heavy work to duplicate records in order to provide fast reads.
Originally, when I designed the Permission System I literally "drew" it on a paper based on the RBAC spec. The output was, indeed, a graph. So, after two years of production usage, I'm thinking of switching to a native graph database.
Neof4j is a popular solution, but some important customers seem to be dissatisfied with it due to its weak clustering and replications system. So have a look at OrientDB (see OrientDB vs Neo4j).
You have mentioned above that "it's not possible to store complex types in a Neo4j array". OrientDB boasts having addressed this issue with custom data types. I haven't personally tried it yet, but planning to test after migrating our production data.
Related
My DB has following tables:
Resource: Some resources can be uploaded on site
Groups: Groups on site
Users: Users on site (not necessarily be part of any group but could be if they like)
Now, when some one uploads a resource then currently, ownership of that resource is given to it's uploader by default. So resource table has column OwnerID with foreign key association to User table.
But now, this has to be changed such that ownership of a resource could be given to either a user or entire group.
I'm trying to decide the migration scheme, to move this owner being user to an entity that could be either user or group. Intuition is that when someone uploads a material, he can choose its owner to be a user or entire group.
Currently my migration plans involves:
Add OwnerType (User, Group, Global) and UserOwner and GroupOwner within the Material table (probably worse normalized table).
OwnerType could be Global if owner is everyone --or-- Group if owner is group entity else user.
Then when I'm querying the Resource table, I can check the OwnerType to condionally select its Owner from either user table or group table.
I do not know if this is good way. I'm using entity framework, and things are already started to look ugly as User and Group hardly have any relationaship that would require me to make generalized entity.
Can some expert guide me on this? What is generally considered good migration plan in this case? Thanks for any help or suggestions.
The think I'm trying to implement is an id table. Basically it has the structure (user_id, lecturer_id) which user_id refers to the primary key in my User table and lecturer_id refers to the primary key of my Lecturer table.
I'm trying to implement this in redis but if I set the key as User's primary id, when I try to run a query like get all the records with lecturer id=5 since lecturer is not the key, but value I won't be able to reach it in O(1) time.
How can I form a structure like the id table I mentioned in above, or Redis does not support that?
One of the things you learn fast while working with redis is that you get to design your data structure around your accessing needs, specially when it comes to relations (it's not a relational database after all)
There is no way to search by "value" with a O(1) time complexity as you already noticed, but there are ways to approach what you describe using redis. Here's what I would recommend:
Store your user data by user id (in e.g. a hash) as you are already doing.
Have an additional set for each lecturer id containing all user ids that correspond to the lecturer id in question.
This might seem like duplicating the data of the relation, since your user data would have to store the lecture id, and your lecture data would store user ids, but that's the (tiny) price to pay if one is to build relations in a no-relational data store like redis. In practical terms this works well; memory is rarely a bottleneck for small-ish data-sets (think thousands of ids).
To get a better picture at how are people using redis to model applications with relations, I recommend reading Design and implementation of a simple Twitter clone and the source code of Lamernews, both of which are written by redis author Salvatore Sanfilippo.
As already answered, in vanilla Redis there is no way to store the data only once and have Redis query them for you.
You have to maintain secondary indexes yourself.
However with the modules in Redis, this is not necessary true. Modules like zeeSQL, or RediSearch allow to store data directly in Redis and retrieve them with a SQL query (for zeeSQL) or simil SQL for RediSearch.
In your case, a small example with zeeSQL.
> ZEESQL.CREATE_DB DB
OK
> ZEESQL.EXEC DB COMMAND "CREATE TABLE user(user_id INT, lecture_id INT);"
OK
> ZEESQL.EXEC DB COMMAND "SELECT * FROM user WHERE lecture_id = 3;"
... your result ...
I have added a series of ASPNET database tables for roles, user and membership management to my existing SQL database using aspnet_regsql.exe.
There is already a user table in the existing database which contains information (ID, Name, Address, Postcode, etc) for a number of users. What I want to achieve to associate the new aspnet_Users table with the existing user table.
Is there any option or options for recommendation please? Thanks
Cheers,
Alex
The UserKey, called UserId in the ASPnet membership tables, is the GUID which identifies a user. You can add a UserKey column to your Users table and then start doing dangerous things like:
select *
from Users as U inner join
aspnet_Users as aU on aU.UserId = U.UserKey inner join
aspnet_Membership as aM on aM.UserId = aU.UserId
where U.UserId = #UserId
No warranty, expressed or implied, is provided by Microsoft (or me) if you want to fiddle about directly in their tables.
We had a similar situation on a project I worked on a couple years ago. What we ended up doing was storing the primary key of the related user record from the external user table as a Profile Property of the ASPNET Membership model.
The benefit was that we didn't have to change anything about the schema of the external database to create the relationship and we could use the built in ASPNET Membership profile objects to easily obtain the related key from within the web code-behinds.
The initial population of this profile property was accomplished via a utility we wrote specifically for the task using ASPNET Membership Profile objects and was made easier by the fact that both our Membership setup and external table stored the email address of the user making it the key for the one time task.
The downside of this approach is that the ASPNET Membership Profile table is very much NOT denormalized (or realy normalized for that matter). It stores the Profile Properties as either xml data or serialized binary. In older versions it was serialized with the property names stored as names and character position of a single value string containing all values. This makes it hard (if not impracticle) to write queries, joins, etc from the aspect of your external table.
For us this wasn't a big deal because we were only working with the external user data on a case by case basis from the website. So, grabbing the key from the ASPNET profile using built objects and then looking it up in the external database was easy.
If your project is going to do a lot of relational queries or batch processes then I would probably recommend instead storing the ASPNET UserId GUID as a foriegn key in your external user table or if emails are going to be unique using those.
What would be an ideal structure for users > permissions of objects.
I've seen many related posts for general permissions, or what sections a user can access, which consists of a users, userGroups and userGroupRelations or something of that nature.
In my system there are many different objects that can get created, and each one has to be able to be turned on or off. For instance, take a password manager that has groups and sub groups.
Group 1
Group 2
Group 3
Group 4
Group 5
Group 6
Group 7
Group 8
Group 9
Group 10
Each group can contain a set of passwords. A user can be given read, write, edit and delete permissions to any group. More groups can get created at any point in time.
If someone has permission to a group, I should be able to make him have permissions to all sub groups OR restrict it to just that group.
My current thought is to have a users table, and then a permissions table with columns like:
permission_id (int) PRIMARY_KEY
user_id (int) INDEX
object_id (int) INDEX
type (varchar) INDEX
admin (bool)
read (bool)
write (bool)
edit (bool)
delete (bool)
This has worked in the past, but the new system I'm building needs to be able to scale rapidly, and I am unsure if this is the best structure. It also makes the idea of having someone with all subgroup permissions of a group more difficult.
There will be a separate table for roles of users/admins, which means they can change the permissions on users below groups they can control.
So, as a question, should I use the above structure? Or can someone point me in the direction of a better one?
EDIT
Alternative is to create a permission table for every type of object.
I suggest you add a "last_update" timestamp and a "last_updated_by_user" column so you have some hope of tracking changes to this table in your running system.
You could consider adding a permission -- grant. A user having the grant permission for an object would be able to grant access to other users to the object in question.
Be careful with "needs to scale rapidly." It's hard to guess without real-world production experience what a scaled-up system really needs.
Also, be careful not to over-complicate a permissions system, because an overly complex system will be hard to verify and therefore easier to crack. A simple system will be much easier to refactor for scaleup than a more complex one.
Your schema seems to relate users to objects. Do you want your primary key and your unique index to be (user_id, object_id)? That is, do you want each user to have either zero or one permission entry for each object? If so, use the primary key to enforce that, rather than using the surrogate permission_id key you propose.
For your objects that exist in hierarchies, you should make one of two choices systemwide:
a grant to an object with subobjects
implicitly grants access to only the
object, or...
it also grants access
to all subobjects.
The second choice reduces the burden of explicit permission granting when new subobjects are created. The first choice is more secure.
The second choice makes it harder to determine whether a user has access to a particular object, because you have to walk the object hierarchy toward the root of the tree looking for access grants on parent objects when verifying whether a user has access. That performance issue should dominate your decision making. Will your users create a few objects and access them often? Or will they create many objects and subobjects and access them rarely? If access is more frequent than creation, you want the first choice. Take the permission-granting overhead hit at object creation time, rather than a permission-searching hit at object access time.
I think the first choice is probably superior. I suggest this table layout:
user_id (int)
object_id (int)
type (varchar) (not sure what you have this column for)
admin (bool)
read (bool)
write (bool)
edit (bool)
grant (bool)
delete (bool)
last_update (timestamp)
last_updated_by_user_id (int)
primary key = user_id, object_id.
You could also use this table layout, and have a row in the table for each distinct permission granted to each user for each object. This one scales up more easily if you add more types of permissions.
user_id (int)
object_id (int)
permission_enum (admin/read/write/edit/grant/delete)
type (varchar) (not sure what you have this column for)
last_update (timestamp)
last_updated_by_user_id (int)
primary key = user_id, object_id, permission_enum
Is there any tool, which will take a set of CRUD queries, and generate a 'good enough'
table schema for that set:
e.g. I can provide input like this:
insert username, password
insert username, realname
select password where username=?
update password where username=?
update realname where username=?
With this input, tool should be able to make either 1 or 2 or 3 table, take care of _id's,
and indexing.
To put it alternatively, i'm looking for a tool, with which, i can design set of queries assuming a single infinite column table, and tool process and actually generates a number of database/tables/columns, and a high level language module with function calls to each of query.
oh yes , i'm trying to fire my db designer (-:
Have you considered using a ORM solution like Hibernate? This requires a inital set of mappings between the application class model (for example the User class) and the database schema representation (eg: USER table).
An ORM solution may supports advanced mapping scenarios where an object maps to more than one table in the schema. Also newer versions of Hibernate supports generating the database schema from the mappings (search for hbm2ddl tool).
You're asking for the impossible.
How would the tool know that username should have an index on it, much less a unique index?
How would it know the data types of the columns?
How would it know any domain constraints — for example, a hypothetical sex column must be either male or female, not crimson?
Wouldn't it be pretty vulnerable to typos, leaving you with a username and a user_name column?
Databases require design for a (well, many) reasons. Questions of normalization, for example, are going to be very difficult for a tool—which can't understand your problem domain—to answer.
That said, it isn't automatic, but what your asking for is—as Aleris answered—an ORM. You didn't specify which language you are using, but surely there is one (or more) for yours.