Typical normalization security issue in web applications - sql

i am currently having a problem, i guess a lot of people have run into before and i would like to know how you handled it.
So, imagine you have 10.000 Users on your App. ( each one has an own user/pw login to administrate his stuff ).
Imagine further, that you have a growing normalized SQL-tablestructure in the backend, with tables like: Users, Orders, OrderPositions, Invoices, etc.
So, to show/edit/delete stuff of a table which isn't the usertable itself, u'll probably have links like these, to let ypur users interact with the application.
~/Orders/EditOrder?id=12
~/Orders/ShowOrderPosition?orderId=12&posId=443
Ok, and now the problem:
How, do i prevent in a "none-complex"-way, that user A has access ( show/edit/delete ) the data of user B.
Example:
User B calls:
~/Orders/ShowOrderPosition?orderId=12&posId=443
which is an order of user A, so user B should have no access to it.
So, in my code i would need to have a UserIdentity-check before or within every single SQL-statement, like:
select * from OrderPosition op, Order o, User u
where op.Id = :orderId
and op.Fk_OrderId = :orderpositionId
and o.Id = :orderId
and o.Fk_User = :userId
Only this way i can make sure, that the data belongs to the requesting user.
To reach the usertable will of course get far more complex, the deeper the usertable-connection is "buried" in the normalization ( imagine tables like payments or invoices, connected to the order-table... )
Question:
What is your approach to deal with this, concidering: Low complexity, DRY and performance
( Hope u understand what i mean ;) )

This is a bit like a multi-tenant application - I have gone down this route and denormalized an ID onto all those tables that require this kind of check (a tenant ID, in your case, sounds like the user id).
I then created an interface that contains this field only and applied it to all those classes in my model layer that required this access.
In my base data access (repository) class, where all the select/update/delete calls go through, I then check to see if the class if of the type of that interface, and I then check that the current access matches that ID.
Of course, this depends on how your code is structured, and how simple/complex making this global kind of change will be...

Never expose ids.
And if you have to: encrypt them.

Performance
for ultimate performance you will have to denormalize to the point that reading the row and comparing with some application level variable would give you an answer on what kind of rights the user has (this is fairly fast and if your DAO/BAO level is well organized plugging it in will keep it relatively DRY and at relatively low complexity.) NOTE: complexity is also a function of your security model, once you start to implement inheritable, positive and negative, role-based access rules then it can not be really simple.
DRY
another route to take (which is very seldomly taken these days) is to use your database roles to manage security; this might get complicated but will offer unparalleled security (as it will be ensured at the DB level and not application level. Complexity should go down, at the application code level, if you manage to encapsulate all of your access paths into VIEWS, which might require quite a bit of re-tailoring at the database level. However(!), it might be possible to implement security model with very little changes to the application code - by renaming existing tables and replacing them with secured views)

Don't use your internal ID column, encrypted or not, it'll come back to bite you one day.
Create a random, unique, string (GUID, whatever), which contains the link between the user and the data he's requesting. So, instead of having, for user 34567:
Edit order
Create a record {"5dsfwe8frf823jrf",34567,12} in a temporary table and show:
Edit order
When the users clicks the link, fetch 34567,12 from your temporary table.
The string 5dsfwe8frf823jrf is impossible to guess = no security risk.

Related

Is it secure to display/embed user_id that represents each user from the db in the html of a page?

If I display user_id that represents each unique user in the db as an atrribute in an HTML element, is that good practice? Because I need the reference to the user if I want to perform an action on that particular user such as adding him to be my friend.
Example in HTML:
<div data-user-id='12' onclick=addFriend(12)>
Click to add John as your friend
</div>
Where 12 is John's actual user id in the db. From a security perspective, is it secure to do this?
it's never a problem to display the user id, actually, it's more secure than showing the username which could be used for logins, but a better solution is to display an id that could be set or changed by the user himself, look for facebook design for reference.
In this case, you want the user to be able to set his public id, and you use this public id to identify the user externally, then you map it to the actual user id internally in the back end.
anyway, all of this is not relevant for an abstract case, to decide how secure it's you need to consider the other security design elements of your application, the main question is always what can a malicious user do by knowing the actual user id?
As usual....it's complicated.
It all depends on how attractive your site is to hackers (and therefore how much effort they're going to invest), and how secure the rest of your solution is.
The first step in an organized attack is to find out as much as possible about your website. Your current solution leaks information. Knowing that users are identified by an integer may be useful (some database engines are more likely to use integers rather than GUIDs). It may help attackers guess other keys. By guessing sequential user IDs, they can find out how many users you have.
Once the attacker has found out all this information, they will use it to try and penetrate your application. The more information they have, the easier it is to create a plan. An individual piece of information may not be useful, but when you put it together with other snippets, it may reveal something useful.
So, no, there's no obvious major risk in this design by itself. It may be part of a wider attack, though, and it could be the bit of information that exposes some other flaw.

Should an API assign and return a reference number for newly created resources?

I am building a RESTful API where users may create resources on my server using post requests, and later reference them via get requests, etc. One thing I've had trouble deciding on is what IDs the clients should have. I know that there are many ways to do what I'm trying to accomplish, but I'd like to go with a design which follows industry conventions and best design practices.
Should my API decide on the ID for each newly created resource (it would most likely be the primary key for the resource assigned by the database)? Or should I allow users to assign their own reference numbers to their resources?
If I do assign a reference number to each new resource, how should this be returned to the client? The API has some endpoints which allow for bulk item creation, so I would need to list out all of the newly created resources on every response?
I'm conflicted because allowing the user to specify their own IDs is obviously a can of worms - I'd need to verify each ID hasn't been taken, makes database queries a lot weirder as I'd be joining on reference# and userID rather than foreign key. On the other hand, if I assign IDs to each resource it requires clients to have to build some type of response parser and forces them to follow my imposed conventions.
Why not do both? Let the user create there reference and you create your own uid. If the users have to login then you can use there reference and userid unique key. I would also give the uid created back if not needed the client could ignore it.
It wasn't practical (for me) to develop both of the above methods into my application, so I took a leap of faith and allowed the user to choose their own IDs. I quickly found that this complicated development so much that it would have added weeks to my development time, and resulted in much more complex and slow DB queries. So, early on in the project I went back and made it so that I just assign IDs for all created resources.
Life is simple now.
Other popular APIs that I looked at, such as the Instagram API, also assign IDs to certain created resources, which is especially important if you have millions of users who can interact with each-other's resources.

should database verify if user is authorized to perform action

Should database be verifying if user is authorized to perform certain action?
Two examples:
1)User is enrolled in 30 teams max and it can see scoresheet of these teams only. I'm passing in userid and teamid to the stored procedure and fetching the scoresheet only if user is authorized to view the scoresheet. Is it more appropriate to only pass in only teamid and check beforehand what all teams user is enrolled in? Should I do both?
2)Currently I'm passing in userid of the poster and the commentid of the comment to be deleted and I'm deleting comment only if both criteria is met - userid matches to the poster id and commentid matches to the commentid - just to make sure user is deleting his own comment and not somebody else's. Is it an overkill?
Multiple layers of validation is best practice and it doesn't seem like your methods would cause additional overhead. Just make sure to limit connecting to the database once, I've found that the most costly part of running database queries is the connection and cursors.
http://msdn.microsoft.com/en-us/library/aa174437%28v=sql.80%29.aspx
Security experts will tell you that No amount of security is enough! But at the same time you have to find a balance b/w security and unnecessary layers of protection that are bound to affect your application's performance.
Answering your 2nd question first: It is a good idea to pass both userid as well as commentid, and matching both, so that you accidentally don't delete all comments by a particular user.
Coming to your 1st question now: As I understand it, you want users only part of the team to be able to view the team's scoresheet, right? In order to do so passing only the teamid of all the teams the user is a part of will do. I am not sure what you mean by authorization here!
NOTE:
I have answered your question from a theoretical view with no idea about your Table structure or whats written in your Stored Procedures.
Your frontend is a much more friendlier (libraries, frameworks, best practices) environment to implement whatever access restrictions or authorization that you could possibly have in mind. Adding another layer inside the database just adds a lot of complexity and duplicate implementation of your access restrictions.
I would only consider doing it if clients connect and execute commands directly against the database.
So, rely on the ids provided by the application and spend your energy on sanitizing user input and implementing a sane authentication model. You will need it.

The REST-way to check/uncheck like/unlike favorite/unfavorite a resource

Currently I am developing an API and within that API I want the signed in users to be able to like/unlike or favorite/unfavorite two resources.
My "Like" model (it's a Ruby on Rails 3 application) is polymorphic and belongs to two different resources:
/api/v1/resource-a/:id/likes
and
/api/v1/resource-a/:resource_a_id/resource-b/:id/likes
The thing is: I am in doubt what way to choose to make my resources as RESTful as possible. I already tried the next two ways to implement like/unlike structure in my URL's:
Case A: (like/unlike being the member of the "resource")
PUT /api/v1/resource/:id/like maps to Api::V1::ResourceController#like
PUT /api/v1/resource/:id/unlike maps to Api::V1::ResourceController#unlike
and case B: ("likes" is a resource on it's own)
POST /api/v1/resource/:id/likes maps to Api::V1::LikesController#create
DELETE /api/v1/resource/:id/likes maps to Api::V1::LikesController#destroy
In both cases I already have a user session, so I don't have to mention the id of the corresponding "like"-record when deleting/"unliking".
I would like to know how you guys have implemented such cases!
Update April 15th, 2011: With "session" I mean HTTP Basic Authentication header being sent with each request and providing encrypted username:password combination.
I think the fact that you're maintaining application state on the server (user session that contains the user id) is one of the problems here. It's making this a lot more difficult than it needs to be and it's breaking a REST's statelessness constraint.
In Case A, you've given URIs to operations, which again is not RESTful. URIs identify resources and state transitions should be performed using a uniform interface that is common to all resources. I think Case B is a lot better in this respect.
So, with these two things in mind, I'd propose something like:
PUT /api/v1/resource/:id/likes/:userid
DELETE /api/v1/resource/:id/likes/:userid
We also have the added benefit that a user can only register one 'Like' (they can repeat that 'Like' as many times as they like, and since the PUT is idempotent it has the same result no matter how many times it's performed). DELETE is also idempotent, so if an 'Unlike' operation is repeated many times for some reason then the system remains in a consistent state. Of course you can implement POST in this way, but if we use PUT and DELETE we can see that the rules associated with these verbs seem to fit our use-case really well.
I can also imagine another useful request:
GET /api/v1/resource/:id/likes/:userid
That would return details of a 'Like', such as the date it was made or the ordinal (i.e. 'This was the 50th like!').
case B is better, and here have a good sample from GitHub API.
Star a repo
PUT /user/starred/:owner/:repo
Unstar a repo
DELETE /user/starred/:owner/:repo
You are in effect defining a "like" resource, a fact that a user resource likes some other resource in your system. So in REST, you'll need to pick a resource name scheme that uniquely identifies this fact. I'd suggest (using songs as the example):
/like/user/{user-id}/song/{song-id}
Then PUT establishes a liking, and DELETE removes it. GET of course finds out if someone likes a particular song. And you could define GET /like/user/{user-id} to see a list of the songs a particular user likes, and GET /like/song/{song-id} to see a list of the users who like a particular song.
If you assume the user name is established by the existing session, as #joelittlejohn points out, and is not part of the like resource name, then you're violating REST's statelessness constraint and you lose some very important advantages. For instance, a user can only get their own likes, not their friends' likes. Also, it breaks HTTP caching, because one user's likes are indistinguishable from another's.

Am I breaking my aggregate boundaries in this model?

I'm modeling a very basic ASP.NET MVC app using NHibernate and I seem to be stuck on my design. Here's a sketch of my model:
As you can see this is VERY basic but I have some concerns about it. The User root entity and the Organization root entity are accessing the same Organization_Users entity child via two one-to-many relationships. This doesn't seem right and I think I am breaking the aggregate boundaries. This model smells to me but I like the idea because I would like to have code like this:
var user = userRepository.Load(1);
var list = user.Organizations; // All the organizations the user is a part of.
and
var org = orgRepository.Load(1);
var list = org.Users; // All the users in an organization.
Also the extra data in the table like flagged and role would be used by the Organization entity. Is this a bad design? If you have any thoughts that would be great. I'm still trying to get my mind around the thinking of DDD. Thanks
This is a typical Many-To-Many relationship. And the Organization_Users tables is the bridge table. Infact NHibernate and all the other ORM tools have built-in feature to support bridge table.
This thing should be resolved at data modelling level rather than at application level. You should analyze your data model and it is recommended to avoid many-to-many relationships (in the sense if it is not the necesity of domain model, you should try to avoid many-to-many relationship).
First thing first you need to be sure that many-to-many relationship in data model is necessary for mapping domain entities. Once you have done this then the model represented in your diagram is ok for mapping those relationships at application level
I have used an approach similar to your first model on several occasion. The one catch with this approach is that you need to create an OganizationUser class in your domain to handle the Role and Flagged fields from you Domain. This would leave you with something like this in your code.
var user = userRepository.Load(1);
var list = user.OrganizationUsers; // All the organizations the user is a part of including their role and flagged values.
var organization = list[0].Organization;
*If you're going to be iterating through all a users organizations quite often you'd likely want to eager load the Organization entity along with OrganzitionUser
With the second design you submitted it looks like you would be able to add a user to the OrgUserDetails without adding the user to OrganizationUser. That doesn't seem like something I would want to support from my Domain.
The first things to consider in DDD are :
forget your database schema (there's
no database !)
what actions will you perform on thoses entities from a domain perspective ?
I think your model is fine. I usually think of domain aggregate roots, when I think of them at all, in terms of what is publicly exposed, not internal implementation. With relationships I think of which entity "wears the pants" in the relationship. That is, is it more natural to add a User to an Organization or add an Organization to a User? In this case both may make sense, a User joins an Organization; an Organization accepts a User for membership.
If your domain sees the relationship from the User's perspective, you can put the methods to maintain (add, remove, etc.) the relationship on the User and expose a read-only collection on the Organization.
In response to your second design (it would have been better if you had edited the original question): I don't like it at all. Your original design is fine. I wouldn't necessarily ignore the database while designing your classes, a good design should accurately model the domain and be straightforward to implement in a relational database. Sometimes you have to compromise in both directions to hit the sweet spot. There's no jail term for breaking aggregate boundaries. :-)
My understanding is:
A User can belong to 0-to-many Organizations.
AND
An Organization consists of 0-to-many Users.
Are both of those correct? If so, that does sound like a many-to-many to me.
In a many-to-many, you pretty much need a relationship-like object of some sort to bridge that gap. The problem is, there is no user_organization in the domain.
This makes me think you shouldn't have user_organization as a part of your domain, per se. It feels like an implementation detail.
On the other hand, maybe it makes sense in your domain to have a Roster which holds the Users in an Organization and stores their role and other information specific to that relationship.
Thanks everyone for your answers. They have been very helpful.
While I was thinking about my model a little bit more, I sketched something new that I think would be better.
My thinking was this:
When a user logs into the site the system finds their account and then returns a list of organizations they are apart of and it gets this info from the user_organizations object.
When a user clicks on one of the organizations they are apart of it directs them to the organization's control panel.
The selected organization then looks up that user's role in its org_user_details to know what access the user should have to that organizations control panel.
Does that make sense? :)
I feel like that would be good in a model but I'm having some doubts about the DB implementation. I know I shouldn't even worry about it but I can't break my bad habit yet! You can see that there is kind of duplicate data in the user_organizations object and the org_user_details object. I'm not a DB pro but is that a bad DB design? Should I instead combine the data from user_organizations and org_user_details into a table like the one in my first post and just tell NHibernate that User looks at it as a Many-to-Many relationship and Organization looks at it as a One-to-Many relationship? That sounds like I'm tricking the system. Sorry if I seemed really confused about this.
What are your thoughts on this? Am I over thinking this? :P