CakePHP: model-based permissions? - authentication

Struggling with a decision on how best to handle Client-level authentication with the following model hierarchy:
Client -> Store -> Product (Staff, EquipmentItem, etc.)
...where Client hasMany Stores, Store hasMany Products(hasMany Staff, hasMany EquipmentItem, etc.)
I've set up a HABTM relationship between User and Client, which is straightforward and accessible through the Auth session or a static method on the User model if necessary (see afterFind description below).
Right now, I'm waffling between evaluating the results in each model's afterFind callback, checking for relationship to Client based on the model I'm querying against the Clients that the current User is a member of. i.e. if the current model is Client, check the id; if the current model is a Store, check Store.clientid, and finally if Product, get parent Store from Item.storeid and check Store.clientid accordingly.
However, to keep in line with proper MVC, I return true or false from the afterFind, and then have to check the return from the calling action -- this is ok, but I have no way I can think of to determine if the Model->find (or Model->read, etc.) is returning false because of invalid id in the find or because of Client permissions in the afterFind; it also means I'd have to modify every action as well.
The other method I've been playing with is to evaluate the request in app_controller.beforeFilter and by breaking down the request into controller/action/id, I can then query the appropriate model(s) and eval the fields against the Auth.User.clients array to determine whether User has access to the requested Client. This seems ok, but doesn't leave me any way (afaik) to handle /controller/index -- it seems logical that the index results would reflect Client membership.
Flaws in both include a lengthy list of conditional "rules" I need to break down to determine where the current model/action/id is in the context of the client. All in all, both feel a little brittle and convoluted to me.
Is there a 3rd option I'm not looking at?

This sounds like a job for Cake ACL. It is a bit of a learning curve, but once you figure it out, this method is very powerful and flexible.
Cake's ACLs (Access Control Lists) allow you to match users to controllers down to the CRUD (Create Read Update Delete) level. Why use it?
1) The code is already there for you to use. The AuthComponent already has it built in.
2) It is powerful and integrated to allow you to control permissions every action in your site.
3) You will be able to find help from other cake developers who have already used it.
4) Once you get it setup the first time, it will be much easier and faster to implement full site permissions on any other application.
Here are a few links:
http://bakery.cakephp.org/articles/view/how-to-use-acl-in-1-2-x
http://book.cakephp.org/view/171/Access-Control-Lists
http://blog.jails.fr/cakephp/index.php?post/2007/08/15/AuthComponent-and-ACL
Or you could just google for CakePHP ACL

Related

Optimizing GraphQL resolvers for SQL databases and in service-oriented architectures

My company has a service-oriented architecture. My app's GraphQL server therefore has to call out to other services to fullfill the data requests from the frontend.
Let's imagine my GraphQL schema defines the type User. The data for this type comes from two sources:
A user account service that exposes a REST endpoint for fetching a user's username, age, and friends.
A SQL database used just by my app to store User-related data that is only relevant to my app: favoriteFood, favoriteSport.
Let's assume that the user account service's endpoint automatically returns the username and age, but you have to pass the query parameter friends=true in order to retrieve the friends data because that is an expensive operation.
Given that background, the following query presents a couple optimization challenges in the getUser resolver:
query GetUser {
getUser {
username
favoriteFood
}
}
Challenge #1
When the getUser resolver makes the request to the user account service, how does it know whether or not it needs to ask for the friends data as well?
Challenge #2
When the resolver queries my app's database for additional user data, how does it know which fields to retrieve from the database?
The only solution I can find to both challenges is to inspect the query in the resolver via the fourth info argument that the resolver receives. This will allow it to find out whether friends should be requested in the REST call to the user account service, and it will be able to build the correct SELECT query to retrieve the needed data from my app's database.
Is this the correct approach? It seems like a use-case that GraphQL implementations must be running into all the time and therefore I'd expect to encounter a widely accepted solution. However, I haven't found many articles that address this, nor does a widely used NPM module appear to exist (graphql-parse-resolve-info is part of PostGraphile but only has ~12k weekly downloads, while graphql-fields has ~18.5k weekly downloads).
I'm therefore concerned that I'm missing something fundamental about how this should be done. Am I? Or is inspecting the info argument the correct way to solve these optimization challenges? In case it matters, I am using Apollo Server.
If you want to modify your resolver based on the requested selection set, there's really only one way to do that and that's to parse the AST of the requested query. In my experience, graphql-parse-resolve-info is the most complete solution for making that parsing less painful.
I imagine this isn't as common of an issue as you'd think because I imagine most folks fall into one of two groups:
Users of frameworks or libraries like Postgraphile, Hasaura, Prisma, Join Monster, etc. which take care of optimizations like these for you (at least on the database side).
Users who are not concerned about overfetching on the server-side and just request all columns regardless of the selection set.
In the latter case, fields that represent associations are given their own resolvers, so those subsequent calls to the database won't be fired unless they are actually requested. Data Loader is then used to help batch all these extra calls to the database. Ditto for fields that end up calling some other data source, like a REST API.
In this particular case, Data Loader would not be much help to you. The best approach is to have a single resolver for getUser that fetches the user details from the database and the REST endpoint. You can then, as you're already planning, adjust those calls (or skip them altogether) based on the requested fields. This can be cumbersome, but will work as expected.
The alternative to this approach is to simple fetch everything, but use caching to reduce the number of calls to your database and REST API. This way, you'll fetch the complete user each time, but you'll do so from memory unless the cache is invalidated or expires. This is more memory-intensive, and cache invalidation is always tricky, but it does simply your resolver logic significantly.

Should an API assign and return a reference number for newly created resources?

I am building a RESTful API where users may create resources on my server using post requests, and later reference them via get requests, etc. One thing I've had trouble deciding on is what IDs the clients should have. I know that there are many ways to do what I'm trying to accomplish, but I'd like to go with a design which follows industry conventions and best design practices.
Should my API decide on the ID for each newly created resource (it would most likely be the primary key for the resource assigned by the database)? Or should I allow users to assign their own reference numbers to their resources?
If I do assign a reference number to each new resource, how should this be returned to the client? The API has some endpoints which allow for bulk item creation, so I would need to list out all of the newly created resources on every response?
I'm conflicted because allowing the user to specify their own IDs is obviously a can of worms - I'd need to verify each ID hasn't been taken, makes database queries a lot weirder as I'd be joining on reference# and userID rather than foreign key. On the other hand, if I assign IDs to each resource it requires clients to have to build some type of response parser and forces them to follow my imposed conventions.
Why not do both? Let the user create there reference and you create your own uid. If the users have to login then you can use there reference and userid unique key. I would also give the uid created back if not needed the client could ignore it.
It wasn't practical (for me) to develop both of the above methods into my application, so I took a leap of faith and allowed the user to choose their own IDs. I quickly found that this complicated development so much that it would have added weeks to my development time, and resulted in much more complex and slow DB queries. So, early on in the project I went back and made it so that I just assign IDs for all created resources.
Life is simple now.
Other popular APIs that I looked at, such as the Instagram API, also assign IDs to certain created resources, which is especially important if you have millions of users who can interact with each-other's resources.

How to associate calculated values to an object

I have some basic objects like Customer, Portfolio and ... with some association to other objects. I can easily display the required information in the web page by reading object values. The problem is that what do I do when the value associated with the object is calculated and returned by a method, a value that makes sense only in certain context and cannot be attached to the object as an instance variable? In this case if I have a list of say Users I have to pass the username of each user to the method to get the calculated value. This causes problem to keep the association while displaying the values in the page.
An example to make this clear:
An application provides the functionality for users to keep track of each others activities by letting them add whoever they want to a list. If this user performs a search on users there's the option to follow each returned user. I want to make sure this option is disabled for those user's that are already being followed. This functionality is provided by a method like isFollowed(String follower, String followee) which returnes a boolean. How can I associate this boolean values to each user in search result?
Solutions:
One thing I can think of is to add a followed instance variable to User class. But I don't think it's a good approach because this variable only makes sense in a certain context. It's not a part of User class in the domain.
The other way I can think of is to use Decoration or Wrappers in a way to extend the User class and add the attribute in the child class. But again what if I have several objects that need to be in the same context. In that case I have to extend all of them with the same boolean attribute in all classes.
I hope I could make it clear.
In principle, I don't see anything wrong with instance method on User: bool IsFollowedBy(User user).
Of course, this could lead to performance issues. If that is the case, you can create separate object for presentation purposes which bundles data from User and whether he is being followed by the user performing search. Then you can build query which retrieves all necessary data for such object in a single roundtrip to DB.
One solution is to avoid querying Entities (as in DDD/ORM) and query directly using subquery/join or even using some denormalized database. This is something CQRS pattern suggests.
Other solution is to do computations on application layer (how many Users can you show on the same page anyway), which is expensive but you can implement some caching techniques to make things easier.

The REST-way to check/uncheck like/unlike favorite/unfavorite a resource

Currently I am developing an API and within that API I want the signed in users to be able to like/unlike or favorite/unfavorite two resources.
My "Like" model (it's a Ruby on Rails 3 application) is polymorphic and belongs to two different resources:
/api/v1/resource-a/:id/likes
and
/api/v1/resource-a/:resource_a_id/resource-b/:id/likes
The thing is: I am in doubt what way to choose to make my resources as RESTful as possible. I already tried the next two ways to implement like/unlike structure in my URL's:
Case A: (like/unlike being the member of the "resource")
PUT /api/v1/resource/:id/like maps to Api::V1::ResourceController#like
PUT /api/v1/resource/:id/unlike maps to Api::V1::ResourceController#unlike
and case B: ("likes" is a resource on it's own)
POST /api/v1/resource/:id/likes maps to Api::V1::LikesController#create
DELETE /api/v1/resource/:id/likes maps to Api::V1::LikesController#destroy
In both cases I already have a user session, so I don't have to mention the id of the corresponding "like"-record when deleting/"unliking".
I would like to know how you guys have implemented such cases!
Update April 15th, 2011: With "session" I mean HTTP Basic Authentication header being sent with each request and providing encrypted username:password combination.
I think the fact that you're maintaining application state on the server (user session that contains the user id) is one of the problems here. It's making this a lot more difficult than it needs to be and it's breaking a REST's statelessness constraint.
In Case A, you've given URIs to operations, which again is not RESTful. URIs identify resources and state transitions should be performed using a uniform interface that is common to all resources. I think Case B is a lot better in this respect.
So, with these two things in mind, I'd propose something like:
PUT /api/v1/resource/:id/likes/:userid
DELETE /api/v1/resource/:id/likes/:userid
We also have the added benefit that a user can only register one 'Like' (they can repeat that 'Like' as many times as they like, and since the PUT is idempotent it has the same result no matter how many times it's performed). DELETE is also idempotent, so if an 'Unlike' operation is repeated many times for some reason then the system remains in a consistent state. Of course you can implement POST in this way, but if we use PUT and DELETE we can see that the rules associated with these verbs seem to fit our use-case really well.
I can also imagine another useful request:
GET /api/v1/resource/:id/likes/:userid
That would return details of a 'Like', such as the date it was made or the ordinal (i.e. 'This was the 50th like!').
case B is better, and here have a good sample from GitHub API.
Star a repo
PUT /user/starred/:owner/:repo
Unstar a repo
DELETE /user/starred/:owner/:repo
You are in effect defining a "like" resource, a fact that a user resource likes some other resource in your system. So in REST, you'll need to pick a resource name scheme that uniquely identifies this fact. I'd suggest (using songs as the example):
/like/user/{user-id}/song/{song-id}
Then PUT establishes a liking, and DELETE removes it. GET of course finds out if someone likes a particular song. And you could define GET /like/user/{user-id} to see a list of the songs a particular user likes, and GET /like/song/{song-id} to see a list of the users who like a particular song.
If you assume the user name is established by the existing session, as #joelittlejohn points out, and is not part of the like resource name, then you're violating REST's statelessness constraint and you lose some very important advantages. For instance, a user can only get their own likes, not their friends' likes. Also, it breaks HTTP caching, because one user's likes are indistinguishable from another's.

OOP design - validation when validation means hitting the database to check

Let's pretend we're talking about an HTML complaint form, one field of which is a product list from the company catalog.
I gather that validation usually (always?) goes in its own class.
I also gather that it's good practice to have gateway classes which can handle all the database queries internally, so when I save my complaint from my complaint form I don't have to worry about the database details.
But what about validation that requires accessing the database - for example checking that the product is actually a product we have (and not someone tampering with the form). This can only be done by finding a match in the database... but my database is abstracted behind my gateway.
Do I add validation logic to my gateway? Do I create validator gateway classes? Do I empower my validator with database logic?
EDIT - attempt to clarify...
customer clicks link to complaint form
HTML complaint form is built with a <select><item></item></select> dropdown with X products from our fictional company catalog
Customer completes form and Submits // Wiseguy alters HTML so product is "Schweddy Balls" and submits
Form class validates simple things like date, all required fields have data, email address, etc.
at step 4, in order to validate that the product being complained about is legit, you'd have to hit the database's product table to see if it's still a valid datapoint. Should that logic go in the form class, the gateway class, or somewhere else? putting it in the form class breeds dependencies, does it not?
Validation should, oddly enough, not be done directly through a Validation class. A model object's class (in this case probably Complaint) should include the Validation class and the validations should be done in there. Since the Complaint has access to all Complaints, validation methods can use the methods of the Complaint class, or call another model class if needed.
When you say "gateway" to the database, I believe you're talking about on Object Relational Mapping (ORM), which allows model objects, like a Complaint, to abstractly talk to the database. The ORM should have no knowledge of the structure or specifics of the application, it should only be an abstract API for objects to communicate to the database or other backend.
Surely my response does not cover everything, so further questions/clarifications welcome.
It's usually good practice to store valid data, only. So, one necessarily writes a validator on the server-side to ensure that requests are valid and only valid data gets put into the database, and as an optimization one typically validates on the client side (e.g. in JavaScript) so that the user is informed of invalid input without needing to submit the form and without needing to make a roundtrip. Doing the validation as part of the database write, allows subsequent reads to be done trivially with no checks.
I don't know the specifics of what counts as "valid" for you, but you can probably save a lot of database lookups with caching. Perhaps you could give some more information about your requirements?