Exposing Sequelize filtering mechanics on API - api

I've been looking into some Node ORMs for use with PostgreSQL lately, and would like to expose some type of flexible filtering on the front end.
I'm quite enjoying the flexibility provided by Sequelize's where/include filtering (e.g. filtering a model based on some relation N levels deep).
Is the filtering mechanism safe at all to expose to any front end API? I haven't had much experience with it, so I"m not sure what types of fields can be passed through to the filter query.
Otherwise, for more complex querying I may go with something like Knex instead.

Generally if you need to ask if it is safe to expose where/include parameter passing directly to your frontend, I would suggest you shouldn't do it. If you don't know how it will behave you'll end up fast leaking all your users and password hashes to the world.
So you'll be better covered by validating incoming filtering parameters and only after that pass them to the query. You can use for example json schema to validate the incoming parameters.
For example in objection.js ORM this thing is handled in a way that you can give the query inside your code certain pattern what data user can include to response, and then incoming user input is automatically reduced to that subset (this works only when one requests additional relations to the requested row).
var houseWithPossiblePetsAndOwner = await House.query()
.allowEager('[pets, owner]')
.eager(eagerParamDirectlyFromEndUser)
.where('id', id);
You could extend your preferred ORM to support that kind of extra method, which allows you to declare for the query which parameters can be passed to it from the user input.

Related

Patterns when designing REST POST endpoint when resource has a computed property

I have a resource, as an example a 'book'.
I want to create a REST POST endpoint to allow consumers to create a new book.
However, some of the properties are required and computed by API, and others were actually taken as they are
Book
{
name,
color,
author # computed
}
Let's say the author is somehow calculated in API based on the book name.
I can think of these solutions each has its drawbacks:
enforce consumer to provide the author and just filter it (do not take into account as an input) # bad because it is very unpredictable why the author was changed
allow the user to provide author # same problem
do not allow the user to provide an author and show an exception if the user provides it
The last solution seems to be the most obvious one. The main problem I can see is that it is inconsistent and can be bizarre for consumers to see the author later on GET request.
I want my POST endpoint to be as expressive as possible. So the POST and GET data transfer objects will look almost the same.
Are there any simple, expressive, and predictable patterns to consider?
Personally I'm a big fan of using the same format for a GET request as well as a PUT.
This makes it possible for a client to do a GET request, add a property to the object they received and immediately PUT again. If your API and clients follow this pattern, it also means it can easily add new properties to GET requests and not break clients.
However, while this is a nice pattern I don't really think that same expectation exists at much for 'creation'. There's usually many things that make less less to require as a property when creating new items (think 'id' for example), so I usually:
Define a schema for PUT and GET.
Define a separate schema for POST that only contains the relevant properties for creation.
If users supply properties not in the schema, always error with a 422.
some of the properties are required and computed by API
Computed properties are neither required nor optional, by definition. No reason to ask consumers to pass such properties.
do not allow the user to provide an author and show an exception if the user provides it
Indeed, DTO should not contain author-property. Consumers can send over network whatever they want, however it is the responsibility of the API-provider to publish contract (DTO) for consumers to use properly. API-provider controls over what properties to consider, and no exception should be thrown, as the number of "bad" properties that can be sent by consumers is endless.
So the POST and GET data transfer objects will look almost the same
Making DTOs of the same resource look the same is not a goal. In many cases, get-operation exposes a lot more properties than post-operation for the same resource, especially when designing domain-driven APIs.
Are there any simple, expressive, and predictable patterns to consider?
If you want your API to express the fact that author is computed, you can have the following endpoints:
POST http://.../author-computed-books
GET http://.../books/1
Personally, I wouldn't implement that way since it does not look natural, however you can get the idea.
I want my POST endpoint to be as expressive as possible. So the POST
and GET data transfer objects will look almost the same.
Maybe just document it instead of relying explicit stuff like it must be almost the same as the GET endpoint.
E.g. my POST endpoint is POST /number "1011" and my GET endpoint is GET /number -> 11. If I don't document that I expect binary and I serve decimal, then nobody will know and they would guess for example decimal for both. Beyond documentation another way of doing this and to be more explicit is changing the response for GET to include the base {"base":10, value:"11"} or changing the GET endpoint GET /number/decimal -> 11.
As of the computed author I don't understand how you would compute it. I mean either a book is registered and the consumer shouldn't register it again or you don't know much about the author of it. If the latter, then you can guess e.g. based on google results for the title, but it will be a guess, not necessarily true. The same with consumer data, but at least that is what the consumers provided. There is no certainty. So for me it would be a complex property not just a primitive one if the source of the information matters. Something like "author": {name: "John Wayne", "source": "consumer/service"} normally it is complex too, because authors tend to have ids, names, other books, etc.
Another thought that if it is weird for the consumers instead of expected, then I have no idea why it is a feature at all. If author guessing is a service, then a possible solution is making the property mandatory and adding a guessing service GET /author?by-book-name={book-name}, so they can use the service if they want to. Or the same with a completely optional property. This way you give back the control to the consumers on whether they want to use this service or not.

Where should validation occur: endpoint or object?

I know this has been asked more generally before, but here is my specific situation:
I have an endpoint (API exposed to clients/users) that ends up calling public member functions of some objects. Should I validate at the endpoint or at the member function?
It seems that validating at the endpoint is a little easier in this case because then all of my validation is done around my API functions.
But somehow it feels like the objects should maintain themselves and prevent invalid data from being used on their own functions.
Thanks!
Validation can be, and usually is, quite complex process, that involves lots of heavy, bussiness-related logic and which has plenty of dependencies to the outer resources.
I suppose it's better to let the client create invalid object and validate it at the very end - just before its use in the bussines service.

Querying multiple OData entities for the same search term

I have a client who has a web service providing several different top-level entities. Let's say there are three which are of particular interest: Organisations, Sectors and Activities.
The client wants to be able to search for a term across all three of these entities simultaneously without have to make three separate calls. For example, "return all records whose name contains bread".
While the expand keyword would seem to be the solution at first glance, this only provides a view into the parent entity.
My suspicion is that this cannot be done by virtue of the way in which OData is designed to work, but I need to have a conclusive answer before going back to the client.
Unless the server provides a service operation for this exact purpose (and that would be pretty tricky to design anyway, what type should it return?), then it's not possible in one query.
On the other hand the client can send three queries inside one batch request. So that it's just a single roundtrip to the server. Might be good enough.
You could add a webget to the service to perform this function. You would have to wrap the response objects though.

Where to put NHibernate query logic?

I am trying to set up proper domain architecture using Fluent NHibernate and Linq to NHibernate. I have my controllers calling my Repository classes, which do the NHibernate thang under the hood and pass back ICollections of data. This seems to work well because it abstracts the data access and keeps the NHibernate functionality in the "fine print".
However, now I'm finding situations where my controllers need to use the same data calls in a different context. For example, my repo returns a list of Users. That's great when I want to display a list of users, but when I want to start utilizing the child classes to show roles, etc., I run into SELECT N+1 issues. I know how to change that in NHibernate so it uses joins instead, but my specific question is WHERE do I put this logic? I don't want every GetAllUsers() call to return the roles also, but I do want some of them to.
So here are my three options that I see:
Change the setting in my mapping so the roles are joined to my query.
Create two Repository calls - GetAllUsers() and GetUsersAndRoles().
Return my IQueryable object from the Repository to the Controller and use the NHibernate Expand method.
Sorry if I didn't explain this very well. I'm just jumping into DDD and a lot of this terminology is still new to me. Thanks!
As lomaxx points out, you need query.Expand.
To prevent your repository from becoming obscured with all kinds of methods for every possible situation, you could create Query Objects which make configurable queries.
I posted some examples using the ICriteria API on my blog. The ICriteria API has FetchMode instead of Expand, but the idea is the same.
I try and keep all the query logic in my repositories and try to only pass back the ICollection from them.
In your situation, I'd pass in some parameters to determine if you want to eager load roles or not and construct the IQueryable that way. For example:
GetAllUsers(bool loadRoles)
{
var query = session.Linq<Users>();
if(loadRoles)
query.Expand("Roles");
return query.ToList();
}
I would choose 2, creating two repositories. And perhaps would I consider creating another repository call to GetRoleByUser(User user). So, you could access a user's role upon user selection change on a seperate thread, if required, so it would increment your performance and won't load every user's roles for each of your users, which would require most resources.
It sounds like you are asking if it is possible to make GetAllUsers() sometimes return just the Users entities and sometimes return the Users and the roles.
I would either make a separate repository method called GetRolesForUser(User user), use lazy loading for Roles, or use the GetAllUsers(bool loadRoles) mentioned by lomaxx's answer.
I would lean toward lazy loading roles or a separate method in your repository.

web service data type (contract)

i have a general design question.
we have a fairly big data model that represents an clinical object, the object itself has 200+ child attributes in the hierarchy.
and we have a SetObject operation, and a GetObject operation. my question is, best practice wise, would it make sense to use that single data model in both operations or different data model for each? Because the Get operation will return much more details than what's needed for Set.
an example of what i mean: the data model has say ProviderId, and ProviderName attributes, in the Get operation, both the ProviderId, and ProviderName would need to be returned. However, in the Set operation, only the ProviderId is needed, and ProviderName is ignored by the service since system has that information already. In this case, if the Get and Set operations use the same data model, the ProviderName is exposed even for Set operation, does that confuse the consuming developer?
It would say: it depends :-)
No seriously. How do you edit / work on the object? I assume your software is calling the WCF service to retrieve an object, using an ID or a search term or something.
So you get back the object with 200+ attributes. How do you work on it, how much of it do you typically change?
If you typically only change a handful of attributes - then maybe having a generic SetProperty method on the service that would take the object ID, a property name, and a new value, might make sense. But think about how this is going to work:
the server side code will get the ID for the object
it will load the object from the database
it will then set a single property to a new value
it will save the object back to the database
What if you update four properties? You'd go through 4 of those cycles. Or: you could extend the SetProperty method to include a dictionary of (property name, value) pairs.
So I guess it depends on how many of those 200 properties are you changing at any given time? If you change 10%, 20% of those properties - wouldn't it be easier to just pass back the whole, modified object?
This looks like a good candidate for using your clinical object as canonical model and providing a restful style service interface. You can then provide different views, or representations of your your data object with only the fields required based on the usage model. Your verbs (get, set) will become the http standard Get, Put.
There are a number of open source Rest frameworks that you can use to make this easier to get started. Restlet is one that I have used successfully.