How do I filter a RESTful collection resource? Query parameters or query strings? - api

I need to filter a list of employees and support both simple and complex queries.
RESTful APIs have query paramaters which are key value pairs provided after the ?
/employees?location=london
What would be used if I wanted to reduce the list to Employees with a start date between 01/01/2020 and 01/05/2020 that are also male and work at the Birmingham office?
Is this where a query string ?q=.... should be used? Is there any best practice to follow for this?

Is there any best practice to follow for this?
Anything that is consistent with the other identifiers in your API is fine.
REST doesn't care what spellings you use for your resource identifiers, so long as they are consistent with the production rules defined by RFC 3986.
A query part that is an application/x-www-form-urlencoded representation of key value pairs is a popular choice because HTML form support means those resource identifiers are easy to test with a web browser.
?q= is just another key value pair -- your values can be pretty much anything so long as they are encoded correctly. For prior art, see the text area input control in html.
Key value pairs are a way to encode information into the query part, but you aren't required to do that. http://example.org/?select%20%2A%20from%20students%3b is a perfectly satisfactory resource identifier from a REST client perspective.
(Of course, you probably wouldn't want to take an unsanitized input and run it in your production relational database using a role authorized to do arbitrary things.)
You aren't restricted to encoding useful information in the query part; if you prefer to encode information into the path segments, that's OK too. HTML doesn't support that out of the box, but a generalization of the HTML form is a URI Template, which gives you more options for communicating to the client how the URI is to be constructed.

Related

How to pass an arbitrary number of parameters while adhering to REST principles

I have a database with 3 tables: product, category, and xref_product_category. My business logic permits a product to be associated with an arbitrary number of categories (bed, bath, kitchen, etc.). In terms of designing a REST API, what's the best way to establish these relationships?
For some reason I'm hesitant to pass a JSON array of category IDs as a parameter, but I don't really have a good reason not to. I suppose the other option would be to make a series of PUT calls, passing a single parameter each time. What's the most RESTful way to establish multiple relationships like this? Should this be done in a single API call, or in multiple calls?
In REST, the phrase "arbitrary number of parameters" usually means "representation".
The parameters as a whole could most likely be combined into payload content to represent a resource.
So firstly define the schema for the payload, and then you'll have a media type that can be used to represent it. Document the schema and tell the people who will use the API that they can POST or PUT with that content to define your product and its arbitrary relationships to other resources.
Then define the URI for your product resource and how clients will navigate to it from a Cool (entry-point) URI to it.
I would allow the parameters to be passing in any order.

Providing complex filtering REST API [duplicate]

This question already has answers here:
REST and complex search queries
(5 answers)
Closed 10 years ago.
So I am building a RESTful (as RESTful as I can) API with the Laravel 4 PHP Framework. Right now I have dozens of API calls working and I have a process for being to do limit, ordering, and do simple filtering. Here would be an example of one of the calls:
/api/v1/users?limit=10&offset=10&firstName=John&order[]=createdTimestamp desc
This would return the 11th through 20th users that have a first name of John ordered by the createdTimestamp in descending order. The simple filtering here can only does exact matches (=). Now I also want to be able to provide a more complex filtering system through the REST API that supports the ability to specific the equality match type that way they could do a != or > or LIKE, etc... The issue is that I don't know if I am going to be able to provide this type of filtering through a normal query string.
What is the best way to provide this complex filtering through a REST API? Is doing through a POST still considered the best way even though it is not "truly" RESTful (even though this would prevent issues of the user trying to run a long query that exceeds the URI character length limit that some browsers have)?
#ryanzec
Now I also want to be able to provide a more complex filtering system
through the REST API that supports the ability to specific the
equality match type that way they could do a != or > or LIKE, etc...
The issue is that I don't know if I am going to be able to provide
this type of filtering through a normal query string.
It's not possible with simple query string(well, maybe it's possible but is very hard to encode such logic properly in query string). You need to define custom query format and use POST to submit such query. Server may respond with:
"201 Created" status and "Location" header field indicating query resource if there was no such query before; or
"303 See Other" and "Location" header field indicating already existing query resource.
Is doing through a POST still considered the best way even though it
is not "truly" RESTful
I do not know who said this, but it's wrong. There is nothing wrong with using POST for such purposes.
Use forms in your collection resource responses to tell the client how to search the collections. See my answer to REST and complex search queries for examples.

how to create a REST api for filtered data

I've been reading a lot and I understand that a REST API maps resources with HTTP verbs. That's very easy of understand when, for example, a tutorial show an example like Employee.
A PUT will be a new record (if it doesn't exist) or an update; a GET will extract a list with all employees, and a GET api.example.com/employee/12 will extract the record for the Employee with ID = 12.
But, for example, how I could map a more useful queries like "get me all the employees with a salary under 50.000, with less that 2 years at the company and with the marital status as single"?
In other words, how I could parametrize the query? Is it correct to add parameters like api.example.com/Employee?salary<50000&years<2&marital-status=single" ?
The theory:
If you add parameters to your query, they are just part of the URL. The form of the URL does not tell you anything about whether your API is RESTful. Your API with query strings is restful if it obeys the constraints described here: http://en.wikipedia.org/wiki/Representational_state_transfer and (optionally) follows the guiding principles
So as long as your query parameters don't do anything crazy like randomly change the state of some of the resources, then your API is still RESTful
The practice:
Any sensible REST API will need query parameters for the 'index' route. In practice, LinkedIn's REST API has query parameters that just select fields from someone's profile. In this case, the URLs looks completely different from yours, but still obey the principles of REST.
Your situation:
Your query strings can't contain inequalitites, only key+value pairs. You need to express it more like ?max-salary=50000&max-years=2&marital-status=single". You might also name your 'index' route differently: api.example.com/employees (plural)

What validation must a form include? Best practices

I am trying to put together a checklist things I need to keep in mind when creating forms. I know I need to filter input content. I already am filtering for errant html and scripts, escaping mysql, and limiting to data types(phone numbers are 10+ digits with training extension digits, email has to be email, strings cannot contain html or code, etc.), and word or character limits (names max out at 4 words separated by whitespace, etc.). But what else should I be doing and what are good ways of doing them?
This validation will be taking place on the server, but I am looking for best practices across platforms. The data will be coming in using POST, so I don;t have to worry too much about mucking about with the url. Also the form presentation, with hinting, js input masking is handled, and pretty much all the client side stuff is in place.
Validation down to its simplest term: only accepting what you want.
For example, if your telephone field should only include numbers (in no particular phone number format) and no longer than 20 numbers, you can check it against regular expression to make sure that it is what you want to accept, i.e. ([0-9]{7,20})
Another example, Twitter. It only accepts username up to 15 characters, alphanumeric and consisting of underscores. So the validation regex might something be: ([a-zA-Z0-9]{1})([a-zA-Z0-9\_]{0,14})
Form validation can also be in the form of security check. One could be honey potting, form validity and so on.
Form Honey potting: Preventing automated/spamming of your form submissions
Form Validity: Check between the time the form has loaded and the time of form submission. If it is too short, the form might be submitted by a bot. If it took too long, the data might be old and expired.
CAPTCHA: another level of bot prevention / human only form validation.
The always great smashing magazine has some great tips:
http://www.smashingmagazine.com/2009/07/07/web-form-validation-best-practices-and-tutorials/
But if I could offer my own:
Make it secure but usable.
Use client side validation along
with server side validation
If you post back with errors, make
sure the users' information is still
populated in the form
Limit the field size in HTML forms.
Of course, all this is assuming you're using web forms.
Commenter S. Lott is correct: Escaping should be taken care of automatically by the framework. If you're not working with an explicit framework, then at the very least, the utility functions you use to access the database and display data on the page should escape for SQL and HTML, respectively. If you have to worry about escaping in your validation code, sooner or later you'll make a mistake, and some twelve-year-old script kiddy will replace the contents of your web site with horse porn.
Stuff that makes sense in the context is good, stuff that doesn't make sense is bad.
If this site filtered for HTML, then we couldn't give HTML examples. Instead it processes the HTML so that they are output escaped, rather than as HTML.
Beware of over-validating. < is not necessarily bad, there are all sorts of reasons people will use <, > and especially &.
Likewise, while Robert '); DROP TABLE Students;-- isn't someone you want signing up at your school, if your preventing that means that O'Brien, O'Tierney, O'Donovan and O'Flanagan can't sign up, by the time O'Donnell is refused he's going to think it's anti-Irish racism and sue you! (More realistically, I do know people here in Ireland who go off to find a competitor when a SQL-injection prevention script blocks or mangles their surname - though more often they've just found yet another site that isn't preventing injection, as either will fail on their name in some way).
Validation, as opposed to security-checking is about making sure something plausibly reflects reality. In reality personal names have ' in them and company and town names have & in them all the time, and "validation" that blocks that has turned valid data into invalid. In reality, credit card numbers are 16digits long (some debit cards 19digits) and pass a luhn check, email addresses have a user info part, an # and a host name with an MX record. People's names are never zero-characters long. That's validation. Only reject (rather than escape) if it genuinely is invalid.
You may want to check out OWASP http://www.owasp.org/index.php/OWASP:About. Especially if you're planning on handling credit cards.

Howto restfully expose related-parent-child resources?

I'm designing an api and I'd like to allow users and groups to have saved searches, but am uncertain how to best expose this information. I've come up with a few URI's to expose them:
# These are for CRUD the search definitions, not running the searches
1. /users/{username}/searches # lists the searches for a user
2. /users/{username}/searches/{search-name} # CRUD a specific user search
3. /groups/{groupname}/searches # lists the searches for a group
4. /groups/{groupname}/searches/{search-name} # CRUD a specific group search
5. /searches/{search-id|search-name}
6. /searches/group/{groupname}/{search-name}
7. /searches/user/{username}/{search-name}
I don't feel its right to expose all those URIs. That means there are 2 ways to update or list searches for a user and a group: through the /groups/search, or through /search/group. It also means more to support and I'm afraid that subtle differences would develop.
Searches can be independent records in the database and not tied to a specific user or group (e.g, default system searches, or context-dependent searches).
Because searches can be independent, it feels wrong to expose them as /users/searches and /groups/searches. At the same time, if I'm thinking, "What are bob's searches?" I would first think of /users/bob/searches because, logically, its bob's search. Similarly, if I want to, say, backup bob's account, all his personal information should be under /users/bob.
So, does anyone have advice on which way is preferred, and/or worked well (or poor) for them?
I would tend to stick with
5. /searches/{search-id|search-name}
6. /searches/group/{groupname}/{search-name}
7. /searches/user/{username}/{search-name}
The backup problem can be solved by creating a new resource that contains links to Bob's info throughout the system e.g.
GET /AccountData/Bob
<div class="AccountData">
<link rel="searches" href="/Searches/User/Bob"/>
<link rel="options" href="/Options/User/Bob"/>
<link rel="usagehistory" href="/History/User/Bob"/>
</div>
My experience is that you will drive yourself nuts if you try and create a single hierarchy that meets all of your usage scenarios. You just can't do it. That's why Wikis work so well, they use links instead of hierarchy to provide access to the information.
I would suggest you focus more on what links will be returned in the representations.
e.g.
GET /Groups/{GroupName}
<div class="group">
<div class="name">AGroup</div>
<link rel="searches" href="/Searches/Group/AGroup"/>
</div>
With this approach, you care much less about what the URL structure looks like. As Roy states here
A REST API must not define fixed
resource names or hierarchies (an
obvious coupling of client and server)
I realize this may seem like an extreme position, considering how everyone on SO seem fixated on what your urls need to look like to have a RESTful API, but the more you think about it, the more sense it makes.
P.S. Please don't get hung up on my choice of HTML as a media type for the representations, I'm just drawing attention to the fact that you don't always need to use a custom XML vocabulary.
My tendency would be to divide up into the types of queries (e.g. one type of query is a "list searches" query. Then I'd have arguments. E.g. /searchlist?user=john. Unless you're trying to make this stuff get indexed by bots and search engines, that should work fine.
Search is functionality not resource so you have to not use it in resource URI may be in param, like
/users/{username}?q={query string}
/groups/{groupname}?q={query string}
then for this query string you have 2 options
"RQL" (Resource Query Language)
or
"FIQL" (Feed Item Query Language)