Validating API Keys in CouchDB - api

I became interested in CouchDB recently and wanted to try and form a small application around it.
The way how I invition my system currently is that requests come providing two things, a id, a API Key and a format. The ID is the _id of a document in the database, the API Key is a _id of another document that has a property of {"valid" : true/false}, and the format is the format they want back. If the API Key is valid, the system would generate the show page for the id given, in the format requested. Otherwise it would return a 403 stats code.
Unfortunately I can't find a way to pull up another document from a show page. I am just beginning CouchDB, so maybe there is something simple here I'm missing.

With a _show function, there are three parts involved:
The design document
The show function inside the design document
The additional document to be shown
For the URL format /db/_design/ddoc/_show/my_show_func/otherdoc:
The design document is _design/ddoc
The show function is shows.my_show_func within that design document
The document to be shown has an _id of otherdoc
Those are the only two documents involved. The only way I can think to do what you describe is have a design doc per API key. The user would query /db/_design/API_KEY/_show/other_doc_id. CouchDB is relaxed. There is nothing wrong with thousands of design docs with identical or similar _show functions. You coul use the HTTP COPY method to clone a base design doc to a new API key as needed. Then you could revoke an API key by deleting the design doc. However that is obviously a unique approach, worth a second thought.
A final consideration is (with the default CouchDB, no reverse proxies, mod_security, etc.) if a user can read one document, they can read the entire database (e.g. from the _all_docs query.) Therefore show functions are a convenience for the software but not a security gateway.

Related

Designing a REST API

In my application, I will have clients trying to book some slots. For this, the client will give some input and based on that input, I have to first create these slots and then return them to the client. Then after getting the slots, the user will book one of these slots. This "book" action is not creating any new resource but simply modifying 2 existing resources.
How do I design my URIs and what methods should I use?
EDIT:
I have 1 existing resource whose URI is: /api/v1/vehicle/id
Using the application front-end a user will fill some form data, with fields date and booking-type and submit it. Then this data will be used by the backend to "calculate" (no resource called slots exists currently) booking slots available to the user. These calculated slots will then be saved in the DB and returned as a response to the user. Out of these slots, the user will book a slot. However, this book action will not create any new resource, instead it will simply modify an existing vehicle resource (add booking related data to it) and the slots object returned by the previous request. I want to create a REST API for this.
I was thinking of doing it like this:
POST /api/v1/slot (1)
PUT/PATCH /api/v1/vehicle/id (2)
PUT/PATCH /api/v1/slot/id (3)
First, I am not sure if I should use PUT or PATCH here, in both (2) and (3). I will only be supplying partial updates to the request. Second, when the user selects a slot and clicks on book button, the front end can only send 1 request to the server. But here, I need to modify 2 resources. How do I do this? I guess I should create 1 URI like /api/v1/createbooking and use the POST method. Then in my backend call 2 different methods to update vehicle and slot objects. Is this URI structure and naming good?
How do I design my URIs and what methods should I use?
How would you do it with web pages?
It sounds like you would have the user navigate to a form which collects the date, booking type, etc. The user submits the form, and the information is sent to the server; because this is not an essentially read-only operation, we'd expect the form to indicate that the POST method should be used.
The server would make its local changes, and present to the user a new form, with the input controls presenting the available options. Once again, choosing a slot doesn't seem to have read-only semantics (we're passing information to the server), so we would see POST again.
What are the URI targets? One way to choose which resources should handle the POST requests is to consider the implications of cache invalidation; because caches know to invalidate the target-uri of a successful POST request, it can be useful to target a resource that is likely to change when the request is successful.
My guess would be that first post would go to the resource that presents the slot choices (since what we are doing is generating new choices for the customer).
For the second submission, since the representation of the vehicle is what is going to be changed by selecting a slot, it makes sense to send that POST request to the vehicle uri.
In general, think about how you read (GET) the changing information from the server; you change that information by sending some request to the same URI.
I am not sure if I should use PUT or PATCH here
PUT and PATCH are typically available together, not as distinct things. They are two different methods for sending a replacement representation of a resource.
A simple example: if you wanted to change the title of /home.html, you could either PUT the entire HTML document with the new title, or you could PATCH the HTML document by sending a patch-document in some format that the server understands.
In other words, both of these methods have remote authoring semantics; we normally would choose between them based on considerations unrelated to your domain model (if the document is small relative to the size of the HTTP headers, a client would usually choose PUT for the additional idempotent semantic guarantees. If the document is really big, but the change is small, we might prefer PATCH).
I need to modify 2 resources.
There's no rule that says a request may only modify one resource. The side effects of the changes may impact the representations of several resources (see 4.3.3 and 4.3.4 of RFC 7231).
What's tricky is telling general purpose clients (and intermediate components) which cached representations are invalidated by the change. Out of the box, we only have semantics for the effective request uri, the Location and the Content-Location. Location and Content-Location already mean something, so you can't just hijack them without the potential of introducing a big mess).
You could do it with Linked Cache Invalidation, using "well known" link relations to identify other documents that have been changed by the request. Unfortunately, LCI doesn't seem to have achieved the status of a "standard", so we may be out of luck for the time being.

REST API responses based on authentication, best practices?

I have an API with endpoint GET /users/{id} which returns a User object. The User object can contain sensitive fields such as cardLast4, cardBrand, etc.
{
firstName: ...,
lastName: ...,
cardLast4: ...,
cardBrand: ...
}
If the user calls that endpoint with their own ID, all fields should be visible. However, if it is someone elses ID then cardLast4 and cardBrand should be hidden.
I want to know what are the best practices here for designing my response. I see three options:
Option 1. Two DTOs, one with all fields and one without the hidden fields:
// OtherUserDTO
{
firstName: ...,
lastName: ..., // cardLast4 and cardBrand hidden
}
I can see this becoming out of hand with DTOs based on role, what if now I have UserDTOForAdminRole, UserDTOForAccountingRole, etc... It looks like it quickly gets out of hand with the number of potential DTOs.
Option 2. One response object being the User, but null out the values that the user should not be able to see.
{
firstName: ...,
lastName: ...,
cardLast4: null, // hidden
cardBrand: null // hidden
}
Option 3. Create another endpoint such as /payment-methods?userId={userId} even though PaymentMethod is not an entity in my database. This will now require 2 api calls to get all the data. If the userId is not their own, it will return 403 forbidden.
{
cardLast4: ...,
cardBrand: ...
}
What are the best practices here?
You're gonna get different opinions about this, but I feel that doing a GET request on some endpoint, and getting a different shape of data depending on the authorization status can be confusing.
So I would be tempted, if it's reasonable to do this, to expose the privileged data via a secondary endpoint. Either by just exposing the private properties there, or by having 2 distinct endpoints, one with the unprivileged data and a second that repeats the data + the new private properties.
I tend to go for option 1 here, because an API endpoint is not just a means to get data. The URI is an identity, so I would want /users/123 to mean the same thing everywhere, and have a second /users/123/secret-properties
I have an API with endpoint GET /users/{id} which returns a User object.
In general, it may help to reframe your thinking -- resources in REST are generalizations of documents (think "web pages"), not generalizations of objects. "HTTP is an application protocol whose application domain is the transfer of documents over a network" -- Jim Webber, 2011
If the user calls that endpoint with their own ID, all fields should be visible. However, if it is someone elses ID then cardLast4 and cardBrand should be hidden.
Big picture view: in HTTP, you've got a bit of tension between privacy (only show documents with sensitive information to people allowed access) and caching (save bandwidth and server pressure by using copies of documents to satisfy more than one request).
Cache is an important architectural constraint in the REST architectural style; that's the bit that puts the "web scale" in the world wide web.
OK, good news first -- HTTP has special rules for caching web requests with Authorization headers. Unless you deliberately opt-in to allowing the responses to be re-used, you don't have to worry the caching.
Treating the two different views as two different documents, with different identifiers, makes almost everything easier -- the public documents are available to the public, the sensitive documents are locked down, operators looking at traffic in the log can distinguish the two different views because the logged identifier is different, and so on.
The thing that isn't easier: the case where someone is editing (POST/PUT/PATCH) one document and expecting to see the changes appear in the other. Cache-invalidation is one of the two hard problems in computer science. HTTP doesn't have a general purpose mechanism that allows the origin server to mark arbitrary documents for invalidation - successful unsafe requests will invalidate the effective-target-uri, the Location, the Content-Location, and that's it... and all three of those values have other important uses, making them more challenging to game.
Documents with different absolute-uri are different documents, and those documents, once copied from the origin server, can get out of sync.
This is the option I would normally choose - a client looking at cached copies of a document isn't seeing changes made by the server
OK, you decide that you don't like those trade offs. Can we do it with just one resource identifier? You immediately lose some clarity in your general purpose logs, but perhaps a bespoke logging system will get you past that.
You probably also have to dump public caching at this point. The only general purpose header that changes between the user allowed to look at the sensitive information and the user who isn't? That's the authorization header, and there's no "Vary" mechanism on authorization.
You've also got something of a challenge for the user who is making changes to the sensitive copy, but wants to now review the public copy (to make sure nothing leaked? or to make sure that the publicly visible changes took hold?)
There's no general purpose header for "show me the public version", so either you need to use a non standard header (which general purpose components will ignore), or you need to try standardizing something and then driving adoption by the implementors of general purpose components. It's doable (PATCH happened, after all) but it's a lot of work.
The other trick you can try is to play games with Content-Type and the Accept header -- perhaps clients use something normal for the public version (ex application/json), and a specialized type for the sensitive version (application/prs.example-sensitive+json).
That would allow the origin server to use the Vary header to indicate that the response is only suitable if the same accept headers are used.
Once again, general purpose components aren't going to know about your bespoke content-type, and are never going to ask for it.
The standardization route really isn't going to help you here, because the thing you really need is that clients discriminate between the two modes, where general purpose components today are trying to use that channel to advertise all of the standardized representations that they can handle.
I don't think this actually gets you anywhere that you can't fake more easily with a bespoke header.
REST leans heavily into the idea of using readily standardizable forms; if you think this is a general problem that could potentially apply to all resources in the world, then a header is the right way to go. So a reasonable approach would be to try a custom header, and get a bunch of experience with it, then try writing something up and getting everybody to buy in.
If you want something that just works with the out of the box web that we have today, use two different URI and move on to solving important problems.

Good practices for designing REST api relations

We currently trying to design some REST api for our webservices. We have two resources, 'record' and 'expedition'. As far as we know, a record can be associated with multiple expeditions, and an expedition can be associated with one record (but not necessarily).
When we create an expedition, and we want to "attach" it to a record, we have come to two solutions :
POST /expeditions?recordId=xxx
POST /records/xxx/expeditions
and a POST /expeditions WS to create expeditions independently.
My colleague suggested the first approach, but I found the second the most usual way to do so. I have not found articles on the web presenting the first approach as a good or bad design.
So, which solution is the good one for you ? Which kind of consideration can help us to choose ?
Thank you.
Which kind of consideration can help us to choose ?
Think about cache-invalidation.
HTTP is about document transfer. We obtain information from the server by asking for a copy of a document; that request might be handled by the server itself, or it might be handled by a cache that has a valid copy of the document.
We send information to a server by proposing edits to documents - POST being the most common method used to do that (via HTML forms).
When an edit is successful, it follows that the previously cached copies of the document are out of date, and we would really prefer that they be replaced by the updated copy.
General purpose cache invalidation is kind of limited; in particular, it doesn't support arbitrary invalidation of documents. Only the target-uri, Location, and Content-Location are invalidated.
Therefore, when we are designing our resource interactions, we want to consider this limitation.
That usually means that the request that we use to change a document should have the same target-uri as the request to read that same document.
(Yes, that means that if we are going to have different kinds of edits to the document, all of the different edits share the same target-uri, and we disambiguate the edit by looking at other parts of the request -- for instance by parsing the body.)
POST /records/xxx/expeditions and a POST /expeditions WS to create expeditions independently.
That's not required - the server is permitted to apply changes to more than one document; HTTP constrains the meaning of the request, but does not constrain the effects.
That said, general purpose caches won't magically know that both documents have been edited. To some degree, part of what you are choosing in your design is which document needs to be refreshed now, and which ones can be out of date for a time (typically until the cached representation reaches its max age).
For the special case where your response to the successful edit is going to be a copy of the updated representation of the resource, you have a little bit more freedom, because use can use the Content-Location header to identify which document we are returning in the response, and that header is automatically invalidated.
POST /foo/bar
...
200 OK
Content-Location: /foo
In this sequence, general purpose headers will invalidated their cached copies of both /foo and /foo/bar.
(of course, there are still issues, in so far as we don't have a mechanism to return both the updated copy of /foo and the updated copy of /bar in a single response. So instead we need to look into other ideas, like server push).
Design the URL paths in a way that the resources can easily be retrieved.
Query string/parameter present in the URL mentioned in the first approach is typically used to locate a resource and perhaps a little counter intuitive to me.
The second approach, perhaps this would work as you are creating an expedition under an associated record xxx i.e. /records/xxx/expeditions. But it could get challenging in a scenario where an expedition is not related to any record.
Another alternative thought here is to link the expedition and record through the payload i.e. have the record id XXX within the POST payload during the "expedition" resource creation. POST /expedition => This operation would return you an expedition id in response as the resource newly gets created. To retrieve the data, you could then use GET /expedition/XXX/record where XXX is the expedition id and you retrieve the record corresponding to XXX. You don't need to mention a record id in this case. You either get a associated record or you don't(in case there is no record tied to the expedition). To retrieve the expedition itself, the URL could be GET /expedition/XXX.

Should I store basic information (like table sort or page number) in query string or session storage?

I am developing a web application (a dashboard of some sort), and on some pages I have a couple of data tables. Each data table has its own page number, filter and sort. Although the data is fetched asynchronously in the background (hence the page does not need to reload), but I need to store these information (page, filter, sort) so it can be persistent for example during a page refresh. As far as I know, there are two ways to store these information:
In the session storage (local storage is not an option, because I don't want these information to be persistent between sessions)
In the query string
Up until now I used to store these information in the query string because I thought it has some advantages: e.g. the user can copy, bookmark, or share the URL with a colleague to discuss some data which they find interesting (for example on the 10th page of a table after performing some filtering and sorting). With the session storage I cannot do this
But now I am extending the abilities of the filter function, and now the information is quite long to use in the query string, and may even exceed the limit (I think 2048 characters, right?). And also there may be more than a couple of tables on each page, and therefore the query string would even become longer.
So first I wanted to know what is the best practice in this situation
And second, is that feature (being able to copy/bookmark/share the page as is) really that important, or not?
Note: please note that the information that I'm talking about is nothing secret or sensitive. It's just table page number, table filter and table sort
The 2048 limit seems that's not really an actual limit, see What is the maximum possible length of a query string?
About the best approach - I've personally always hated that most "new websites" do not support the feature you care about, so I'd personally encourage you to support it!
And finally about the exact mechanism, firstly the query string approach will keep working for a lot longer than 2048 characters, but I can see that copy&pasting it might be unwieldy and depending on the media to share the URL it can introduce mistakes.
So, from the user's perspective, I think the best experience would be given by storing those searches on the backend side and enabling a shorter URL for permalinking/sharing/bookmarking.
This new URL could be obtained by the user via a specific UI button (Share/Permalink) so you save the search in that moment and return the URL, or (best experience but harder and costlier to implement) you can be saving it continuously and sending back to the UI the generated URL and use Javascript to replace the URL for the nicer version (either always or just when people copy it).
Also consider: it may be good enough to just keep using the query string :-)

Authors fields in domino DDS REST api

I am building a Javascript Web application with a Domino back end, using the Domino DDS REST api to do POST, PUT, and GET operations against the database. I want to use Authors and Readers fields in documents to control which users can see which documents and to give users with Author access in the ACL the ability to edit documents they have created. When doing a POST of a new document (implemented by the save() method of a new Backbone model) is there a way to designate one or more fields as Readers or Authors?
Doing a GET on an existing document returns a JSON object with an attribute named '#authors' containing the names and roles in the Authors fields. Is this attribute read/write?
Can I populate #authors with the desired values before doing a POST to have these values control author access?
My colleague says the Domino REST api makes no provision for setting Authors and Readers fields, and that this functionality can only be done through Java servlets. Is this right?
I'm not familiar with the Domino DDS REST API, but from what I gather it is doubtfull that when POSTing a document, you get to chose the type of the fields. I suspect they all end up as text.
What you could do however is to link the action of your form to a Domino agent which, using the backend Java or LotusScript API, will be able to control precisely the final shape of your document, hereby allowing you to fully utilize the powerfull security model of Domino.
Nevertheless, keep in mind that at some point, your users will have to authenticate against the Domino Directory. Depending where your users originally log in, you may need to talk to your Domino administrator to sort out a Single Sing-On scheme linked to your other directory.
Alternatively, you could take advantage of the fact that Domino is also a web server and an application server : you can build your HTML form in there, starting with a Domino form (simple) or an xPage (a bit more complex).
You may want to have a look here.
Some would say that you could even build your whole application in Domino, as using it as a mere back-end data repository is akin to using a Rolls-Royce to ferry potatoes, but I suppose that you and your organization have good reasons to do so.
Finally you could also completely ditch Domino and use another nosql database like MongoDB, but that would only displace your access control problem.
You can post data back to Domino and nominate a form to use. If you use the 'computewithform=true' parameter and the form design includes the authors/reader fields you need, this will set the field flags correctly and automatically.