Why is there a "Get Status" step when implementing Async APIs w/ Polling? - api

Often times I see the following for polling:
Send a request and get back a unique ID.
Poll a "Status" endpoint, which tells the client when the request has been completed.
Send a request to fetch the response.
Why steps (2) and (3) can not be combined?
If the response isn't ready yet, it'll return no response back, and some status indicating that.
If it is ready, it'll return the response.
Why are (2) and (3) often separate steps?

Is it ready is a boolean true/false and a response can be anything. In general it's easier to call "is it ready" then write logic to handle true and false than to write logic to get a response, determine if the response is not ready or is the data type you need.
In this way, the logic is all client side but if you combined them you'd need to have logic on both client and server (both to say it's not ready and to handle the actual response). You could do it but keeping it separate just keeps things neater.

This pattern is generally defined by the HTTP 202 status code, which is the HTTP protocol's mechanism of initiating asynchronous requests.
We can think of a 202 response as indicating that a job has been created. If and when that job executes, it may (or may not) produce some business entity. Presumably the client receiving a 202 is ultimately interested in that business entity, which may (or may not) exist in the future, but certainly does not exist now, hence the 202 response.
So one simple reason for returning a pointer to a job status is because the job status exists now and we prefer to identify things that exist now rather than things that may (or may not) exist in the future. The endpoint receiving the request may not even be capable of generating an ID for the (future) business entity.
Another reason is status codes. A status endpoint returns a custom job status capable of describing unlimited potential states in which a job can exist. These job states are beyond the scope of HTTP status codes. The standard codes defined by the w3 already have precise definitions; and there simply is no standard HTTP status code that means "keep polling".

The reason is that they are different resources from REST perspective.
Let's examine this a bit through an example:
If you want to place an order then first you have to submit an order request
Then there is lengthy, asynchronous process in the background which checks the payment validity, the items availability in the inventories, etc.
If everything goes fine then there will be an order aggregate with some subelements (like order items, shipping address, etc.)
From REST perspective:
There is a POST /orders endpoint to place an order
There is a GET /order_requests/{id} endpoint to retrieve order request
There is a GET /orders/{id} endpoint to retrieve order details
Whenever the order and all related sub-resources are created then the 2. endpoint usually responds with a 303 See Other status code to ask the consumer to redirect to GET /orders/{id}.

Related

Can I send an API response before successful persistence of data?

I am currently developing a Microservice that is interacting with other microservices.
The problem now is that those interactions are really time-consuming. I already implemented concurrent calls via Uni and uses caching where useful. Now I still have some calls that still need some seconds in order to respond and now I thought of another thing, which I could do, in order to improve the performance:
Is it possible to send a response before the sucessfull persistence of data? I send requests to the other microservices where they have to persist the results of my methods. Can I already send the user the result in a first response and make a second response if the persistence process was sucessfull?
With that, the front-end could already begin working even though my API is not 100% finished.
I saw that there is a possible status-code 207 but it's rather used with streams where someone wants to split large files. Is there another possibility? Thanks in advance.
"Is it possible to send a response before the sucessfull persistence of data? Can I already send the user the result in a first response and make a second response if the persistence process was sucessfull? With that, the front-end could already begin working even though my API is not 100% finished."
You can and should, but it is a philosophy change in your API and possibly you have to consider some edge cases and techniques to deal with them.
In case of a long running API call, you can issue an "ack" response, a traditional 200 one, only the answer would just mean the operation is asynchronous and will complete in the future, something like { id:49584958, apicall:"create", status:"queued", result:true }
Then you can
poll your API with the returned ID to see if the operation that is still ongoing, has succeeded or failed.
have a SSE channel (realtime server side events) where your server can issue status messages as pending operations finish
maybe using persistent connections and keepalives, or flushing the response in the middle, you can achieve what you point out, ie. like a segmented response. I am not familiar with that approach as I normally go for the suggesions above.
But in any case, edge cases apply exactly the same: For example, what happens if then through your API a user issues calls dependent on the success of an ongoing or not even started previous command? like for example, get information about something still being persisted?
You will have to deal with these situations with mechanisms like:
Reject related operations until pending call is resolved "server side": Api could return ie. a BUSY error informing that operations are still ongoing when you want to, for example, delete something that still is being created.
Queue all operations so the server executes all them sequentially.
Allow some simulatenous operations if you find they will not collide (ie. create 2 unrelated items)

Why should we not return a response in an HTTP DELETE request?

At my company we have a DELETE endpoint which (obviously) deletes some data that the user selects on the client side. Currently we are doing another get request to refresh the data on this page after the deletion requests completes successfully (So we don't have two conflicting states on client/server side).
A developer at our company wants to now change the DELETE endpoint to return the updated state after the data is successfully deleted, so we can just use this response value to update our client side.
This makes sense to me (As we can avoid another GET call) but from other threads the general consensus is that we should return an empty response.
Can someone explain to me why this is the case? I've looked around it seems like people are mostly saying 'because that is how REST is supposed to be' which doesn't really seem like a good reason.

What happen to a Transaction if a HTTP POST fail during the response?

I'm trying to communicate with cryptocurrencies exchanges thought HTTP API.
When I make a buy/sell order I would like to handle IO exceptions during the HTTP response and be sure to don't duplicate the order.
Some exchanges allow to use Client Order ID, so if an exceptions occurs, I can just check if the order exists with my own generated ID => So try again if the order don't exists.
But for exchanges which don't have this feature, I can't check if the order exists because I don't receive any results (Just a connection error).
So there is any rule in the HTTP world which says if a POST response is not correctly send back to the user => Don't make the transaction ?
(So in this case I can just try again and be sure to don't duplicate order)
Thanks in advance :)
Refer this: https://www.rfc-editor.org/rfc/rfc7231#section-4.2
If Post response times out, or you get a IO exception it is hard to trace it the execution failed or was successfully done on client you were calling.
Since POST are not idempotent (multiple calls is not equal to single call) but a GET call is. I would suggest to do a GET for data (created, check if cryptocurrency exchange executed) and based on result can do another POST call.

JSON:API HTTP status code for duplicate content creation avoidance

Suppose I have an endpoint that supports creating new messages. I am avoiding the creation of two times the same message in the backend, in case the user tries to push the button twice (or in case the frontend app behaves strangely).
Currently for the duplicate action my server is responding with a 303 see other pointing to the previously created resource URL. But I see I could also use a 302 found. Which one seems more appropriate ?
Note that the duplicate avoidance strategy can be more complex (eg for an appointment we would check whether the POSTed appointment is within one hour of an existing one)
I recommend using HTTP Status Code 409: Conflict.
The 3XX family of status codes are generally used when the client needs to take additional action, such as redirection, to complete the request. More generally, status codes communicate back to the client what actions they need to take or provide them with necessary information about the request.
Generally for these kind of "bad" requests (such as repeated requests failing due to duplication) you would respond with a 400 status code to indicate to the client that there was an issue with their request and it was not processed. You could use the response to communicate more precisely the issue.
Also to consider, if the request is just "fire and forget" from the client then as long as you've handled the case for duplication and no more behavior is needed from the client it might be acceptable to send a 200 response. This tells the client "the request was received and handled appropriately, nothing more you need to do." However this is a bit deceptive as it does not indicate the error to the client or allow for any modified behavior.
The JSON:API specification defines:
A server MUST return 409 Conflict when processing a POST request to create a resource with a client-generated ID that already exists.

Is a status method necessary for an API?

I am building an API and I was wondering is it worth having a method in an API that returns the status of the API whether its alive or not?
Or is this pointless, and its the API users job to be able to just make a call to the method that they need and if it doesn't return anything due to network issues they handle it as needed?
I think it's quite useful to have a status returned. On the one hand, you can provide more statuses than 'alive' or not and make your API more poweful, and on the other hand, it's more useful for the user, since you can tell him exactly what's going on (e.g. 'maintainance').
But if your WebService isn't available at all due to network issues, then, of course, it's up to the user to catch that exception. But that's not the point, I guess, and it's not something you could control with your API.
It's useless.
The information it returns is completely out of date the moment it is returned to you because the service may fail right after the status return call is dispatched.
Also, if you are load balancing the incoming requests and your status request gets routed to a failing node, the reply (or lack thereof) would look to the client like a problem with the whole API service. In the meantime, all the other nodes could be happily servicing requests. Now your client will think that the whole API service is down but subsequent requests would work just fine (assuming your load balancer would remove the failed node or restart it).
HTTP status codes returned from your application's requests are the correct way of indicating availability. Your clients of course have to be coded to tolerate and handle them.
What is wrong with standard HTTP response status codes? 503 Service Unavailable comes to mind. HTTP clients should already be able to handle that without writing any code special to your API.
Now, if the service is likely to be unavailable frequently and it is expensive for the client to discover that but cheap for the server, then it might be appropriate to have a separate 'health check' URL that can quickly let the client know that the service is available (at the time of the GET on the health check URL).
It is not necessary most of the time. At least when it returns simple true or false. It just makes client code more complicated because it has to call one more method. Even if your client received active=true from service, next useful call may still fail. Let you client make the calls that they need during normal execution and have them handle network, timeout and HTTP errors correctly. Very useful pattern for such cases is called Circuit Breaker.
The reasons where status check may be useful:
If all the normal calls are considered to be expensive there may be an advantage in first calling lightweight status-check method (just to avoid expensive call).
Service can have different statuses and client can change its behavior depending on these statuses.
It might also be worth looking into stateful protocols like XMPP.