Can I send an API response before successful persistence of data? - api

I am currently developing a Microservice that is interacting with other microservices.
The problem now is that those interactions are really time-consuming. I already implemented concurrent calls via Uni and uses caching where useful. Now I still have some calls that still need some seconds in order to respond and now I thought of another thing, which I could do, in order to improve the performance:
Is it possible to send a response before the sucessfull persistence of data? I send requests to the other microservices where they have to persist the results of my methods. Can I already send the user the result in a first response and make a second response if the persistence process was sucessfull?
With that, the front-end could already begin working even though my API is not 100% finished.
I saw that there is a possible status-code 207 but it's rather used with streams where someone wants to split large files. Is there another possibility? Thanks in advance.

"Is it possible to send a response before the sucessfull persistence of data? Can I already send the user the result in a first response and make a second response if the persistence process was sucessfull? With that, the front-end could already begin working even though my API is not 100% finished."
You can and should, but it is a philosophy change in your API and possibly you have to consider some edge cases and techniques to deal with them.
In case of a long running API call, you can issue an "ack" response, a traditional 200 one, only the answer would just mean the operation is asynchronous and will complete in the future, something like { id:49584958, apicall:"create", status:"queued", result:true }
Then you can
poll your API with the returned ID to see if the operation that is still ongoing, has succeeded or failed.
have a SSE channel (realtime server side events) where your server can issue status messages as pending operations finish
maybe using persistent connections and keepalives, or flushing the response in the middle, you can achieve what you point out, ie. like a segmented response. I am not familiar with that approach as I normally go for the suggesions above.
But in any case, edge cases apply exactly the same: For example, what happens if then through your API a user issues calls dependent on the success of an ongoing or not even started previous command? like for example, get information about something still being persisted?
You will have to deal with these situations with mechanisms like:
Reject related operations until pending call is resolved "server side": Api could return ie. a BUSY error informing that operations are still ongoing when you want to, for example, delete something that still is being created.
Queue all operations so the server executes all them sequentially.
Allow some simulatenous operations if you find they will not collide (ie. create 2 unrelated items)

Related

Need suggestions: Send multiple images to backend, perform upload operation in backend, send response

I need some best practice guidelines for a backend service in a scenario like this one:
UI sends multiple images for uploading to the backend service
Backend service receives all of the images and processes upload to storage one by one
There can be failure in 1 or multiple image upload
My question is how do I send the response towards UI if my backend service is unable to upload 1 or more file(s).
One way can be to send failed and successful image link together in a JSON response body. So the UI knows about the failure and handles it in its own way.
Another way can be to send only the successfully uploaded images' link which is the best case scenario.
Any suggestions will be welcomed with some reference links.
Use an Orchestrator - something specific that can coordinate multiple actions and provide a meaningful result back to the caller.
This might be as simple as a component sitting in the UI that orchestrates calls to the backend. The UI component and the backend service might be designed as parts of a cohesive solution, or the UI component might simply act as a type of client/proxy/facade to some random backend service.
UI calls the orchestrator with references to all the images it needs uploading.
The orchestrator works through the items, uploading each as you prefer (sequentially or in parallel, etc). For each file, handle errors however you prefer - e.g. try once and die gracefully on failure; put errors into a queue or some other mechanism for retry (how many times is up to you); etc.
Based on rules internal to the orchestrator, return status to the caller.
For potentially long-running processes (like file uploads) make sure the call to the orchestrator is asynchronous.
Rather than only returning "complete" result at the end, the orchestrator might provide a simple status back, allowing callers to get some idea of where processing is at. For example, you might have a call-back (from the orchestrator to it's caller) that simply emits very simple statuses like: processing, failed and complete. A more complex solution would be for the orchestrator to return more specific info like %complete and detailed error info.
Have a look at how the big cloud providers do complex file uploads by reading their documentation and studying their API's.
I need some best practice guidelines for a backend service
In no particular order:
Keep it as simple as possible - generally, the fewer moving parts the better. E.g. pay attention to the Single Responsibility Principle (SRP).
Clean up after yourself. If the upload service generates any data - make sure you have a clean-up process so you don't end up with mountains of un-needed data lying around, especially stuff like image files. If you design an upload solution that maintains state (which is independent of what happens to the images once they are uploaded) then you'll be storing data which probably won't be needed once the images are all processed.
Think about support - not just developer debugging but also operational support. Getting your solution into production is not the end result, it's just the beginning.
If designing this solution across teams (e.g. frontend and backend teams) make sure both teams are involved in the design. If the backend team can't provide a solution that works for the frontend team then it's not going to end well.
Think about the likely error scenarios and how can you handle them.
This isn't really just a question of best practice, as there are multiple ways you could implement it, more than one of which could be valid. This is actually an architecture and design question, with more than one valid answer, hence I don't think it fits as a Stack Overflow question and you will not get references to any one correct approach.
That said, by way of an answer I will outline what I think you need. At a very high level, and not necessarily in this order but taking these factors into account, I would:
Design the UI process flow. For example, you may decide that the user process will have several stages:
User selects first image for upload;
User selects each subsequent image for upload;
User presses some kind of "Go" button after selecting all images;
System now uploads the batch, and user receives a response confirming success or otherwise;
User has option to click through to detailed success/error details.
Design the required success/error reports
Design the data needed to support the overall functionality
Provide one or more APIs giving the upload function and the report function(s) the CRUD access they need to this data
If you hit any specific technical issues at any stage, then please post a new questions accordingly as you go.
As to the point you mentioned, how to send the UI response, there is more than one valid way but I would return a basic success/falure response initially, containing only minimal details such as number of successes, and return more details in further messages in response to user actions (such as clicking through to detailed success/error details), at which point I would retrieve the requested error details from the database.
As I said at the start of my answer, I don't think your question can be answered just in terms of best practices, as it's a whole architecture and design question, but I hope my answer helps you along this path.

how to design REST API to ask server to wait for resource version to arrive on GET requests?

I work on splitting monoliths into microservices. With the monolith, I had a single source of truth and can just GET /resources/123 right after the PATCH /resources/123 and be sure that the database has the up-to-date data I need.
With microservices and CQRS in place, there is a risk that the query service has not seen yet the latest update to the record when I perform a GET request.
What is the best or standard approach to making sure that the client receives back the up-to-date value? I know that the client may compare resource versions that he receives after PATCH and after GET and retry requests, but is there a known API design to tell the server something like GET /resources/123 and wait up to 5 sec for the resource version 45 or bigger to arrive?
Since a PATCH request allows a response body, to my mind there's nothing wrong with the response including the object after patching. The requestor who sent the PATCH can use the response in lieu of a GET; for others, the eventual consistency delay for the GET isn't observable (since they don't know when the PATCH was issued).
CQRS means to not contort your write model for the sake of reads. If there's a read that is easily performed based on the write model, that read can be done against the write model.
Generally a better design might be for the PATCH request to delay its own response, if that's an option.
However, your GET request can also just 'hang' until it's ready. This generally feels like a better design than polling.
A client could indicate to the server how long it's willing to wait using a Prefer: wait= header: https://datatracker.ietf.org/doc/html/rfc7240#section-4.3
This could be used both for the GET or the PATCH request.
I don't think there's a standard HTTP way to say: this resource is not available right now, but will be in the future. However, there is a standard HTTP header to tell clients when to retry the request:
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Retry-After
This is mainly used for 429 and 503 errors but neither seem appropriate here.
Frankly this is one of the first thing I've heard in a while that could be a good new HTTP status code. 425 Too Early exists but its a different use-case.

How to keep an API idempotent while receiving multiple requests with the same id at the same time?

From a lot of articles and commercial API I saw, most people make their APIs idempotent by asking the client to provide a requestId or idempotent-key (e.g. https://www.masteringmodernpayments.com/blog/idempotent-stripe-requests) and basically store the requestId <-> response map in the storage. So if there's a request coming in which already is in this map, the application would just return the stored response.
This is all good to me but my problem is how do I handle the case where the second call coming in while the first call is still in progress?
So here is my questions
I guess the ideal behaviour would be the second call keep waiting until the first call finishes and returns the first call's response? Is this how people doing it?
if yes, how long should the second call wait for the first call to be finished?
if the second call has a wait time limit and the first call still hasn't finished, what should it tell the client? Should it just not return any responses so the client will timeout and retry again?
For wunderlist we use database constraints to make sure that no request id (which is a column in every one of our tables) is ever used twice. Since our database technology (postgres) guarantees that it would be impossible for two records to be inserted that violate this constraint, we only need to react to the potential insertion error properly. Basically, we outsource this detail to our datastore.
I would recommend, no matter how you go about this, to try not to need to coordinate in your application. If you try to know if two things are happening at once then there is a high likelihood that there would be bugs. Instead, there might be a system you already use which can make the guarantees you need.
Now, to specifically address your three questions:
For us, since we use database constraints, the database handles making things queue up and wait. This is why I personally prefer the old SQL databases - not for the SQL or relations, but because they are really good at locking and queuing. We use SQL databases as dumb disconnected tables.
This depends a lot on your system. We try to tune all of our timeouts to around 1s in each system and subsystem. We'd rather fail fast than queue up. You can measure and then look at your 99th percentile for timings and just set that as your timeout if you don't know ahead of time.
We would return a 504 http status (and appropriate response body) to the client. The reason for having a idempotent-key is so the client can retry a request - so we are never worried about timing out and letting them do just that. Again, we'd rather timeout fast and fix the problems than to let things queue up. If things queue up then even after something is fixed one has to wait a while for things to get better.
It's a bit hard to understand if the second call is from the same client with the same request token, or a different client.
Normally in the case of concurrent requests from different clients operating on the same resource, you would also want to implementing a versioning strategy alongside a request token for idempotency.
A typical version strategy in a relational database might be a version column with a trigger that auto increments the number each time a record is updated.
With this in place, all clients must specify their request token as well as the version they are updating (typical the IfMatch header is used for this and the version number is used as the value of the ETag).
On the server side, when it comes time to update the state of the resource, you first check that the version number in the database matches the supplied version in the ETag. If they do, you write the changes and the version increments. Assuming the second request was operating on the same version number as the first, it would then fail with a 412 (or 409 depending on how you interpret HTTP specifications) and the client should not retry.
If you really want to stop the second request immediately while the first request is in progress, you are going down the route of pessimistic locking, which doesn't suit REST API's that well.
In the case where you are actually talking about the client retrying with the same request token because it received a transient network error, it's almost the same case.
Both requests will be running at the same time, the second request will start because the first request still has not finished and has not recorded the request token to the database yet, but whichever one ends up finishing first will succeed and record the request token.
For the other request, it will receive a version conflict (since the first request has incremented the version) at which point it should recheck the request token database table, find it's own token in there and assume that it was a concurrent request that finished before it did and return 200.
It's seems like a lot, but if you want to cover all the weird and wonderful failure modes when your dealing with REST, idempotency and concurrency this is way to deal with it.

How to write a middle-tier http API endpoint that can stream results as they arrive to the client?

The scenario is this - I have a frontend web-server that I'm writing in node.js. I have an as-yet-unwritten middle-tier internal-API layer written in, well, anything. The internal-API is the only thing allowed to talk to the data-store (which happens to be a relational database).
Disclaimer: I'm a node.js beginner.
node.js wants to do data-access asynchronously - that makes calls like Database.query.all inefficient, since the response callback wouldn't start until the whole list has been assembled. Documentation I've read suggests that instead, it'd be better to stream results one at a time to the client.
I would like to know how to write the frontend and middle-tier http internal-API such that I can take advantage of node.js' asynchronicity, here.
I guess the question is "how do I stream structured data over http"? I guess that's the feature of the internal API that I'm asking for support for.
Should I:
Get the frontend to ask for a list of IDs, then issue one request each to the backend? Sounds crude and chatty, plus I don't see a guarantee that the requests will return in the order that I want, so I'd have to wait 'til I had everything back at the frontend anyway..?
Get the frontend to make a series of requests against the internal API for pages of data, and treat each chunk as a stream-segment...?
Fetch only enough data for the first screen's worth, then request for subsequent chunks, writing each one to the end of the list as it arrives?
something cleverer!?
(Note: please don't say "get rid of the middle-tier so you can talk to the database directly" - that's not an option)
I am not sure what exactly you mean by "streaming"; from the ideas you give, it could be either interpreted as some HTTP server push or long polling technique, or simply making subsequent XHR requests.
Since you're using node, I recommend Socket.io, which allows you to really push data to the browser whenever you want.
If you chose to go with XHRs, simply tell the browser what to request next.
If that doesn't fit you, and you want to use server push or long polling, response.write() seems the way to go. But you will probably run into problems with request timeouts and such.

Is a status method necessary for an API?

I am building an API and I was wondering is it worth having a method in an API that returns the status of the API whether its alive or not?
Or is this pointless, and its the API users job to be able to just make a call to the method that they need and if it doesn't return anything due to network issues they handle it as needed?
I think it's quite useful to have a status returned. On the one hand, you can provide more statuses than 'alive' or not and make your API more poweful, and on the other hand, it's more useful for the user, since you can tell him exactly what's going on (e.g. 'maintainance').
But if your WebService isn't available at all due to network issues, then, of course, it's up to the user to catch that exception. But that's not the point, I guess, and it's not something you could control with your API.
It's useless.
The information it returns is completely out of date the moment it is returned to you because the service may fail right after the status return call is dispatched.
Also, if you are load balancing the incoming requests and your status request gets routed to a failing node, the reply (or lack thereof) would look to the client like a problem with the whole API service. In the meantime, all the other nodes could be happily servicing requests. Now your client will think that the whole API service is down but subsequent requests would work just fine (assuming your load balancer would remove the failed node or restart it).
HTTP status codes returned from your application's requests are the correct way of indicating availability. Your clients of course have to be coded to tolerate and handle them.
What is wrong with standard HTTP response status codes? 503 Service Unavailable comes to mind. HTTP clients should already be able to handle that without writing any code special to your API.
Now, if the service is likely to be unavailable frequently and it is expensive for the client to discover that but cheap for the server, then it might be appropriate to have a separate 'health check' URL that can quickly let the client know that the service is available (at the time of the GET on the health check URL).
It is not necessary most of the time. At least when it returns simple true or false. It just makes client code more complicated because it has to call one more method. Even if your client received active=true from service, next useful call may still fail. Let you client make the calls that they need during normal execution and have them handle network, timeout and HTTP errors correctly. Very useful pattern for such cases is called Circuit Breaker.
The reasons where status check may be useful:
If all the normal calls are considered to be expensive there may be an advantage in first calling lightweight status-check method (just to avoid expensive call).
Service can have different statuses and client can change its behavior depending on these statuses.
It might also be worth looking into stateful protocols like XMPP.