Should Internal Server Error be documented in swagger? - api

I am writing a new API and documenting it using Swagger/OpenAPI. It seems to be a good standard to document error responses, that the developer can expect to encounter.
But I cannot find any guide lines or best practices about Internal Server Error. Every path could in theory throw an unhandled exception. I do not expect it to happen, but it might. Should all paths have a response with status code 500 "Internal Server Error" or should I only document responses the developer can do anything about, i.e. 2xx, 3xx and 4xx?

The offical documentation shows an example for specifying all 5xx status codes in the responses section, but it does not go into details about the specific status code, or the message returned. It also mentions that the API specification should only contain known errors:
Note that an API specification does not necessarily need to cover all possible HTTP response codes, since they may not be known in advance. However, it is expected to cover successful responses and any known errors. By “known errors” we mean, for example, a 404 Not Found response for an operation that returns a resource by ID, or a 400 Bad Request response in case of invalid operation parameters.
You could follow the same approach and specify it like in the example. I think it's not important or even recommended to try to describe it more specifically, since you might not be able to cover all cases anyway and the client is not expected to act on the message returned for internal server errors (possibly other than retrying later). So for example, I would not recommend specifying a message format for it.
Omitting any responses with 5xx HTTP error codes makes sense as well.

Related

Rest API: Response HTTP status code for applicative semantic error

Suppose to have a REST API which updates the stock of some products in an e-commerce portal:
URL : /products/stock
METHOD : PUT
BODY :
{
"PRD001": 3,
"PRD002": 2
}
Where the request body is a map made of <<PRODUCT_CODE>> : <<USER_REQUIRED_QUANTITY>> entries
At some point, the server receives a well-formed syntactic request, but the logic behind the API fails because:
One or more of the sent PRODUCT CODES do not exist.
The USER_REQUIRED_QUANTITY requested for one of the products having PRODUCT_CODE is unavailable because of insufficient stock.
Which HTTP CODE should the REST API return for these "semantic applicative errors"?
In my opinion:
It shouldn't return 400 - BAD REQUEST because the REQUEST is well-formed from a syntactic perspective.
In the case of an inexistent product, it shouldn't return 404 -NOT FOUND because the resource is related to a stock and not to a specific product. Returning 404 - NOT FOUND could lead the client intto an error.
It could return a 409 - CONFLICT (The request could not be completed due to a conflict with the current state of the resource)
It could return a 422 Unprocessable Entity (The server understands the content type and syntax of the request entity, but still server is unable to process the request for some reason). Anyway, this status code is part of WebDAV specific and not part of the HTTP)
What do you think about this specific use case?
And, in a more general way, in which way do you handle HTTP Status codes according to applicative semantic errors?
Thank you
Important idea - status codes are metadata of the transfer of documents over a a network domain; they describe the semantics of the HTTP response so that general purpose HTTP components can do intelligent things (e.g. invalidate cached responses).
Which HTTP CODE should the REST API return for these "semantic applicative errors"?
The fact that the values in the request body are the problem strongly suggests that we want some flavor of 4xx Client Error semantics.
I'm inclined to guess that the simplest approach would be to use 403 Forbidden
The 403 (Forbidden) status code indicates that
the server understood the request but refuses
to fulfill it.
Any nuance that you need to share with the user/bespoke client is described in the body of the 403 response.
From what I can tell, 409 Conflict isn't right, but also isn't going to get you into a lot of trouble.
For a general purpose component, 403 and 409 are handled essentially the same way -- in theory a general purpose component could try to "resolve the conflict" on its own and resubmit the request, but in practice we don't have a standard for describing the nature of the conflict, which means that the component isn't going to know a way to modify the request.
So while I would decline a pull request (PR) that used a 409 here, I would also accept a PR that used a 409 and also included a decision record documenting the trade offs that the implementer had considered in this specific context (for example - it might be important that your human operators easily be able to distinguish this case from authentication issues when scanning access logs).
In other words, make the boring choice unless you have really good reasons to do something else. If you have really good reasons to do something else, write them down.
422
My doubt about this one is that it is not part to the HTTP specs.
Don't be worried at that. HTTP status codes are intended to be extensible. Anything you find in the IANA status code registry should be considered safe to use.
Also, today, the registered reference for 422 is the current HTTP Semantics specification (RFC 9110).
That said, I wouldn't use it here, because I don't think the semantics are as good a fit for your circumstance as 403.
The 422 (Unprocessable Content) status code indicates that the
server understands the content type of the request content
(hence a 415 (Unsupported Media Type) status code is inappropriate),
and the syntax of the request content is correct, but it was unable
to process the contained instructions. For example, this status
code can be sent if an XML request content contains well-formed
(i.e., syntactically correct), but semantically erroneous XML
instructions.
My interpretation of this is that we're trying to indicate that there's a problem specific to the semantics of the request body (ex: a required field is missing). It announces that we're unable to process the request, rather than announcing that the processing failed.
In other words, 422 is "I don't know what this means", where 403 is "I know what this means, but I won't do it."
We also have considered using the 403 - FORBIDDEN code, but it seems to be more related to authorization issues.
The specification gives wider latitude than the most common usage.
a request might be forbidden for reasons unrelated
to the credentials.
That said, choosing a different status code can be the right engineering trade off. If the benefits of doing the right thing are small, and the costs (in particular, the support and operational costs) are large, well... maybe being successful is more important than being right.

HTTP code for an error in processing a request

Let's say we have an HTTP request made by the client. The endpoint exists and is accessible by the client (this rules out 401, 403, 404 and 405). The request payload is valid (this rules out 400). The server is alive and well, and is able to handle the request and return a response (this rules out 5xx).
The error arises within the processing of the request. Examples of such errors may include:
Business validation error.
Querying for an entity in a database that does not exist. Assume that the database lookup is only one part of the request processing pipeline (e.g. not the client request itself).
The server that handles the original client request makes an internal HTTP request that fails. In this case, the handling server is alive and well, while the internal HTTP request may return a 5xx. Assume that the internal HTTP request is only one part of the request processing pipeline (e.g. not the client request itself).
What is the appropriate HTTP code to assign for these responses?
I've seen API docs use 402 (Stripe) and 422 (PayPal), though I haven't come across anything definitive.
Thoughts from the community welcome! Thanks.
What is the appropriate HTTP code to assign for these responses?
Two important ideas
First - your API is a facade, designed to make it look your service/business logic/etc is just another HTTP compliant document store (aka the "uniform interface" constraint). For the purposes of designing your responses, the specific nature of your resources and the implementation details are not significant.
Second - the important point of a status code is how that status code will be understood by general purpose components (think browsers, web caches, reverse proxies, spiders...). We're trying to help these components broadly categorize the nature of the response. (This is one reason why there are relatively few codes in the 5xx class; there just isn't much that a general purpose component can do differently if the servers handling of the request fails).
And here's the thing: if the general purpose handling of two status codes isn't significantly different. 403 Forbidden and 409 Conflict have different semantics associated with them, but the differences in the standardized handling of those codes, if any, are pretty subtle.
You should make an effort to get 4xx vs 5xx right. It's often less important to precisely identify which 4xx code to use.
Business validation error
Common choices here would be 409 Conflict (your request is not consistent with my copy of the data), or 403 Forbidden (I understood your request, but I'm not going to fulfill it).
If the problem is the data within the request itself (ie: somebody submitted the wrong form) you are more likely to see a 422 Unprocessable Entity (yes, I accept application/json, but not this application/json).
Querying for an entity in a database that does not exist.
The implementation details don't matter; can you trace the problem back to the HTTP request?
If the problem traces back to the URI (we parse the target uri for some information, and use that information to lookup information in our data store), then 404 Not Found is often a good choice. If the problem traces back to the body of the request (we expected some option in the form to match an entry in our enumerated list, but it doesn't), then 409 Conflict is reasonable.
If the server's data is flat out issing, then you are probably looking at a 500 Internal Server Error.
The server that handles the original client request makes an internal HTTP request that fails.
A failure of the server to connect to some other HTTP server is purely an implementation detail, like not being able to connect to a database or a file system.
Unless that failure is a consequence of information in the request, you are going to end up with the 500 Internal Server Error.
This may be where the use of custom defined error response codes may come in, As long as you respect the already defined response codes. For example you could define 600 as your response code and in your API Docs specify what these custom codes mean in detail. For more information of all existing codes I would reference Iana: http://www.iana.org/assignments/http-status-codes/http-status-codes.xhtml
Now if your goal is to stay within existing http response boundaries I would recommend something along the lines of:
Unprocessable failure: Status 422
Authorization failure: Status 403
Unable to process could mean many things such as the aforementioned business validation error.
Business validation error.
This could be 400, 422, 403, 409 depending on what business validation means.
Querying for an entity in a database that does not exist. Assume that the database lookup is only one part of the request processing pipeline (e.g. not the client request itself).
Sounds like 400, 409 or 422.
The server that handles the original client request makes an internal HTTP request that fails. In this case, the handling server is alive and well, while the internal HTTP request may return a 5xx. Assume that the internal HTTP request is only one part of the request processing pipeline (e.g. not the client request itself).
The client doesn't know/care about internal http requests. The point is that it's failed, and it's a bug/system failure so this is a 5xx error.
The most important thing to remember when choosing a HTTP status code is:
Make sure you have the general class correct, so 4xx and 5xx depending on this is a client/server error.
If you need something more specific, ask yourself why. Is your client going to be able to make a better decision if it received a 400 or 409? If not, maybe it's not that important.
I wrote a ton about error codes here, and would suggest you read a bunch of the 4xx entries.
Also a great blog post from one of the authors of the HTTP standards, which goes more into the idea that finding the perfect status code for a case is not that important.

The use of HTTP status to communicate application circumstances

Suppose I ask a question to an criminal register: http://server/demographics/party/{partyId} and that person is not known on that system.
Is that an error? Isn't it a good thing?
When returning 404, it is an error code, Restlet has implemented it as a specific URL cannot be resolved to a route to a server-resource (a handling class), an erroneous situation.
In my opinion, if a system understands a call, and is able to process it without errors, it should return 200 (HTTP-ok), and it should return the information: "We don't know this person".
What is the best thing to do?
There's nothing wrong in returning a 404 status code. It simply means "you asked for a resource, and it doesn't exist". If it did return 200, it would have to somehow return an additional status code telling you that everything went fine, but the resource couldn't be found. That's unnecessary, because 404 already means exactly that.
A status 500 is normally returned to mean "something went wrong, I can't tell you if the resource exists or not". Now if you returned a 404 to mean that, this would be a mistake.
Whether it's a good thing or not is not doesn't have anything to do with HTTP or REST. And BTW, if the register was a file containing the survivors of a disaster, you would probably find it bad to not find the person you looked for, unless maybe if the person is your mother in law, unless you actually love your mother in law. (this is meant as a joke, for those who don't find it obvious).
A web service could be implemented in a way that produces either kind of response.
If it is being implemented in a way that is truer to HTTP/Rest, then '404 not found' would be the appropriate response for not finding something. This is what the 404 status is for.
It may help to think of your request as saying 'I assume this record exists, and I want to see it', and the 404 response as saying 'You are in error that record does not exist'.
If this URL is going to be used by code (rather than be displayed in a browser), then if you don't make a distinction in the response status then you will have to add extra information to the response body to make the difference obvious.
If this URL is going to be displayed to a user in a browser, then the server is still able to display content in the response body for a 404.
Please read: http://archive.oreilly.com/pub/post/restful_error_handling.html#__federated=1
Quoting:
Conclusion:
Human Readable Error Messages: Part of the major appeal of REST based web services is that you can open any browser, type in the right URL, and see an immediate response -- no special tools needed. However, HTTP error codes do not always provide enough information. For example, if we take option 1 above, and request and invalid book ID, we get back a 404 Error Code. From the developer perspective, have we actually typed in the wrong host name, or an invalid book ID? It's not immediately clear. In Option 3 (DAS), we get back a blank page with no information. To view the actual error code, you need to run a network sniffer, or point your browser through a proxy. For all these reasons, I think Option 4 has a lot to offer. It significantly lowers the barrier for new developers, and enables all information related to a web service to be directly viewable within a web browser.
Application Specific Errors: Option 1 has the disadvantage of not being directly viewable within a browser. It also has the additional disadvantage of mapping all HTTP error codes to application specific error codes. HTTP status codes are specific to document retrieval and posting, and these may not map directly to your application domain. For example, one of the DAS error codes relates to invalid genomic coordinates (sequence coordinate is out of bounds/invalid). What HTTP error code would we map to in this case?
Machine Readable Error Codes: As a third criteria, error codes should be easily readable by other applications. For example, the XooMLe application returns back only human readable error messages, e.g. "Invalid Google API key supplied". An application parsing a XooMLe response would have to search for this specific error message, and this can be notoriously brittle -- for example, the XooMLe server might simply change the message to "Invalid Key Supplied". Error codes, such as those provided by DAS are important for programmatic control, and easy creation of exceptions. For example, if XooMLe returned a 1001 error code, a client application could do a quick lookup and immediately throw an InvalidKeyException.
Based on these three criteria, here's my vote for best error handling option:
Use HTTP Status Codes for problems specifically related to HTTP, and not specifically related to your web service.
When an error occurs, always return an XML document detailing the error.
Make sure the XML error document contains both an error code, and a human readable error message. For example:
1001
Invalid Google API key supplied
By following these three simple practices, you can make it significantly easier for others to interface with your service, and react when things go wrong. New developers can easily see valid and invalid requests via a simple web browser, and programs can easily (and more robustly) extract error codes and act appropriately.
The Amazon.com web services API follows the approach of returned XML document can specify an ErrorMsg element.
XooMLe also follows this approach. (XooMLe provides a RESTful API wrapper to the existing SOAP based Google API).
Another approach is by DAS ( Distributed Annotation System) which always returns 200 if there was no HTTP-error and has error information in the HTTP-header, which is less favorable, because it is not human readable, as a browser does not display the HTTP-header.

Choose appropriate HTTP status codes in controversial situations or introduce subcodes?

I am developing iOS application running against a remote server, having another developer behind it. The project and an API draft we are writing are in initial phase.
The problem we are faced with is that we are not satisfied with existing amount of conventional status codes described by HTTP/REST specifications: there are cases where we are uncertain about what code to pick up.
Examples to provide minimal context:
Server-side validation errors. Fx. Client-side validations are ok, but server API has recently been changed slightly, so a server should return something indicating that it is exactly the validation problem.
An attempt to register user that already exists. SO topics do not provide any precise point on that.
A user is registered, and tries to log in without having the password confirmation procedure accomplished.
Two obvious approaches we see here:
Use fx 400 error for the cases when an appropriate conventional status code could not be found. This will lead us to parsing error text messages from JSON responses. Obviously, this approach will introduce superfluous complication in a client-side code.
Create our own sub-codes system and rely on it in our code. This one involves too much artificial conventions, which will lead us towards becoming too opinionated and arbitrary.
Feeling that the number of such cases is going to grow, we are thinking about an introduction of custom sub-codes in JSON responses our server should give (i.e. choose the second approach).
What I'm asking here:
What are the recommended approaches, strategies, even glues or hacks for these kinds of situations?
What are pros-cons of moving away from strictly following REST/HTTP conventions for status codes?
Thanks!
For validation problems, I use 422 Unprocessable Entity (WebDAV; RFC 4918)
The request was well-formed but was unable to be followed due to semantic errors. This is because the request did not fail because of malformed syntax, but because of semantics.
Then in order to communicate you just need to decide on your errors format, so for situation 1 if there is a required field you might return a 422 with the following.
{
"field": ["required"]
}
I would treat number two as a validation problem, since really it is a validation problem on username, so a 422 with the following.
{
"username": ["conflict"]
}
Number three I would treat as a 403 Forbidden because passing an authorization header will not help and will be forbidden until they do something other than pass credentials.
You could do something like oauth2 does and return a human readable description, a constant that people can code against that further clarifies the error and a uri for more information.
{
"error": "unfinished_registration",
"error_description": "Must finish the user registration process",
"error_uri": "http://yourdocumentation.com"
}
Note: you will find that people disagree on what http codes map to what situation and if 422 should be used since is part of the WebDAV extensions, and that is fine, the most important thing you can do is document what the codes mean and be consistent rather than being perfect with what they mean.
There's no such thing as "sub-codes" in HTTP (Microsoft IIS is clearly violating the spec, and should be flogged).
If there's an appropriate status code, use it; don't say "this status code means that in my application" because that's losing the value of generic status codes; you might as well design your own protocol.
After that, if you need to refine the semantics of the status code, use headers and/or the body.
For the use cases you have described, you could use these error codes:
1) 400 Bad Request
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.
2) 409 Conflict
The request could not be completed due to a conflict with the current state of the resource. This code is only allowed in situations where it is expected that the user might be able to resolve the conflict and resubmit the request. The response body SHOULD include enough
information for the user to recognize the source of the conflict. Ideally, the response entity would include enough information for the user or user agent to fix the problem; however, that might not be possible and is not required.
Conflicts are most likely to occur in response to a PUT request. For example, if versioning were being used and the entity being PUT included changes to a resource which conflict with those made by an earlier (third-party) request, the server might use the 409 response to indicate that it can't complete the request. In this case, the response entity would likely contain a list of the differences between the two versions in a format defined by the response Content-Type.
3) 401 Not Authorized
The request requires user authentication. The response MUST include a WWW-Authenticate header field (section 14.47) containing a challenge applicable to the requested resource. The client MAY repeat the request with a suitable Authorization header field (section 14.8). If the request already included Authorization credentials, then the 401 response indicates that authorization has been refused for those credentials. If the 401 response contains the same challenge as the prior response, and the user agent has already attempted authentication at least once, then the user SHOULD be presented the entity that was given in the response, since that entity might include relevant diagnostic information. HTTP access authentication is explained in "HTTP Authentication: Basic and Digest Access Authentication" [43].
For any other use case that you have, it varies. I would probably go with number 2 if there is truly no standard way of encoding specific errors.

REST API Framework. Recommended behavior for invalid querystring parameter

I am implementing a REST API Framework, and I wonder what the recommendedbehavior is, when a client submits an invalid querystring parameter.
I will illustrate what I mean with a specific example:
Say, I have an API handler on the /api/contacts/ endpoint, and the handler provides a querystring filter named id, which enables clients to select certain contacts with the provided IDs.
So, a GET or DELETE request could be /api/contacts/?id=2&id=4&id=lalalala.
Clearly, there is no such thing as a Contact with id=lalalala. In this case, what should the server behave like?
Ignore the invalid Contact with id=lalalala, and only filter the contacts on the valid ids, 2 and 4.
Respond with an error code that indicates this error. If yes, which error code should be provided?
Thanks in advance.
Edit: To clarify; The main focus of the framework I develop, is having a predictable behavior and hence response codes. For this reason, I want the clients consuming an API built on this framework, to expect the least possible surprises.
So, the question basically is: Should the API return an error in this case(and if yes, which)? Or ignore invalid filter entries, and only filter on the correct querystring parameters?
Since this is a REST call, we are talking about resources. And whenever we have a wrong filter, we should return a proper error code.
In this case i would go for 400 - bad request as the resource was found and correctly mapped (/api/contacts), but there was a problem with the query string part. Therefore a 400 and not a 404.
Would return a 404 if someone requested /api/contacts-all or some non-existant resource.
EDIT based on comments below
Agree to your comment. Ideally a 400 is a problem with the request. Going by that, you could use a 422 Unprocessable Entity. Please look at the stackoverflow link below and it talks about the same thing.
I would guess that developers around the world would be more comfortable seeing a 400 than 422 for such logical errors due to the fact that bigger companies are using 400 and not 422.
References:
Http status codes and
400 for logical error vs malformed request
Following the letter of the law, the response should be a 404 Not found. However, nobody is going to get too upset with you if you prefer to return 400 - bad request.
I would definitely return a 4XX status code though. You want the client to know that they made an error.