References to IDs in APIs responses, null or 0? - api

I consider myself that 0 is not a good thing to do when returning information from an API
e.g.
{
userId: int|null
}
I have a colleague that insists in that userId should be 0 or -1, but that forces a system to know that the 0 means "not set", instead of null which is universally known as not set.
The same happens with string params, like logoUrl. However, in this case I think it is acceptable to have an empty string instead of null if the variable is not set or was unset.
Is there bibliography, standards, etc, that I can refer to?

I'm not aware of any standard around that, but the way I take those kind of decisions is by thinking about the way consumer services would read this response, the goal being to provide a very smooth and clean consuming workflow. This exercise can even be transformed into a documentation for your API consumers.
In your case, instead of returning a null/0 field, I would simply remove that field altogether when it's empty and let the consumers explicitly mark that field as optional in the model they use to deserialize this response.
In that way, they'll explicitly have to deal with those optional fields, than relying on the API to always provide a value for them.

Related

Where to validate parameters sent to an API?

In which place is best practice to validate parameters sent by the user in an API design? By parameter validation I refer to: checking required params are sent, ensure they have correct format and so... Here are a couple of simple examples that validate an id has been sent. It is Python using Flask to illustrate:
A) Add validation logic in the route definition, within the controller.
#api.route('/job', methods=['GET'])
def get_jobs():
try:
if params["id"] is None:
raise Exception("Invalid param ID parameters.")
job = job_manager.get_job(params["id"])
return jsonify(job)
B) In the core of the app. This is the business layer, where logic is applied to transform data.
class JobManager:
def get_job(self, job_id) -> None:
if job_id is None:
raise Exception("Invalid param ID parameters.")
In more complex scenarios a validator service or decorators could be used, but the question would be the same: At which point of the code is best practice to validate a user's input.
If the answer is none of the scenarios above (or both), please provide more details on your answer. If possible, try to be language agnostic as I'm looking for a best practice that can be applied anywhere.
Parsing, as a rule, should happen at the point where information enters your system, or as close to that point as is practical.
Therefore certainly "application layer" rather than "domain layer/business layer": either invoked by the controller itself, or very close to it. (Not typically "in" the controller, because you should be able to test the parser without being coupled to a bunch of HTTP ceremony.)
#api.route('/job', methods=['GET'])
def get_jobs():
try:
job_id = parse_job_id(params["id"])
job = job_manager.get_job(job_id)
return jsonify(job)
In typed languages, this can make your life a lot easier, because you greatly reduce the number of places you have to ask "does this general purpose data structure have the information I expect?"
Checks against business policy, on the other hand, normally belong in the domain layer.
For example: if your API requires a date, the checks that the date is actually present, and that the date is represented in the appropriate ISO-8601 format, and so on... these kinds of checks all happen as part of the parsing of the input by the controller.
On the other hand, checking that the date is "in the future", or that the date is within warranty, or ... these are checks that belong in your domain code.
Generally, I split validation into phases:
Immediate syntax validation of input data in the REST controller
Business-logic validation in the services
The first validation should only flag things that are definitely wrong; e.g. missing required fields, type mismatch, unparsable strings, any attempts of code injection and the presence (or lack of) of security tokens.
When this validation passes, the input data is at least syntactically correct and may be passed on to the services, where a more strict validation occurs; i.e. does the input data make sense business-wise, does the resource with that ID exist - and so on.
Short version: The first validation looks for things that are obviously wrong, while the following validation ensures input data is correct and meaningful business-wise.

Mono.Defer() vs Mono.create() vs Mono.just()?

Could someone help me to understand the difference between:
Mono.defer()
Mono.create()
Mono.just()
How to use it properly?
Mono.just(value) is the most primitive - once you have a value you can wrap it into a Mono and subscribers down the line will get it.
Mono.defer(monoSupplier) lets you provide the whole expression that supplies the resulting Mono instance. The evaluation of this expression is deferred until somebody subscribes. Inside of this expression you can additionally use control structures like Mono.error(throwable) to signal an error condition (you cannot do this with Mono.just).
Mono.create(monoSinkConsumer) is the most advanced method that gives you the full control over the emitted values. Instead of the need to return Mono instance from the callback (as in Mono.defer), you get control over the MonoSink<T> that lets you emit values through MonoSink.success(), MonoSink.success(value), MonoSink.error(throwable) methods.
Reactor documentation contains a few good examples of possible Mono.create use cases: link to doc.
The general advice is to use the least powerful abstraction to do the job: Mono.just -> Mono.defer -> Mono.create.
Although in general I agree with (and praise) #IlyaZinkovich's answer, I would be careful with the advice
The general advice is to use the least powerful abstraction to do the job: Mono.just -> Mono.defer -> Mono.create.
In the reactive approach, especially if we are beginners, it's very easy to overlook which the "least powerful abstraction" actually is. I am not saying anything else than #IlyaZinkovich, just depicting one detailed aspect.
Here is one specific use case where the more powerful abstraction Mono.defer() is preferable over Mono.just() but which might not be visible at the first glance.
See also:
https://stackoverflow.com/a/54412779/2886891
https://stackoverflow.com/a/57877616/2886891
We use switchIfEmpty() as a subscription-time branching:
// First ask provider1
provider1.provide1(someData)
// If provider1 did not provide the result, ask the fallback provider provider2
.switchIfEmpty(provider2.provide2(someData))
public Mono<MyResponse> provide2(MyRequest someData) {
// The Mono assembly is needed only in some corner cases
// but in fact it is always happening
return Mono.just(someData)
// expensive data processing which might even fail in the assemble time
.map(...)
.map(...)
...
}
provider2.provide2() accepts someData only when provider1.provide1() does not return any result, and/or the method assembly of the Mono returned by provider2.provide2() is expensive and even fails when called on wrong data.
It this case defer() is preferable, even if it might not be obvious at the first glance:
provider1.provide1(someData)
// ONLY IF provider1 did not provide the result, assemble another Mono with provider2.provide()
.switchIfEmpty(Mono.defer(() -> provider2.provide2(someData)))

What is the difference between an Idempotent and a Deterministic function?

Are idempotent and deterministic functions both just functions that return the same result given the same inputs?
Or is there a distinction that I'm missing?
(And if there is a distinction, could you please help me understand what it is)
In more simple terms:
Pure deterministic function: The output is based entirely, and only, on the input values and nothing else: there is no other (hidden) input or state that it relies on to generate its output. There are no side-effects or other output.
Impure deterministic function: As with a deterministic function that is a pure function: the output is based entirely, and only, on the input values and nothing else: there is no other (hidden) input or state that it relies on to generate its output - however there is other output (side-effects).
Idempotency: The practical definition is that you can safely call the same function multiple times without fear of negative side-effects. More formally: there are no changes of state between subsequent identical calls.
Idempotency does not imply determinacy (as a function can alter state on the first call while being idempotent on subsequent calls), but all pure deterministic functions are inherently idempotent (as there is no internal state to persist between calls). Impure deterministic functions are not necessarily idempotent.
Pure deterministic
Impure deterministic
Pure Nondeterministic
Impure Nondeterministic
Idempotent
Input
Only parameter arguments (incl. this)
Only parameter arguments (incl. this)
Parameter arguments and hidden state
Parameter arguments and hidden state
Any
Output
Only return value
Return value or side-effects
Only return value
Return value or side-effects
Any
Side-effects
None
Yes
None
Yes
After 1st call: Maybe.After 2nd call: None
SQL Example
UCASE
CREATE TABLE
GETDATE
DROP TABLE
C# Example
String.IndexOf
DateTime.Now
Directory.Create(String)Footnote1
Footnote1 - Directory.Create(String) is idempotent because if the directory already exists it doesn't raise an error, instead it returns a new DirectoryInfo instance pointing to the specified extant filesystem directory (instead of creating the filesystem directory first and then returning a new DirectoryInfo instance pointing to it) - this is just like how Win32's CreateFile can be used to open an existing file.
A temporary note on non-scalar parameters, this, and mutating input arguments:
(I'm currently unsure how instance methods in OOP languages (with their hidden this parameter) can be categorized as pure/impure or deterministic or not - especially when it comes to mutating the the target of this - so I've asked the experts in CS.SE to help me come to an answer - once I've got a satisfactory answer there I'll update this answer).
A note on Exceptions
Many (most?) programming languages today treat thrown exceptions as either a separate "kind" of return (i.e. "return to nearest catch") or as an explicit side-effect (often due to how that language's runtime works). However, as far as this answer is concerned, a given function's ability to throw an exception does not alter its pure/impure/deterministic/non-deterministic label - ditto idempotency (in fact: throwing is often how idempotency is implemented in the first place e.g. a function can avoid causing any side-effects simply by throwing right-before it makes those state changes - but alternatively it could simply return too.).
So, for our CS-theoretical purposes, if a given function can throw an exception then you can consider the exception as simply part of that function's output. What does matter is if the exception is thrown deterministically or not, and if (e.g. List<T>.get(int index) deterministically throws if index < 0).
Note that things are very different for functions that catch exceptions, however.
Determinacy of Pure Functions
For example, in SQL UCASE(val), or in C#/.NET String.IndexOf are both deterministic because the output depends only on the input. Note that in instance methods (such as IndexOf) the instance object (i.e. the hidden this parameter) counts as input, even though it's "hidden":
"foo".IndexOf("o") == 1 // first cal
"foo".IndexOf("o") == 1 // second call
// the third call will also be == 1
Whereas in SQL NOW() or in C#/.NET DateTime.UtcNow is not deterministic because the output changes even though the input remains the same (note that property getters in .NET are equivalent to a method that accepts no parameters besides the implicit this parameter):
DateTime.UtcNow == 2016-10-27 18:10:01 // first call
DateTime.UtcNow == 2016-10-27 18:10:02 // second call
Idempotency
A good example in .NET is the Dispose() method: See Should IDisposable.Dispose() implementations be idempotent?
a Dispose method should be callable multiple times without throwing an exception.
So if a parent component X makes an initial call to foo.Dispose() then it will invoke the disposal operation and X can now consider foo to be disposed. Execution/control then passes to another component Y which also then tries to dispose of foo, after Y calls foo.Dispose() it too can expect foo to be disposed (which it is), even though X already disposed it. This means Y does not need to check to see if foo is already disposed, saving the developer time - and also eliminating bugs where calling Dispose a second time might throw an exception, for example.
Another (general) example is in REST: the RFC for HTTP1.1 states that GET, HEAD, PUT, and DELETE are idempotent, but POST is not ( https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html )
Methods can also have the property of "idempotence" in that (aside from error or expiration issues) the side-effects of N > 0 identical requests is the same as for a single request. The methods GET, HEAD, PUT and DELETE share this property. Also, the methods OPTIONS and TRACE SHOULD NOT have side effects, and so are inherently idempotent.
So if you use DELETE then:
Client->Server: DELETE /foo/bar
// `foo/bar` is now deleted
Server->Client: 200 OK
Client->Server DELETE /foo/bar
// foo/bar` is already deleted, so there's nothing to do, but inform the client that foo/bar doesn't exist
Server->Client: 404 Not Found
// the client asks again:
Client->Server: DELETE /foo/bar
// foo/bar` is already deleted, so there's nothing to do, but inform the client that foo/bar doesn't exist
Server->Client: 404 Not Found
So you see in the above example that DELETE is idempotent in that the state of the server did not change between the last two DELETE requests, but it is not deterministic because the server returned 200 for the first request but 404 for the second request.
A deterministic function is just a function in the mathematical sense. Given the same input, you always get the same output. On the other hand, an idempotent function is a function which satisfies the identity
f(f(x)) = f(x)
As a simple example. If UCase() is a function that converts a string to an upper case string, then clearly UCase(Ucase(s)) = UCase(s).
Idempotent functions are a subset of all functions.
A deterministic function will return the same result for the same inputs, regardless of how many times you call it.
An idempotent function may NOT return the same result (it will return the result in the same form but the value could be different, see http example below). It only guarantees that it will have no side effects. In other words it will not change anything.
For example, the GET verb is meant to be idempotent in HTTP protocol. If you call "~/employees/1" it will return the info for employee with ID of 1 in a specific format. It should never change anything but simply return the employee information. If you call it 10, 100 or so times, the returned format will always be the same. However, by no means can it be deterministic. Maybe if you call it the second time, the employee info has changed or perhaps the employee no longer even exists. But never should it have side effects or return the result in a different format.
My Opinion
Idempotent is a weird word but knowing the origin can be very helpful, idem meaning same and potent meaning power. In other words it means having the same power which clearly doesn't mean no side effects so not sure where that comes from. A classic example of There are only two hard things in computer science, cache invalidation and naming things. Why couldn't they just use read-only? Oh wait, they wanted to sound extra smart, perhaps? Perhaps like cyclomatic complexity?

Use encoded object query param for GET request

While designing the API with with the team a suggestion was forwarded in regards to some complex query parameters that we sent which need to be encoded as objects, arrays of objects, etc. Suppose I have a route GET /resource/ and I want to apply a set of filters directly in the query params. The object literal structure of this filter would be something like
filter: {
field1: {
contains: 'value',
notin: ['value2', 'value3']
},
field2: {
greaterThan: 10
}
}
Encoding this in the url, via a query string parser such as the qs node module that express.js uses internally, would be cheap on the backend. However 1) The generated url is very hard to read, if a client wants to connect with the API he would need to use an encoding library and 2) I don't think I ever encountered the use of query params like this, it looks a little bit of overengineered and I'm not sure how used it is and if it is really safe.
The example above would yield query params such as:
GET /resource/?field1%5Bcontains%5D=value&field1%5Bnotin%5D%5B0%5D=value2&field1%5Bnotin%5D%5B1%5D=value3&field2%5BgreaterThan%5D=10
Does this practice of sending url query parameters that happen to be complex objects have some standards or best practices?
We implemented a different solution for filtering, when the list of possible parameters was very long. We ended up doing it in two steps, posting the filter and returning a filter ID. The filter ID could then be used in the GET query.
We had trouble finding any best practices for this.

Is returning null bad design? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I've heard some voices saying that checking for a returned null value from methods is bad design. I would like to hear some reasons for this.
pseudocode:
variable x = object.method()
if (x is null) do something
The rationale behind not returning null is that you do not have to check for it and hence your code does not need to follow a different path based on the return value. You might want to check out the Null Object Pattern which provides more information on this.
For example, if I were to define a method in Java that returned a Collection I would typically prefer to return an empty collection (i.e. Collections.emptyList()) rather than null as it means my client code is cleaner; e.g.
Collection<? extends Item> c = getItems(); // Will never return null.
for (Item item : c) { // Will not enter the loop if c is empty.
// Process item.
}
... which is cleaner than:
Collection<? extends Item> c = getItems(); // Could potentially return null.
// Two possible code paths now so harder to test.
if (c != null) {
for (Item item : c) {
// Process item.
}
}
Here's the reason.
In Clean Code by Robert Martin he writes that returning null is bad design when you can instead return, say, empty array. Since expected result is an array, why not? It'll enable you to iterate over result without any extra conditions. If it's an integer, maybe 0 will suffice, if it's a hash, empty hash. etc.
The premise is to not force calling code to immediately handle issues. Calling code may not want to concern itself with them. That's also why in many cases exceptions is better than nil.
Good uses of returning null:
If null is a valid functional result, for example: FindFirstObjectThatNeedsProcessing() can return null if not found and the caller should check accordingly.
Bad uses: Trying to replace or hide exceptional situations such as:
catch(...) and return null
API dependency initialization failed
Out of disk space
Invalid input parameters (programming error, inputs must be sanitized by the caller)
etc
In those cases throwing an exception is more adequate since:
A null return value provides no meaningful error info
The immediate caller most likely cannot handle the error condition
There is no guarantee that the caller is checking for null results
However, Exceptions should not be used to handle normal program operation conditions such as:
Invalid username/password (or any user-provided inputs)
Breaking loops or as non-local gotos
Yes, returning NULL is a terrible design, in object-oriented world. In a nutshell, NULL usage leads to:
ad-hoc error handling (instead of exceptions)
ambiguous semantic
slow instead of fast failing
computer thinking instead of object thinking
mutable and incomplete objects
Check this blog post for a detailed explanation: http://www.yegor256.com/2014/05/13/why-null-is-bad.html. More in my book Elegant Objects, Section 4.1.
Who says this is bad design?
Checking for nulls is a common practice, even encouraged, otherwise you run the risk of NullReferenceExceptions everywhere. Its better to handle the error gracefully than throw exceptions when you don't need to.
Based on what you've said so far, I think there's not enough information.
Returning null from a CreateWidget()method seems bad.
Returning null from a FindFooInBar() method seems fine.
Its inventor says it is a billion dollar mistake!
It depends on the language you're using. If you're in a language like C# where the idiomatic way of indicating the lack of a value is to return null, then returning null is a good design if you don't have a value. Alternatively, in languages such as Haskell which idiomatically use the Maybe monad for this case, then returning null would be a bad design (if it were even possible).
If you read all the answers it becomes clear the answer to this question depends on the kind of method.
Firstly, when something exceptional happens (IOproblem etc), logically exceptions are thrown. When exactly something is exceptional is probably something for a different topic..
Whenever a method is expected to possibly have no results there are two categories:
If it is possible to return a neutral value, do so.
Empty enumrables, strings etc are good examples
If such a neutral value does not exist, null should be returned.
As mentioned, the method is assumed to possibly have no result, so it is not exceptional, hence should not throw an exception. A neutral value is not possible (for example: 0 is not especially a neutral result, depending on the program)
Untill we have an official way to denote that a function can or cannot return null, I try to have a naming convention to denote so.
Just like you have the TrySomething() convention for methods that are expected to fail, I often name my methods SafeSomething() when the method returns a neutral result instead of null.
I'm not fully ok with the name yet, but couldn't come up with anything better. So I'm running with that for now.
I have a convention in this area that served me well
For single item queries:
Create... returns a new instance, or throws
Get... returns an expected existing instance, or throws
GetOrCreate... returns an existing instance, or new instance if none exists, or throws
Find... returns an existing instance, if it exists, or null
For collection queries:
Get... always returns a collection, which is empty if no matching[1] items are found
[1] given some criteria, explicit or implicit, given in the function name or as parameters.
Exceptions are for exceptional circumstances.
If your function is intended to find an attribute associated with a given object, and that object does has no such attribute, it may be appropriate to return null. If the object does not exist, throwing an exception may be more appropriate. If the function is meant to return a list of attributes, and there are none to return, returning an empty list makes sense - you're returning all zero attributes.
It's not necessarily a bad design - as with so many design decisions, it depends.
If the result of the method is something that would not have a good result in normal use, returning null is fine:
object x = GetObjectFromCache(); // return null if it's not in the cache
If there really should always be a non-null result, then it might be better to throw an exception:
try {
Controller c = GetController(); // the controller object is central to
// the application. If we don't get one,
// we're fubar
// it's likely that it's OK to not have the try/catch since you won't
// be able to really handle the problem here
}
catch /* ... */ {
}
It's fine to return null if doing so is meaningful in some way:
public String getEmployeeName(int id){ ..}
In a case like this it's meaningful to return null if the id doesn't correspond to an existing entity, as it allows you to distinguish the case where no match was found from a legitimate error.
People may think this is bad because it can be abused as a "special" return value that indicates an error condition, which is not so good, a bit like returning error codes from a function but confusing because the user has to check the return for null, instead of catching the appropriate exceptions, e.g.
public Integer getId(...){
try{ ... ; return id; }
catch(Exception e){ return null;}
}
For certain scenarios, you want to notice a failure as soon as it happens.
Checking against NULL and not asserting (for programmer errors) or throwing (for user or caller errors) in the failure case can mean that later crashes are harder to track down, because the original odd case wasn't found.
Moreover, ignoring errors can lead to security exploits. Perhaps the null-ness came from the fact that a buffer was overwritten or the like. Now, you are not crashing, which means the exploiter has a chance to execute in your code.
What alternatives do you see to returning null?
I see two cases:
findAnItem( id ). What should this do if the item is not found
In this case we could: Return Null or throw a (checked) exception (or maybe create an item and return it)
listItemsMatching (criteria) what should this return if nothing is found?
In this case we could return Null, return an empty list or throw an Exception.
I believe that return null may be less good than the alternatives becasue it requires the client to remember to check for null, programmers forget and code
x = find();
x.getField(); // bang null pointer exception
In Java, throwing a checked exception, RecordNotFoundException, allows the compiler to remind the client to deal with case.
I find that searches returning empty lists can be quite convenient - just populate the display with all the contents of the list, oh it's empty, the code "just works".
Make them call another method after the fact to figure out if the previous call was null. ;-) Hey, it was good enough for JDBC
Well, it sure depends of the purpose of the method ... Sometimes, a better choice would be to throw an exception. It all depends from case to case.
Sometimes, returning NULL is the right thing to do, but specifically when you're dealing with sequences of different sorts (arrays, lists, strings, what-have-you) it is probably better to return a zero-length sequence, as it leads to shorter and hopefully more understandable code, while not taking much more writing on API implementer's part.
The base idea behind this thread is to program defensively. That is, code against the unexpected.
There is an array of different replies:
Adamski suggests looking at Null Object Pattern, with that reply being up voted for that suggestion.
Michael Valenty also suggests a naming convention to tell the developer what may be expected.
ZeroConcept suggests a proper use of Exception, if that is the reason for the NULL.
And others.
If we make the "rule" that we always want to do defensive programming then we can see that these suggestions are valid.
But we have 2 development scenarios.
Classes "authored" by a developer: The Author
Classes "consumed" by another(maybe) developer: the Developer
Regardless of whether a class returns NULL for methods with a return value or not,
the Developer will need to test if the object is valid.
If the developer cannot do this, then that Class/method is not deterministic.
That is, if the "method call" to get the object does not do what it "advertises" (eg getEmployee) it has broken the contract.
As an author of a class, I always want to be as kind and defensive ( and deterministic) when creating a method.
So given that either NULL or the NULL OBJECT (eg if(employee as NullEmployee.ISVALID)) needs to be checked
and that may need to happen with a collection of Employees, then the null object approach is the better approach.
But I also like Michael Valenty's suggestion of naming the method that MUST return null eg getEmployeeOrNull.
An Author who throws an exception is removing the choice for the developer to test the object's validity,
which is very bad on a collection of objects, and forces the developer into exception handling
when branching their consuming code.
As a developer consuming the class, I hope the author gives me the ability to avoid or program for the null situation
that their class/methods may be faced with.
So as a developer I would program defensively against NULL from a method.
If the author has given me a contract that always returns a object (NULL OBJECT always does)
and that object has a method/property by which to test the validity of the object,
then I would use that method/property to continue using the object, else the object is not valid
and I cannot use it.
Bottom line is that the Author of the Class/Methods must provide mechanisms
that a Developer can use in their defensive programming. That is, a clearer intention of the method.
The Developer should always use defensive programming to test the validity of the objects returned
from another class/method.
regards
GregJF
Other options to this, are:
returning some value that indicates success or not (or type of an error), but if you just need boolean value that will indicate success / fail, returning null for failure, and an object for success wouldn't be less correct, then returning true/false and getting the object through parameter.
Other approach would to to use exception to indicates failures, but here - there are actually many more voices, that say this is a BAD practice (as using exceptions may be convenient but has many disadvantages).
So I personally don't see anything bad in returning null as indication that something went wrong, and checking it later (to actually know if you have succeeded or not). Also, blindly thinking that your method will not return NULL, and then base your code on it, may lead to other, sometimes hard to find, errors (although in most cases it will just crash your system :), as you will reference to 0x00000000 sooner or later).
Unintended null functions can arise during the development of a complex programs, and like dead code, such occurrences indicate serious flaws in program structures.
A null function or method is often used as the default behavior of a revectorable function or overrideable method in an object framework.
Null_function #wikipedia
If the code is something like:
command = get_something_to_do()
if command: # if not Null
command.execute()
If you have a dummy object whose execute() method does nothing, and you return that instead of Null in the appropriate cases, you don't have to check for the Null case and can instead just do:
get_something_to_do().execute()
So, here the issue is not between checking for NULL vs. an exception, but is instead between the caller having to handle special non-cases differently (in whatever way) or not.
For my use case I needed to return a Map from method and then looking for a specific key. But if I return an empty Map, then it will lead to NullPointerException and then it wont be much different returning null instead of an empty Map.
But from Java8 onward we could use Optional. The above is the very reason Optional concept was introduced.
G'day,
Returning NULL when you are unable to create a new object is standard practise for many APIs.
Why the hell it's bad design I have no idea.
Edit: This is true of languages where you don't have exceptions such as C where it has been the convention for many years.
HTH
'Avahappy,