Null safety issues with Gson in kotlin - kotlin

I have a data class:
data class Temp(
val value: Int
)
When I run the following code:
val temp1 = Gson().fromJson("", Temp::class.java)
val temp2 = temp1.value
println(temp2)
I get null pointer exception.
Doesn't this violate the null safety of kotlin?

As far as I can remember, this has to do with the way that GSON makes use of reflection, which essentially can bypass kotlin null-safety, so the best thing you can do is mark all fields as potentially nullable and be safe, alternatively you could consider writing custom type adapters too, but that could be a bit much, as this null safety is only really valid for compile-time, but isn't really guaranteed during runtime
Generally, if you're making use of a third-party api which can change at any time, it's probably a good idea to consider the fact that stuff could be changed at any given time, changes in variable names or variables being removed, etc. If you're using your own api's, this is probably a bit different, as you'd (hopefully) have more control over the change, still doesn't hurt to manually implement null-safety though, assume everything can be null

Related

Kotlin: Assert Immutability

I have class that internally maintains a mutable list, and I want to provide an immutable view on this list. Currently I'm using the following:
/**The list that actually stores which element is at which position*/
private val list: MutableList<T> = ArrayList()
/**Immutable view of [list] to the outside.*/
val listView: List<T> get() = list.toList()
First question: Can this be done easier
Second question: How can I test that listView is actually immutable. I guess reflections are necessary?
If you only needed the compile-time type to be immutable, you could simply upcast your list:
val listView: List<T> get() = list
(Though if you checked and downcast that to MutableList, you could make changes — and those would affect the original list.)
However, if you want full immutability, that's tricky.  I don't think there are any general-purpose truly immutable lists in the Kotlin stdlib.
Although List and MutableList look like two different types, in Kotlin/JVM they both compile down to the same type in the bytecode, which is mutable.  And even in Kotlin, while Iterable.toList() does return a new list, the current implementation* actually gives a MutableList that's been upcast to List.  (Though mutating it wouldn't change the original list.)
Some third-party libraries provide truly immutable collections, though; see this question.
And to check whether a List is mutable, you could use a simple type check:
if (listView is MutableList)
// …
No need to use reflection explicitly.  (A type check like that is considered to be implicit reflection.)
(* That can change, of course.  It's usually a mistake to read too much into the current code, if it's not backed up by the documentation.)

Unit testing value objects in isolation from its dependencies

TL;DR
How do you test a value object in isolation from its dependencies without stubbing or injecting them?
In Misko Hevery's blog post To “new” or not to “new”… he advocates the following (quoted from the blog post):
An Injectable class can ask for other Injectables in its constructor.(Sometimes I refer to Injectables as Service Objects, but
that term is overloaded.). Injectable can never ask for a non-Injectable (Newable) in its constructor.
Newables can ask for other Newables in their constructor, but not for Injectables (Sometimes I refer to Newables as Value Object, but
again, the term is overloaded)
Now if I have a Quantity value object like this:
class Quantity{
$quantity=0;
public function __construct($quantity){
$intValidator = new Zend_Validate_Int();
if(!$intValidator->isValid($quantity)){
throw new Exception("Quantity must be an integer.");
}
$gtValidator = new Zend_Validate_GreaterThan(0);
if(!$gtvalidator->isValid($quantity)){
throw new Exception("Quantity must be greater than zero.");
}
$this->quantity=$quantity;
}
}
My Quantity value object depends on at least 2 validators for its proper construction. Normally I would have injected those validators through the constructor, so that I can stub them during testing.
However, according to Misko a newable shouldn't ask for injectables in its constructor. Frankly a Quantity object that looks like this
$quantity=new Quantity(1,$intValidator,$gtValidator); looks really awkward.
Using a dependency injection framework to create a value object is even more awkward. However now my dependencies are hard coded in the Quantity constructor and I have no way to alter them if the business logic changes.
How do you design the value object properly for testing and adherence to the separation between injectables and newables?
Notes:
This is just a very very simplified example. My real object my have serious logic in it that may use other dependencies as well.
I used a PHP example just for illustration. Answers in other languages are appreciated.
A Value Object should only contain primitive values (integers, strings, boolean flags, other Value Objects, etc.).
Often, it would be best to let the Value Object itself protect its invariants. In the Quantity example you supply, it could easily do that by checking the incoming value without relying on external dependencies. However, I realize that you write
This is just a very very simplified example. My real object my have serious logic in it that may use other dependencies as well.
So, while I'm going to outline a solution based on the Quantity example, keep in mind that it looks overly complex because the validation logic is so simple here.
Since you also write
I used a PHP example just for illustration. Answers in other languages are appreciated.
I'm going to answer in F#.
If you have external validation dependencies, but still want to retain Quantity as a Value Object, you'll need to decouple the validation logic from the Value Object.
One way to do that is to define an interface for validation:
type IQuantityValidator =
abstract Validate : decimal -> unit
In this case, I patterned the Validate method on the OP example, which throws exceptions upon validation failures. This means that if the Validate method doesn't throw an exception, all is good. This is the reason the method returns unit.
(If I hadn't decided to pattern this interface on the OP, I'd have preferred using the Specification pattern instead; if so, I'd instead have declared the Validate method as decimal -> bool.)
The IQuantityValidator interface enables you to introduce a Composite:
type CompositeQuantityValidator(validators : IQuantityValidator list) =
interface IQuantityValidator with
member this.Validate value =
validators
|> List.iter (fun validator -> validator.Validate value)
This Composite simply iterates through other IQuantityValidator instances and invokes their Validate method. This enables you to compose arbitrarily complex validator graphs.
One leaf validator could be:
type IntegerValidator() =
interface IQuantityValidator with
member this.Validate value =
if value % 1m <> 0m
then
raise(
ArgumentOutOfRangeException(
"value",
"Quantity must be an integer."))
Another one could be:
type GreaterThanValidator(boundary) =
interface IQuantityValidator with
member this.Validate value =
if value <= boundary
then
raise(
ArgumentOutOfRangeException(
"value",
"Quantity must be greater than zero."))
Notice that the GreaterThanValidator class takes a dependency via its constructor. In this case, boundary is just a decimal, so it's a Primitive Dependency, but it could just as well have been a polymorphic dependency (A.K.A a Service).
You can now compose your own validator from these building blocks:
let myValidator =
CompositeQuantityValidator([IntegerValidator(); GreaterThanValidator(0m)])
When you invoke myValidator with e.g. 9m or 42m, it returns without errors, but if you invoke it with e.g. 9.8m, 0m or -1m it throws the appropriate exception.
If you want to build something a bit more complicated than a decimal, you can introduce a Factory, and compose the Factory with the appropriate validator.
Since Quantity is very simple here, we can just define it as a type alias on decimal:
type Quantity = decimal
A Factory might look like this:
type QuantityFactory(validator : IQuantityValidator) =
member this.Create value : Quantity =
validator.Validate value
value
You can now compose a QuantityFactory instance with your validator of choice:
let factory = QuantityFactory(myValidator)
which will let you supply decimal values as input, and get (validated) Quantity values as output.
These calls succeed:
let x = factory.Create 9m
let y = factory.Create 42m
while these throw appropriate exceptions:
let a = factory.Create 9.8m
let b = factory.Create 0m
let c = factory.Create -1m
Now, all of this is very complex given the simple nature of the example domain, but as the problem domain grows more complex, complex is better than complicated.
Avoid value types with dependencies on non-value types. Also avoid constructors that perform validations and throw exceptions. In your example I'd have a factory type that validates and creates quantities.
Your scenario can also be applied to entities. There are cases where an entity requires some dependency in order to perform some behaviour. As far as I can tell the most popular mechanism to use is double-dispatch.
I'll use C# for my examples.
In your case you could have something like this:
public void Validate(IQuantityValidator validator)
As other answers have noted a value object is typically simple enough to perform its invariant checking in the constructor. An e-mail value object would be a good example as an e-mail has a very specific structure.
Something a bit more complex could be an OrderLine where we need to determine, totally hypothetical, whether it is, say, taxable:
public bool IsTaxable(ITaxableService service)
In the article you reference I would assert that the 'newable' relates quite a bit to the 'transient' type of life cycle that we find in DI containers as we are interested in specific instances. However, when we need to inject specific values the transient business does not really help. This is the case for entities where each is a new instance but has very different state. A repository would hydrate the object but it could just as well use a factory.
The 'true' dependencies typically have a 'singleton' life-cycle.
So for the 'newable' instances a factory could be used if you would like to perform validation upon construction by having the factory call the relevant validation method on your value object using the injected validator dependency as Mark Seemann has mentioned.
This gives you the freedom to still test in isolation without coupling to a specific implementation in your constructor.
Just a slightly different angle on what has already been answered. Hope it helps :)

Stateful objects, properties and parameter-less methods in favour of stateless objects, parameters and return values

I find this class definition a bit odd:
http://www.extremeoptimization.com/Documentation/Reference/Extreme.Mathematics.LinearAlgebra.SingleLeastSquaresSolver_Members.aspx
The Solve method does have a return value but would not need to because the result is also available in the Solution property.
This is what I see as traditional code:
var sqrt2 = Math.Sqrt(2)
This would be an alternative in the same spirit as the solver in the link:
var sqrtCalculator = new SqrtCalculator();
sqrtCalculator.Parameter = 2;
sqrtCalculator.Run();
var sqrt2 = sqrtCalculator.Result;
What are the pros and cons besides the second version being a bit "untraditional"?
Yes, the compiler won't help the user who forgot to assign some property (parameter) BUT this is the case with all components that contain writeable properties and don't have mandatory values in the constructor.
Yes, threading will not work, BUT each thread can create its own solver.
Yes, the garbage collector won't be able to dispose the solver's result, BUT if the entire solver is disposed it will.
Yes, compilers and processors have special treatment of parameters and return values which makes them fast, BUT the time for parameter handling is mostly neglectable.
And so on. Other ideas?
Well, after a year I found a clear flaw with this "introvert" approach. I am using an existing filter object which should operate on a measurement object but rather operates on itself in a "it's all me and nothing else"-fashion described above. Now the customer wants a recalculation of a measurement object a few minutes after the first calculation, and meanwhile the filter has processed other measurement objects. If it had been stateless and stored its data in the measurement object, it would have been an easy matter to implement a Recalculate method. The only way to solve the problem with an introvert filter is to let a filter instance be a part of the measurement object. Then filters need to be instantiated for every new measurement object. And since filters are a part of a chain the entire chain needs to be recreated. Well, there is some merit to being stateless.

Language without type-casting

My question is pretty much what the title says: Is it possible to have a programming language which does not allow explicit type casting?
To clarify what I mean, assume we're working in some C#-like language with a parent Base class and a child Derived class. Clearly, such code would be safe:
Base a = new Derived();
Since going up the inheritance hierarchy is safe, but
Dervied b = (Base)a;
is not guarenteed safe, since going down is not safe.
But, regardless of the safety, such downcasts are valid in many languages (like Java or C#) - the code will compile, and will simply fail at runtime if the types aren't right. So technically, the code is still safe, but via runtime checks and not compile-time checks (btw, I'm not a fan of runtime checks).
I would personally find complete compile-time type safety to be very important, at least from a theoretical perspective, and at most from the perspective of reliable code. A consequence of compile-time type safety is that casts are no longer needed (which I think is great, 'cause they're ugly anyways). Any cast-like behaviour can be implemented by an implicit conversion operator or by a constructor.
So I'm wondering, are currently any OO languages which provide such a rigourous type safety at compile-time that casts are obsolete? I.e., they don't any allow unsafe conversion operations whatsoever? Or is there a reason this wouldn't work?
Thanks for any input.
Edit
If I can clarify by example, here's the big reason I hate downcasts so much.
Let's say I have the following (loosely based on C#'s collections):
public interface IEnumerable<T>
{
IEnumerator<T> GetEnumerator();
IEnumerable<T> Filter( Func<T, bool> );
}
public class List<T> : IEnumerable<T>
{
// All of list's implementation here
}
Now suppose someone decides to write code like this:
List<int> list = new List<int>( new int[]{1, 2, 3, 4, 5, 6} );
// Let's filter out the odd numbers
List<int> result = (List<int>)list.Filter( x => x % 2 != 0 );
Notice how the cast is necessary on that last line. But is it valid? Not in general. Sure, it makes sense that the implementation of List<T>.Filter will return another List<T>, but this is not guarenteed (it could be any subtype of IEnumerable<T>). Even if this runs at one point in time, a later version may change this, exposing how brittle the code is.
Pretty much all of the situations I can think that require downcasts would boil down to something like this example - a method has a return type of some class or interface, but since we know some implementation details, we're confident in downcasting the result. But this is anti-OOP, since OOP actually encourages abstracting from implementation details. So why do we do it anyways, even in purely OOP languages?
Downcasts can be gradually eliminated by improving the power of the type system.
One proposed solution to the example you gave is to add the ability to declare the return type of a method as "the same as this". This allows a subclass to return a subclass without requiring a cast. Thus you get something like this:
public interface IEnumerable<T>
{
IEnumerator<T> GetEnumerator();
This<T> Filter( Func<T, bool> );
}
public class List<T> : IEnumerable<T>
{
// All of list's implementation here
}
Now the cast is unnecessary:
List<int> list = new List<int>( new int[]{1, 2, 3, 4, 5, 6} );
// Compiler "knows" that Filter returns the same type as its receiver
List<int> result = list.Filter( x => x % 2 != 0 );
Other cases of downcasting also have proposed solutions by improving the type system, but these improvements have not yet been made to C#, Java, or C++.
Well, it's certainly possible to have programming languages that don't have subtyping at all, and then naturally there's no need for downcasts there. Most non-OO language fall into that class.
Even in a class-based OO language like Java, most downcasts could formally be replaced simply by letting the base class have a method
Foo meAsFoo() {
return null;
}
which the subclass would then override to return itself. However, that would still just be another way to express a run-time test, with the added downside of being more complicated to use. And it would be hard to forbid the pattern without losing all other advantages of inheritance-based subtyping.
Of course, this is only possible if you're able to modify the parent class. I suspect you might consider that a plus, but given how often one can modify the parent class and so use the workaround, I'm not sure how much that would be worth in terms of encouraging "good" design (for some more or less arbitrary value of "good").
A case could be made that it would encourage safe programming more if the language offered a case-matching construct instead of a downcast expression:
Shape x = .... ;
switch( x ) {
case Rectangle r:
return 5*r.diagonal();
case Circle c:
return c.radius();
case Point:
return 0 ;
default:
throw new RuntimeException("This can't happen, and I, "+
"the programmer, take full responsibility");
}
However, it might then be a problem in practice that without a closed-world assumption (which modern programming languages seem to be reluctant to make) many of those switches would need default: cases that the programmer knows can never happen, which might well desensitivize the programmer to the resultant throws.
There are many languages with duck typing and/or implicit type conversion. Perl certainly comes to mind; the intricacies of how subtypes of the scalar type are converted internally are a frequent source of criticism, but also receive praise because when they do work like you expect, they contribute to the DWIM feel of the language.
Traditional Lisp is another good example - all you have is atoms and lists, and nil which is both at the same time. Otherwise, the twain never meet ...
(You seem to come from a universe where programming languages are necessarily object-oriented, strongly typed, and compiled, though.)

Non-nullable reference types

I'm designing a language, and I'm wondering if it's reasonable to make reference types non-nullable by default, and use "?" for nullable value and reference types. Are there any problems with this? What would you do about this:
class Foo {
Bar? b;
Bar b2;
Foo() {
b.DoSomething(); //valid, but will cause exception
b2.DoSomething(); //?
}
}
My current language design philosophy is that nullability should be something a programmer is forced to ask for, not given by default on reference types (in this, I agree with Tony Hoare - Google for his recent QCon talk).
On this specific example, with the unnullable b2, it wouldn't even pass static checks: Conservative analysis cannot guarantee that b2 isn't NULL, so the program is not semantically meaningful.
My ethos is simple enough. References are an indirection handle to some resource, which we can traverse to obtain access to that resource. Nullable references are either an indirection handle to a resource, or a notification that the resource is not available, and one is never sure up front which semantics are being used. This gives either a multitude of checks up front (Is it null? No? Yay!), or the inevitable NPE (or equivalent). Most programming resources are, these days, not massively resource constrained or bound to some finite underlying model - null references are, simplistically, one of...
Laziness: "I'll just bung a null in here". Which frankly, I don't have too much sympathy with
Confusion: "I don't know what to put in here yet". Typically also a legacy of older languages, where you had to declare your resource names before you knew what your resources were.
Errors: "It went wrong, here's a NULL". Better error reporting mechanisms are thus essential in a language
A hole: "I know I'll have something soon, give me a placeholder". This has more merit, and we can think of ways to combat this.
Of course, solving each of the cases that NULL current caters for with a better linguistic choice is no small feat, and may add more confusion that it helps. We can always go to immutable resources, so NULL in it's only useful states (error, and hole) isn't much real use. Imperative technqiues are here to stay though, and I'm frankly glad - this makes the search for better solutions in this space worthwhile.
Having reference types be non-nullable by default is the only reasonable choice. We are plagued by languages and runtimes that have screwed this up; you should do the Right Thing.
This feature was in Spec#. They defaulted to nullable references and used ! to indicate non-nullables. This was because they wanted backward compatibility.
In my dream language (of which I'd probably be the only user!) I'd make the same choice as you, non-nullable by default.
I would also make it illegal to use the . operator on a nullable reference (or anything else that would dereference it). How would you use them? You'd have to convert them to non-nullables first. How would you do this? By testing them for null.
In Java and C#, the if statement can only accept a bool test expression. I'd extend it to accept the name of a nullable reference variable:
if (myObj)
{
// in this scope, myObj is non-nullable, so can be used
}
This special syntax would be unsurprising to C/C++ programmers. I'd prefer a special syntax like this to make it clear that we are doing a check that modifies the type of the name myObj within the truth-branch.
I'd add a further bit of sugar:
if (SomeMethodReturningANullable() into anotherObj)
{
// anotherObj is non-nullable, so can be used
}
This just gives the name anotherObj to the result of the expression on the left of the into, so it can be used in the scope where it is valid.
I'd do the same kind of thing for the ?: operator.
string message = GetMessage() into m ? m : "No message available";
Note that string message is non-nullable, but so are the two possible results of the test above, so the assignment is value.
And then maybe a bit of sugar for the presumably common case of substituting a value for null:
string message = GetMessage() or "No message available";
Obviously or would only be validly applied to a nullable type on the left side, and a non-nullable on the right side.
(I'd also have a built-in notion of ownership for instance fields; the compiler would generate the IDisposable.Dispose method automatically, and the ~Destructor syntax would be used to augment Dispose, exactly as in C++/CLI.)
Spec# had another syntactic extension related to non-nullables, due to the problem of ensuring that non-nullables had been initialized correctly during construction:
class SpecSharpExampleClass
{
private string! _nonNullableExampleField;
public SpecSharpExampleClass(string s)
: _nonNullableExampleField(s)
{
}
}
In other words, you have to initialize fields in the same way as you'd call other constructors with base or this - unless of course you initialize them directly next to the field declaration.
Have a look at the Elvis operator proposal for Java 7. This does something similar, in that it encapsulates a null check and method dispatch in one operator, with a specified return value if the object is null. Hence:
String s = mayBeNull?.toString() ?: "null";
checks if the String s is null, and returns the string "null" if so, and the value of the string if not. Food for thought, perhaps.
A couple of examples of similar features in other languages:
boost::optional (C++)
Maybe (Haskell)
There's also Nullable<T> (from C#) but that is not such a good example because of the different treatment of reference vs. value types.
In your example you could add a conditional message send operator, e.g.
b?->DoSomething();
To send a message to b only if it is non-null.
Have the nullability be a configuration setting, enforceable in the authors source code. That way, you will allow people who like nullable objects by default enjoy them in their source code, while allowing those who would like all their objects be non-nullable by default have exactly that. Additionally, provide keywords or other facility to explicitly mark which of your declarations of objects and types can be nullable and which cannot, with something like nullable and not-nullable, to override the global defaults.
For instance
/// "translation unit 1"
#set nullable
{ /// Scope of default override, making all declarations within the scope nullable implicitly
Bar bar; /// Can be null
non-null Foo foo; /// Overriden, cannot be null
nullable FooBar foobar; /// Overriden, can be null, even without the scope definition above
}
/// Same style for opposite
/// ...
/// Top-bottom, until reset by scoped-setting or simply reset to another value
#set nullable;
/// Nullable types implicitly
#clear nullable;
/// Can also use '#set nullable = false' or '#set not-nullable = true'. Ugly, but human mind is a very original, mhm, thing.
Many people argue that giving everyone what they want is impossible, but if you are designing a new language, try new things. Tony Hoare introduced the concept of null in 1965 because he could not resist (his own words), and we are paying for it ever since (also, his own words, the man is regretful of it). Point is, smart, experienced people make mistakes that cost the rest of us, don't take anyones advice on this page as if it were the only truth, including mine. Evaluate and think about it.
I've read many many rants on how it's us poor inexperienced programmers who really don't understand where to really use null and where not, showing us patterns and antipatterns that are meant to prevent shooting ourselves in the foot. All the while, millions of still inexperienced programmers produce more code in languages that allow null. I may be inexperienced, but I know which of my objects don't benefit from being nullable.
Here we are, 13 years later, and C# did it.
And, yes, this is the biggest improvement in languages since Barbara and Stephen invented types in 1974.:
Programming With Abstract Data Types
Barbara Liskov
Massachusetts Institute of Technology
Project MAC
Cambridge, Massachusetts
Stephen Zilles
Cambridge Systems Group
IBM Systems Development Division
Cambridge, Massachusetts
Abstract
The motivation
behind the work in very-high-level languages is to ease the
programming task by providing the programmer with a language
containing primitives or abstractions suitable to his problem area.
The programmer is then able to spend his effort in the right place; he
concentrates on solving his problem, and the resulting program will be
more reliable as a result. Clearly, this is a worthwhile goal.
Unfortunately, it is very difficult for a designer to select in
advance all the abstractions which the users of his language might
need. If a language is to be used at all, it is likely to be used to
solve problems which its designer did not envision, and for which the
abstractions embedded in the language are not sufficient. This paper
presents an approach which allows the set of built-in abstractions to
be augmented when the need for a new data abstraction is discovered.
This approach to the handling of abstraction is an outgrowth of work
on designing a language for structured programming. Relevant aspects
of this language are described, and examples of the use and
definitions of abstractions are given.
I think null values are good: They are a clear indication that you did something wrong. If you fail to initialize a reference somewhere, you'll get an immediate notice.
The alternative would be that values are sometimes initialized to a default value. Logical errors are then a lot more difficult to detect, unless you put detection logic in those default values. This would be the same as just getting a null pointer exception.