Is it ok to Serialize Value based objects if the application never relies on its object identity? - serialization

Sonar shows
Make this value-based field transient so it is not included in the
serialization of this class.
This is a future-proof bug when value-based class will be released.
So, if the application never relies on its object identity can I make value-based objects non-transient?

To make a field of a value-based class non-transient, the value based class must be serializable. So it’s actually a design decision not made by you.
If the designer declares a class to be value-based and implementing Serializable, they assume that value based classes and Serialization are compatible and will stay so.
We don’t know, how the final value type implementation will look like, but the migration path offered by the JRE developers, e.g. when introducing the immutable lists, being value based and serializable, should be taken, rather than assuming that there are additional rules and constraints beyond the specification.
After all, there is no reason to assume that Serialization won’t work with value types. It supports primitive values as well and has been adapted in the past too, e.g. when enum support was added. It’s not clear whether it will always store the values then or still support back references like with ordinary objects or perform an entirely different canonicalization, but as long as you don’t rely on the object identity, as was your premise, you’re on the safe side, as either strategy would work with your code.

Related

Why wouldn't I make every eligable Kotlin class a data class?

I'm of course excluding any reasons that involve violating the rules for what can be a data class. So if you know you won't need to inherit from it for example (although it's my understanding that rule is going away in Kotlin 1.1).
Are there any disadvantages to making a class a data class?
Why don't all eligible classes provide the functionality of a data class as long as they remain eligible? This should all be detectable by the compiler without needing a special keyword. Of course the answer to this might be obvious depending on the answer to question 1.
Is there any reason for me not to mark all of my eligible classes as data classes?
data modifier makes Kotlin generate common methods like toString, hashCode, equals for the most commons (%80) scenarios based on the primary constructor.
This shows 3 reasons why only few classes should be data:
Most non-data classes have a mix of properties defined in the primary constructor and in the body of the class. Also the primary constructor often has parameter that are not fields (but help initialise more complex fields in the body). In other words, data has very restrictive requirements which are rarely met by regular classes.
In addition to point 1, making a class data may hurt its extensibility. Even if the layout of the class in question conforms to the rules of data classes, later someone may want to add another property in the body of the class. In that case he will have to manually override hashCode because it may be used somewhere.
Marking a class data sends a message to the one who reads the code that you intend to use this class as a data career. Marking other classes will be misleading.
because of one the fundament pinciple of OO programming: encapsulation.
by design we deliberately limit ways in which other code can interact with out modules. this gives us maintainability (more powerful refactoring) and readablity

is it acceptable to provide an API that is undefined a large part of the time?

Given some type as follows:
class Thing {
getInfo();
isRemoteThing();
getRemoteLocation();
}
The getRemoteLocation() method only has a defined result if isRemoteThing() returns true. Given that most Things are not remote, is this an acceptable API? The other option I see is to provide a RemoteThing subclass, but then the user needs a way to cast a Thing to a RemoteThing if necessary, which just seems to add a level of indirection to the problem.
Having an interface include members which are usable on some objects that implement the interface but not all of them, and also includes a query method to say which interface members will be useful, is a good pattern in cases where something is gained by it.
Examples of reasons where it can be useful:
If it's likely than an interface member will be useful on some objects but not other instances of the same type, this pattern may be the only one that makes sense.
If it's likely that a consumer may hold references to a variety of objects implementing the interface, some of which support a particular member and some of which do not, and if it's likely that someone with such a collection would want to use the member on those instances which support it, such usage will be more convenient if all objects implement an interface including the member, than if some do and some don't. This is especially true for interface members like IDisposable.Dispose whose purpose is to notify the implementation of something it may or may not care about (e.g. that nobody needs it anymore and it may be abandoned without further notice), and ask it to do whatever it needs to as a consequence (in many cases nothing). Blindly calling Dispose on an IEnumerable<T> is faster than checking whether an implementation of IEnumerable also implements IDisposable. Not only the unconditional call faster than checking for IDisposable and then calling it--it's faster than checking whether an object implements IDisposable and finding out that it doesn't.
In some cases, a consumer may use a field to hold different kinds of things at different times. As an example, it may be useful to have a field which at some times will hold the only extant reference to a mutable object, and at other times will hold a possibly-shared reference to an immutable object. If the type of the field includes mutating methods (which may or may not work) as well as a means of creating a new mutable instance with data copied from an immutable one, code which receives an object and might want to mutate the data can store a reference to the passed-in object. If and when it wants to mutate the data, it can overwrite the field with a reference to a mutable copy; if it never ends up having to mutate the data, however, it can simply use the passed-in immutable object and never bother copying it.
The biggest disadvantage of having interfaces include members that aren't always useful is that it imposes more work on the implementers. Thus, people writing interfaces should only include members whose existence could significantly benefit at least some consumers of almost every class implementing the interface.
Why should this not be acceptable? It should, however, be clearly documented. If you look at the .net class libraries or the JDK, there are collection interfaces defining methods to add or delete items, but there are unmodifiable classes implementing these interfaces. It is a good idea in this case - as you did - to provide a method to query the object if it has some capabilities, as this helps you avoid exceptions in the case that the method is not appropriate.
OTOH, if this is an API, it might be more appropriate to use an interface than a class.

flexjson and versioning : how accommodating change is flexjson?

I'm considering using flexjson to serialise my business objects to a file in an android application, simply using JSONSerializer().deepSerialise(myObject) and JSONDeserializer().deserialise(jsonString) with all the default transformers and object factories.
I'm hoping that once the application is released any changes to the business model should be accommodated by writing flexjsons transformers and object factories in the new release to maintain compatibility with previous versions.
What I'm not sure about is what changes the default transformers and object factories can cope with.
i.e if I add a field to a class and deserialise from an old version without the field into the new class will it fail or will the new field be null or 0 (if a number). Same question if I remove a field, what happens.
In standard java serialisation this is all documented here..
http://docs.oracle.com/javase/7/docs/platform/serialization/spec/version.html
But I cant find the equivalent information for flexjson, that deals explicitly with the issues surrounding versioning of objects, Is there any?
Cheers,
Phil.
Flexjson will look at the JSON first to find any fields it contains, and then looks for those fields on the Object you are deserializing into. So adding new fields to an object will not cause the deserialization process to fail. The new field will just not be populated from the JSON object (ie it will retain the value(s) set in the constructor or the initialization values).
If you remove a field from an object in the future Flexjson will simply not deserialize that value into the object because it won't find a setter for it.
So you can think about the getter/setter functions as a declaration on the JSON of what you want out of it. You aren't required to serialize/deserialize all values from the JSON object.
The only part that gets really tricky is if you rename fields, or change types on a field. Renaming field can be handled by keeping the older setter around and internally setting the new field in that older setter. You can mark it private or protected to hide it from the outside and Flexjson will still use it. If you change the type it is much more tricky. One option is to keep the older setter with the prior type around (like setFoo(String) and setFoo(List)) and adapt to the new type. The other option is to write your ObjectFactory to translate between to the two potential types. This of course is the hardest to do. The last option is don't do this without changing the name of the field, and use one of the other methods to translate.

In what cases should public fields be used instead of properties? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Public Data members vs Getters, Setters
In what cases should public fields be used, instead of properties or getter and setter methods (where there is no support for properties)? Where exactly is their use recommended, and why, or, if it is not, why are they still allowed as a language feature? After all, they break the Object-Oriented principle of encapsulation where getters and setters are allowed and encouraged.
If you have a constant that needs to be public, you might as well make it a public field instead of creating a getter property for it.
Apart from that, I don't see a need, as far as good OOP principles are concerned.
They are there and allowed because sometimes you need the flexibility.
That's hard to tell, but in my opinion public fields are only valid when using structs.
struct Simple
{
public int Position;
public bool Exists;
public double LastValue;
};
But different people have different thoughts about:
http://kristofverbiest.blogspot.com/2007/02/public-fields-and-properties-are-not.html
http://blogs.msdn.com/b/ericgu/archive/2007/02/01/properties-vs-public-fields-redux.aspx
http://www.markhneedham.com/blog/2009/02/04/c-public-fields-vs-automatic-properties/
If your compiler does not optimize getter and setter invocations, the access to your properties might be more expensive than reading and writing fields (call stack). That might be relevant if you perform many, many invocations.
But, to be honest, I know no language where this is true. At least in both .NET and Java this is optimized well.
From a design point of view I know no case where using fields is recommended...
Cheers
Matthias
Let's first look at the question why we need accessors (getters/setters)? You need them to be able to override the behaviour when assigning a new value/reading a value. You might want to add caching or return a calculated value instead of a property.
Your question can now be formed as do I always want this behaviour? I can think of cases where this is not useful at all: structures (what were structs in C). Passing a parameter object or a class wrapping multiple values to be inserted into a Collection are cases where one actually does not need accessors: The object is merely a container for variables.
There is one single reason(*) why to use get instead of public field: lazy evaluation. I.e. the value you want may be stored in a database, or may be long to compute, and don't want your program to initialize it at startup, but only when needed.
There is one single reason(*) why to use set instead of public field: other fields modifications. I.e. you change the value of other fields when you the value of the target field changes.
Forcing to use get and set on every field is in contradiction with the YAGNI principle.
If you want to expose the value of a field from an object, then expose it! It is completely pointless to create an object with four independent fields and mandating that all of them uses get/set or properties access.
*: Other reasons such as possible data type change are pointless. In fact, wherever you use a = o.get_value() instead of a = o.value, if you change the type returned by get_value() you have to change at every use, just as if you would have changed the type of value.
The main reason is nothing to do with OOP encapsulation (though people often say it is), and everything to do with versioning.
Indeed from the OOP position one could argue that fields are better than "blind" properties, as a lack of encapsulation is clearer than something that pretends to encapsulation and then blows it away. If encapsulation is important, then it should be good to see when it isn't there.
A property called Foo will not be treated the same from the outside as a public field called Foo. In some languages this is explicit (the language doesn't directly support properties, so you've got a getFoo and a setFoo) and in some it is implicit (C# and VB.NET directly support properties, but they are not binary-compatible with fields and code compiled to use a field will break if it's changed to a property, and vice-versa).
If your Foo just does a "blind" set and write of an underlying field, then there is currently no encapsulation advantage to this over exposing the field.
However, if there is a later requirement to take advantage of encapsulation to prevent invalid values (you should always prevent invalid values, but maybe you didn't realise some where invalid when you first wrote the class, or maybe "valid" has changed with a scope change), to wrap memoised evaluation, to trigger other changes in the object, to trigger an on-change event, to prevent expensive needless equivalent sets, and so on, then you can't make that change without breaking running code.
If the class is internal to the component in question, this isn't a concern, and I'd say use fields if fields read sensibly under the general YAGNI principle. However, YAGNI doesn't play quite so well across component boundaries (if I did need my component to work today, I certainly am probably going to need that it works tomorrow after you've changed your component that mine depends on), so it can make sense to pre-emptively use properties.

Serialization of Objects

how does Serialization of objects works? How object got deserialized and a instance is created from serialized date without a call to any constructor?
I've kept this answer language agnostic since a language wasn't given.
When the object is serialized, all the require information to rebuild it is encoded in way which can be retrieved. This typically includes the type of the object, as well as the value of all the instance variables.
When the object is deserialized, an area in memory of the correct size is allocated and is populated using the serialized information such that the new object is identical to the serialized one.
The running program can then refer to this new object in memory without having to actually call the constructor.
There are lots of little details which this doesn't explain, but this is the general idea of serialization/deserialization.
Are you talking about Java? If so, serialization is an extralingual object creation mechanism. It's a backdoor that uses native code to create the object without calling any constructors. Therefore, when designing a class for serializability, you need to make sure that a class created through deserialization maintains the same invariants (key fields being initialized) as you would through the constructor path. A third way to create objects in Java is through cloning, and similar issues apply.
Cloning and serialization don't interact well with the use of final fields if you need to set the value of that field to something different than what is returned by clone or the deserialization process.
Josh Bloch's "Effective Java" has some chapters that explain these issues in more depth.
(this answer may apply to other languages too, but I've only used serialization in Java)
Regarding .NET: this isn't a definitive or textbook answer, and I might be all-out wrong...
.NET Serialization needs to be seperated out into Binary vs. others (XML or an XML derivitave typically). Binary serialization is mostly a black-box to me, but it allows the object to be serialized and restored in their current state. XML serialization typically only serialized the public fields/properties of an object, unless overriden by adding a custom ISerializable implementation.
In the case of XML serialization I believe .NET uses Reflection to determine which fields and properties get converted to their equivalent Elements. Adding an [XMLSerializable] attribute will implement a default behavior which can be adjusted by applying other attributes at the field level (such as [XMLAttribute]).
The metadata (which Reflection depends on) stores all the object members as well as their attributes and addresses, which allows the serializer to determine how it should build the output.