flexjson and versioning : how accommodating change is flexjson? - serialization

I'm considering using flexjson to serialise my business objects to a file in an android application, simply using JSONSerializer().deepSerialise(myObject) and JSONDeserializer().deserialise(jsonString) with all the default transformers and object factories.
I'm hoping that once the application is released any changes to the business model should be accommodated by writing flexjsons transformers and object factories in the new release to maintain compatibility with previous versions.
What I'm not sure about is what changes the default transformers and object factories can cope with.
i.e if I add a field to a class and deserialise from an old version without the field into the new class will it fail or will the new field be null or 0 (if a number). Same question if I remove a field, what happens.
In standard java serialisation this is all documented here..
http://docs.oracle.com/javase/7/docs/platform/serialization/spec/version.html
But I cant find the equivalent information for flexjson, that deals explicitly with the issues surrounding versioning of objects, Is there any?
Cheers,
Phil.

Flexjson will look at the JSON first to find any fields it contains, and then looks for those fields on the Object you are deserializing into. So adding new fields to an object will not cause the deserialization process to fail. The new field will just not be populated from the JSON object (ie it will retain the value(s) set in the constructor or the initialization values).
If you remove a field from an object in the future Flexjson will simply not deserialize that value into the object because it won't find a setter for it.
So you can think about the getter/setter functions as a declaration on the JSON of what you want out of it. You aren't required to serialize/deserialize all values from the JSON object.
The only part that gets really tricky is if you rename fields, or change types on a field. Renaming field can be handled by keeping the older setter around and internally setting the new field in that older setter. You can mark it private or protected to hide it from the outside and Flexjson will still use it. If you change the type it is much more tricky. One option is to keep the older setter with the prior type around (like setFoo(String) and setFoo(List)) and adapt to the new type. The other option is to write your ObjectFactory to translate between to the two potential types. This of course is the hardest to do. The last option is don't do this without changing the name of the field, and use one of the other methods to translate.

Related

Is it ok to Serialize Value based objects if the application never relies on its object identity?

Sonar shows
Make this value-based field transient so it is not included in the
serialization of this class.
This is a future-proof bug when value-based class will be released.
So, if the application never relies on its object identity can I make value-based objects non-transient?
To make a field of a value-based class non-transient, the value based class must be serializable. So it’s actually a design decision not made by you.
If the designer declares a class to be value-based and implementing Serializable, they assume that value based classes and Serialization are compatible and will stay so.
We don’t know, how the final value type implementation will look like, but the migration path offered by the JRE developers, e.g. when introducing the immutable lists, being value based and serializable, should be taken, rather than assuming that there are additional rules and constraints beyond the specification.
After all, there is no reason to assume that Serialization won’t work with value types. It supports primitive values as well and has been adapted in the past too, e.g. when enum support was added. It’s not clear whether it will always store the values then or still support back references like with ordinary objects or perform an entirely different canonicalization, but as long as you don’t rely on the object identity, as was your premise, you’re on the safe side, as either strategy would work with your code.

In what cases should public fields be used instead of properties? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Public Data members vs Getters, Setters
In what cases should public fields be used, instead of properties or getter and setter methods (where there is no support for properties)? Where exactly is their use recommended, and why, or, if it is not, why are they still allowed as a language feature? After all, they break the Object-Oriented principle of encapsulation where getters and setters are allowed and encouraged.
If you have a constant that needs to be public, you might as well make it a public field instead of creating a getter property for it.
Apart from that, I don't see a need, as far as good OOP principles are concerned.
They are there and allowed because sometimes you need the flexibility.
That's hard to tell, but in my opinion public fields are only valid when using structs.
struct Simple
{
public int Position;
public bool Exists;
public double LastValue;
};
But different people have different thoughts about:
http://kristofverbiest.blogspot.com/2007/02/public-fields-and-properties-are-not.html
http://blogs.msdn.com/b/ericgu/archive/2007/02/01/properties-vs-public-fields-redux.aspx
http://www.markhneedham.com/blog/2009/02/04/c-public-fields-vs-automatic-properties/
If your compiler does not optimize getter and setter invocations, the access to your properties might be more expensive than reading and writing fields (call stack). That might be relevant if you perform many, many invocations.
But, to be honest, I know no language where this is true. At least in both .NET and Java this is optimized well.
From a design point of view I know no case where using fields is recommended...
Cheers
Matthias
Let's first look at the question why we need accessors (getters/setters)? You need them to be able to override the behaviour when assigning a new value/reading a value. You might want to add caching or return a calculated value instead of a property.
Your question can now be formed as do I always want this behaviour? I can think of cases where this is not useful at all: structures (what were structs in C). Passing a parameter object or a class wrapping multiple values to be inserted into a Collection are cases where one actually does not need accessors: The object is merely a container for variables.
There is one single reason(*) why to use get instead of public field: lazy evaluation. I.e. the value you want may be stored in a database, or may be long to compute, and don't want your program to initialize it at startup, but only when needed.
There is one single reason(*) why to use set instead of public field: other fields modifications. I.e. you change the value of other fields when you the value of the target field changes.
Forcing to use get and set on every field is in contradiction with the YAGNI principle.
If you want to expose the value of a field from an object, then expose it! It is completely pointless to create an object with four independent fields and mandating that all of them uses get/set or properties access.
*: Other reasons such as possible data type change are pointless. In fact, wherever you use a = o.get_value() instead of a = o.value, if you change the type returned by get_value() you have to change at every use, just as if you would have changed the type of value.
The main reason is nothing to do with OOP encapsulation (though people often say it is), and everything to do with versioning.
Indeed from the OOP position one could argue that fields are better than "blind" properties, as a lack of encapsulation is clearer than something that pretends to encapsulation and then blows it away. If encapsulation is important, then it should be good to see when it isn't there.
A property called Foo will not be treated the same from the outside as a public field called Foo. In some languages this is explicit (the language doesn't directly support properties, so you've got a getFoo and a setFoo) and in some it is implicit (C# and VB.NET directly support properties, but they are not binary-compatible with fields and code compiled to use a field will break if it's changed to a property, and vice-versa).
If your Foo just does a "blind" set and write of an underlying field, then there is currently no encapsulation advantage to this over exposing the field.
However, if there is a later requirement to take advantage of encapsulation to prevent invalid values (you should always prevent invalid values, but maybe you didn't realise some where invalid when you first wrote the class, or maybe "valid" has changed with a scope change), to wrap memoised evaluation, to trigger other changes in the object, to trigger an on-change event, to prevent expensive needless equivalent sets, and so on, then you can't make that change without breaking running code.
If the class is internal to the component in question, this isn't a concern, and I'd say use fields if fields read sensibly under the general YAGNI principle. However, YAGNI doesn't play quite so well across component boundaries (if I did need my component to work today, I certainly am probably going to need that it works tomorrow after you've changed your component that mine depends on), so it can make sense to pre-emptively use properties.

Redefining instance variables of a Smalltalk class

I've never used Smalltalk, but I've read a lot about it and it has always intrigued me. I've seen the cool demos where a program is running and simply by changing the methods of the classes the program's objects are using alters the running program's behavior. It's clearly powerful stuff and I understand how that can work the way it does. What I can't seem to nail down for certain is what happens to the existing instances of a class when you want to add, remove, or rename instance variables of that class.
I can't imagine how one can alter the instance variables that all the classes are using in a running program and still expect the existing instances of that class to function correctly afterward. Perhaps I'm adding a new instance variable that I need to have initialized and where previously existing methods have been altered to depend on this variable. Couldn't I end up with a horrible malfunction of any running code that has live instances of that class? Or what if the meaning of an instance variable has changed and I now expect a different kind of object to be stored there than was previously? Is there some kind of "upgrade" mechanism? Or is the usual practice to just let the previous instances crash and burn? Or is this simply a case of "we don't do that sort of thing on running programs and expect them to survive?"
The only reasonably clean approach I can think of is that when you alter the instance variable definitions perhaps it actually creates an entirely new class and the old instances, prior to the change, continue to function just fine with the old class definition (which is now inaccessible by name since the name was redefined to the new class definition). Perhaps that is the most logical explanation - but since I haven't found anything that directly explains this process, I figured I'd ask here and see what kind of fun information that got me. :)
According to this paper, it is like you said:
It also automatically manages class redefinition, guaranteeing system consistency in terms of object structures and preventing name conflicts, especially instance variable name conflicts. When a class definition changes, existing instances must be structurally modified in order to match the definition of their new class. Instead of modifying an existing object, the ClassBuilder creates a new one with the correct structure (i.e., from the new class that replaces the old one). It then fills this new object with the values of the old one. The ClassBuilder uses the become: primitive (cf 2.1.1) to proceed with the strutural modifications, by replacing the old objects with the new ones throughout the entire system.

Serialization of Objects

how does Serialization of objects works? How object got deserialized and a instance is created from serialized date without a call to any constructor?
I've kept this answer language agnostic since a language wasn't given.
When the object is serialized, all the require information to rebuild it is encoded in way which can be retrieved. This typically includes the type of the object, as well as the value of all the instance variables.
When the object is deserialized, an area in memory of the correct size is allocated and is populated using the serialized information such that the new object is identical to the serialized one.
The running program can then refer to this new object in memory without having to actually call the constructor.
There are lots of little details which this doesn't explain, but this is the general idea of serialization/deserialization.
Are you talking about Java? If so, serialization is an extralingual object creation mechanism. It's a backdoor that uses native code to create the object without calling any constructors. Therefore, when designing a class for serializability, you need to make sure that a class created through deserialization maintains the same invariants (key fields being initialized) as you would through the constructor path. A third way to create objects in Java is through cloning, and similar issues apply.
Cloning and serialization don't interact well with the use of final fields if you need to set the value of that field to something different than what is returned by clone or the deserialization process.
Josh Bloch's "Effective Java" has some chapters that explain these issues in more depth.
(this answer may apply to other languages too, but I've only used serialization in Java)
Regarding .NET: this isn't a definitive or textbook answer, and I might be all-out wrong...
.NET Serialization needs to be seperated out into Binary vs. others (XML or an XML derivitave typically). Binary serialization is mostly a black-box to me, but it allows the object to be serialized and restored in their current state. XML serialization typically only serialized the public fields/properties of an object, unless overriden by adding a custom ISerializable implementation.
In the case of XML serialization I believe .NET uses Reflection to determine which fields and properties get converted to their equivalent Elements. Adding an [XMLSerializable] attribute will implement a default behavior which can be adjusted by applying other attributes at the field level (such as [XMLAttribute]).
The metadata (which Reflection depends on) stores all the object members as well as their attributes and addresses, which allows the serializer to determine how it should build the output.

Will the VB.Net Serializers execute code in public members?

We wish to use the Binary Formatter. In debugging, thus far, it seems that it does not execute the getters for public properties. Does the XML Serializer behave the same way? Also, during deserialization, will the deserializers use the setters to apply the values during deserialization?
Thus far, our testing with BinaryFormatter shows that it simply writes directly to and from member variables. It does not step through any of the getters or setters. Is the XML Serializer the same way?
What if a public property did something silly like Random().Next? Will this be serialized by the Binary Formatter? It seems that with the XML Serializer, you would need to decorate this member appropriately to get it to participate. The Binary Formatter seems to only work, again, on member variables.
Thanks.
You need both a getter and a setter or the property will not be serialized. The reason for this is that the serializer assumes it can't set the value so transporting it would be wasteful.
You can even have an empty setter and it will work.
I just ran a quick test using the XML Serializer. To answer your question: Yes, it does use the getters durning serialization and it does use the setters during deserialization.
EDIT
Found this in the docs:
This example uses a binary formatter to do the serialization. All you need to do is create an instance of the stream and the formatter you intend to use, and then call the Serialize method on the formatter. The stream and the object to serialize are provided as parameters to this call. Although it is not explicitly demonstrated in this example, all member variables of a class will be serialized—even variables marked as private. In this aspect, binary serialization differs from the XMLSerializer Class, which only serializes public fields. For information on excluding member variables from binary serialization, see Selective Serialization.
Think about it this way:
Let's say you're deserializing a class with several serialized properties, and the setter for the last property has side effects that can alter the values of another property. The altered property no longer reflects your serialized data. Do you really want it to use that setter?
On the other hand, what if there is no backing store for a property? Perhaps it's a composite property allowing you to get and set values of all the others at once. Arguably this property shouldn't be serialized (or only this property should be serialized, depending on how things work), but there could be other examples. How does the formatter know where to assign the value for such a property?
So which is it? I had to look it up and couldn't quickly find an authoritative source, but it looks like the XmlSerializer does use getters and setters while the BinaryFormatter does not use getters or setters.
And that kind of makes sense. My first point showed that you don't really want to use getter/setters. My 2nd point showed that you may have to use them. The binary formatter can just take the exact in-memory representation of the object, so it skips the getter/setters. The XmlSerializer, which doesn't have this ability, has to use the other method.
You should probably set up a quick test project for yourself so you can see it in action.