Serialization of Objects - serialization

how does Serialization of objects works? How object got deserialized and a instance is created from serialized date without a call to any constructor?

I've kept this answer language agnostic since a language wasn't given.
When the object is serialized, all the require information to rebuild it is encoded in way which can be retrieved. This typically includes the type of the object, as well as the value of all the instance variables.
When the object is deserialized, an area in memory of the correct size is allocated and is populated using the serialized information such that the new object is identical to the serialized one.
The running program can then refer to this new object in memory without having to actually call the constructor.
There are lots of little details which this doesn't explain, but this is the general idea of serialization/deserialization.

Are you talking about Java? If so, serialization is an extralingual object creation mechanism. It's a backdoor that uses native code to create the object without calling any constructors. Therefore, when designing a class for serializability, you need to make sure that a class created through deserialization maintains the same invariants (key fields being initialized) as you would through the constructor path. A third way to create objects in Java is through cloning, and similar issues apply.
Cloning and serialization don't interact well with the use of final fields if you need to set the value of that field to something different than what is returned by clone or the deserialization process.
Josh Bloch's "Effective Java" has some chapters that explain these issues in more depth.
(this answer may apply to other languages too, but I've only used serialization in Java)

Regarding .NET: this isn't a definitive or textbook answer, and I might be all-out wrong...
.NET Serialization needs to be seperated out into Binary vs. others (XML or an XML derivitave typically). Binary serialization is mostly a black-box to me, but it allows the object to be serialized and restored in their current state. XML serialization typically only serialized the public fields/properties of an object, unless overriden by adding a custom ISerializable implementation.
In the case of XML serialization I believe .NET uses Reflection to determine which fields and properties get converted to their equivalent Elements. Adding an [XMLSerializable] attribute will implement a default behavior which can be adjusted by applying other attributes at the field level (such as [XMLAttribute]).
The metadata (which Reflection depends on) stores all the object members as well as their attributes and addresses, which allows the serializer to determine how it should build the output.

Related

Is it ok to Serialize Value based objects if the application never relies on its object identity?

Sonar shows
Make this value-based field transient so it is not included in the
serialization of this class.
This is a future-proof bug when value-based class will be released.
So, if the application never relies on its object identity can I make value-based objects non-transient?
To make a field of a value-based class non-transient, the value based class must be serializable. So it’s actually a design decision not made by you.
If the designer declares a class to be value-based and implementing Serializable, they assume that value based classes and Serialization are compatible and will stay so.
We don’t know, how the final value type implementation will look like, but the migration path offered by the JRE developers, e.g. when introducing the immutable lists, being value based and serializable, should be taken, rather than assuming that there are additional rules and constraints beyond the specification.
After all, there is no reason to assume that Serialization won’t work with value types. It supports primitive values as well and has been adapted in the past too, e.g. when enum support was added. It’s not clear whether it will always store the values then or still support back references like with ordinary objects or perform an entirely different canonicalization, but as long as you don’t rely on the object identity, as was your premise, you’re on the safe side, as either strategy would work with your code.

Is the format of the data held in kotlin.MetaData documented anywhere?

I'm interested to know what data is held in the MetaData annotation added to each Kotlin class.
But most fields give no more detail than
"Metadata in a custom format. The format may be different (or even absent) for different kinds."
https://github.com/JetBrains/kotlin/blob/master/libraries/stdlib/jvm/runtime/kotlin/Metadata.kt
Is there are reference somewhere that explains how to interpret this data?
kotlin.Metadata contains information about Kotlin symbols, such as their names, signatures, relations between types, etc. Some of this information is already present in the JVM signatures in the class files, but a lot is not, since there's quite a few Kotlin-specific things which JVM class files cannot represent properly: type nullability, mutable/read-only collection interfaces, declaration-site variance, and others.
No specific actions were taken to make the schema of the data encoded in this annotation public, because for most users such data is needed to introspect a program at runtime, and the Kotlin reflection library provides a nice API for that.
If you need to inspect Kotlin-specific stuff which is not exposed via the reflection API, or you're just generally curious what else is stored in that annotation, you can take a look at the implementation of kotlinx.reflect.lite. It's a light-weight library, the core of which is the protobuf-generated schema parser. There's not much supported there at the moment, but there are schemas available
which you can use to read any other data you need.
UPD (August 2018): since this was answered, we've published a new (experimental and unstable) library, which is designed to be the intended way for reading and modifying the metadata: https://discuss.kotlinlang.org/t/announcing-kotlinx-metadata-jvm-library-for-reading-modifying-metadata-of-kotlin-jvm-class-files/7980

Write/read class objects to/from file, D-Lang

I'm trying to write/read a class object from/to a file.
I'm new to D and I just want to play a little bit around with it.
Is there a Class/Function to write/read an object to/from a file?
I'm looking for something similar to the ObjectOutputStream сlass in Java.
Or do I have to serialize (concatenate) the object's variables as strings in the file?
I have a Movie class and a MovieManager class, which contains a dynamic movie-array.
A Movie object contains just a few strings and integer values.
Extending answer, provided in comment, it is worth explicitly stating, that D does not provide "one true way" of reading/writing objects to/from files, as there can't be a single optimal one. Different considerations about speed, resulting file format, handling references and similar corner cases may results in different serialization strategies.
That being said, most likely proper serialization library is needed, and, by lucky chance, one of most mature D solutions ("Orange" by Jacob Carlborg https://github.com/jacob-carlborg/orange) is being reviewed right now as a candidate for inclusion into standard library as a std.serialization: newsgroup thread. It may be your best bet.
The library Unmanaged provides a serialization system. You also have Orange
which is less restrictive, as Unmanaged serialization only works if the object to serialize is an ancestor of one of the framework base class.But...Unmanaged works on the "accessor" principle. The data serialized are get via a method and the data deserialized are set via a method, which allows to update some stuffs when the deserializer recall for example...

Linq to Xml VS XmlSerializer VS DataContractSerializer

In my web method, I get an object of some third party C# entity class. The entity class is nothing but the DataContract. This entity class is quite complex and has properties of various types, some properties are collections too. Of course, those linked types are also DataContracts.
I want to serialize that DataContract entity into XML as part of business logic of my web service. I cannot use DataContractSerializer directly (on the object I receive in the web method) simply because the XML schema is altogether different. So the XML generated by DataContractSerializer will not get validated against the schema.
I am not able to conclude the approach I should follow for implementation. I could think of following implementation approaches:
LINQ to XML - This looks ok but I need to create XML tree (i.e. elements or XML representation of the class instance) manually for each type of object. Since there are many entity classes and they are linked to each other, I think this is too much of work to write XML elements manually. Besides, i'll have to keep modifying the XML Tree as and when the entity class introduces some new property. Not only this, the code where I generate XML tree would look little clumsy (at least in appearance) and would be harder to maintain/change by some other developer in future; he/she will have to look at it so closely to understand how that XML is generated.
XmlSerializer - I can write my own entity classes that represent the XML structure I want. Now, I need to copy details from incoming object to the object of my own classes. So this is additional work (for .NET too when code executes!). Then I can use XmlSerializer on my object to generate XML. In this case, I'll have to create entity classes and whenever third party entity gets modified, I'll have to just add new property in my class. (with XmlElement or XmlAttibute attributes). But people recommend DataContractSerializer over this one and so I don't want to finalize this unless all aspects are clear to me.
DataContractSerializer - Again here, I'll have to write my own entity class since I have no control over the third party DataContracts. And I need to copy details from incoming object to the object of my own classes. So this is additional work. However, since DataContractSerializer does not support Xml attributes, I'll have to implement IXmlSerializable and generate required Xml in WriteXml method. DataContractSerializer is faster than XmlSerializer, but again I'll have to handle the changes (in WriteXml) if third party entity changes.
Questions:
Which approach is best in this scenario considering performance too?
Can you suggest some better approach?
Is DataContractSerializer worth considering (because it has better performance over XmlSerilaizer) when incoming entity class is subject to change?
Should LINQ be really used for serialization? Or is it really good for things other than querying?
Can XmlSerializer be preferred over LINQ in such cases? If yes, why?
I agree with #Werner Strydom's answer.
I decided to use the XmlSerializer because code becomes maintainable and it offers performance I expect. Most important is that it gives me full control over the XML structure.
This is how I solved my problem:
I created entity classes (representing various types of Xml elements) as per my requirement and passed an instance of the root class (class representing root element) through XmlSerializer.
Small use of LINQ in case of 1:M relationship:
Wherever I wanted same element (say Employee) many times under specific node (say Department) , I declared the property of type List<T>. e.g. public List<Employee> Employees in the Department class. In such cases XmlSerializer obviously added an element called Employees (which is grouping of all Employee elements) under the Department node. In such cases, I used LINQ (after XmlSerializer serialized the .NET object) to manipulate the XElement (i.e. XML) generated by XmlSerializer. Using LINQ, I simply put all Employee nodes directly under Department node and removed the Employees node.
However, I got the expected performance by combination of xmlSerializer and LINQ.
Downside is that, all classes I created had to be public when they could very well be internal!
Why not DataContractSerializer and LINQ-to-XML?
DataContractSerializer does not allow to use Xml attributes (unless I implement IXmlSerializable). See the types supported by DataContractSerializer.
LINQ-to-XML (and IXmlSerializable too) makes code clumsy while creating complex XML structure and that code would definitely make other developers scratch their heads while maintaining/changing it.
Is there any other way?
Yes. As mentioned by #Werner Strydom, you can very well generate classes using XSD.exe or tool like Xsd2Code and work directly with them if you are happy with the resulting classes.
I'll pick XmlSerializer because its the most maintainable for a custom schema (assuming you have the XSD). When you are done developing the system, test its performance in its entirety and determine whether XML serialization is causing problems. If it is, you can then replace it with something that requires more work and test it again to see if there is any gains. But if XML serialization isn't an issue, then you have maintainable code.
The time it takes to parse a small snippet of XML data may be negligible compared to communicating with the database or external systems. On systems with large memory (16GB+) you may find the GC being a bottleneck in .NET 4 and earlier (.NET 4.5 tries to solve this), especially when you work with very large data sets and streams.
Use AutoMapper to map objects created by XSD.EXE to your entities. This will allow the database design to change without impacting the web service.
One thing that is great about LINQ to XML is XSD validation. However, that impacts performance.
Another option is to utilize LINQ and Reflection to create a generic class to serialize your object to XML. A good example of this can be found at http://primecoder.blogspot.com/2010/09/how-to-serialize-objects-to-xml-using.html . I am not sure what your XML needs to look like at the end of the day, but if it is pretty basic this could do the trick. You would not need to make changes as your entity classes add/remove/change properties, and you could use this across all of your objects (and other projects if stored in a utility DLL).

ImmutableCollection declarations for GWT-RPC serialization

My understanding is that DTOs to be serialized for GWT RPC ought to declare their fields of the lowest possible implementation type for performance reasons. For example, one should favor ArrayList over List or Collection, in defiance of the advice we normally receive to the contrary (e.g., Effective Java, Item 52).
With the JDK collections, this is no problem—most of the time, a Map is a HashMap, a Set is a HashSet and a List is an ArrayList. However, I am using Guava's Immutable* collections (e.g., ImmutableList), where I really don't know which implementation I'll end up getting. Do I need to just suck it up and let GWT emulate all of them, or is there any way to do damage control here?
Right. Just use the most specific type that is part of the API.
Subtypes that are annotated with #GwtCompatible(serializable = true) are serializable over GWT RPC unless otherwise specified (by another #GwtCompatible(serializable = false)). You can safely use Immutable* types as GWT RPC interfaces.