Does datacontract serialization use reflection? - serialization

XmlSerialization creates a serializer proxy for each class. The proxy resides in a different assembly so it can serialize only public fields.
DataContract serialization can serialize private fields too. Does it mean it uses reflection? Isn't it slower than using a proxy (except for the first time)?

This protobuf-net page shows the performance of DataContractSerializer to be significantly better than XmlSerializer. Of course, you should always test with your own data, but if you are looking to replace XmlSerializer, you will most likely find DataContractSerializer to be a performance improvement.
I'm not sure how DataContractSerializer is implemented internally, but generally serializers are highly optimized. This is especially true for DataContractSerializer since it is a big part of the performance picture for WCF. It is not uncommon for a serializer to generate MSIL code on the fly. When this is done, DynamicMethod allows you to (surprisingly!) bypass visibility checks (see MSDN), so it is possible to access private fields without reflection.
From MSDN:
Given sufficient security permissions,
a serialization engine implemented
using dynamic methods can access
private and protected data to enable
serialization of objects not authored
by the engine creator.

Related

Linq to Xml VS XmlSerializer VS DataContractSerializer

In my web method, I get an object of some third party C# entity class. The entity class is nothing but the DataContract. This entity class is quite complex and has properties of various types, some properties are collections too. Of course, those linked types are also DataContracts.
I want to serialize that DataContract entity into XML as part of business logic of my web service. I cannot use DataContractSerializer directly (on the object I receive in the web method) simply because the XML schema is altogether different. So the XML generated by DataContractSerializer will not get validated against the schema.
I am not able to conclude the approach I should follow for implementation. I could think of following implementation approaches:
LINQ to XML - This looks ok but I need to create XML tree (i.e. elements or XML representation of the class instance) manually for each type of object. Since there are many entity classes and they are linked to each other, I think this is too much of work to write XML elements manually. Besides, i'll have to keep modifying the XML Tree as and when the entity class introduces some new property. Not only this, the code where I generate XML tree would look little clumsy (at least in appearance) and would be harder to maintain/change by some other developer in future; he/she will have to look at it so closely to understand how that XML is generated.
XmlSerializer - I can write my own entity classes that represent the XML structure I want. Now, I need to copy details from incoming object to the object of my own classes. So this is additional work (for .NET too when code executes!). Then I can use XmlSerializer on my object to generate XML. In this case, I'll have to create entity classes and whenever third party entity gets modified, I'll have to just add new property in my class. (with XmlElement or XmlAttibute attributes). But people recommend DataContractSerializer over this one and so I don't want to finalize this unless all aspects are clear to me.
DataContractSerializer - Again here, I'll have to write my own entity class since I have no control over the third party DataContracts. And I need to copy details from incoming object to the object of my own classes. So this is additional work. However, since DataContractSerializer does not support Xml attributes, I'll have to implement IXmlSerializable and generate required Xml in WriteXml method. DataContractSerializer is faster than XmlSerializer, but again I'll have to handle the changes (in WriteXml) if third party entity changes.
Questions:
Which approach is best in this scenario considering performance too?
Can you suggest some better approach?
Is DataContractSerializer worth considering (because it has better performance over XmlSerilaizer) when incoming entity class is subject to change?
Should LINQ be really used for serialization? Or is it really good for things other than querying?
Can XmlSerializer be preferred over LINQ in such cases? If yes, why?
I agree with #Werner Strydom's answer.
I decided to use the XmlSerializer because code becomes maintainable and it offers performance I expect. Most important is that it gives me full control over the XML structure.
This is how I solved my problem:
I created entity classes (representing various types of Xml elements) as per my requirement and passed an instance of the root class (class representing root element) through XmlSerializer.
Small use of LINQ in case of 1:M relationship:
Wherever I wanted same element (say Employee) many times under specific node (say Department) , I declared the property of type List<T>. e.g. public List<Employee> Employees in the Department class. In such cases XmlSerializer obviously added an element called Employees (which is grouping of all Employee elements) under the Department node. In such cases, I used LINQ (after XmlSerializer serialized the .NET object) to manipulate the XElement (i.e. XML) generated by XmlSerializer. Using LINQ, I simply put all Employee nodes directly under Department node and removed the Employees node.
However, I got the expected performance by combination of xmlSerializer and LINQ.
Downside is that, all classes I created had to be public when they could very well be internal!
Why not DataContractSerializer and LINQ-to-XML?
DataContractSerializer does not allow to use Xml attributes (unless I implement IXmlSerializable). See the types supported by DataContractSerializer.
LINQ-to-XML (and IXmlSerializable too) makes code clumsy while creating complex XML structure and that code would definitely make other developers scratch their heads while maintaining/changing it.
Is there any other way?
Yes. As mentioned by #Werner Strydom, you can very well generate classes using XSD.exe or tool like Xsd2Code and work directly with them if you are happy with the resulting classes.
I'll pick XmlSerializer because its the most maintainable for a custom schema (assuming you have the XSD). When you are done developing the system, test its performance in its entirety and determine whether XML serialization is causing problems. If it is, you can then replace it with something that requires more work and test it again to see if there is any gains. But if XML serialization isn't an issue, then you have maintainable code.
The time it takes to parse a small snippet of XML data may be negligible compared to communicating with the database or external systems. On systems with large memory (16GB+) you may find the GC being a bottleneck in .NET 4 and earlier (.NET 4.5 tries to solve this), especially when you work with very large data sets and streams.
Use AutoMapper to map objects created by XSD.EXE to your entities. This will allow the database design to change without impacting the web service.
One thing that is great about LINQ to XML is XSD validation. However, that impacts performance.
Another option is to utilize LINQ and Reflection to create a generic class to serialize your object to XML. A good example of this can be found at http://primecoder.blogspot.com/2010/09/how-to-serialize-objects-to-xml-using.html . I am not sure what your XML needs to look like at the end of the day, but if it is pretty basic this could do the trick. You would not need to make changes as your entity classes add/remove/change properties, and you could use this across all of your objects (and other projects if stored in a utility DLL).

Using my ISerializable implementation instead of DataContractSerializer

My WCF client and server exchange objects whose types are defined on a class library shared by both, the server and the client. These objects implement custom serialization via ISerializable. However, my custom serialization is not being used. DataContractSerializer is being used, instead. How do I force my custom serialization to be used?
The reason why I need to use my own serialization is because I do trasfer some big object graphs and my serialization algorithm does a good job in compressing/speeding up things.
This is possible by adding the correct attributes to your wcf contact.
See: http://www.danrigsby.com/blog/index.php/2008/03/07/xmlserializer-vs-datacontractserializer-serialization-in-wcf/ for details.
If this does not help, show some code of how you have told the wcf contract that it should use the custom serializer.

Why should I use DataContract Serializer while I have XML/Binary serializer?

Can anyone please elabortae me the reasons why should I use Data Contract Serializer
while we have XML/Binary serializer already there in .Net ?
This is a site i found when i was looking into the same issue.
You should check this out
Quoting from the same link mentioned above:
Advantages of DataContractSerializer over XMLSerializer
Opt-in rather than opt-out properties to serialize.
Because it is opt in you can serialize not only properties, but also fields. You can even serialize non-public members such as private or protected members. And you dont need a set on a property either (however without a setter you can serialize, but not deserialize)
Is about 10% faster than XmlSerializer to serialize the data because since you don’t have full control over how it is serialize, there is a lot that can be done to optimize the serialization/deserialization process.
Can understand the SerializableAttribute and know that it needs to be serialized
More options and control over KnownTypes
Hope it helps!
One more HUGE advantage; DataContract serialization allows for interop between any two classes that implement the same DataContract. This is what allows WCF to generate the data classes for you automatically when you reference a WCF service. You can also "hack" this process by referencing a published DataContract in a new user-developed class (or two or three); You can then transfer data between instances of these classes, and any other new ones you create, via serialization. This is also possible but much more difficult with XML serialization, and impossible with binary serialization.

WCF Serialization -More Information

I read some microsoft articles.They explained that WCF uses DataContractSerializer for serialization.But the articles did not explain why DataContractSerializer preferred over
XmlSerialization.Can anyone give me the additional information?
Here is an article with a comparison.
Key section:
XmlSerializer
Advantages:
Opt-out rather than opt-in properties to serialize. This mean you don’t have to specify each and every property to serialize, only those you don’t wan to serialize2. Full control over how a property is serialized including it it should be a node or an attribute
Supports more of the XSD standard
Disadvantages:
Can only serialize properties
Properties must be public
Properties must have a get and a set which can result in some awkward design
Supports a narrower set of types
Cannot understand the DataContractAttribute and will not serialize it unless there is a SerializableAttribute too
DataContractSerializer
Advantages:
Opt-in rather than opt-out properties to serialize. This mean you specify what you want serialize
Because it is opt in you can serialize not only properties, but also fields. You can even serialize non-public members such as private or protected members. And you dont need a set on a property either (however without a setter you can serialize, but not deserialize)
Is about 10% faster than XmlSerializer to serialize the data because since you don’t have full control over how it is serialize, there is a lot that can be done to optimize the serialization/deserialization process.
Can understand the SerializableAttribute and know that it needs to be serialized
More options and control over KnownTypes
Disadvantages:
No control over how the object is serialized outside of setting the name and the order

Serialization of Objects

how does Serialization of objects works? How object got deserialized and a instance is created from serialized date without a call to any constructor?
I've kept this answer language agnostic since a language wasn't given.
When the object is serialized, all the require information to rebuild it is encoded in way which can be retrieved. This typically includes the type of the object, as well as the value of all the instance variables.
When the object is deserialized, an area in memory of the correct size is allocated and is populated using the serialized information such that the new object is identical to the serialized one.
The running program can then refer to this new object in memory without having to actually call the constructor.
There are lots of little details which this doesn't explain, but this is the general idea of serialization/deserialization.
Are you talking about Java? If so, serialization is an extralingual object creation mechanism. It's a backdoor that uses native code to create the object without calling any constructors. Therefore, when designing a class for serializability, you need to make sure that a class created through deserialization maintains the same invariants (key fields being initialized) as you would through the constructor path. A third way to create objects in Java is through cloning, and similar issues apply.
Cloning and serialization don't interact well with the use of final fields if you need to set the value of that field to something different than what is returned by clone or the deserialization process.
Josh Bloch's "Effective Java" has some chapters that explain these issues in more depth.
(this answer may apply to other languages too, but I've only used serialization in Java)
Regarding .NET: this isn't a definitive or textbook answer, and I might be all-out wrong...
.NET Serialization needs to be seperated out into Binary vs. others (XML or an XML derivitave typically). Binary serialization is mostly a black-box to me, but it allows the object to be serialized and restored in their current state. XML serialization typically only serialized the public fields/properties of an object, unless overriden by adding a custom ISerializable implementation.
In the case of XML serialization I believe .NET uses Reflection to determine which fields and properties get converted to their equivalent Elements. Adding an [XMLSerializable] attribute will implement a default behavior which can be adjusted by applying other attributes at the field level (such as [XMLAttribute]).
The metadata (which Reflection depends on) stores all the object members as well as their attributes and addresses, which allows the serializer to determine how it should build the output.