XStream <-> Alternative binary formats (e.g. protocol buffers)

XStream <-> Alternative binary formats (e.g. protocol buffers) - serialization

We currently use XStream for encoding our web service inputs/outputs in XML. However we are considering switching to a binary format with code generator for multiple languages (protobuf, Thrift, Hessian, etc) to make supporting new clients easier and less reliant on hand-coding (also to better support our message formats which include binary data).
However most of our objects on the server are POJOs with XStream handling the serialization via reflection and annotations, and most of these libraries assume they will be generating the POJOs themselves. I can think of a few ways to interface an alternative library:
Write an XStream marshaler for the target format.
Write custom code to marshal the POJOs to/from the classes generated by the alternative library.
Subclass the generated classes to implement the POJO logic. May require some rewriting. (Also did I mention we want to use Terracotta?)
Use another library that supports both reflection (like XStream) and code generation.
However I'm not sure which serialization library would be best suited to the above techniques.

(1) might not be that much work since many serialization libraries include a helper API that knows how to read/write primitive values and delimiters.
(2) probably gives you the widest choice of tools: https://github.com/eishay/jvm-serializers/wiki/ToolBehavior (some are language-neutral). Flawed but hopefully not totally useless benchmarks: https://github.com/eishay/jvm-serializers/wiki
Many of these tools generate classes, which would require writing code to convert to/from your POJOs. Tools that work with POJOs directly typically aren't language-neutral.
(3) seems like a bad idea (not knowing anything about your specific project). I normally keep my message classes free of any other logic.
(4) The Protostuff library (which supports the Protocol Buffer format) lets you write a "schema" to describe how you want your POJOs serialized. But writing this schema might end up being more work and more error-prone than just writing code to convert between your POJOs and some tool's generated classes.
Protostuff can also automatically generate a schema via reflection, but this might yield a message format that feels a bit Java-centric.

Related

Difference between properties File, Yaml & Json?

I'm a beginner in software testing. I'm working with selenium with page object design patterns. I want to keep the test data separately, but i'm confusing how to do it.
I want to know the difference between the usage of properties file, yaml, json.Which is most useful in software testing?
Which should I choose yaml, properties file, or json. So I need to keep the test data separately in json or properties file or yaml. Which is more people using nowadays ? As a tester using yaml, json and properties file is knowiing well. or following as particular pattern which is more easier. whats your suggestion ?

XML (Extensible Markup Language) is flexible and powerful markup capabilities. It is often used in configuration and preference files like those used for the Eclipse IDE. Most Web browsers have XML viewers, although XML is designed for structured data, making it a bit like looking at the internals of a database.
JavaScript Object Notation (JSON) is used with JavaScript, of course. It will be familiar to Web developers that use it for client/server communication.
YAML stands for YAML Ain’t Markup Language. It uses line and whitespace delimiters instead of explicitly marked blocks that could span one or more lines like XML and JSON. This approach is used in many programming languages, such as Python.
So it comes down to YAML or JSON-
Technically YAML is a superset of JSON. That is, in theory at least, a YAML parser can understand JSON, but not necessarily the other way around.
In general, there are certain things I like about YAML that are not available in JSON.
1) YAML is visually easier to look at. In fact the YAML homepage is itself valid YAML, yet it is easy for a human to read.
2)YAML has the ability to reference other items within a YAML file using "anchors." Thus it can handle relational information as one might find in a MySQL database.
3)YAML is more robust about embedding other serialization formats such as JSON or XML within a YAML file.
4)YAML, depending on how you use it, can be more readable than JSON
5)JSON is often faster and is probably still interoperable with more systems
6)Duplicate keys, which are potentially valid JSON, are definitely invalid YAML.
7)YAML has a ton of features, including comments and relational anchors. YAML syntax is accordingly quite complex, and can be hard to understand.
8)YAML can be used, directly, for complex tasks like grammar definitions, and is often a better choice than inventing a new language.
If you don't need any features which YAML has and JSON doesn't, I would prefer JSON because it is very simple and is widely supported (has a lot of libraries in many languages). YAML is more complex and has less support. I don't think the parsing speed or memory use will be very much different, and maybe not a big part of your program's performance.But JSON is the winner for performance (if relevant) and interoperability. YAML is better for human-maintained files.So basically use as per your requirements not what most people are using.

Compilable IDLs that serialize to JSON

I've used Protobuf before, and I was looking into Thrift, but I was wondering what the options were for IDLs that compile to (at least) C#, JS, Objective C and Java, but also serialize/deserialize JSON in all of those languages. Thrift mostly does that, but doesn't support JSON in OC, and I was concerned (perhaps unwarranted) about the maturity of its JSON interfaces. Are there any IDLs that use JSON as their primary serialization, but also compile to strongly typed bindings in all of the languages listed above?
Thanks!

Regarding Thrift: If there are any serialization protocols could be considered "primary", it would certainly be the binary format. However, we strive to introduce a common minimum set of protocols and transports for each language, one of which is JSON.
Next, please keep in mind that Thrift's JSON format might not be what you expect. The JSON format is especially designed for Thrift, the main goal is a compact representation of the data. The SimpleJSON protocol also available for some languages is more verbatim, but initially designed to be write only (although that viewpoint right now changes slightly).
I was concerned (perhaps unwarranted) about the maturity of its JSON interfaces
There is nothing to be concerned of, honestly. There are a few PHP-related issues with regard to proper string encoding but otherwise it works just fine - when available for the language of choice. If you don't mind, it is not that hard to write a JSON transport and we always welcome quality contributions. If you need help during that process, ask the mailing lists.

F# binary serialization for XNA and XBox

I have a series of F# data structures which I cannot control (computation expressions and lambdas, which are compiler-generated) and which I must serialize.
The binary serializer works, but unfortunately it is not available for the XBox and the .Net CF. Is there some alternative that does not require me to redesign a year worth of pure and immutable data structures?
Thanks

I've used the nserializer open source library successfully for a similar situation - namely serializing arbitrary .Net objects used to implement game AI in F#, including Unity style "coroutines" implemented via yields in sequences (which are internally compiled to a number of classes roughly representing the possible continuations).
It should do what you want, although it uses XML rather than a binary format - considering compressing and decompressing if the size turns out to be an issue.

WCF code generation for large/complex schema (HR-XML/OAGIS) - is there an alternative?

and thank you for reading.
I am implementing a WCF Service based on a predefined specification (HR-XML 3.0). As such, I am starting with the schema, and working my way back to code. There are a number of large Schema documents (which import yet more Schema documents) related to my implementation, provided by this specification.
I am able to generate code using xsd.exe, by supplying the "main" and "supporting" xsd files as arguments. But there are several issues, and I am wondering if this is the right approach.
there are litterally hundreds of classes - the code file is half a meg in size
duplicate classes (ex. Type, Type1 - which both represent the same type)
there are classes declared as inheriting from a base class, but that base class is not generated/defined
I understand that there are limitations to the types of Schema supported by svcutil.exe/xsd.exe when targeting the DataContractSerializer and even XmlSerializer. My question is two-fold:
Are code generation "issues" fairly common when dealing with larger, modular xsd files? Has anyone had success with generating data contracts from OAGIS or HR-XML schema?
Given the above issues, are there better approaches to this task, avoiding generating code and working with concrete objects? Does it make better sence to read and compose a SOAP message directly, while still taking advantage of the rest of the WCF framework? I understand that I am loosing the convenience of working with .NET objects, and the framekwork-provided (de)serialization; given these losses, would it still be advantageous to base my Service on WCF? Is there some "middle ground" between working with .NET types and pure XML?
Thank you very much!
-Sasha Borodin
DFWHC.org

Sasha, If you are going to use code generation, you likely should never start with the modular schemas. When you put a code generator against the modular schemas, you'll generate a class for all the common compoents in the HR-XML library and a good bit of the common components in OAGIS. You don't want this. HR-XML is distributed with standalone schemas, which are a better starting point. An even better starting point would be to create a flattened package xsd containing only the types brought in by the WSDL. If you use a couple standalone schemas, you are going to at least have some duplications among your generated code.

Well, you could try and do something like this:
convert your XSD to C# code separately, using something like the xsd.exe tool from Microsoft, or something like Xsd2Code as a Visual Studio Plugin.
Xsd2Code in Visual Studio http://i3.codeplex.com/Project/Download/FileDownload.aspx?ProjectName=Xsd2Code&DownloadId=41336
once you have your C# classes, weed out any inconsistencies, duplications, and so forth
package everything up into a separate class library assembly
now, when generating your WCF service from the WSDL, either using Add Service Reference from Visual Studio or the svcutil.exe tool, reference that assembly with all the data classes. Doing so, WCF should skip re-creating the whole set of classes again, and use whatever is available in that data assembly
With this, you might be able to get this mess under control.

Xstream/HTTP service

We run multiple websites which use the same rich functional backend running as a library. The backend is comprised of multiple components with a lot of objects shared between them. Now, we need to separate a stateless rule execution component into a different container for security reasons. It would be great if I could have access to all the backend objects seamlessly in the rules component (rather than defining a new interface and objects/adapters).
I would like to use a RPC mechanism that will seamlessly support passing our java pojos (some of them are hibernate beans) over the wire. Webservices like JAXB, Axis etc. are needing quite a bit of boiler plate and configuration for each object. Whereas those using Java serialization seem straightforward but I am concerned about backward/forward compatibility issues.
We are using Xstream for serializing our objects into persistence store and happy so far. But none of the popular rpc/webservice framework seem use xstream for serialization. Is it ok to use xstream and send my objects over HTTP using my custom implementation? OR will java serialization just work OR are there better alternatives?
Advance thanks for your advise.

The good thing with standard Java serialization is that it produces binary stream which is quite a bit more space- and bandwidth-efficient than any of these XML serialization mechanisms. But as you wrote, XML can be more back/forward compatibility friendly, and it's easier to parse and modify by hand and/or by scripts, if need arises. It's a trade-off; if you need long-time storage, then it's advisable to avoid plain serialization.
I'm a happy XStream user. Zero problems so far.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas