Compilable IDLs that serialize to JSON

Compilable IDLs that serialize to JSON - objective-c

I've used Protobuf before, and I was looking into Thrift, but I was wondering what the options were for IDLs that compile to (at least) C#, JS, Objective C and Java, but also serialize/deserialize JSON in all of those languages. Thrift mostly does that, but doesn't support JSON in OC, and I was concerned (perhaps unwarranted) about the maturity of its JSON interfaces. Are there any IDLs that use JSON as their primary serialization, but also compile to strongly typed bindings in all of the languages listed above?
Thanks!

Regarding Thrift: If there are any serialization protocols could be considered "primary", it would certainly be the binary format. However, we strive to introduce a common minimum set of protocols and transports for each language, one of which is JSON.
Next, please keep in mind that Thrift's JSON format might not be what you expect. The JSON format is especially designed for Thrift, the main goal is a compact representation of the data. The SimpleJSON protocol also available for some languages is more verbatim, but initially designed to be write only (although that viewpoint right now changes slightly).
I was concerned (perhaps unwarranted) about the maturity of its JSON interfaces
There is nothing to be concerned of, honestly. There are a few PHP-related issues with regard to proper string encoding but otherwise it works just fine - when available for the language of choice. If you don't mind, it is not that hard to write a JSON transport and we always welcome quality contributions. If you need help during that process, ask the mailing lists.

Related

Difference between properties File, Yaml & Json?

I'm a beginner in software testing. I'm working with selenium with page object design patterns. I want to keep the test data separately, but i'm confusing how to do it.
I want to know the difference between the usage of properties file, yaml, json.Which is most useful in software testing?
Which should I choose yaml, properties file, or json. So I need to keep the test data separately in json or properties file or yaml. Which is more people using nowadays ? As a tester using yaml, json and properties file is knowiing well. or following as particular pattern which is more easier. whats your suggestion ?

XML (Extensible Markup Language) is flexible and powerful markup capabilities. It is often used in configuration and preference files like those used for the Eclipse IDE. Most Web browsers have XML viewers, although XML is designed for structured data, making it a bit like looking at the internals of a database.
JavaScript Object Notation (JSON) is used with JavaScript, of course. It will be familiar to Web developers that use it for client/server communication.
YAML stands for YAML Ain’t Markup Language. It uses line and whitespace delimiters instead of explicitly marked blocks that could span one or more lines like XML and JSON. This approach is used in many programming languages, such as Python.
So it comes down to YAML or JSON-
Technically YAML is a superset of JSON. That is, in theory at least, a YAML parser can understand JSON, but not necessarily the other way around.
In general, there are certain things I like about YAML that are not available in JSON.
1) YAML is visually easier to look at. In fact the YAML homepage is itself valid YAML, yet it is easy for a human to read.
2)YAML has the ability to reference other items within a YAML file using "anchors." Thus it can handle relational information as one might find in a MySQL database.
3)YAML is more robust about embedding other serialization formats such as JSON or XML within a YAML file.
4)YAML, depending on how you use it, can be more readable than JSON
5)JSON is often faster and is probably still interoperable with more systems
6)Duplicate keys, which are potentially valid JSON, are definitely invalid YAML.
7)YAML has a ton of features, including comments and relational anchors. YAML syntax is accordingly quite complex, and can be hard to understand.
8)YAML can be used, directly, for complex tasks like grammar definitions, and is often a better choice than inventing a new language.
If you don't need any features which YAML has and JSON doesn't, I would prefer JSON because it is very simple and is widely supported (has a lot of libraries in many languages). YAML is more complex and has less support. I don't think the parsing speed or memory use will be very much different, and maybe not a big part of your program's performance.But JSON is the winner for performance (if relevant) and interoperability. YAML is better for human-maintained files.So basically use as per your requirements not what most people are using.

How to describe messagepack data structure is used in internal bin protocol? is ASN.1 or BNF suited for it?

My goal is to write specification of simple client-server application protocol for our project where will be few kinds of client: IOS(swift), Android(java) and Web(http/websocket) probably. Server is the python. Our team decided to use MessagePack as a data structure serializer for different requests/responses.
So now i think how to describe such data structures. I don't wanna write the whole description of specification manually and spent time for thinking over different rules and agreements. I would want to point to a notation system description for my colleagues of client development.
My question is a common.
How do you behave with such task? Do you write pure text in your native speaking language or use some notation system? Is it right to use notation system and existing serializer together? I meant ASN.1. It is seemed clear.

how to choose the right markup language or serialization format?

So far my use of markup languages and/or data serialization formats have been limited to using json and a little bit of XML to store data in my early attempts at leaning video game development. Recently though I have been trying writing different types of applications, and in the cases where I need to store data I have been mainly using the json format because that is what I am most comfortable using. I am starting to feel that I have been blindly throwing json at my problems, which I doubt is any better than blindly throwing XML at my problems.
Mainly my question is what things do you have to consider when choosing the right language or format and what are some common abuses or misuse of these tools that I should watch out for?

XStream <-> Alternative binary formats (e.g. protocol buffers)

We currently use XStream for encoding our web service inputs/outputs in XML. However we are considering switching to a binary format with code generator for multiple languages (protobuf, Thrift, Hessian, etc) to make supporting new clients easier and less reliant on hand-coding (also to better support our message formats which include binary data).
However most of our objects on the server are POJOs with XStream handling the serialization via reflection and annotations, and most of these libraries assume they will be generating the POJOs themselves. I can think of a few ways to interface an alternative library:
Write an XStream marshaler for the target format.
Write custom code to marshal the POJOs to/from the classes generated by the alternative library.
Subclass the generated classes to implement the POJO logic. May require some rewriting. (Also did I mention we want to use Terracotta?)
Use another library that supports both reflection (like XStream) and code generation.
However I'm not sure which serialization library would be best suited to the above techniques.

(1) might not be that much work since many serialization libraries include a helper API that knows how to read/write primitive values and delimiters.
(2) probably gives you the widest choice of tools: https://github.com/eishay/jvm-serializers/wiki/ToolBehavior (some are language-neutral). Flawed but hopefully not totally useless benchmarks: https://github.com/eishay/jvm-serializers/wiki
Many of these tools generate classes, which would require writing code to convert to/from your POJOs. Tools that work with POJOs directly typically aren't language-neutral.
(3) seems like a bad idea (not knowing anything about your specific project). I normally keep my message classes free of any other logic.
(4) The Protostuff library (which supports the Protocol Buffer format) lets you write a "schema" to describe how you want your POJOs serialized. But writing this schema might end up being more work and more error-prone than just writing code to convert between your POJOs and some tool's generated classes.
Protostuff can also automatically generate a schema via reflection, but this might yield a message format that feels a bit Java-centric.

Biggest differences of Thrift vs Protocol Buffers? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
What are the biggest pros and cons of Apache Thrift vs Google's Protocol Buffers?

They both offer many of the same features; however, there are some differences:
Thrift supports 'exceptions'
Protocol Buffers have much better documentation/examples
Thrift has a builtin Set type
Protocol Buffers allow "extensions" - you can extend an external proto to add extra fields, while still allowing external code to operate on the values. There is no way to do this in Thrift
I find Protocol Buffers much easier to read
Basically, they are fairly equivalent (with Protocol Buffers slightly more efficient from what I have read).

Another important difference are the languages supported by default.
Protocol Buffers: Java, Android Java, C++, Python, Ruby, C#, Go, Objective-C, Node.js
Thrift: Java, C++, Python, Ruby, C#, Go, Objective-C, JavaScript, Node.js, Erlang, PHP, Perl, Haskell, Smalltalk, OCaml, Delphi, D, Haxe
Both could be extended to other platforms, but these are the languages bindings available out-of-the-box.

RPC is another key difference. Thrift generates code to implement RPC clients and servers wheres Protocol Buffers seems mostly designed as a data-interchange format alone.

Protobuf serialized objects are about 30% smaller than Thrift.
Most actions you may want to do with protobuf objects (create, serialize, deserialize) are much slower than thrift unless you turn on option optimize_for = SPEED.
Thrift has richer data structures (Map, Set)
Protobuf API looks cleaner, though the generated classes are all packed as inner classes which is not so nice.
Thrift enums are not real Java Enums, i.e. they are just ints. Protobuf has real Java enums.
For a closer look at the differences, check out the source code diffs at this open source project.

As I've said as "Thrift vs Protocol buffers" topic :
Referring to Thrift vs Protobuf vs JSON comparison :
Thrift supports out of the box AS3, C++, C#, D, Delphi, Go, Graphviz, Haxe, Haskell, Java, Javascript, Node.js, OCaml, Smalltalk, Typescript, Perl, PHP, Python, Ruby, ...
C++, Python, Java - in-box support in Protobuf
Protobuf support for other languages (including Lua, Matlab, Ruby, Perl, R, Php, OCaml, Mercury, Erlang, Go, D, Lisp) is available as Third Party Addons (btw. Here is SWI-Prolog support).
Protobuf has much better documentation and plenty of examples.
Thrift comes with a good tutorial
Protobuf objects are smaller
Protobuf is faster when using "optimize_for = SPEED" configuration
Thrift has integrated RPC implementation, while for Protobuf RPC solutions are separated, but available (like Zeroc ICE ).
Protobuf is released under BSD-style license
Thrift is released under Apache 2 license
Additionally, there are plenty of interesting additional tools available for those solutions, which might decide. Here are examples for Protobuf: Protobuf-wireshark , protobufeditor.

Protocol Buffers seems to have a more compact representation, but that's only an impression I get from reading the Thrift whitepaper. In their own words:
We decided against some extreme storage optimizations (i.e. packing
small integers into ASCII or using a 7-bit continuation format)
for the sake of simplicity and clarity in the code. These alterations
can easily be made if and when we encounter a performance-critical
use case that demands them.
Also, it may just be my impression, but Protocol Buffers seems to have some thicker abstractions around struct versioning. Thrift does have some versioning support, but it takes a bit of effort to make it happen.

I was able to get better performance with a text based protocol as compared to protobuff on python. However, no type checking or other fancy utf8 conversion, etc... which protobuff offers.
So, if serialization/deserialization is all you need, then you can probably use something else.
http://dhruvbird.blogspot.com/2010/05/protocol-buffers-vs-http.html

One obvious thing not yet mentioned is that can be both a pro or con (and is same for both) is that they are binary protocols. This allows for more compact representation and possibly more performance (pros), but with reduced readability (or rather, debuggability), a con.
Also, both have bit less tool support than standard formats like xml (and maybe even json).
(EDIT) Here's an Interesting comparison that tackles both size & performance differences, and includes numbers for some other formats (xml, json) as well.

I think most of these points have missed the basic fact that Thrift is an RPC framework, which happens to have the ability to serialize data using a variety of methods (binary, XML, etc).
Protocol Buffers are designed purely for serialization, it's not a framework like Thrift.

ProtocolBuffers is FASTER.
There is a nice benchmark here:
https://github.com/eishay/jvm-serializers/wiki (last updated 2016, but there are forks that contain faster serializers as of 2020, e.g. ActiveJ created a fork to demonstrate their speed on the JVM: https://github.com/activej/jvm-serializers).
You might also want to look into Avro, which can be faster. There are two libraries for Avro in .NET:
Apache.Avro
Chr.Avro - written by engineers at C.H. Robinson, a supply chain logistics company
By the way, the fastest I've ever seen is Cap'nProto;
A C# implementation can be found at the Github-repository of Marc Gravell.

And according to the wiki the Thrift runtime doesn't run on Windows.

For one, protobuf isn't a full RPC implementation. It requires something like gRPC to go with it.
gPRC is very slow compared to Thrift:
http://szelei.me/rpc-benchmark-part1/

I think the basic data structure is different
Protocol Buffer use variable-length integee which refers to variable-length digital encoding, turning a fixed-length number into a variable-length number to save space.
Thrift proposed different types of serialization formats (called "protocols").
In fact, Thrift has two different JSON encodings, and no less than three different binary encoding methods.
In conclusion，these two libraries are completely different. Thrift likes a one-stop shop, giving you the entire integrated RPC framework and many options (supporting cross-language), while Protocol Buffers is more inclined to "just do one thing and do it well".

There are some excellent points here and I'm going to add another one in case someones' path crosses here.
Thrift gives you an option to choose between thrift-binary and thrift-compact (de)serializer, thrift-binary will have an excellent performance but bigger packet size, while thrift-compact will give you good compression but needs more processing power. This is handy because you can always switch between these two modes as easily as changing a line of code (heck, even make it configurable). So if you are not sure how much your application should be optimized for packet size or in processing power, thrift can be an interesting choice.
PS: See this excellent benchmark project by thekvs which compares many serializers including thrift-binary, thrift-compact, and protobuf: https://github.com/thekvs/cpp-serializers
PS: There is another serializer named YAS which gives this option too but it is schema-less see the link above.

It's also important to note that not all supported languages compair consistently with thrift or protobuf. At this point it's a matter of the modules implementation in addition to the underlying serialization. Take care to check benchmarks for whatever language you plan to use.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas