I'm trying to write/read a class object from/to a file.
I'm new to D and I just want to play a little bit around with it.
Is there a Class/Function to write/read an object to/from a file?
I'm looking for something similar to the ObjectOutputStream сlass in Java.
Or do I have to serialize (concatenate) the object's variables as strings in the file?
I have a Movie class and a MovieManager class, which contains a dynamic movie-array.
A Movie object contains just a few strings and integer values.
Extending answer, provided in comment, it is worth explicitly stating, that D does not provide "one true way" of reading/writing objects to/from files, as there can't be a single optimal one. Different considerations about speed, resulting file format, handling references and similar corner cases may results in different serialization strategies.
That being said, most likely proper serialization library is needed, and, by lucky chance, one of most mature D solutions ("Orange" by Jacob Carlborg https://github.com/jacob-carlborg/orange) is being reviewed right now as a candidate for inclusion into standard library as a std.serialization: newsgroup thread. It may be your best bet.
The library Unmanaged provides a serialization system. You also have Orange
which is less restrictive, as Unmanaged serialization only works if the object to serialize is an ancestor of one of the framework base class.But...Unmanaged works on the "accessor" principle. The data serialized are get via a method and the data deserialized are set via a method, which allows to update some stuffs when the deserializer recall for example...
Related
I'm currently struggling with the experimental KXS-properties serialization backend, mainly because of two reasons:
I can't find any documentation for it (I think there is none)
KXS-properties only includes a serializer / deserializer, but no encoder / decoder
The endpoint provided by the framework is essentially Map<String, Any>, but the map is flat and the keys already have the usual dot-separated properties syntax. So the step that I have to take is to encode the map to a single string that is printable to a .properties file AND decode a single string from a .properties file into the map. I'm generally following the Properties Format Spec from https://docs.oracle.com/javase/10/docs/api/java/util/Properties.html#load(java.io.Reader), it's not as easy as one might think.
The problem is that I can't use java.util.Properties right away because KXS is multiplatform and it would kinda kill the purpose of it when I'd restrict it to JVM because I use java.util.Properties. If I were to use it, the solution would be pretty simple, like this: https://gist.github.com/RaphaelTarita/748e02c06574b20c25ab96c87235096d
So I'm trying to implement my own encoder / decoder, following the rough structure of kotlinx.serialization.json.Json.kt. Although it's pretty tedious, it went well so far, but now I've stumbled upon a new problem:
As far as I know (I am not sure because there is no documentation), the map only contains primitives (or primitive-equivalents, as Kotlin does not really have primitives). I suspect this because when you write your own KSerializers for the KXS frontend, you can specify to encode to any primitive by invoking the encodeXXX() functions of the Encoder interface. Now the problem is: When I try to decode to the map that should contain primitives, how do I even know which primitives are expected by the model class?
I've once written my own serializer / deserializer in Java to learn about the topic, but in that implementation, the backend was a lot more tightly coupled to the frontend, so that I could query the expected primitive type from the model class in the backend. But in my situation, I don't have access to the model class and I have no clue how to retrieve the expected types.
As you can see, I've tried multiple approaches, but none of them worked right away. If you can help me to get any of these to work, that would be very much appreciated
Thank you!
The way it works in kotlinx.serialization is that there are serializers that describe classes and structures etc. as well as code that writes/read properties as well as the struct. It is then the job of the format to map those operations to/from a data format.
The intended purpose of kotlinx.serialization.Properties is to support serializing a Kotlin class to/from a java.util.Properties like structure. It is fairly simple in setup in that every nested property is serialized by prepending the property name to the name (the dotted properties syntax).
Unfortunately it is indeed the case that this deserializing from this format requires knowing the types expected. It doesn't just read from string. However, it is possible to determine the structure. You can use the descriptor property of the serializer to introspect the expectations.
From my perspective this format is a bit more simple than it should be. It is a good example of a custom format though. A key distinction between formats is whether they are intended to just provide a storage format, or whether the output is intended (be able to) to represent a well designed api. The latter ones need to be more complex.
Why do we need to use serialization?
If we want to send an object or piece of data through a network we can use streams of bytes. If we want to save some data to the disk, we can again use the binary mode along with the byte streams and save it.
So what's the advantage of using serialization?
Technically on the low-level, your serialized object will also end up as a stream of bytes on your cable or your filesystem...
So you can also think of it as a standardized and already available way of converting your objects to a stream of bytes. Storing/transferring object is a very common requirement, and it has less or little meaning to reinvent this wheel in every application.
As other have mentioned, you also know that this object->stream_of_bytes implementation is quite robust, tested, and generally architecture-independent.
This does not mean it is the only acceptable way to save or transfer an object: in some cases, you'll have to implement your own methods, for example to avoid saving unnecessary/private members (for example for security or performance reasons). But if you are in a simple case, you can make your life easier by using the serialization/deserialization of your framework, language or VM instead of having to implement it by yourself.
Hope this helps.
Quoting from Designing Data Intensive Applications book:
Programs usually work with data in (at least) two different
representations:
In memory, data is kept in objects, structs, lists, arrays, hash tables, trees, and so on. These data structures are optimized for
efficient access and manipulation by the CPU (typically using
pointers).
When you want to write data to a file or send it over the network, you have to encode it as some kind of self-contained sequence of bytes
(for example, a JSON document). Since a pointer wouldn’t make sense to
any other process, this sequence-of-bytes representation looks quite
different from the data structures that are normally used in memory.
Thus, we need some kind of translation between the two
representations. The translation from the in-memory representation to
a byte sequence is called encoding (also known as serialization or
marshalling), and the reverse is called decoding (parsing,
deserialization, unmarshalling).
Among other reasons to be compatible between architecture. An integer doesn't have the same number of bytes from one architecture to another, and sometimes from one compiler to another.
Plus what you're talking about is still serialization. Binary Serialization. You're putting all the bytes of your object together in order to store them and be able to reconvert them as an object later. This is serializing.
More info on wikipedia
Serialization is the process of converting an object into a stream so that it can be saved in any physical file like (XML) or can be saved in Database. The main purpose of Serialization in C# is to persist an object and save it in any specified storage medium like stream, physical file or DataBase.
In General, serialization is a method to persist an object's state, but i suggest you to read this wiki page, it is pretty detailed and correct in my opinion:
http://en.wikipedia.org/wiki/Serialization
In serialization, the point is not turning an object into bits and bytes, objects ARE bits and bytes already. Serialization is the process of making the object's "state" persistent. Notice the word "state", which means the values of the instance variables of the entire object graph (the target object and all the objects it references either directly or indirectly) WITHOUT methods and other extra runtime stuff stuck to them (and of course plus a little more info that JVM needs for restoration of these objects, such as their class types).
So this is the main reason of its necessity: Storing the whole bytes of objects would be expensive, and for all intents and purposes, unnecessary.
I'm curious what the advantages of encoding objects in objective c with NSCoding and writing them to disk may be over simply writing a persistence object to disk. Is there a performance increase in terms of I/O or disk space usage?
Well, most NSCoding implementations will handle object graphs correctly; i.e. if you code a member object that's already been coded to the coder, it won't code it again. Decoding will restore the object graph correctly (so the decoded target object has multiple inbound references). You also get all the built in helper coding functions (for primitive types, and objects).
Other than that, NSCoders are just persistence object generators, so you end up doing similar work, only without the annoyances and common cases handled by Apple. What persistence generator could you write that wouldn't duplicate tonnes of NSCoder functionality?
The iOS docs differentiate between "serializing" and "archiving." Is this a general distinction (i.e., holds in other languages) or is it specific to Objective-C? Also, what is the difference between these two?
This is a case of one being the other some (but not all) of the time.
Wikipedia has this to say about serialization:
"Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be "resurrected" later in the same or another computer environment"
So, archiving may only be serialization, but it could also be the combination of serialization and compresssion, for example. Or perhaps it adds some kind of header info. So serialization is a form of archive, but an archive is not necessarily a serialization.
This isn't really specific to iOS - these terms are thrown around all over. Their specific meaning in the context of iOS could be quite specific, though.
I was actually trying to look for their difference from IOS perspective. Adding the following for people interested :
Purpose:
Archiving is used to store object graphs. complete data model can be archived and restored easily. The way Nib files work can be considered as example for archiving.
Serialization is used for storing arbitrary hierarchy of objects.
The wat plist files work can be considered as example fo serializations.
Differences(excerpts from Archives programing guide):
"The archive preserves the identity of every object in the graph and all the relationships it has with all the other objects in the graph."
Every object encoded within the context of rootObject invocation is tracked. If the coder is asked to encode an object more than once, the coder encodes a reference to the first encoding instead of encoding the object again.
"The serialization only preserves the values of the objects and their position in the hierarchy. Multiple references to the same value object might result in multiple objects when deserialized. The mutability of the objects is not maintained."
Implementation differences:
Any object that implements NSCoding protocol can be archived where as Only instances of NSArray, NSDictionary, NSString, NSDate, NSNumber, and NSData (and some of their subclasses) can be serialized. The contents of array and dictionary objects must also contain only objects of these few classes.
When to Use:
property lists(serialization) should be used for data that consists primarily of strings and numbers. They are very inefficient when used with large blocks of binary data.
It is worthy to Archive objects other than plist objects or storing large blocks of data.
Generally speaking, Serialization is concerned with converting your program data types into architecture independent byte streams. Archiving is specialized serialization in that you could store type and other relationship based information that allow you to unserialize/unmarshall easily. So archival can be thought of as a specialization and subset of Serialization. For Objective-C
Serialization converts Objective-C
types to and from an
architecture-independent byte stream.
In contrast to archiving, basic
serialization does not record the data
type of the values nor the
relationships between them; only the
values themselves are recorded. It is
your responsibility to deserialize the
data in the proper order. Several
convenience classes, however, do
provide the ability to serialize
property lists, recording their
structure along with their values.
With C++ boost serialization --
http://www.boost.org/doc/libs/1_45_0/libs/serialization/doc/index.html
Here, we use the term "serialization"
to mean the reversible deconstruction
of an arbitrary set of C++ data
structures to a sequence of bytes.
Such a system can be used to
reconstitute an equivalent structure
in another program context. Depending
on the context, this might used
implement object persistence, remote
parameter passing or other facility.
In this system we use the term
"archive" to refer to a specific
rendering of this stream of bytes.
This could be a file of binary data,
text data, XML, or some other created
by the user of this library.
how does Serialization of objects works? How object got deserialized and a instance is created from serialized date without a call to any constructor?
I've kept this answer language agnostic since a language wasn't given.
When the object is serialized, all the require information to rebuild it is encoded in way which can be retrieved. This typically includes the type of the object, as well as the value of all the instance variables.
When the object is deserialized, an area in memory of the correct size is allocated and is populated using the serialized information such that the new object is identical to the serialized one.
The running program can then refer to this new object in memory without having to actually call the constructor.
There are lots of little details which this doesn't explain, but this is the general idea of serialization/deserialization.
Are you talking about Java? If so, serialization is an extralingual object creation mechanism. It's a backdoor that uses native code to create the object without calling any constructors. Therefore, when designing a class for serializability, you need to make sure that a class created through deserialization maintains the same invariants (key fields being initialized) as you would through the constructor path. A third way to create objects in Java is through cloning, and similar issues apply.
Cloning and serialization don't interact well with the use of final fields if you need to set the value of that field to something different than what is returned by clone or the deserialization process.
Josh Bloch's "Effective Java" has some chapters that explain these issues in more depth.
(this answer may apply to other languages too, but I've only used serialization in Java)
Regarding .NET: this isn't a definitive or textbook answer, and I might be all-out wrong...
.NET Serialization needs to be seperated out into Binary vs. others (XML or an XML derivitave typically). Binary serialization is mostly a black-box to me, but it allows the object to be serialized and restored in their current state. XML serialization typically only serialized the public fields/properties of an object, unless overriden by adding a custom ISerializable implementation.
In the case of XML serialization I believe .NET uses Reflection to determine which fields and properties get converted to their equivalent Elements. Adding an [XMLSerializable] attribute will implement a default behavior which can be adjusted by applying other attributes at the field level (such as [XMLAttribute]).
The metadata (which Reflection depends on) stores all the object members as well as their attributes and addresses, which allows the serializer to determine how it should build the output.