ArrayBuffer vs Blob and XHR2 - xmlhttprequest

XHR2 differences states
The ability to transfer ArrayBuffer, Blob, File and FormData objects.
What are the differences between ArrayBuffer and Blob ?
Why should I care about being able to send them over XHR2 ? (I can understand value of File and FormData)

This is an effort to replace the old method which would take a "string" and cut sections of it out.
You would use an ArrayBuffer when you need a typed array because you intend to work with the data, and a blob when you just need the data of the file.
Blobs (according to spec anyway) have space for a MIME and easier to put into the HTML5 file API than other formats (it's more native to it).
The ArrayBuffer lets us work with typed arrays which is much faster than string manipulation to work with specific bytes and lets us define what type the array segments actually are. Since JavaScript is not strictly typed, it's hard to take a file that might be broken into an array of 32bit ints or perhaps 64bit floats (just imagine 8 bit ints-- that'd be a nightmare in terms of performance with string manipulation and bitwise calculations, especially with unicode).
As far as I can tell you can always move a blob to an array buffer or to a string representation, but this being native to XHR allows scripts to be faster which is the main advantage.
I'd use a blob for working with the file API, but I'd use the array for preforming computation on the data.

Related

Serialize unity3d C# objects to string and back

Which one of the two is recommended approach given my server API is expecting a C# string? Which one will result in lowest string length?
1) Protobuf-net
Using protobuf-net to convert object <-> byte array
Use Convert.ToBase64String methods for converting byte array <-> string
2) Use Json .Net directly to convert object <-> string
We have Protobuf-net working in our project with byte[] server APIs. Now our server is migrating to string APIs instead of byte[]. We are not sure whether we should move to Json .Net or stay with protobuf-net and use Convert Base 64 for extra string to byte[] conversion.
What do you suggest?
Okay, so this is my thought process which I'm hoping can help you decide between the two:
Before deciding which one is better we need to have a better grasp of the context of the problem. Optimization is always something that has to be done under well defined "fitness" parameters.
What I mean by this is:
If you're most constrained by CPU usage, I would test to see which code uses more CPU to execute.
If bandwidth is an issue, you'd want to look at the method that sends the smallest packets. (In which case base64 of binary serialization should be the answer.)
If code readability is a factor, you should probably look at which code is easier to read / understand while taking less text to write. (In which case I suspect that the JSON route will have better readability)
In general, I would caution against over-optimization. Mainly because you might spend more time thinking and comparing than would be lost by your "unoptimized" code :)
That is to say, only optimize when you can clearly define your bottle-neck.
Hope this helped :)

Selection of datatype for storing images

I want to store and retrieve images to/from database using java beans, which data type should be used for this purpose?
The most conventional approach would be to store the data in a BLOB. That would allow you to read and write the byte stream from Java but wouldn't allow you to do any transformation inside the database.
The other alternative would be to use the interMedia ORDImage type. This provides a great deal more flexibility by allowing the database to do things like generate and return a thumbnail image, to adjust compression quality, etc.
As Grant indicates, LONG RAW (and the character-based LONG) data types have been deprecated so you should not be using them.
I think BLOB is what you are looking for. RAW types are for legacy support so unless you absolutely have to deal with them I wouldn't.

Identify compression method used on blob/binary data

I have some binary data (blobs) from a database, and I need to know what compression method was used on them to decompress them.
How do I determine what method of compression that has been used?
Actually it is easier. Assume one of the standard methods was used, there possibly are some magic bytes at the beginning. I suggest taiking the hex values of the first 3-4 bytes and asking google.
It makes no sense to develop your own compressions, so... unless the case was special, or the programmer stupid, he used one of the well known compression methods. YOu could also take libraires of the most popular ones and just try what they say.
The only way to do this, in general, would be to store which compression method was used when you store the BLOB.
Starting from the blob in db you can do the following:
Store in file
For my use case I used DBeaver to export multiple blobs to separate files.
Find out more about the magic numbers from the file by doing
file -i filename
In my case the files are application/zlib; charset=binary.

Making a file format extensible

I'm writing a particular serialisation system. The first version works well. It's a hierarchial string-key, data-value system. So to get a particular value, you navigate to a particular node and say getInt("some key") etc. etc.
My issue with the current system is that the file size gets quite large very quickly.
I'm going to combat this by adding a string table. The issue with this is that I can't think of a way to support the old system. All I have is a file identifier which is 32 bits long.
I can change the file identifier, but everytime I make another change to the format, I'll need to change the identifier again.
What's an elegant way to implement new features while still supporting the old features?
I've studied the PNG format and creating chunks seems like a good way to go.
Is there any other advice you can give me on chunk dependencies and so forth?
If you need a binary format, look at Protocol Buffers, which Google uses internally for RPCs as well as long-term serialization of records. Each field of a protocol buffer is identified by an integer ID. Old applications ignore (and pass through) the fields that they don't understand, so you can safely add new fields. You never reuse deprecated field IDs or change the type of a field.
Protocol buffers support primitive types (bool, int32, int64, string, byte arrays) as well as repeated and even recursively nested messages. Unfortunately they don't support maps, so you have to turn a map into a list of (key, value).
Don't spend all your time fretting about serialization and deserialization. It's not as fun as designing protobufs.

Binary file & saved game formatting

I am working on a small roguelike game, and need some help with creating save games. I have tried several ways of saving games, but the load always fails, because I am not exactly sure what is a good way to mark the beginning of different sections for the player, entities, and the map.
What would be a good way of marking the beginning of each section, so that the data can read back reliably without knowing the length of each section?
Edit: The language is C++. It looks like a readable format would be a better shot. Thanks for all the quick replies.
The easiest solution is usually use a library to write the data using XML or INI, then compress it. This will be easier for you to parse, and result in smaller files than a custom binary format.
Of course, they will take slightly longer to load (though not much, unless your data files are 100's of MBs)
If you're determined to use a binary format, take a look at BER.
Are you really sure you need binary format?
Why not store in some text format so that it can be easily parseable, be it plain text, XML or YAML.
Since you're saving binary data you can't use markers without length.
Simply write the number of records of any type and then structured data, then it will be
easy to read again. If you have variable length elements like string the also need length information.
2
player record
player record
3
entities record
entities record
entities record
1
map
If you have a marker, you have to guarantee that the pattern doesn't exist elsewhere in your binary stream. If it does exist, you must use a special escape sequence to differentiate it. The Telnet protocol uses 0xFF to mark special commands that aren't part of the data stream. Whenever the data stream contains a naturally occurring 0xFF, then it must be replaced by 0xFFFF.
So you'd use a 2-byte marker to start a new section, like 0xFF01. If your reader sees 0xFF01, it's a new section. If it sees 0xFFFF, you'd collapse it into a single 0xFF. Naturally you can expand this approach to use any length marker you want.
(Although my personal preference is a text format (optionally compressed) or a binary format with length bytes instead of markers. I don't understand how you're serializing it without knowing when you're done reading a data structure.)