Why are numerous HTTP-header content types classified as "application?" - http-headers

What is the meaning of "application" in these content types:
application/java-archive
application/EDI-X12
application/EDIFACT
application/javascript
application/octet-stream
application/ogg
application/pdf
application/xhtml+xml
application/x-shockwave-flash
application/json
application/ld+json
application/xml
application/zip
application/x-www-form-urlencoded
It seems inconsistent to have text formats like JSON and XML in there, when we have this list:
text/css
text/csv
text/html
text/javascript (obsolete)
text/plain
text/xml
Which lacks JSON but repeats XML. And we have OGG in the "application" category, while an audio category already exists:
audio/mpeg
audio/x-ms-wma
audio/vnd.rn-realaudio
audio/x-wav
Thanks for any insight.

This is largely addressed by RFC 2046.
In particular, text/*:
The "text" media type is intended for sending material which is principally textual in form … there are many formats for representing what might
be known as "rich text". An interesting characteristic of many such
representations is that they are to some extent readable even without
the software that interprets them. It is useful, then, to
distinguish them, at the highest level, from such unreadable data as
images, audio, or text represented in an unreadable form.
and
The "application" media type is to be used for discrete data which do
not fit in any of the other categories, and particularly for data to
be processed by some type of application program.
Some document types could fall under text/* or application/*. You raised the example of XML. Compare an XHTML document (mostly plain text with some semantic markup around it) with an SVG document (mostly descriptions of lines and points often expressed as long strings of numbers of single letters).
And we have OGG in the "application" category, while an audio category already exists:
Ogg is a container format, not a video format.
And sometimes there are just mistakes (or things that are matters of opinion).
text/javascript (obsolete)
JavaScript isn't supposed to be human readable. So it should be under application/* but was initially put under text/*. The RFCs were updated to move it to application/javascript but almost nobody paid attention to that and it was moved back under weight of peer pressure.

Related

How to extract the encoding dictionary from gzip archives

I am looking for a method whereby I can extract the encoding dictionary made by DEFLATE algorithm from a gzip archive.
I need the LZ77 made pointers from the whole archive which refer to patterns from the file as well as the Huffman tree with the aforementioned pointers.
Is there any solution in python?
Does anyone know the https://github.com/madler/infgen/blob/master/infgen.c which might provide the dictionary?
The "dictionary" used for compression at any point in the input is nothing more than the 32K bytes of uncompressed data that precede that point.
Yes, infgen will disassemble a deflate stream, showing all of the LZ77 references and the derived Huffman codes in a readable form. You could run infgen from Python and interpret the output in Python.
infgen also has a -b option for a non-human-readable binary format that might be faster to process for what you want to do.

Puff.c How do I create the defalte stream that will work

I'm using Zlib to deflate a series of arrays using compress. My test code uses uncompress and works correctly. Here's my question:
Can I use Zlib compress my array so that it can be uncompressed using puff.c. Puff.c is available in a much larger application and I do not have the option of installing Zlib as a library.
I ran pufftest.c with zero.raw successfully, but How do I create zeros.raw
"raw" means no zlib header or trailer. You can simply strip the two-byte header and four-byte trailer from the output of compress to feed to puff. Better would be to process the zlib header and trailer (documented in RFC 1950), and feed the deflate innards to puff. Then the trailer provides an integrity check on the uncompressed data, as was intended.

Differences and meanings between gzip header "1f8b0800000000000000" and "1f8b0800000000000003"

As stated in the topic, I am deflating data in my iOS and Android app respectively. The result generated happens to be mostly identical, except that the headers are different.
In iOS, the header is
1f8b0800000000000003
while on Android, the header is
1f8b0800000000000000
Other than this, the remaining parts are identical, I tried to search with both header string but only found results that stating both of them are gzip headers. What are their differences and what would be the possible reason causing them?
Thanks!
GZIP format spec (RFC 1952) says that GZIP member header stores the OS on which the conversion took place, with the intent of describing the filesystem. That's field OS here
http://www.zlib.org/rfc-gzip.html#header-trailer
+---+---+---+---+---+---+---+---+---+---+
|ID1|ID2|CM |FLG| MTIME |XFL|OS | (more-->)
+---+---+---+---+---+---+---+---+---+---+
which matches the position in which you observe the difference.
0 stands for OS with FAT filesystem, 3 stands for Unix.
Granted, trying to identify filesystem through identifying OS does not sound like a good idea today, but that's the way it was originally designed.

create .ttf (true type font) file programmatically

I'm interested in creating my own .ttf file using my own code. I did some research and found Apple's specification for .ttf files.
I'm having trouble understanding it though. Here is an excerpt:
"A TrueType font file consists of a sequence of concatenated tables. A table is a sequence of words. Each table must be long aligned and padded with zeroes if necessary." https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6.html
I opened up a .ttf file with notepad++, expecting to see the tables described above, but just got a bunch of incomprehensible stuff. See attached screenshot.
My question: What are these tables?
Can anybody expand on what I need to do to create these tables? I'm newer to writing code, so maybe the problem is my lack of coding knowledge. If that's the case, could someone point me to a reference where I can educate myself on these tables?
Take a look at OpenType Cookbook about how to program fonts. If you want to simply take a look at the tables you mentioned, you'll need a tool like TTX/FontTools to convert the binary tables to something more readable (an XML file in this case).
I found the answer to my question:
http://www.fileformat.info/tool/hexdump.htm
I uploaded a .ttf file here and the site converted it to hexadecimal form to display. Now i can read the .ttf specification in one window and have an example of the spec being implemented up in another.
Originally i was looking for a binary display, but this hex display is much better for viewing.
Using this hex dump along with the .ttx file makes the .ttf file format a LOT more understandable.
Update:
I found another answer. There's a python package called 'ufo-extractor' that converts .otf or .ttf files into .ufo files. A .ufo file is a human readable font file. See:
http://unifiedfontobject.org/

REST API having same object, but light

We are building a REST API and we want to return the same object, but one call is a 'light' version (without all the field)
what is the best practice ?
1st case
full version: http://api.domain.com/myobject/{objectId}
light version: http://api.domain.com/myobject/{objectId}?filter=light
2nd case
full version: http://api.domain.com/myobject/{objectId}/details
light version: http://api.domain.com/myobject/{objectId}
3rd case
full version: http://api.domain.com/myobject/{objectId}?full=true
light version: http://api.domain.com/myobject/{objectId}
4th case ?
Any link to a documented resource of a REST API is welcome !
Thanks.
This should be handled through content negotiation, that's what its for. Content negotiation is how a client can request which representation of the resource it wants to see. Consider the case of a picture: image/x-canon-cr2, image/jpeg, image/png.
Ostensibly all the same image, but in different formats.
So, this is the mechanism you really want to use for a "lite" version of your resource. For example you could use:
"application/xhtml+xml" for the main version
"application/xhtml+xml; lite" for the for the light weight version
So, for a full resource:
GET /resource
Accept: application/xhtml+xml
For a light version
GET /resource
Accept: application/xhtml+xml; lite
For either, but preferring the lite version:
GET /resource
Accept: application/xhtml+xml;lite, application/xhtml+xml
(the more specific specifier, i.e. the one with ;lite, has higher priority over the normal applciation/xhtml+xml.)
If you will take either, but prefer the full version:
GET /resource
Accept: application/xhtml+xml;lite;q=0.1, application/xhtml+xml
Those without a quality factor default to 1.0, so 0.1 is less than 1.0 and you will get the full version if available over the lite version.
Addenda:
The q factor on Accept is effectively used to show the preferences of the client. It is used to prioritize the list of media types that the client accepts. It says "I can handle these media types, but I prefer a over and b over c".
A JPEG vs a PNG is no different than the lite vs full version. The fact that a JPEG looks anything like the original PNG is an optical illusion, the data is far different, and they have different uses. A JPEG is not "lower quality", it's different data. It's "missing fields". If I want, say, the image size, the JPEG will give me that information just as well as a PNG would. In that case, it's quality is adequate for the task. If it wasn't adequate, then I shouldn't be requesting it.
I can guarantee that if I have a client that can only process PNG and ask for a JPEG, then that program will not "work equally well" with it. If my son wants Chicken Fingers and I give him Cream of Spinach, there are going to be issues, even though both of those are representations of the the resource /dinner.
The "application/xhtml+xml;lite" representation is just that -- a representation, it is NOT the resource itself. That's why the word representation is used. The representations are all simply projections from the actual resource, which is some virtual entity on the server realized internally in some undefined way.
Some representations are normative, some are not.
Representations are manifested through media types, and media types are handled via Con-neg and the ACCEPT header. If you can't handle a representation, then don't ask for it.
This is a con-neg problem.
I don't know what a "media player" has to do with this discussion.
The 1st case and 3rd case have the advantage that one url is used for a single resource and the query string is used to request a particular view of that resource. Choosing between them is a matter of taste, but I tend to prefer a default of getting all the data and saving the options for viewing subsets.