How to determine the Content-Type in an HTTP Response - apache

I'm building a web server as an exercise. When I receive a raw request, it gets parsed into an simple syntax tree, and a response is built by evaluating this tree. My question this: When sending an HTTP Response, does the Content-Type field get set by taking the file extension of the requested resource and looking it up in a dictionary of MIME-types? A good example would be the anatomy of how the response for a favicon.ico is built. Any insight into this would be most helpful. Thanks.

By default, web server looks into file extension and select what kind of Content Type it should interpret the file as. However, server-side scripting can send custom header ( e.g. header() function of PHP ) to override the settings . For example, a JPEG can be interpreted as PNG if you send Content Type as image/png to web server with the following code:
header('Content-Type: image/png');
For non-file requests, the web server looks into custom header directly.
Web server maps extension with MIME type. As you tag apache, Apache uses AddType directive to identify file's MIME type, while IIS and other web servers have similar settings .

Related

Azure Web Application Firewall (WAF) not diferentiating file uploads from normal posts and returning 413

The Azure WAF can be configured to check the maximum size of a request like this:
Anyway, besides having this configuration, any time we upload a file the WAF considers it as a "not file upload operation" and returns 413 "Request entity too large" if the file exceeds 128 Kb.
We are sending the POST request with what we think are the right headers:
Content-disposition: attachment; filename="testImage.jpg"
Content-Length: 2456088
Content-Type: image/jpeg
But it does not make a difference. Any idea why the WAF does not see this is a file upload and applies the Max file upload check instead of the Max request body size limit?
After several conversations with Microsoft we found that the WAF considers only file attachments if they are sent using multipart/form-data
If you send it this way the WAF will understand it is a file and thus will apply the limits configured for files instead than for bodies.
There is no other way to send files supported by the WAF for now.
From documentation:
Only requests with Content-Type of multipart/form-data are considered
for file uploads. For content to be considered as a file upload, it
has to be a part of a multipart form with a filename header. For all
other content types, the request body size limit applies.
Please note that filename header also needs to be present in request for WAF to consider it as file upload.

How to enable CORS for Catalyst

Having a Perl Catalyst application, which produces JSON, I need to read that JSON content using jQuery within an HTML page, served by an Apache server. Both applications, Catalyst and Apache are running on the same host.
When I access the Catalyst URL from Apache I get the error
Access to XMLHttpRequest at 'http://localhost:3000/abc/json_list' from origin 'http://localhost:8888' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.
As I red in many topics, a header (or more) must be set. In this case the Catalyst must be set but I don't know how.
Any hint?
Catalyst allows you to set response headers using the header method on the response object.
$c->res->header( "Access-Control-Allow-Origin" => "http://localhost:8888" );
Consider using a controller's sub auto or using existing middleware if you have multiple endpoints that need to provide permission via CORS.

How can a CGI script decode a multipart/form-data

Let's say an HTTP POST request is made with this header
Content-Type: multipart/form-data; boundary=...
and then, the body is built accordingly.
If I understand correctly, when the web server transmits the request to a CGI application, it sets some environments variables and the body is sent as stdin. So, the CGI app does not have access to the headers (except through some environment variables).
Then, how can a CGI application decode the body (stdin) if it does not have access to the header (Content-Type)?
There is a CONTENT-TYPE as part of the environment variable that a CGI application has access to. This link explains in details how a CGI application can read a multipart form.

robots.txt error : Content Type should be text/plain

I'm testing my site with the software called Search Engine Optimization (SEO) Toolkit 1.0 it displays this error :
The content type for the response from "htpp://mysite.com/robots.txt"
is "text/html". The Web server should return "text/plain" as the
content type for a Robots.txt file.
My robots.txt file is simply this :
User-agent: *
Allow: /
Saved with UTF-8 Without BOM Encoding.
Is this wrong?
What should be a default, harmless robots.txt file?
Thanks !
This is a MIME type issue and needs to be configured in your server.
Here is a link: http://www.nextthing.org/archives/2007/03/12/robotstxt-adventure
For your specific hosting provider, they insert a small snippet of tracking javascript. Disable that feature following the customer service support in the comments and the mimetype should render.

Specify content-type for documents uploaded in Magnolia

We have uploaded an mp4 video file into our Magnolia DMS, which fails to play on Safari (Mac/iPad). Investigation shows that the Content-Type returned by Magnolia is "application/octet-stream" for the request. When serving the file through Tomcat directly, the correct Content-Type "video/mp4" is returned and video playback works.
How can we configure the content-type to be returned in Magnolia?
We know the content-type is a function of the request (e.g. if we add ".jpg" to the URL the type returned is "image/jpeg"), but couldn't use this knowledge to come up with a solution.
Update:
We found the MIME configuration and could change the Content-Type for "mp4" to "video/mp4". However, the Content-Type returned by Magnolia is now
Content-Type: video/mp4;charset=UTF-8
while the correct, working Content-Type returned for files hosted by Tomcat is
Content-Type: video/mp4
Is it possible to make Magnolia not append any charset info to the Content-Type?
Glad you found the MIME configuration OK.
Both the MIME type and the character encoding are set in ContentTypeFilter.java and MIMEMapping.java. You can specify a charset for a MIME type yourself by including it in the mime-type definition. (E.g. "video/mp4;charset=UTF-8".)
If you don't include one, however, Magnolia automatically assigns the default (in this case, UTF-8). If you want to change this behavior, you'd need to tweak the source code.
Out of curiosity, is the charset causing you any trouble, or are you just trying to get Magnolia to match what Tomcat does by default?