header charset. where does it come from? - apache

I have a problem with charset in http header. it shows "utf-8":
Content-Type: text/html; charset=utf-8
Where I can change this to iso-8859-1? I changed default charset in apache and nginx config, but it did not fix the issue.

it's coming from your application instead of apache and nginx. how you call your application, via CGI or fast cgi to php?

Related

HTTP headers automatically set

I am starting to learn about http correctly.
I am working in lamp stack.
On the command line i am requesting a local page which will be served with apache to see the headers that are returned.
curl -i local.testsite
The page i am requesting has no content and i am not setting any headers but there are already a lot of headers sent in the response such as:
HTTP/1.1 200 OK
Date: Thu, 17 Jan 2013 20:28:52 GMT
Server: Apache/2.2.22 (Ubuntu)
X-Powered-By: PHP/5.3.10-1ubuntu3.4
Vary: Accept-Encoding
Content-Length: 0
Content-Type: text/html
So if i am not setting these, does apache set these automatically?
Yes Apache is setting those by default. By the way, if you only care about the headers, you should use
curl -I local.testsite
-I returns the headers only (HTTP HEAD request), such that even if you had content on the page you would only get the header.
Some are set by PHP:
The X-Powered-By header is set by the expose_php INI setting.
The Content-Type header is set by the default_mimetype INI setting.
The others are set by Apache:
The Server header is set by the ServerSignature directive.
The Vary: Accept-Encoding header is usually sent when mod_deflate is enabled.
Date and Content-Length are not configurable as they are part of the HTTP spec. Date is included as a MUST (except under some conditions) and Content-Length as a SHOULD.
See also How to remove date header from apache? and How to disable the Content-Length response header with Apache?.

How to specify the character set in the HTTP Content-Type response header?

I had my site tested with the Page Speed app from Google and one of the suggestions was to specify the character set in the HTTP Content-Type response header claiming it was better than just in a meta tag.
Here's what I understand I need to write:
Content-Type: text/html; charset=UTF-8
..but where exactly should I put this? I'm on a shared server.
Thank you!
Apache: add to your .htaccess file in root directory:
AddDefaultCharset UTF-8
It will modify the header from this:
Content-Type text/html
...to this:
Content-Type text/html; charset=UTF-8
nginx [doc] [serverfault Q]
server {
# other server config...
charset utf-8;
}
add charset utf-8; to server block (and reload nginx config)
When i added this, my response header looked like this:
HTTP/1.1 200 OK
Content-Type: text/html,text/html;charset='UTF-8'
Vary: Accept-Encoding
Server: Microsoft-IIS/7.5
With Apache, you use http://httpd.apache.org/docs/2.2/mod/core.html#adddefaultcharset
With IIS you edit the MIME type for the filetype in the list of files.
With most server-side technologies like PHP or ASP.NET there's a method or property provided by that technology. For example in ASP.NET you can set it in config, page, or page code-behind.

mod_proxy_ajp error: renders html as text/plain, prompts user to "save as..."

We have an odd, intermittent error that occurs with mod_proxy_ajp, i.e. using apache as a front end to a tomcat server.
The error
User clicks on a link browser prompts
user to "save as...." (e.g. in
Firefox "You have chosen top open
thread.jsp which is a
application/octet-stream"...What
should firefox do with this file)
User says "Huh?" and presses "Cancel"
User clicks again on the same link
Browser displays the page correctly
This error occurs intermittently, but unfortunately rarely on our test server and frequently on production.
In firefox's LiveHttpHeaders I see the following in the above usecase:
first page download (i.e. click on link) is "text/plain"
second download is "text/html"
I thought the problem may stem from ProxyPassReverse (i.e. muddling up whether to use http or ajp), but all these proxypassreverse settings resulted in the same error:
ProxyPassReverse /ajp://localhost:8080/
ProxyPassReverse /pe http://localhost/pe
ProxyPassReverse /pe http://forumstest.company.com/pe
Additionally, I've checked the apache error logs (set to debug) and see no warnings or errors...
** But it works with mod_proxy_http ?? **
It appears that switching to mod_proxy_http 'solves' the problem. Limited testing, I have not been able to recreate the problem in the test environment.
Because the problem is intermittent, I'm not 100% sure that mod_proxy_http "solves" the problem
Environment
Apache 2.2 Windows
Jboss 4.2.2 back end (tomcat 6)
One other data point
For better or worse, a servlet filter in tomcat gzips the html before sending it to apache. (which means extra work as apache must unzip before it performs ProxyPassReverse's "find and replace"). I don't know if "gzip" messes up.
Questions
anyone seen this before?
what tools help analyze the cause?
thanks
Addendum 1: Here is the LiveHttpHeaders output
Browser Incorrectly sees html as "text/plain"
http://forums.customer.com/pe/action/forums/displaythread?rootPostID=10842016&channelID=1&portalPageId=1002
GET http://forums.customer.com/pe/action/forums/displaythread?rootPostID=10842016&channelID=1&portalPageId=1002 HTTP/1.1
Host: forums.customer.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 ( .NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Proxy-Connection: keep-alive
Cookie: __utma=156962862.829309431.1260304144.1297956514.1297958674.234; __utmz=156962862.1296760237.232.50.utmcsr=forumstest.customer.com|utmccn=(referral)|utmcmd=referral|utmcct=/pe/action/forums/displaythread; s_vi=[CS]v1|258F5B88051D3FC3-40000105C056085F[CE]; inqVital=xd|0^sesMgr|{"sID":4,"lsts":1292598007}^incMgr|{"id":"755563420055418864","group":"CHAT","ltt":1292598006741,"sid":"755563549194447187","igds":"1290627502757","exempt":false}^inq|{"customerID":"755562378269271622"}^saleMgr|{"state":"UNSOLD","qDat":{},"sDat":{}}; inqState=sLnd|1^Lnd|{"c":4,"flt":1274728016,"lldt":17869990,"pgs":{"201198":{"c":1,"flt":1274728016,"lldt":0},"0":{"c":3,"flt":1274845009,"lldt":17752997}},"pq":["0","0","0","201198"],"fsld":1274728016697}; adv_search_results_page=10; ep_beta=1; visitorID=57307059; JSESSIONID=6jXLNdHRDjR9Th3B5gvTVkw1dZLn1zvhvKLR2r4GTLjylHJgjY3Q!683274050; __utmc=156962862; JSESSIONID=6jXLNdHRDjR9Th3B5gvTVkw1dZLn1zvhvKLR2r4GTLjylHJgjY3Q!683274050; TLTHID=5CCA50304DE99E28DB79A7B3267D4231; TLTSID=9DFCDE8045B374AAB752CC98A30E8311; AreCookiesEnabled=1; s_cc=true; SC_LINKS=%5B%5BB%5D%5D; s_sq=%5B%5BB%5D%5D; __utmb=156962862.64.10.1297958674; memberexists=T; ev1=greywolf%20hdtv%20whmx
Cache-Control: max-age=0
HTTP/1.0 200 OK
Date: Thu, 17 Feb 2011 17:38:42 GMT
Content-Type: text/plain
X-Cache: MISS from samus.company.com
X-Cache-Lookup: MISS from samus.company.com:3128
Via: 1.0 samus.company.com:3128 (squid/2.6.STABLE20)
Proxy-Connection: close
----------------------------------------------------------
Browser Correctly sees html as "text/html"
http://forums.customer.com/pe/action/forums/displaythread?rootPostID=10842016&channelID=1&portalPageId=1002
GET http://forums.customer.com/pe/action/forums/displaythread?rootPostID=10842016&channelID=1&portalPageId=1002 HTTP/1.1
Host: forums.customer.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 ( .NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Proxy-Connection: keep-alive
Cookie: __utma=156962862.829309431.1260304144.1297956514.1297958674.234; __utmz=156962862.1296760237.232.50.utmcsr=forumstest.customer.com|utmccn=(referral)|utmcmd=referral|utmcct=/pe/action/forums/displaythread; s_vi=[CS]v1|258F5B88051D3FC3-40000105C056085F[CE]; inqVital=xd|0^sesMgr|{"sID":4,"lsts":1292598007}^incMgr|{"id":"755563420055418864","group":"CHAT","ltt":1292598006741,"sid":"755563549194447187","igds":"1290627502757","exempt":false}^inq|{"customerID":"755562378269271622"}^saleMgr|{"state":"UNSOLD","qDat":{},"sDat":{}}; inqState=sLnd|1^Lnd|{"c":4,"flt":1274728016,"lldt":17869990,"pgs":{"201198":{"c":1,"flt":1274728016,"lldt":0},"0":{"c":3,"flt":1274845009,"lldt":17752997}},"pq":["0","0","0","201198"],"fsld":1274728016697}; adv_search_results_page=10; ep_beta=1; visitorID=57307059; JSESSIONID=6jXLNdHRDjR9Th3B5gvTVkw1dZLn1zvhvKLR2r4GTLjylHJgjY3Q!683274050; __utmc=156962862; JSESSIONID=6jXLNdHRDjR9Th3B5gvTVkw1dZLn1zvhvKLR2r4GTLjylHJgjY3Q!683274050; TLTHID=5CCA50304DE99E28DB79A7B3267D4231; TLTSID=9DFCDE8045B374AAB752CC98A30E8311; AreCookiesEnabled=1; s_cc=true; SC_LINKS=%5B%5BB%5D%5D; s_sq=%5B%5BB%5D%5D; __utmb=156962862.64.10.1297958674; memberexists=T; ev1=greywolf%20hdtv%20whmx
Cache-Control: max-age=0
HTTP/1.0 200 OK
Date: Thu, 17 Feb 2011 17:38:44 GMT
X-Powered-By: Servlet 2.4; JBoss-4.2.1.GA (build: SVNTag=JBoss_4_2_1_GA date=200707131605)/Tomcat-5.5
Content-Encoding: gzip
Content-Type: text/html;charset=UTF-8
Content-Length: 24739
X-Cache: MISS from samus.company.com
X-Cache-Lookup: MISS from samus.company.com:3128
Via: 1.0 samus.company.com:3128 (squid/2.6.STABLE20)
Proxy-Connection: keep-alive
----------------------------------------------------------
Addendum 2: Additional Information
The browser did receive the "gzipped" file. I had earlier clicked "save as..." when a few of these errors occurred. Gunzip successfully processed the files and converted them to html.
Answer is here: How to preserve the Content-Type header of a Tomcat HTTP response sent through an AJP connector to Apache using mod_proxy
set DefaultType to None in apache configuration.
# DefaultType: the default MIME type the server will use for a document
# if it cannot otherwise determine one, such as from filename extensions.
# If your server contains mostly text or HTML documents, "text/plain" is
# a good value. If most of your content is binary, such as applications
# or images, you may want to use "application/octet-stream" instead to
# keep browsers from trying to display binary files as though they are
# text.
#
DefaultType None
The first page download is of mime type "application/octet-stream". May be the zipped stream is being sent back to the browser? (You can confirm by saving the file and looking inside it.)
It would be really helpful to post the Live HTTP Header request / response traces for problematic and normal cases. If the content is same in both cases - the response HTML - may be forcing the mime type to be text/html (using ForceType directive) for that particular context can fix the issue.
If it turns out to be the case that gzipped content is being sent to the browser in the problematic case - then digging deeper would be necessary. Is this browser specific - only happens with Firefox or happens with all browsers?
Ok, based on the new information provided - looks like Squid is caching the problematic response and not sending the right headers back to the client. So the browsers are doing the right thing here - without the Content-Encoding and right Content-Type they cannot do much else.
Missing response headers for the problematic request :
X-Powered-By: Servlet 2.4; JBoss-4.2.1.GA (build: SVNTag=JBoss_4_2_1_GA date=200707131605)/Tomcat-5.5
Content-Encoding: gzip
Content-Type: text/html;charset=UTF-8
X-Powered-By : It sounds like the problematic requests don't actually hit your JBoss server. Can you verify this by checking access logs? Is Squid caching the response and then sending it back as text/plain?
Reconfiguring Squid to not cache that particular URL should help - see http://www.lirmm.fr/doc/Doc/FAQ/FAQ-7.html section 7.8 for example (which is specific to Squid 2 but later versions should have similar capabilities.)
Seems like a content-negotiation problem. Apache is guessing the content type using the "magic" byte and setting the content type incorrectly. That explains why it happens intermittently. Try disabling mod_negotiation and see what happens. See http://httpd.apache.org/docs/2.0/content-negotiation.html for more info.
I saw the following in your settings
ProxyPassReverse /ajp://localhost:8080/
But port 8080 is not ajp port. The default ajp port is 8009. Could this be your problem?
There is most likely something wrong with your web application, not Apache. If your web app sends back the correct Content-Type, Apache will gladly forward it to the client. No content negotiation will be done in that case. If you do not return any Content-Type, Apache will almost surely substitute text/plain, which is not what you want.
Test your web app without Apache in the middle, make sure that it sends back the correct Content-Type.
It uses to be when apache serves secure content in a non secure channel.

Prevent Apache from chunking gzipped content

When using mod_deflate in Apache2, Apache will chunk gzipped content, setting the Transfer-encoding: chunked header. While this results in a faster download time, I cannot display a progress bar.
If I handle the compression myself in PHP, I can gzip it completely first and set the Content-length header, so that I can display a progress bar to the user.
Is there any setting that would change Apache's default behavior, and have Apache set a Content-length header instead of chunking the response, so that I don't have to handle the compression myself?
You could maybe play with the sendBufferSize to get a value big enough to contain your response in one chunk.
Then chunked content is part of the HTTP/1.1 protocol, you could force an HTTP/1.0 response (so not chunked: “A server MUST NOT send transfer-codings to an HTTP/1.0 client.”) by setting the force-response-1.0 in your apache configuration. But PHP breaks this settings, it's a long-known-bug of PHP, there's a workaround.
We could try to modify the request on the client side with an header preventing the chunked content, but w3c says: "All HTTP/1.1 applications MUST be able to receive and decode the "chunked" transfer-coding", so I don't think there's any header like 'Accept' and such which can prevent the server from chunking content. You could however try to set your request in HTTP/1.0, it's not really an header of the request, it's the first line, should be possible with jQuery, certainly.
Last thing, HTTP/1.0 lacks one big thing, the 'host' headers is not mandatory, verify your requests in HTTP/1.0 are still using the 'host' header if you work with name based virtualhosts.
edit: by using the technique cited in the workaround you can see that you could tweak Apache env in the PHP code. This can be used to force the 1.0 mode only for your special gzipped content, and you should use it to prevent having you complete application in HTTP/1.0 (or use the request mode to set the HTTP/1.0 for you gzip requests).

Yslow doesnt recognize my gzip

my site is all happily Gzipped according to:
http://www.gidnetwork.com/tools/gzip-test.php
However when I run it through Yslow I get a F for Gzip and it lists all of my scripts as components that are not gzipped.
Any ideas ?
Have a look in the headers in Firebug and check that the browser is sending
Accept-Encoding gzip,deflate
in the request header and that
Content-Encoding gzip
is being sent by the server in the response header (indicating that gzipping has been applied).
If you used the method in the linked pages to gzip your site, it won't have any effect on the scripts as they are not run through PHP. You'll need to either:
1) configure your webserver of choice (apache2 uses mod_deflate)
2) serve your .js files through php:
<?php ob_start('ob_gzhandler'); echo file_get_contents('whatever.js'); ?>