mod_proxy_ajp error: renders html as text/plain, prompts user to "save as..." - apache

We have an odd, intermittent error that occurs with mod_proxy_ajp, i.e. using apache as a front end to a tomcat server.
The error
User clicks on a link browser prompts
user to "save as...." (e.g. in
Firefox "You have chosen top open
thread.jsp which is a
application/octet-stream"...What
should firefox do with this file)
User says "Huh?" and presses "Cancel"
User clicks again on the same link
Browser displays the page correctly
This error occurs intermittently, but unfortunately rarely on our test server and frequently on production.
In firefox's LiveHttpHeaders I see the following in the above usecase:
first page download (i.e. click on link) is "text/plain"
second download is "text/html"
I thought the problem may stem from ProxyPassReverse (i.e. muddling up whether to use http or ajp), but all these proxypassreverse settings resulted in the same error:
ProxyPassReverse /ajp://localhost:8080/
ProxyPassReverse /pe http://localhost/pe
ProxyPassReverse /pe http://forumstest.company.com/pe
Additionally, I've checked the apache error logs (set to debug) and see no warnings or errors...
** But it works with mod_proxy_http ?? **
It appears that switching to mod_proxy_http 'solves' the problem. Limited testing, I have not been able to recreate the problem in the test environment.
Because the problem is intermittent, I'm not 100% sure that mod_proxy_http "solves" the problem
Environment
Apache 2.2 Windows
Jboss 4.2.2 back end (tomcat 6)
One other data point
For better or worse, a servlet filter in tomcat gzips the html before sending it to apache. (which means extra work as apache must unzip before it performs ProxyPassReverse's "find and replace"). I don't know if "gzip" messes up.
Questions
anyone seen this before?
what tools help analyze the cause?
thanks
Addendum 1: Here is the LiveHttpHeaders output
Browser Incorrectly sees html as "text/plain"
http://forums.customer.com/pe/action/forums/displaythread?rootPostID=10842016&channelID=1&portalPageId=1002
GET http://forums.customer.com/pe/action/forums/displaythread?rootPostID=10842016&channelID=1&portalPageId=1002 HTTP/1.1
Host: forums.customer.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 ( .NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Proxy-Connection: keep-alive
Cookie: __utma=156962862.829309431.1260304144.1297956514.1297958674.234; __utmz=156962862.1296760237.232.50.utmcsr=forumstest.customer.com|utmccn=(referral)|utmcmd=referral|utmcct=/pe/action/forums/displaythread; s_vi=[CS]v1|258F5B88051D3FC3-40000105C056085F[CE]; inqVital=xd|0^sesMgr|{"sID":4,"lsts":1292598007}^incMgr|{"id":"755563420055418864","group":"CHAT","ltt":1292598006741,"sid":"755563549194447187","igds":"1290627502757","exempt":false}^inq|{"customerID":"755562378269271622"}^saleMgr|{"state":"UNSOLD","qDat":{},"sDat":{}}; inqState=sLnd|1^Lnd|{"c":4,"flt":1274728016,"lldt":17869990,"pgs":{"201198":{"c":1,"flt":1274728016,"lldt":0},"0":{"c":3,"flt":1274845009,"lldt":17752997}},"pq":["0","0","0","201198"],"fsld":1274728016697}; adv_search_results_page=10; ep_beta=1; visitorID=57307059; JSESSIONID=6jXLNdHRDjR9Th3B5gvTVkw1dZLn1zvhvKLR2r4GTLjylHJgjY3Q!683274050; __utmc=156962862; JSESSIONID=6jXLNdHRDjR9Th3B5gvTVkw1dZLn1zvhvKLR2r4GTLjylHJgjY3Q!683274050; TLTHID=5CCA50304DE99E28DB79A7B3267D4231; TLTSID=9DFCDE8045B374AAB752CC98A30E8311; AreCookiesEnabled=1; s_cc=true; SC_LINKS=%5B%5BB%5D%5D; s_sq=%5B%5BB%5D%5D; __utmb=156962862.64.10.1297958674; memberexists=T; ev1=greywolf%20hdtv%20whmx
Cache-Control: max-age=0
HTTP/1.0 200 OK
Date: Thu, 17 Feb 2011 17:38:42 GMT
Content-Type: text/plain
X-Cache: MISS from samus.company.com
X-Cache-Lookup: MISS from samus.company.com:3128
Via: 1.0 samus.company.com:3128 (squid/2.6.STABLE20)
Proxy-Connection: close
----------------------------------------------------------
Browser Correctly sees html as "text/html"
http://forums.customer.com/pe/action/forums/displaythread?rootPostID=10842016&channelID=1&portalPageId=1002
GET http://forums.customer.com/pe/action/forums/displaythread?rootPostID=10842016&channelID=1&portalPageId=1002 HTTP/1.1
Host: forums.customer.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 ( .NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Proxy-Connection: keep-alive
Cookie: __utma=156962862.829309431.1260304144.1297956514.1297958674.234; __utmz=156962862.1296760237.232.50.utmcsr=forumstest.customer.com|utmccn=(referral)|utmcmd=referral|utmcct=/pe/action/forums/displaythread; s_vi=[CS]v1|258F5B88051D3FC3-40000105C056085F[CE]; inqVital=xd|0^sesMgr|{"sID":4,"lsts":1292598007}^incMgr|{"id":"755563420055418864","group":"CHAT","ltt":1292598006741,"sid":"755563549194447187","igds":"1290627502757","exempt":false}^inq|{"customerID":"755562378269271622"}^saleMgr|{"state":"UNSOLD","qDat":{},"sDat":{}}; inqState=sLnd|1^Lnd|{"c":4,"flt":1274728016,"lldt":17869990,"pgs":{"201198":{"c":1,"flt":1274728016,"lldt":0},"0":{"c":3,"flt":1274845009,"lldt":17752997}},"pq":["0","0","0","201198"],"fsld":1274728016697}; adv_search_results_page=10; ep_beta=1; visitorID=57307059; JSESSIONID=6jXLNdHRDjR9Th3B5gvTVkw1dZLn1zvhvKLR2r4GTLjylHJgjY3Q!683274050; __utmc=156962862; JSESSIONID=6jXLNdHRDjR9Th3B5gvTVkw1dZLn1zvhvKLR2r4GTLjylHJgjY3Q!683274050; TLTHID=5CCA50304DE99E28DB79A7B3267D4231; TLTSID=9DFCDE8045B374AAB752CC98A30E8311; AreCookiesEnabled=1; s_cc=true; SC_LINKS=%5B%5BB%5D%5D; s_sq=%5B%5BB%5D%5D; __utmb=156962862.64.10.1297958674; memberexists=T; ev1=greywolf%20hdtv%20whmx
Cache-Control: max-age=0
HTTP/1.0 200 OK
Date: Thu, 17 Feb 2011 17:38:44 GMT
X-Powered-By: Servlet 2.4; JBoss-4.2.1.GA (build: SVNTag=JBoss_4_2_1_GA date=200707131605)/Tomcat-5.5
Content-Encoding: gzip
Content-Type: text/html;charset=UTF-8
Content-Length: 24739
X-Cache: MISS from samus.company.com
X-Cache-Lookup: MISS from samus.company.com:3128
Via: 1.0 samus.company.com:3128 (squid/2.6.STABLE20)
Proxy-Connection: keep-alive
----------------------------------------------------------
Addendum 2: Additional Information
The browser did receive the "gzipped" file. I had earlier clicked "save as..." when a few of these errors occurred. Gunzip successfully processed the files and converted them to html.

Answer is here: How to preserve the Content-Type header of a Tomcat HTTP response sent through an AJP connector to Apache using mod_proxy
set DefaultType to None in apache configuration.
# DefaultType: the default MIME type the server will use for a document
# if it cannot otherwise determine one, such as from filename extensions.
# If your server contains mostly text or HTML documents, "text/plain" is
# a good value. If most of your content is binary, such as applications
# or images, you may want to use "application/octet-stream" instead to
# keep browsers from trying to display binary files as though they are
# text.
#
DefaultType None

The first page download is of mime type "application/octet-stream". May be the zipped stream is being sent back to the browser? (You can confirm by saving the file and looking inside it.)
It would be really helpful to post the Live HTTP Header request / response traces for problematic and normal cases. If the content is same in both cases - the response HTML - may be forcing the mime type to be text/html (using ForceType directive) for that particular context can fix the issue.
If it turns out to be the case that gzipped content is being sent to the browser in the problematic case - then digging deeper would be necessary. Is this browser specific - only happens with Firefox or happens with all browsers?
Ok, based on the new information provided - looks like Squid is caching the problematic response and not sending the right headers back to the client. So the browsers are doing the right thing here - without the Content-Encoding and right Content-Type they cannot do much else.
Missing response headers for the problematic request :
X-Powered-By: Servlet 2.4; JBoss-4.2.1.GA (build: SVNTag=JBoss_4_2_1_GA date=200707131605)/Tomcat-5.5
Content-Encoding: gzip
Content-Type: text/html;charset=UTF-8
X-Powered-By : It sounds like the problematic requests don't actually hit your JBoss server. Can you verify this by checking access logs? Is Squid caching the response and then sending it back as text/plain?
Reconfiguring Squid to not cache that particular URL should help - see http://www.lirmm.fr/doc/Doc/FAQ/FAQ-7.html section 7.8 for example (which is specific to Squid 2 but later versions should have similar capabilities.)

Seems like a content-negotiation problem. Apache is guessing the content type using the "magic" byte and setting the content type incorrectly. That explains why it happens intermittently. Try disabling mod_negotiation and see what happens. See http://httpd.apache.org/docs/2.0/content-negotiation.html for more info.

I saw the following in your settings
ProxyPassReverse /ajp://localhost:8080/
But port 8080 is not ajp port. The default ajp port is 8009. Could this be your problem?

There is most likely something wrong with your web application, not Apache. If your web app sends back the correct Content-Type, Apache will gladly forward it to the client. No content negotiation will be done in that case. If you do not return any Content-Type, Apache will almost surely substitute text/plain, which is not what you want.
Test your web app without Apache in the middle, make sure that it sends back the correct Content-Type.

It uses to be when apache serves secure content in a non secure channel.

Related

UI not updating when using tomcat with apache & mod_jk

Well I have a weired issue, I was trying fronting tomcat with apache mod_jk using worker.
If I update my form with something like http://server.internal:8080 i.e tomcat then it works fine i.e updates are shown on screen and onrefresh the updates retain.
But if I update form with apache i.e http://server.internal/ then updates are seen in database but on refresh UI shows old values only, after refreshing 5-10 times then UI shows new values.
Also during refresh sometimes it shows old values while sometimes new values in the form.
I am using tomcat 7 + apache 2.2 + mod_jk on windows server.
I have disabled caching modules but still getting error.
Not sure where and how to debug such an issue.
Edited ---------
Request headers with apache
Cache-Control no-cache,no-store,private,pre-check=0,post-check=0,max-age=0
Connection close
Content-Encoding gzip
Content-Length 10174
Content-Type text/html;charset=utf-8
Date Thu, 11 Dec 2014 19:39:36 GMT
Expires -1
Pragma no-cache
Server Apache/2.2.25 (Win32) mod_jk/1.2.40
Vary Accept-Encoding
Request headers with tomcat
Content-Type text/html;charset=utf-8
Date Thu, 11 Dec 2014 19:43:43 GMT
Server Apache-Coyote/1.1
Transfer-Encoding chunked
Response Header with apache
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding gzip, deflate
Accept-Language en-US,en;q=0.5
Cache-Control max-age=0
Connection keep-alive
Cookie JSESSIONID=7D3ACA49B478E8B3A126B37252B62481
Host server
User-Agent Mozilla/5.0 (Windows NT 6.0; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0
Response header with tomcat
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding gzip, deflate
Accept-Language en-US,en;q=0.5
Connection keep-alive
Cookie JSESSIONID=7D3ACA49B478E8B3A126B37252B62481
Host server:8080
User-Agent Mozilla/5.0 (Windows NT 6.0; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0
Does not look a caching issue, tried with KeepAlive off also
I think its problem with the browser local history, delete the history in your browser and try again.mostly it will works fine.
This definitely smells like a caching issue. To be sure, explicitly purge your browser's cache (or disable it) and see if that helps. If it does, you may
want to add this in your apache's httpd.conf (replace the pattern */yourform.jsp by something that matches your form page(s)) in order to mark these pages as "uncacheable":
<Proxy */yourform.jsp>
Header unset Pragma
Header Always set Cache-Control: "no-cache,no-store,private,pre-check=0,post-check=0,max-age=0"
Header Always set Pragma: "no-cache"
Header Always set Expires "-1"
</Proxy>

header charset. where does it come from?

I have a problem with charset in http header. it shows "utf-8":
Content-Type: text/html; charset=utf-8
Where I can change this to iso-8859-1? I changed default charset in apache and nginx config, but it did not fix the issue.
it's coming from your application instead of apache and nginx. how you call your application, via CGI or fast cgi to php?

Why browser does not send "If-None-Match" header?

I'm trying to download (and hopefully cache) a dynamically loaded image in PHP. Here are the headers sent and received:
Request:
GET /url:resource/Pomegranate/resources/images/logo.png HTTP/1.1
Host: pome.local
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.22 (KHTML, like Gecko) Ubuntu Chromium/25.0.1364.160 Chrome/25.0.1364.160 Safari/537.22
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cookie: PHPSESSID=fb8ghv9ti6v5s3ekkmvtacr9u5
Response:
HTTP/1.1 200 OK
Date: Tue, 09 Apr 2013 11:00:36 GMT
Server: Apache/2.2.22 (Ubuntu)
X-Powered-By: PHP/5.3.14 ZendServer/5.0
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Content-Disposition: inline; filename="logo"
ETag: "1355829295"
Last-Modified: Tue, 18 Dec 2012 14:44:55 Asia/Tehran
Keep-Alive: timeout=5, max=98
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: image/png
When I reload the URL, the exact same headers are sent and received. My question is what should I send in my response to see the If-None-Match header in the consequent request?
NOTE: I believe these headers were doing just fine not long ago, even though I can not be sure but I think browsers are changed not to sent the If-None-Match header anymore (I used to see that header). I'm testing with Chrome and Firefox and both fail to send the header.
Same Problem, Similar Solution
I've been trying to determine why Google Chrome won't send If-None-Match headers when visiting a site that I am developing. (Chrome 46.0.2490.71 m, though I'm not sure how relevant the version is.)
This is a different — though very similar — answer than the OP ultimately cited (in a comment regarding the Accepted Answer), but it addresses the same problem:
The browser does not send the If-None-Match header in subsequent requests "when it should" (i.e., server-side logic, via PHP or similar, has been used to send an ETag or Last-Modified header in the first response).
Prerequisites
Using a self-signed TLS certificate, which turns the lock red in Chrome, changes Chrome's caching behavior. Before attempting to troubleshoot an issue of this nature, install the self-signed certificate in the effective Trusted Root Store, and completely restart the browser, as explained at https://stackoverflow.com/a/19102293 .
1st Epiphany: If-None-Match requires an ETag from the server, first
I came to realize rather quickly that Chrome (and probably most or all other browsers) won't send an If-None-Match header until the server has already sent an ETag header in response to a previous request. Logically, this makes perfect sense; after all, how could Chrome send If-None-Match when it's never been given the value?
This lead me to look at my server-side logic — particularly, how headers are sent when I want the user-agent to cache the response — in an effort to determine for what reason the ETag header is not being sent in response to Chrome's very first request for the resource. I had made a calculated effort to include the ETag header in my application logic.
I happen to be using PHP, so #Mehran's (the OP's) comment jumped-out at me (he/she says that calling header_remove() before sending the desired cache-related headers solves the problem).
Candidly, I was skeptical about this solution, because a) I was pretty sure that PHP wouldn't send any headers of its own by default (and it doesn't, given my configuration); and b) when I called var_dump(headers_list()); just before setting my custom caching headers in PHP, the only header set was one that I was setting intentionally just above:
header('Content-type: application/javascript; charset=utf-8');
So, having nothing to lose, I tried calling header_remove(); just before sending my custom headers. And much to my surprise, PHP began sending the ETag header all of a sudden!
2nd Epiphany: gzipping the response changes its hash
It then me hit me like a bag of bricks: by specifying the Content-type header in PHP, I was telling NGINX (the webserver I'm using) to GZIP the response once PHP hands it back to NGINX! To be clear, the Content-type that I was specifying was on NGINX's list of types to gzip.
For thoroughness, my NGINX GZIP settings are as follows, and PHP is wired-up to NGINX via php-fpm:
gzip on;
gzip_min_length 1;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript application/javascript image/svg+xml;
gzip_vary on;
I pondered why NGINX might remove the ETag that I had sent in PHP when a "gzippable" content-type is specified, and came up with a now-obvious answer: because NGINX modifies the response body that PHP passes back when NGINX gzips it! This makes perfect sense; there is no point in sending the ETag when it's not going to match the response used to generate it. It's pretty slick that NGINX handles this scenario so intelligently.
I don't know if NGINX has always been smart enough not to compress response bodies that are uncompressed but contain ETag headers, but that seems to be what's happening here.
UPDATE: I found commentary that explains NGINX's behavior in this regard, which in turn cites two valuable discussions regarding this subject:
NGINX forum thread discussing the behavior.
Tangentially-related discussion in a project repository; see the comment Posted on Jun 15, 2013 by Massive Bird.
In the interest of preserving this valuable explanation, should it happen to disappear, I quote from Massive Bird's contribution to the discussion:
Nginx strips the Etag when gzipping a response on the fly. This is
according to spec, as the non-gzipped response is not byte-for-byte
comparable to the gzipped response.
However, NGINX's behavior in this regard might be considered slightly flawed in that the same spec
... also says there is a thing called weak Etags (an Etag value
prefixed with W/), and tells us it can be used to check if a response
is semantically equivalent. In that case, Nginx should not mess with
it. Unfortunately, that check never made it into the source tree [citation is now filled with spam, sadly]."
I'm unsure as to NGINX's current disposition in this regard, and specifically, whether or not it has added support for "weak" Etags.
So, What's the Solution?
So, what's the solution to getting ETag back into the response? Do the gzipping in PHP, so that NGINX sees that the response is already compressed, and simply passes it along while leaving the ETag header intact:
ob_start('ob_gzhandler');
Once I added this call prior to sending the headers and the response body, PHP began sending the ETag value with every response. Yes!
Other Lessons Learned
Here are some interesting tidbits gleaned from my research. This information is rather handy when attempting to test a server-side caching implementation, whether in PHP or another language.
Chrome, and its Developer Tools "Net" panel, behave differently depending on how the request is initiated.
If the request is "made fresh", e.g., by pressing Ctrl+F5, Chrome sends these headers:
Cache-Control: no-cache
Pragma: no-cache
and the server responds 200 OK.
If the request is made with only F5, Chrome sends these headers:
Pragma: no-cache
and the server responds 304 Not Modified.
Lastly, if the request is made by clicking on a link to the page you're already viewing, or placing focus into Chrome's address bar and pressing Enter, Chrome sends these headers:
Cache-Control: no-cache
Pragma: no-cache
and the server responds 200 OK (from cache).
While this behavior a bit confusing at first, if you don't know how it works, it's ideal behavior, because it allows one to test every possible request/response scenario very thoroughly.
Perhaps most confusing is that Chrome automatically inserts the Cache-Control: no-cache and Pragma: no-cache headers in the outgoing request when in fact Chrome is acquiring the responses from its cache (as evidenced in the 200 OK (from cache) response).
This experience was rather informative for me, and I hope others find this analysis of value in the future.
Your response headers include Cache-Control: no-store, no-cache; these prevent caching.
Remove those values (I think must-revalidate, post-check=0, pre-check=0 could/should be kept – they tell the browser to check with the server if there was a change).
And I would stick with Last-Modified alone (if the changes to your resources can be detected using this criterion alone) – ETag is a more complicated thing to handle (especially if you want to deal with it in your PHP script yourself), and Google PageSpeed/YSlow advise against this one too.
Posting this for future me...
I was having a similar problem, I was sending ETag in the response, but the HTTP client wasn't sending a If-None-Match header in subsequent requests (which was strange because it was the day before).
Turns out I was using http://localhost:9000 for development (which didn't use If-None-Match) - by switching to http://127.0.0.1:9000 Chrome1 automatically started sending the If-None-Match header in requests again.
Additionally - ensure Devtools > Network > Disable Cache [ ] is unchecked.
Chrome: Version 71.0.3578.98 (Official Build) (64-bit)
1 I can't find anywhere this is documented - I'm assuming Chrome was responsible for this logic.
Similar problem
I was trying to obtain Conditional GET request with If-None-Match header, having supplied proper Etag header, but to no avail in any browser I tried.
After a lot of trial I realize that browsers treat both GET and POST to the same path as a same cache candidate. Thus having GET with proper Etag was effectively canceled with immediate "POST" to the same path with Cache-Control:"no-cache, private", even though it was supplied by X-Requested-With:"XMLHttpRequest".
Hope this might be helpful to someone.
This was happening to me because I had set the cache size to be too small (via Group Policy).
It didn't happen in Incognito, which is what made me realize this might be the case.
Fixing that resolved the issue.
This happened to me due to 2 reasons:
My server didn't send etag response header. I updated my jetty web.xml to return etag by adding the following:
<init-param>
<param-name>etags</param-name>
<param-value>true</param-value>
</init-param>
The URL I called was for xml file. When I changed it to html file, chrome started sending "if-none-match" header!
I hope it helps someone

How to get mod_python site to allow clients to cache selected image content?

I have a small dynamic site implemented in mod_python. I inherited this, and while I have successfully made relatively minor changes to its content and logic, with HTTP caching I'm out of my depth. The site works fine already, so this isn't "the usual question" about how to disable caching for a dynamic site.
My problem is there is one large banner image on each page (the same image from same URL on each page) which accounts for ~90% of site bandwidth but which so far as I can tell isn't being cached; as I browse the site every time I click to a new page (or back to a previously visited one) there it is downloading it yet again.
If I wget the banner's image URL (to see the headers) I see:
$ wget -S http://example.com/site/gif?name=banner.gif
--2012-04-04 23:02:38-- http://example.com/site/gif?name=banner.gif
Resolving example.com... 127.0.0.1
Connecting to example.com|127.0.0.1|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Date: Wed, 04 Apr 2012 22:02:38 GMT
Server: Apache/2.2.14 (Ubuntu)
Content-Location: gif.py
Vary: negotiate
TCN: choice
Set-Cookie: <blah blah blah>
Connection: close
Content-Type: image/gif
Length: unspecified [image/gif]
Saving to: `gif?name=banner.gif'
and the code which is serving it up isn't much more than
req.content_type = 'image/gif'
req.sendfile(fullname)
where fullname is a file-path munged from the request's name parameter.
My question is: is there some quick fix along the lines of setting an Expires: or Vary: field in the image's request response which will result in clients being less keen to repeatedly download it ?
The site is hosted on Ubuntu 10.04 and doesn't have any non-default apache mods enabled other than rewrite.
I note that most (not all) of the site pages' headers themselves do contain
Pragma: no-cache
Cache-Control: no-cache
Expires: -1
Vary: Accept-Encoding
(and the original site author has clearly thought about this as no-cache is applied selectively to non-static content pages). I don't know enough about caching to know whether this somehow poisons the included .gif IMG into being reloaded every time too though.
I don't know if my answer can help you or not, but I post it anyway.
Instead of serving image files from within python application, you can create another virtualhost within apache (on same server) just to serve static and image file. In your python application, you can embed image likes this
<img src="http://img.yoursite.com/banner.gif" alt="banner" />
With separate virtualhost, you can add various header to various content type using mode header, or add another caching layer for your static file.
Hope this help.

How to specify the character set in the HTTP Content-Type response header?

I had my site tested with the Page Speed app from Google and one of the suggestions was to specify the character set in the HTTP Content-Type response header claiming it was better than just in a meta tag.
Here's what I understand I need to write:
Content-Type: text/html; charset=UTF-8
..but where exactly should I put this? I'm on a shared server.
Thank you!
Apache: add to your .htaccess file in root directory:
AddDefaultCharset UTF-8
It will modify the header from this:
Content-Type text/html
...to this:
Content-Type text/html; charset=UTF-8
nginx [doc] [serverfault Q]
server {
# other server config...
charset utf-8;
}
add charset utf-8; to server block (and reload nginx config)
When i added this, my response header looked like this:
HTTP/1.1 200 OK
Content-Type: text/html,text/html;charset='UTF-8'
Vary: Accept-Encoding
Server: Microsoft-IIS/7.5
With Apache, you use http://httpd.apache.org/docs/2.2/mod/core.html#adddefaultcharset
With IIS you edit the MIME type for the filetype in the list of files.
With most server-side technologies like PHP or ASP.NET there's a method or property provided by that technology. For example in ASP.NET you can set it in config, page, or page code-behind.