I'm running and Amazon EC2 with Ubuntu 14.4 and Apache2 with no PHP or other server-side script--just a static content site. I used this tutorial to get to the point I am at now with the apache file (see screenshot at link below):
https://www.digitalocean.com/community/tutorials/how-to-configure-apache-content-caching-on-ubuntu-14-04
I want to have a directive (if that is the nomenclature) that tells Apache to not cache a single, specific file only, but still handle everything else as it is already configured. I'm no computer whizz here--just learning. Is there a way to do this? Currently I have made a new directory inside my images folder called "no-cache" where the image I do not want cached lives.
I tried adding a second location tag below the first one with "CacheDisable on" inside it, however this is not supported. Also tried using a Directory tag, but this also does not work with the current configuration.
Thanks in advance!
/etc/apache2/sites-enabled/000.default.conf
The link you provided is a bit confusing since it mentions so many different types of caching.
When dealing with Webservers and caching, what you usually mean is sending cache messages (using http headers) to define how the browser should handle caching, to improve visitors performance. This is the last item discussed in that link of yours, despite being the most common. The first section talks about Apache caching files itself to improve its own performance and is much less common.
If client side caching using mod_expiries is what you mean, then you can control this with location headings:
#Allow long term assets to be cached for 6 months as they are versioned and should never change
<Location /assets/libraries/ >
ExpiresDefault A15724800
Header set Cache-Control "public"
</Location>
#Do not cache these files
<Location /login >
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
</Location>
I've a more detailed blog on this here: https://www.tunetheweb.com/performance/http-performance-headers/caching/.
Related
I am trying to make sure that visitors to my website see the latest version. To this end I have written a script to rename appropriate files and requests so that they append a fresh version number at build time. This includes the index file, let's call it index-v123.html.
I have uploaded this built source and pointed my apache2 server to the new index file by including
DirectoryIndex index-v123.html
in my apache2.conf. I have restarted it, and when viewing the website in chrome incognito mode or on hard refresh I can see that all the new files are loaded and the website works as expected.
My issue is that in my normal browser, when I visit the URL, I still load up a cached version of index.html. Clearly changing the DirectoryIndex didn't convince the client to go to the new index file like I'd hoped...
So can I do anything to make this happen?
(Also may be relevant: I am running a progressive web app using Polymer 2.0, with a service-worker.js that is built automatically by polymer build.)
This turned out to be a service worker issue: service-worker.js was being cached on the client side, and hence was providing outdated content as if the client was in offline mode. Could only be updated by deregistering the worker. The solution was to implement max-age=0 on the service worker at the apache2 server side:
<Files index.html|service-worker.js>
FileETag None
Header unset ETag
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header set Expires "Wed, 11 Jan 1984 05:00:00 GMT"
</Files>
Was surprised this wasn't better highlighted in the polymer build/production docs somewhere. For reference, in the google primer on service workers it says:
The caching headers on the service worker script are respected (up to
24 hours) when fetching updates. We're going to make this opt-in
behaviour, as it catches people out. You probably want a max-age of 0
on your service worker script.
I have a specific question regarding satisfying the Pingdom Tools criteria of "Serve the following static resources from a domain that doesn't set cookies".
My situation is quite specific and I've googled extensively, followed multiple guides and even went as far as asking my hosting provider but didn't get help there (which I assumed).
This is my scenario:
I have a website called "www.wheretofarminwow.com" and I've run the speed test at tools.pingdom.com and I get the "Serve the following static resources from a domain that doesn't set cookies" message.
I've moved the static files from wheretofarminwow.com to another site wtfiwstatic.com
And I assumed that domain "www.wtfiwstatic.com" would be cookieless because it doesn't doesn't do anything but just host those static files, but I keep on getting that "Serve the following static resources from a domain that doesn't set cookies" message from Pingdom.
I've even added a htaccess rule on that domain to try to not set any cookies:
<IfModule mod_headers.c>
<FilesMatch "\.(ttf|ttc|otf|eot|woff|woff2|font.css|css|js|jpg|jpeg|png|gif)$">
Header set Access-Control-Allow-Origin "*"
Header append Vary: Accept-Encoding
RequestHeader unset Cookie
Header unset Cookie
Header unset Set-Cookie
</FilesMatch>
</IfModule>
Hoping the bright community at Stack overflow could give me some insight on my situation and perhaps guide me to achieve my goal.
Thanking in advance,
Johnny
Ok, I think I figured it out.
It turns out as I was using Cloudflare on wtfiwstatic.com, Cloudflare sets a cookie to every request for security reasons and can't be disabled. Hope this helps someone in a similar situation.
Sources:
https://www.keycdn.com/support/how-to-use-cookie-free-domains/
https://support.cloudflare.com/hc/en-us/articles/200169816-Can-I-serve-a-cookieless-domain-or-subdomain-through-Cloudflare-
I am working on a website with PHP in backend and AngularJS in frontend. and it's served via apache2.4.
My problem is when I update my website to a new version some users cannot see the latest modifications, so I added this .htaccess to force cleaning the cache every 1 hour, but it doesn't work as I expected.
FileETag None
<ifModule mod_headers.c>
Header unset ETag
Header set Cache-Control "max-age=3600, must-revalidate, private"
</ifModule>
Could you give me the right cache configuration to force the browsers to get the last update whenever a new version is available?
Within your build process, you could append a query parameter to your static files such as JS / CSS like : app.js?1476109496 (where epoch is a unique reference such as deployment epoch, commit hash or similar) which would cause browsers to request a new version without needing to mess with your .htaccess.
We have been attempting to configure our server not to cache our .htm files as it is causing a few issues with our analytics package as well as not displaying the pages correctly if the visitor hits the back button in their browser.
We have attempted to tackle it by adding:
<FilesMatch "\.(htm)$">
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header set Expires "Wed, 11 Jan 1984 05:00:00 GMT"
Header set Warning "Testing"
</FilesMatch>
to our httd file but it does not appear to execute, however, when we move the Header set outside of the FilesMatch it appears to execute fine..
Anyone have any ideas where we are going wrong?
I recently needed to figure out the same kind of problem and, although this post pointed me in the right direction, I wanted to share some clarifying information for the edification of those who search on this topic in the future.
David, your initial FilesMatch was not working because FilesMatch only works on real, physical files that exist on your filesystem. http://httpd.apache.org/docs/current/sections.html states it as:
The Directory and Files directives, along with their regex counterparts, apply directives to parts of the filesystem.
This is also why your second post using LocationMatch resolved the issue. Also from http://httpd.apache.org/docs/current/sections.html, it states:
The Location directive and its regex counterpart, on the other hand, change the configuration for content in the webspace. < SNIP > The directive need not have anything to do with the filesystem. For example, the following example shows how to map a particular URL to an internal Apache HTTP Server handler provided by mod_status. No file called server-status needs to exist in the filesystem.
<Location /server-status>
SetHandler server-status
</Location>
The Apache docs summarizes this behavior with the following statement:
Use Location to apply directives to content that lives outside the filesystem. For content that lives in the filesystem, use Directory and Files. An exception is < Location / >, which is an easy way to
apply a configuration to the entire server.
For those that want to understand more of the mechanics, this is how I understand the internals:
Location directives match based on the HTTP request URI (e.g. example.com/this/is/a/uri.htm without the example.com part).
Directory and Files directives, on the other hand, match based on whether there is a directory path or file in the filesystem of the DocumentRoot that matches to respective part of the the HTTP request URI
The Apache docs summarizes this behavior as:
What to use When
Choosing between filesystem containers and webspace containers is actually quite easy. When applying directives to objects that reside in the filesystem always use Directory or Files. When applying directives to objects that do not reside in the filesystem (such as a webpage generated from a database), use Location.
[IMPORTANT!] It is important to never use Location when trying to restrict access to objects in the filesystem. This is because many different webspace locations (URLs) could map to the same filesystem location, allowing your restrictions to be circumvented.
This issue has now been resolved.
In order to get it to work we have changed from using FilesMatch to LocationMatch and now the headers are being set perfectly.
We believe this is because the page is being redirected from a JSP page to an HTML page.
<LocationMatch "\.(htm|html)$">
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header set Expires "Wed, 11 Jan 1984 05:00:00 GMT"
Header set Warning "Testing"
</LocationMatch>
Hopefully others will find this helpful.
For testing purposes, I have this in my Apache configuration:
<Directory "/home/http">
...
<FilesMatch "\.(html|htm)$">
Header unset Etag
Header set Cache-control "max-age=0, no-cache"
</FilesMatch>
<FilesMatch "\.(jpg|jpeg|gif|png|js|css)$">
Header unset Etag
Header set Cache-control "public, max-age=10"
</FilesMatch>
</Directory>
This basically says to set static assets to have a cache that lasts 10 seconds. Again this is for testing and demonstration purposes.
I test it out by navigating directly to the file
$ wget -O - --save-headers localhost/mod_pagespeed_example/images/Puzzle.jpg
Cache-control: public, max-age=10
which works fine. But then I try to load the page with mod_pagespeed and extend_cache enabled
$wget -O - --save-headers localhost/mod_pagespeed_example/extend_cache.html?ModPagespeed=on&ModPagespeedFilters=extend_cache
<img src="images/Puzzle.jpg"/>
$wget -O - --save-headers localhost/mod_pagespeed_example/extend_cache.html?ModPagespeed=on&ModPagespeedFilters=extend_cache
<img src="http://localhost/mod_pagespeed_example/images/xPuzzle.jpg.pagespeed.ic.hgbHsZe0IN.jpg"/>
This is all fine and dandy. The initial request doesn't work because it needs to load the info into the cache, but from there it correctly replaces the src of the img tag with the cached, hashed version.
However, this only persists UNTIL max-age. So, if I have it set to 10 seconds, it will continue to point to http://localhost/mod_pagespeed_example/images/xPuzzle.jpg.pagespeed.ic.hgbHsZe0IN.jpg, but then it will revert to images/Puzzle.jpg again after 10 seconds, at which time it will go back to the cached version.
Is this expected behavior? I would think that pagespeed would check the hash after max-age, and if it's the same it wouldn't bother changing it back to the original value, but instead continue serving the cached file.
This is somewhat concerning. If I set max-age to something more useful, say 60 minutes, that will allow me to continue to update these asset files and assure that my updates are seen in a timely manner. However, if the site is visited once per day by users, then that is more than the max-age and they will always be served the original file rather than the cached version.
This is expected behavior. As you mentioned, the reason is that the resource has expired in cache and so we need to re-check it to make sure it is still the same. We do not want to block the user request while we check all the sub-resources.
Note, one solution to this would be to use ModPagespeedLoadFromFile. This will check the file's last modified time on disk and so can check even if the resource expired in cache.