how to invalidate server side caching on apache - apache

I've used apache's content expire times for caching of static assets. The only way I know of to force it to update the content on command is to restart apache. is there a better way?

Caching on our server, as it turns out, was data stored in a file, so we had to clear the cache file's contents, and voila, the cache was cleared.

Related

Apache httpd: cache miss: cache unwilling to store response

I've been trying to configure my Apache Httpd 2.4 server to use mod_cache and mod_cache_disk to do caching for a Wordpress site. It should have a big improvement of performance. I have followed all the guides and done everything right, and yet... no files get saved in the cache.
I enabled logging of cache errors and I see this error coming up: "cache miss: cache unwilling to store response". Ok... I looked in the source code of mod_cache.c and it looks like that happens when cache_create_entity() returns anything other than OK. So I looked at the souce of mod_ccache_disk.c and I can see that in create_entity(), it silently fails if conf->cache_root == NULL. In all other cases, it would log an error or return OK.
I can only assume that it's failing because conf->cache_root is null. Where does that come from? It would come from ap_get_module_config to get the config of the cache_disk_module.
How can that possibly be returning null? And more annoyingly, why does the Apache server give some log message when there's a major error condition, such as loading a module but not having a config object for it?
Really stumped on this one... Thank you.
Problem solved: SELinux was blocking files from being created in the CacheRoot I specified. I had to put it in the right directory that had permissions, or I could have added permission to the directory.

Does an apache restart reliably clear pagespeed cache?

I'm currently developing a website that's getting fairly frequent javascript updates and have just started using mod_pagespeed in an effort to ensure that customers will always have the latest code.
The docs tell me doing this will clear my pagespeed cache and force clients to get my new javascript/css:
sudo touch /var/cache/pagespeed/cache.flush
I did a test by changing some javascript code, hitting refresh on my browser to verify that I was still seeing the old code (my cache expiration is set to one day), then restarting apache, and I can indeed see my new changes.
Can I trust that a restart will always be sufficient, and that a cache.flush is not needed, or do I need to run the flush command as well? I'm reading that a restart of apache is required to clear the memory cache, but not how the file cache and/or cache.flush fits in with that.
Update:
I pulled the pagespeed code, and if I'm understanding correctly, the cache.flush process updates a timestamp.
It looks like that's happening in RewriteOptions::UpdateCacheInvalidationTimestampMs here:
http://modpagespeed.googlecode.com/svn/trunk/src/net/instaweb/rewriter/rewrite_options.cc
If I could figure out which timestamp this was updating, it seems like I could either check it/restart apache/check it again (to see if the timestamp changed) or deduce from the filename/location/who owns it somehow whether or not that's likely to happen.
Any more thoughts on this? Advice on how to figure out which timestamp is being updated? Other reasoning to make me feel better about either manually doing the extra flush command every time I update (when I'm already restarting apache for other reasons) or leaving it out?
touch the cache.flush file:
sudo touch /var/cache/mod_pagespeed/cache.flush
Reference: https://developers.google.com/speed/pagespeed/module/system#flush_cache
No restart of Apache doesnt clear the pagespeed cache. You have to do it manually by using cache.flush.
What I like to do to ensure that the whole cache on the entire web portion of the server
Apache2, this is a dry run, remove "-D" if you are sure you want to go through with it -l is for size of memory -p for path:
htcacheclean -D -p/var/cache/apache2 -l100M
mod_pagespeed:
sudo touch /var/cache/mod_pagespeed/cache.flush
A restart of Apache should flush the cache.

Varnish Cache - Initial cache of web pages

I have installed the Varnish cache with my Apache web server and configured them correctly. It works OK and I can now access my web pages though Varnish Cache.
The default behavior of varnish is to store copies of the pages served by the web server. The next time the same page is requested, Varnish will serve the copy instead of requesting the page from the Apache server.
And now comes my question: Is it possible to cache my entire website initially after setting up the Varnish cache, without the need to have a page to be accessed then store it on the cache? This is because, after varnish has been setup, the cache is initially empty, and it will require a page to be accessed in order to be available on the cache. Can this be done without having to access each page manually?
What you are looking for is a way of warming up the cache. You could use varnishreplay or a Web crawler, such as Wget or HTTrack to go through your site. Alternatively if you have a sitemap of your pages you could use that as a starting point and warm up the cache by looping over it and issuing requests on the pages using e.g. curl or wget.
Using varnishreplay requires you to first run varnishlog and gather a log of traffic before you can use it later for playing back the traffic and warming up the cache.
Wget, HTTrack etc. can be pointed to your home page and they will crawl their way through your site. Depending on the size and nature of your site this might not be practical though (for example if you use Ajax extensively).
Unless your pages take a very long time to load from the backend server (i.e. Apache), I wouldn't worry too much about warming up the cache. If the TTL for the cached content is high enough most of the visitors will only ever receive cached content anyway.
There is a much better way to do this which employs req.hash_always_miss and works with Varnish 3 and 4 (employs sitemap too). It warms up your cache and refreshes old pages without having to purge the cache. Full diagram, outline of how to configure it and 3 scripts for various use cases are outlined here http://www.htpcguides.com/smart-warm-up-your-wordpress-varnish-cache-scripts/ and are easily adapted for non-Wordpress sites.

Caching dynamic content in Apache (mod_cache)

I'm trying to understand how dynamic caching is done in Apache.
I read the Caching Guide of Apache and an article about dynamic caching, but still don't understand exactly the internals of how dynamic caching works.
Say for example I have a PHP page that serves content through reading from a database according to parameters in the user's URL query-string (or parameters specified in POST).
e.g. www.mySite.com?articleID=31
How is that cached then?
Does mod_cache keeps the content retrieved from the database for this specific article?
Any sources or suggestions are welcomed.
It caches the output of your script, the HTML. What I'm not sure if CacheStorePrivate On will cache PHP scripts with no cache headers. I'm using Apache 2.4.17 and look like it doesn't.

Is mod_rewrite a valid option for caching dynamic pages with Apache?

I have read about a technique involving writing to disk a rendered dynamic page and using that when it exists using mod_rewrite. I was thinking about cleaning out the cached version every X minutes using a cron job.
I was wondering if this was a viable option or if there were better alternatives that I am not aware of.
(Note that I'm on a shared machine and mod_cache is not an option.)
You could use your cron job to run the scripts and redirect the output to a file.
If you had a php file index.php, all you would have to do is run
php index.php > (location of static file)
You just have to make sure that your script runs the same on command line as it does served by apache.
I would use a cache on application level. Because the application knows best when the cached version is out of date and is more flexible and powerful in the matter of cache negotiation.
Does the page need to be junked every so often because it just has to? Or should it be paralleled with a static version after an update to the page?
If the latter, you could try and write a script that would make a copy of the just edited page and save it to its static filename version. That should lighten the write load since in that scenario you wouldn't need to have a fresh static copy unless there was a change made that needed some show time.