How to stop PageSpeed from setting no-cache on my html page? - apache

I am using Apache with PageSpeed; On my index page I want to manually set a cache time by PHP but the headers get overwritten by PageSpeed because it sees the page as html and it ads no-cache:
header("Cache-Control:public, max-age=60");
pagespeed modifies it to:
Cache-Control: public, max-age=60
Cache-Control: max-age=0, no-cache, s-maxage=10
From the downstram caching documentation:
By default PageSpeed serves HTML files with Cache-Control: no-cache,
max-age=0 so that changes to the HTML and its resources are sent fresh
on each request
OK, but is there an easy way to get rid of that no-cache ? The method shown on documentation seems insanely complicated for such a simple issue. And already having reverse proxies and such, the infrastructure is complicated enough already.
Would Cache-Control:private help ?

Looks like ModPagespeedModifyCachingHeaders off does just that, not sure why this is not mentioned in the downstream caching documentation.

Related

Since adding HTTP security headers, Ahrefs.com produces JS 404s

We're using WP Engine for our website host. I added some Web Rules to produce the following HTTP headers:
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Referrer-Policy: origin-when-cross-origin
Feature-Policy: geolocation 'self'
X-Frame-Options: SAMEORIGIN
Content-Security-Policy: upgrade-insecure-requests
Since making the change above, the next Ahrefs.com Site Audit produced a lot of 404s for some Beaver Builder JS files.
If you load the webpage where that 404 is detected by Ahrefs, then there are no broken resources.
It is only the Ahrefs user agents which are generating the 404 for some Beaver Builder JS files.
Could the HTTP headers added above be producing these 404s by the Ahrefs user agent?
WP Engine say the Ahrefs user agents are not blocked.
Help appreciated.
Update: in the Ahrefs crawl settings, turning on "Execute Javascript " resolved this issue for us.
Some time ago Arefs announced rendering web pages and executing JavaScript for its Crawler and for Site Audit tool.
To do this, Ahrefs had to partially implement browser rendering. This was not a full-fledged browser implemented, but some part of its "ready to use" code was taken. This renderer, inherited partial support of HTTP headers, which was built into this "ready-to-use" code.
But the main task was to search for links inserted via javascript, and not full support for HTTP headers.
Therefore, the implementation used incorrectly executes some headers, which one it can only be found out by experimenting - disable them one by one and wait for the results.
But these are hardly headers Referrer-Policy, Feature-Policy and X-Frame-Options - they cannot result in a 404 Not Found error.
Most likely, these are just an Ahrefs renderer errors of loading external scripts, they are unlikely to have an impact on the analysis capabilities of Ahrefs or SE ranking.

Does the must-revalidate cache-control header tell the browser to only download a cached file if it has changed?

If I want browsers to load PDF files from cache until they changed on the server, do I have to use max-age=0 and must-revalidate as cache-control headers?
If I would use another value (larger than 0) for max-age would that mean the revalidation would only happen once the max-age value was exceeded?
What would happen if I would only set the must-revalidate header without max-age?
I was reading through this question and I am not 100% sure.
Also, what exactly does revalidate mean? Does it mean the client asks the server if the file has changed?
On the contrary, I've read that cache-control no-cache pretty much does what I want to achieve. Cache and check with the server if there is a new version... so whats the correct way?
I assume you are asking about which headers should you configure to be sent from your server, and by "client" you mean "modern web browser"? Then the quoted question/answer is correct, so:
Yes, you should set both, but max-age=0 is enough, (must-revalidate is the default behavior)
Yes, correct, the response would be served from local cache until max-age expires, after that it would be revalidated (once), then again served from local cache and so on
It is kind of undefined, and differs between browsers and the way you send request (clicking link from html, hitting reload button, typing directly in address bar and hitting enter). Generally, response should not be served directly from cache but it could either just be revalidated or full response can be requested from server.
Revalidate means that client asks server to send the content only if it has been changed since it was last retrieved. In order for this to work, in response to initial request server will send either one or both of:
Etag header (which contains hashed value of the content), which client will cache and send back in revalidation request as If-None-Match header, so server can compare clients cached Etag value with the current Etag on server side. If the value did not change, server will respond with 304 Not Modified (and empty body), and if the value changed, server will respond with 200 and full (new) content
Last-Modified (which contains timestamp of the last content modification), and client will send that in revalidation request in If-Modified-Since header, which will be used on server side to detirmine the response (304 or 200)
Cache-control: no-cache might achieve the same effect in most of the (simple) cases. The situation where things get complicated is when there are intermediate caches between client and the server, or when you want to tweak client behavior (for example when sending AJAX requests) and that is when most of the caching directives come into use

response not getting cached because of faulty headers

one of the request of my web page is to retrieve the html panel data. This request is triggered with the help of java script from browser. The response received from the server include caching headers that are confusing. The headers received are like - Cache-Control:"private, max-age=120no-store"
in this expected is max-age=120 header but we are getting no-store header as well which is causing the response not to remain cached.
Can anyone let me know what all we can do to avoid the no-store in the cache-control header?
As max-age and no-store are not separated by any separator, what we can do to know what is causing them to appear in this format and fix the issue?

phantomjs/casperjs force page caching

I am trying to force phantoms to in-memory cache some webpage (GET) that is sending "Cache-Control: no-cache, must-revalidate” header to us.
I ve tried to do this by modifying Cache-Control header in casper.options.onResourceReceived but it seems the headers are kind of a READ-ONLY in this callback?!
I would appreciate some directions to investigate in this problem …..
If the server doesn't want a request cached, then there is nothing you can do. PhantomJS is just another browser, so it will follow those instructions.

Webkit not caching when both max-age and Last-Modified headers are provide

I have a writing a web based application with a custom web server and I'm having problems with the webkit browsers not caching images, stylesheets and javascript.
I have tracked it down to some relationship between Cache-Control: max-age and Last-Modified. If both are specified then webkit seems to ignore the max-age header and check if the file has been modified EVERY time the resource is used. The site has an iframe on the first page and it results in stylesheets, etc. being requested twice within a second!
If you remove the Last-Modified then the files will not be re-requested until the next day; however the requests the next day will no longer be if-modified-since requests, requiring the server to resend everything instead of just a 304 header.
On IE9, Firefox 10.0 and Opera 11.61 the browsers cache correctly and don't re-request the images, only the HTML, which has a Cache-Control: no-cache header attribute.
On Chrome 16.0.912.77 m and Safari 5.1.2 (7534.52.7) a conditional request is made for every image on every page, every time. The server responds with a 304 header, again containing the max-age attribute but they both keep requesting.
An example HTTP header I'm sending with a response is:
HTTP/1.1 200 OK
Date: Mon, 06 Feb 2012 15:12:12 GMT
Cache-Control: max-age=86400
Content-length: 708
Content-type: image/gif
Last-Modified: Fri, 6 Jan 2012 14:39:07 GMT
Server: Webspring
Does anyone have any suggestions of how I can get these browsers to all respect my cache headers?
All the browsers are running on Win7 Pro x64 and the HTTP header above is the raw output of Fiddler, so that is exactly what the browser is receiving.
Note: I had asked a previous question before I discovered this was an interaction between header fields. I have deleted the previous question as it was no longer accurate.
Thanks
Mog
I was having the same problem but it wasn't Webkit wide; Safari and Chrome in incognito work fine. I tried disabling all extensions which made no difference but clearing the entire cache seemed to have fixed it.
My guess is that if you add the Cache-Control header to your site for the first time then Chrome doesn't correctly overwrite the old headers in its cache. This poses some problems with getting existing users over to your new cache settings properly.