Cache control: max-age settings - http-headers

This below is the HTTP header of my site. I need to know:
what is Cache-Control: max-age=259200?
Do you think that a so high value 259200 would prevent Googlebot to index my pages? Should I lower that value?
We talk about a blog of information, publishing articles every day.
HTTP/1.1 200 OK
Server: nginx
Date: Sat, 25 Feb 2017 15:07:53 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 123783
Connection: keep-alive
X-Powered-By: PHP/7.0.14
X-Pingback: http://www.example.com/xmlrpc.php
Link: <http://www.example.com/wp-json/>; rel="https://api.w.org/", <http://www.example.com/?p=1427>; rel=shortlink
Vary: Accept-Encoding
X-Powered-By: PleskLin
Cache-Control: max-age=259200
Expires: Tue, 28 Feb 2017 15:07:52 GMT

According to https://developer.mozilla.org/ru/docs/Web/HTTP/Headers/Cache-Control
max-age=<seconds>
Specifies the maximum amount of time a resource will be considered fresh. Contrary to Expires, this directive is relative to the time of the request.
In other words this is time interval for which any client such as browser or proxy server might use cached version.
How exactly it affects google I'm not sure. Googlebot might take it into account in some way (but I doubt they blindly trust you). This might be an issue if you have it on your main page because the bot might not come back for 3 days (259200 seconds = 3 days) to see new articles/posts. The same goes for new comments. Still if google ignores your site for much longer than that, the issue is not with caching but somewhere else.
You might also consider looking into Google Webmaster Tools. Start at https://support.google.com/webmasters/answer/34397/?hl=en and https://support.google.com/webmasters/answer/6065812/?hl=en

Related

Can browser caching be controlled by HTTP headers alone w/o using hash names for asset files?

I'm reading it in Webpack docs:
The way it works has a pitfall: if we don’t change filenames of our resources when deploying a new version, browser might think it hasn’t been updated and client will get a cached version of it.
I'm curious, is it mandatory to use this mechanism with ugly file names main.55e783391098c2496a8f.js for assets in order to inform a browser that an asset file has changed?
Can it be controlled by HTTP headers only? There are multiple HTTP headers in the standard to control how browser caches assets, like:
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: 0
Date: Wed, 24 Aug 2020 18:32:02 GMT
Last-Modified: Tue, 15 Nov 2024 12:45:26 GMT
ETag: x234dff
max-age: 12345
So can I use those headers alone? Or do I still have to bother about hash parts in file names main.55e783391098c2496a8f.js?
When user agent opens a page it must always get correct version of a source code. You have two options to achieve this:
Set Cache-Control, Expires and strong validator (ETag) response headers . This way you instruct user agent to perform relatively lightweight conditional request on each page load
Embed version in source code file URL and set Cache-Control and Expires response headers. This way you instruct user agent to cache source code with particural version forever
For more information check HTTP Caching article by Ilya Grigorik, HTTP conditional requests MDN page and this StackOverflow answer about resource revalidation.

How to remove HTTP Server "Apache"?

For security reason, i need remove(yes!, really I need remove, delete or hide) Apache signature.
I use ServerSignature n' ServerTokens directives, but only hide the version...
ServerSignature Off
ServerTokens Prod
The results is:
Name :Value
Date :Mon, 15 Jun 2015 11:47:28 GMT
Content-Encoding :gzip
Last-Modified :Sun, 14 Jun 2015 00:01:37 GMT
Server :Apache
ETag :"6176c-28f4-5186f0b8c3bb0"
Vary :Accept-Encoding,User-Agent
Content-Type :text/xml; charset=utf-8
Cache-Control :max-age=1
Accept-Ranges :bytes
Content-Length :1531
Expires :Mon, 15 Jun 2015 11:47:29 GMT
Look this
Server Apache
I need(without http header "Server:Apache"):
Name Value
Date Mon, 15 Jun 2015 11:47:28 GMT
Content-Encoding gzip
Last-Modified Sun, 14 Jun 2015 00:01:37 GMT
ETag "6176c-28f4-5186f0b8c3bb0"
Vary Accept-Encoding,User-Agent
Content-Type text/xml; charset=utf-8
Cache-Control max-age=1
Accept-Ranges bytes
Content-Length 1531
Expires Mon, 15 Jun 2015 11:47:29 GMT
Thanks!
I am very sorry Apache team, but this time can't show your
signature.
The core distribution doesn't allow it to be removed. It's trivial to do in a plugin. mod_security allows you to configure it to be stripped.
This should be a comment, but it's a bit long....
For security reason, i need remove...Apache signature - even from the data other than the Server header it is blatantly obvious that this is an Apache server (or something doing a very good impression of one).
As per discussion on security.stackexchange I do not believe that there is any security benefit in removing banners from your software. In addition to the information in your headers I could also determine this from the default error messages, how the server handles content negotiation, conditional requests, .... every time I look at a related issue, the list gets longer.
I've yet to see any evidence that disabling banners had any impact on a sites security (as opposed to allowing an auditor to tick a box in a checklist). But if anyone can provide a reference I would be very interested to hear.

Cache-control max-age meta tag not registering

I've put this in my head section. It appears in the page source in the browser.
<meta http-equiv="Cache-Control" content="max-age=1209600">
However, when I look in the Chrome extension Live HTTP Headers, it says the following.
Cache-Control: max-age=0
Content-Encoding: gzip
Content-Length: 5849
Content-Type: text/html; charset=utf-8
Date: Sat, 05 Apr 2014 04:29:16 GMT
Expires: Sat, 05 Apr 2014 04:29:16 GMT
Last-Modified: Sat, 05 Apr 2014 03:33:19 GMT
The max-age isn't registering. I've emptied the browser cache but it makes no difference.
Any explanations? This is the site, incidentally.
UPDATES:
Firebug similarly records Cache-Control: max-age=0.
Google also makes clear here that max-age overrides the Expires header (which I don't set) and that you don't need both.
When you use tools like Live HTTP Headers, they show you the actual HTTP headers sent by the browser. What they do with meta tags used to simulate HTTP headers is a different issue. We can expect any conflict to be resolved in favor of the actual headers. (This has been normatively specified in HTML specs for Content-Type headers.)
To control cacheing, you should (at least primarily) use server configuration. See Caching Tutorial for Web Authors and Webmasters.

Analysis of HTTP header

Hello I want to analyze & understand at first place and then optimize the HTTP header responses of my site. What I get when I fetch as Google from webmasters is:
HTTP/1.1 200 OK
Date: Fri, 26 Oct 2012 17:34:36 GMT // The date and time that the message was sent
Server: Apache // A name for the server
P3P: CP="NOI ADM DEV PSAi COM NAV OUR OTRo STP IND DEM" // P3P Does an e-commerse store needs this?
ETag: c4241ffd9627342f5f6f8a4af8cc22ed // Identifies a specific version of a resource
Content-Encoding: gzip // The type of encoding used on the data
X-Content-Encoded-By: Joomla! 1.5 // This is obviously generated by Joomla, there wont be any issue if I just remove it, right?
Expires: Mon, 1 Jan 2001 00:00:00 GMT // Gives the date/time after which the response is considered stale: Since the date is set is already expired, this creates any conflicts?
Cache-Control: post-check=0, pre-check=0 // This means site is not cached? or what?
Pragma: no-cache // any idea?
Set-Cookie: 5d962cb89e7c3329f024e48072fcb9fe=9qdp2q2fk3hdddqev02a9vpqt0; path=/ // Why do I need to set cookie for any page?
Last-Modified: Fri, 26 Oct 2012 17:34:37 GMT
X-Powered-By: PleskLin // Can this be removed?
Cache-Control: max-age=0, must-revalidate // There are 2 cache-controls, this needs to be fixed right? which one is preffected? max-age=0, must-revalidate? post-check=0, pre-check=0?
Keep-Alive: timeout=3, max=100 // Whats that?
Connection: Keep-Alive
Transfer-Encoding: chunked // This shouldnt be deflate or gzip ??
Content-Type: text/html
post-check
Defines an interval in seconds after which an entity must be checked for freshness. The check may happen after the user is shown the resource but ensures that on the next roundtrip the cached copy will be up-to-date.
http://www.rdlt.com/cache-control-post-check-pre-check.html
pre-check
Defines an interval in seconds after which an entity must be checked for freshness prior to showing the user the resource.
Pragma: no-cache header field is an HTTP/1.0 header intended for use in requests. It is a means for the browser to tell the server and any intermediate caches that it wants a fresh version of the resource, not for the server to tell the browser not to cache the resource. Some user agents do pay attention to this header in responses, but the HTTP/1.1 RFC specifically warns against relying on this behavior.
Set-Cookie: When the user browses the same website in the future, the data stored in the cookie can be retrieved by the website to notify the website of the user's previous activity.[1] Cookies were designed to be a reliable mechanism for websites to remember the state of the website or activity the user had taken in the past. This can include clicking particular buttons, logging in, or a record of which pages were visited by the user even months or years ago.
X-Powered-By: specifies the technology (e.g. ASP.NET, PHP, JBoss) supporting the web application.This comes under common non-standard response headers and can be removed.
Keep-Alive It is meant to reduce the number of connections for a website. Instead of creating a new connection for each image/css/javascript in a webpage many requests will be made re-using the same connection.
Transfer-Encoding: The form of encoding used to safely transfer the entity to the user. Currently defined methods are: chunked, compress, deflate, gzip, identity.

Keep the assets fresh in browser and cancel the freshness check request of the cache [for rails 3.1 app on heroku]

I have lot of small images (of sizes ~3kb or so) and lot of css and js files. After the first request tey are getting cached on the browser, but when I reload the page the browser is trying to check the freshness of the cached content (by setting the If-Modified-Since etc) and gets the response 304 not modified. Each of this validation request seriously increase the page load time (say 20 time 300ms).
How can I cancel this cache freshness check with the server from the browser? How can instruct the browser to use local cached files/images for certain time (say 1 hour) without re-validating or checking the freshness of the local cache with the remote server for every reload with that time period?
sample small image fetch header details below [using rails 3.1, on heroku]:
Response Headers
HTTP/1.1 304 Not Modified
Server: nginx/0.7.67
Date: Thu, 10 Nov 2011 17:53:33 GMT
Connection: keep-alive
Via: 1.1 varnish
X-Varnish: 1968827848
Last-Modified: Tue, 08 Nov 2011 07:36:04 GMT
Cache-Control: public, max-age=31536000
Etag: "5bda917d22f8a144c293f3f19723dbc6"
Request Headers
GET /assets/icons/flash_close_button-5bda917d22f8a144c293f3f19723dbc6.png HTTP/1.1
Host: ???.heroku.com
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:6.0.1) Gecko/20100101 Firefox/6.0.1
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Connection: keep-alive
Referer: http://???.heroku.com/
Cookie: ???
If-Modified-Since: Tue, 08 Nov 2011 07:36:04 GMT
If-None-Match: "5bda917d22f8a144c293f3f19723dbc6"
Cache-Control: max-age=0
This line:
Cache-Control: public, max-age=31536000
is telling the browser to not ask for updates for a long time, and store the files in a publicly accessible cache (which hear means public to the local machine - not the general public). Your browser should therefore not really be re-checking those files. Have you tried another browser to verify this behaviour exists elsewhere?
Saying all of this though, considering that your files are coming from the varnish cache and not your dyno, and are being returned as HTTP 304, 300ms for 20 files sounds like a very long time. However, This should be barely perceptible to the user.