Why is the referer from my server alway null? - apache

I am trying to work out why my referrer from my server always seems to be blank. I have knocked together the following to test it:
<html>
<head>
<meta http-equiv="Refresh" content="0; url='https://www.whatismyreferer.com/'" />
<meta name="referrer" content="origin" />
</head>
<body>
</body>
</html>
When I go to this page I get this:
Is this something that is being set at a server level in Apache? I have a case where I need to pass the referrer so finding out what is controlling this would be good.

The referrer header (with the famous referer spelling) is sent by the browser. If the browser decides not to send it (e.g. for privacy reasons) it just won't do. You should never rely on the header to be there. Even if you find configurations that currently work: The request is valid with or without this header. And browsers might change their opinion any time (they did: The header used to be omnipresent, not it's less present)

Related

How can I force a hard refresh if page has been visited before

Is it possible to check if the client has a cached version of a website, and if so, force his browser to apply a hard refresh once?
You can't force a browser to do anything, because you don't know how rigidly a remote client is observing the rules of HTTP.
However you can set HTTP headers which the browser is supposed to obey.
One such is Cache-control. There are a number of values that may meet your needs including no-cache and max-age. There is also the Expires header which specifies a wall-clock expiration time.
It is not ready apparent if the client has a cached version. To tell the client not to use cache you can use these meta tags.
<HEAD>
<TITLE>---</TITLE>
<META HTTP-EQUIV="Pragma" CONTENT="no-cache">
<META HTTP-EQUIV="Expires" CONTENT="-1">
</HEAD>

Browser doesnt cache script tag requests upon page reload even if the url is same

this might sound like a very basic question, but i couldnt find much help from google..
so, i have a html file -
<!doctype html>
<html>
<title>New Form Title</title>
<head>
<script type='text/javascript' src='http://localhost/whatever.js'></script>
</head>
<body>
</body>
</html>
when i hit f5(after loading the page for first time), i can see the server returned a 304 status, but i was under assumption that a server request will not even be sent in the first place (i.e the browser would not send a request because the url is the same, and the browser would use the cached item)
what am i missing? is this the actual behaviour?
thank you..

Precedence of X-Robots-Tag header vs robots meta tag

I've placed the following Header in my vhost config:
Header set X-Robots-Tag "noindex, nofollow"
The goal here is to just disable search engines from indexing my testing environment. The site is Wordpress and there is a plugin installed to manage per-page the meta robots settings. For example:
<meta name="robots" content="index, follow" />
So my question is, which directive will take precedence over the other since both are being set on every page?
I am not sure if a definitive answer can be given to the question, as the behavior may be implementation-dependent (on the robot side).
However, I think there is reasonable evidence that X-Robots-Tag will take precedence over <meta name="robots" .... See :
One significant difference between the X-Robots-Tag and the robots meta directive is:
X-Robots-Tag is part of the HTTP protocol header.
<meta name="robots" ... is part of the HTML document header.
Therefore the the X-Robots-Tag belongs to HTTP protocol layer, while <meta name="robots" ... belongs to the HTML protocol layer.
As they belong to a different protocol layer, they will not be parsed simultaneously by the (robot) client getting the page: The HTTP layer will be parsed first, and the HTML in a later step.
(Also, it should be noted that X-Robots-Tag and <meta name="robots" ... are not suppported by all robots. Google and Yahoo/Bing suppport both, but according to this some support only <meta name="robots" ..., others support neither.)
Summary :
if supported by the robot, X-Robots-Tag will be processed first ; restrictions (noindex, nofollow) apply (and <meta name="robots" ... is ignored).
else, <meta name="robots" ... directive applies.
Just an update to Dan's experience, I also have both the
Header set X-Robots-Tag "noindex, nofollow"
and
<meta name="robots" content="index, follow" />
on my one of my Wordpress sites, and a check in Google Search Console confirmed that the noindex in X-Robots-Tag is taking precedence as the pages have been crawled and but aren't indexed. So the logic in the correct answer is indeed, correct.
In my recent experience, when Google sees mixed-messages it prefers positive action by default - ie - it favours indexing - meanwhile will flag the issue as a critical error/warning in your webmaster tools console if you have one.
see your site's status in google here: https://www.google.com/webmasters/
see you site's status in bing here: http://www.bing.com/toolbox/webmaster (note that yahoo search is now powered by bing)
Google takes this positive-by-default action because lots of site owners unwittingly have a dodgy cms semi-blocking robots and we know how google loves to accumulate as much data as it can - any excuse!
if the technical settings are erroneous they're liable to be totally disregarded, and we know how search engines index and follow by default when no settings are specified.

Internet Explorer 9 ignoring no cache headers

I'm tearing my hair out over Internet Explorer 9's caching.
I set a series of cookies from a perl script depending on a query string value. These cookies hold information about various things on the page like banners and colours.
The problem I'm having is that in IE9 it will always, ALWAYS, use the cache instead of using the new values. The sequence of events runs like this:
Visit www.example.com/?color=blue
Perl script sets cookies, I am redirected back to www.example.com
Colours are blue, everything is as expected.
Visit www.example.com/?color=red
Cookies set, redirected, colours set to red, all is normal
Re-visit www.example.com/?color=blue
Perl Script runs, cookies are re-set (I have confirmed this) but! IE9 retreives all resources from the cache, so on redirect all my colours stay red.
So, every time I visit a new URL it gets the resources fresh, but each time I visit a previously visited URL it retrieves them from the cache.
The following meta tags are in the <head> of example.com, which I thought would prevent the cache from being used:
<META HTTP-EQUIV="CACHE-CONTROL" CONTENT="NO-CACHE">
<META HTTP-EQUIV="PRAGMA" CONTENT="NO-CACHE">
<META HTTP-EQUIV="EXPIRES" CONTENT="0">
For what it's worth - I've also tried <META HTTP-EQUIV="EXPIRES"
CONTENT="-1">
IE9 seems to ignore ALL these directives. The only time I've had success so far in that browser is by using developer tools and ensuring that it is manually set to "Always refresh from server"
Why is IE ignoring my headers, and how can I force it to check the server each time?
Those are not headers. They are <meta> elements, which are an extremely poor substitute for HTTP headers. I suggest you read Mark Nottingham's caching tutorial, it goes into detail about this and about what caching directives are appropriate to use.
Also, ignore anybody telling you to set the caching to private. That enables caching in the browser - it says "this is okay to cache as long as you don't forward it on to another client".
Try sending the following as HTTP Headers (not meta tags):
Cache-Control: private, must-revalidate, max-age=0
Expires: Thu, 01 Jan 1970 00:00:00
I don't know if this will be useful to anybody, but I had a similar problem on my movies website (crosstastemovies.com). Whenever I clicked on the button "get more movies" (which retrieves a new random batch of movies to rate) IE9 would return the exact same page and ignore the server's response... :P
I had to call a random variable in order to keep IE9 from doing this. So instead of calling "index.php?location=rate_movies" I changed it to "index.php?location=rate_movies&rand=RANDOMSTRING".
Everything is ok now.
Cheers
Will just mention that I had a problem looking very like this. But I tried IE9 on a different computer and there was no issue. Then going to Internet Options -> General -> Delete and deleting everything restored correct behaviour. Deleting the cache was not sufficient.
The only items that HTML5 specifies are content-type, default-style and refresh. See the spec.
Anything else that seems to work is only by the grace of the browser and you can't depend on it.
johnstok is correct. Typing in that code will allow content to update from the server and not just refresh the page.
<meta http-equiv="Content-Type" content="text/html; charset=utf-8; Cache-Control: no-cache" />
put this line of code into your section if you need to have it in you asp code and it should work.

SSL and W3 XHTML Validator

This may be a dumb newbie question, so appologies for that.
My website is using a SSL certificate. I also include the W3 validator link in each of my webpages as follows:
<img src="valid-xhtml1.png" alt="Valid XHTML 1.0 Strict" height="31" width="88" />
(Note: copied over the w3 validator image so SSL wouldn't complain about unsecure resources).
When I do this, and click on the image to validate the page, I get this message from the validator:
The error mentions requesting the validator unsecurely. So I tried changing the href of the <a> tag to use https for the validator, but then the page simply doesn't load (I guess because the validator doesn't use SSL).
Does anyone know a way around this? I am guessing there is not a way to use the code as is, but maybe there is a way to update uri=referer to be uri=https://mysite.com/...? Is there a way to dynamically grab the URL of the current page?
Also, just for further reference, does SSL simply prevent the referer request header from being accessed?
Oh, and I know I can just go to my website using http instead of https, and the validator works. But I'd rather get it configured to work with https too.
As for the "validate icon" question:
This would usually lead to displaying a messages about "unsecure items" (=mixed http+https content)... the validate icon is not officially supported in such constellation... a partial workaround is described here.
IF you want to grab the uri dynamically I suspect you will have to use JavaScript for that and then create/add the <a> in the DOM...
As for the SSL/Referer question:
The standard says that a client (=browser) should send referer only if the destination is secure - so yes, in mixed cases the referer won't get sent to the non-secure URL.
Ok, so it's not looking like there is a way to do this with just HTML. So instead, I decided to use JavaScript to handle the issue.
I removed the <a> tag from around the W3 logo and added an onclick JavaScript function validatePage(). So here is basically a template for an XHTML Strict page that still allows you to include the validation icon.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<title>Title of document</title>
<script type="text/javascript">
function validatePage() {
var validatorUrl = "http://validator.w3.org/check?uri=http" + (document.URL).substring(5);
window.open(validatorUrl);
}
</script>
</head>
<body>
<h1>Test Template Page</h1>
<p><img src="valid-xhtml1.png" alt="Valid XHTML 1.0 Strict" height="31" width="88" onclick="validatePage()" /></p>
</body>
</html>
Notice how the validatorUrl variable trims off the "https" from the URL and instead uses "http". So I just circumvented using the HTTP referer header.
Hope this helps someone else.