Should I send Link rel=preload headers when there is a redirect before the final HTML page? - http-headers

I'm trying to optimise the speed of some web page by adding the Link HTTP header to instruct the browser to preload some assets and data that is used on the web page. However, most visitors will get redirected first, before they get to the actual HTML page.
Does it make sense to already include the Link header in the response that contains the redirect? Or will the browser not use those assets for the redirected page anyway, so can I better leave them out to prevent unnecessary requests?
Example
The user first opens /initial, and gets redirected from there:
GET /initial HTTP/1.1
HTTP/1.1 302 Found
Link: </static/css/bundle.css>; rel=preload; as=style
Location: /final
The browser then follows the redirect to /final:
GET /final HTTP/1.1
HTTP/1.1 200 OK
Link: </static/css/bundle.css>; rel=preload; as=style
<html>
<link rel="stylesheet" href="/static/css/bundle.css">
...
So, does it make sense that the Link header is already included in the first 302 response, or can I better leave it out there?

Related

301 http response without Location header

If I create an application that sends a 301 response to the browser without sending a Location header how would the browser respond to the response.
When I tried a POC using nodejs looks like the browser is redirecting the request to /
Is it browser depended or a documented spec?
Browsers should just render the HTML body. Location is optional.

Dynamic AJAX Meteor website - how to make it crawlable?

I have a Meteor project that has the spiderable package added to it. If I load the page normally and then do view page source I don't get anything in the <body> tag. If I enter the url and then add the ugly ?_escaped_fragment_= at the end and look at the page source again - everything shows up as it should. I think this means that the spiderable package is working and is correctly rendering the HTML with phantomJS. So the question now is, how do I make the regular URL without the ugly part become crawlable ? I want to submit the site to google Adsense and the ugly url is not accepted, trying to see what google sees with the http://www.feedthebot.com/tools/spider/ tool results in an empty result. Any suggestions/helps ?
Edit 1: Adding the google crawl result from Google Webmaster
Date: Saturday, April 5, 2014 at 8:13:45 PM PDT
Googlebot Type: Web
Download Time (in milliseconds): 304
HTTP/1.1 200 OK
Vary: Accept-Encoding
Content-Type: text/html; charset=utf-8
Date: Sun, 06 Apr 2014 03:13:58 GMT
Connection: keep-alive
Transfer-Encoding: chunked
<!DOCTYPE html>
<html>
<head>
<link rel="stylesheet" href="/7a2b57749c356bfba1728bdae88febf653d0c6ca.css?meteor_css_resource=true">
<script type='text/javascript'>__meteor_runtime_config__ = {"meteorRelease":"0.7.2","PUBLIC_SETTINGS":{"ga":{"account":"UA-********-1"}},"ROOT_URL":"http://****.***","ROOT_URL_PATH_PREFIX":"","autoupdateVersion":"8213872485a2cc1cff2745d78330d7c8db8d8899"};</script>
<script type="text/javascript" src="/caefe2b2510e562c5e310f649c4ff564ddb6b519.js"></script>
<script type='text/javascript'>
if (typeof Package === 'undefined' ||
! Package.webapp ||
! Package.webapp.WebApp ||
! Package.webapp.WebApp._isCssLoaded())
document.location.reload();
</script>
<meta name="fragment" content="!">
</head>
<body>
</body>
</html>
Edit 2:
For now it seems that Google indexes the site correctly, but adsense doesn't use the same policies, which is the core of this issue for me. Meteor + spiderable + phantomjs = incompatibe for AdSense = but...compatible for indexing by Google.
The issue appears to be simply how Google is reporting the crawling in the Webmaster Tools. After some testing with a dummy app, it appears that even though the Google Webmaster Tools reports that it fetched the empty page, the site still gets crawled, indexed, and cached properly on Google.
So for some reason, it shows the result for the pretty URL, even though the ugly URL is the actual page getting crawled, as expected. This doesn't seem like it would be a problem that is specific to Meteor, but rather with the Webmaster Tools. The spiderable package appears to be working as expected.
After all, http://meteor.com, http://docs.meteor.com, and http://atmosphere.meteor.com are all running Meteor and they are indexed/cached fine on Google.
One way you can verify that your site is being crawled without submitting it to be indexed is to look at the thumbnail of the site on your Webmaster Tools homepage:
https://www.google.com/webmasters/tools/home?hl=en
If you're running Apache you could setup a *mod_rewrite* rewrite rule that would push every 404 error to a script. The script would would check if the requests was pointing to special folder (like the 'content' folder below) and try to pull the content for the corresponding ugly url.
The change to the .htaccess file would look something this:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*) /director.php?q=$1 [L,QSA]
The director.php script would work something like this:
Check if the 404 request is targeting a specfic folder like 'content'
Example: http: //myplace.com/content/f-re=feedddffv
Convert the unknown URL into an ugly URL and use CURL to get and serve the content
http: //myplace.com/content/f-re=feedddffv becomes http: //myplace.com/?f-re=feedddffv
The the script uses CURL to pull the ugly url's content into a varible
Echo the content to the viewer
You also need to create a site map for the search engine with the new pretty links. You can done something similar in IIS with URL rewriter. Using something like CURL can slow so try to keep your sitemap away from human eyes if possible.

google translate not showing up when https is used in url

For some reason when you go to the url https://www.improvementskills.org/index.cfm google translate does not show up, but when you go to http://www.improvementskills.org/index.cfm it works fine. So I know the issue is with SSL and having https. Does anyone know what the problem is and how to fix it. Thanks!
You are loading Google's JavaScript with an http URL, even when your page is served with https. The browser rejects that, because it's insecure to include non-https content in an https page.
You need to do this:
<script type="text/javascript" src="//translate.google.com/...
rather than specifying the URL as http://translate.google.com/... By starting the URL at the double-slash, the browser will use whichever of http or https the page itself is using.

Is it possible to send a 401 Unauthorized AND redirect (with a Location)?

I'd like to send a 401 Unauthorized AND redirect the client somewhere. However:
if I do it like this:
header('HTTP/1.1 401 Unauthorized');
header('Location: /');
the server sends a 302 Found with Location, so not a 401 Unauthorized.
If I do it like this:
header('Location: /');
header('HTTP/1.1 401 Unauthorized');
the browser receives both a 401 Unauthorized and a Location, but does not redirect.
(IE 9 and Chrome 16 behave the same, so I'm guessing it's correct)
Maybe I'm misusing HTTP? I'd like my app interface to be exactly the same for all clients: text browser, modern browser, API calls etc. The 401 + response text would tell an API user what's what. The redirect is useful for a browser.
Is there a (good) way?
By definition (see RFC 2616), the HTTP 302 response code is the redirect code. Without it, the location header may be ignored.
However, you can send an HTTP 401 response and still display output. Instead of redirecting the user to an error page, you could simply write your content you want to send in the HTTP body in the same request.
I'm coming in very late here but I thought I'd add my two cents. As I understand it, the desire is to indicate that the user doesn't have the correct authorization and to prompt them to log in. Rudie understandably would like to return 401 Unauthorized (because the user needs to authorize by some mechanism, eg. logging in), and also forward them to the login page - but this is not very easy to accomplish and isn't supported out-of-the-box by most libraries. One solution is to display the login page in the body of the 401 response, as was suggested in another answer. However, let me take a look at this from the perspective of established/best practice.
Test case 1: Facebook
Navigating to a protected Facebook page (my user profile) while logged out results in a 404 Not Found response. Facebook serves up a general purpose "this page is not available" page, which also includes a login form. Interesting. Even more interesting: when I navigate to the "events" page, I'm served a 302 response which forwards to a login page (which returns a 200 response). So I guess their idea is to return 302 for pages that we know exist, but serve 404 for pages which may or may not exist (eg. to protect a user's privacy).
Test case 2: Google Inbox
Navigating to my inbox when I am logged out returns 302 and forwards me to a login page, similar to Facebook. I wasn't able to figure out how to make my Google+ profile private so no test data there...
Test case 3: Amazon.com
Navigating to my order history when I am logged out returns 302 and forwards me to a login page as before. Amazon has no concept of a "profile" page so I can't test that here either.
To summarize the test cases here, it seems to be best practice to return a 302 Found and forward to a login page if the user needs to log in (although I would argue 303 See Other is actually more appropriate). This is of course just in the case where a real human user needs to input a username and password in an html form. For other types of authentication (eg. basic, api key, etc), 401 Unauthorized is obviously the appropriate response. In this case there is no need to forward to a login page.
3xx means Redirect
4xx means the browser did something wrong.
There's a reason why the codes are split up the way they are - they don't mix ;)
In addition to the fine answers from Kolink and David (+1's), I would point out that you are attempting to change the semantics of the HTTP protocol by both returning a 401 AND telling the browser to redirect. This is not how the HTTP protocol is intended to work, and if you do find a way to get that result, HTTP clients will find the behavior of your service to be non-standard.
Either you send a 401 and allow the browser to deal with it, or you handle the situation differently (e.g. as one commenter suggested, redirect to a login page or perhaps to a page explaining why the user had no access).
You can send 401 and then in response body you can send window.location='domain.com'. However, user will be immediately redirected without knowing that 401 occurred.
Here is a clean way:
On the 401 page, you can choose the "view" to send based on the "accept" header in the request.
If the accept is application/json, then you can include the body:
{"status":401;"message":"Authentication required"}
If the "accept" is text/html, then you can include the body:
<form action="/signin" method="post">
<!-- bla bla -->
<input type="hidden" name="redirect" value="[URL ENCODE REQUEST URI]">
</form>
Then you run into the same question... do you issue a 200 OK or a 302 Found on a successful login? (see what I did there? )
If you can handle authentication on any page, you can just have the form action be the same page URL, but watch for XSS if you are putting the user supplied request_uri in the form action attribute.
Web browsers are not REST clients. Stick to sending status 200 with a Location header and no body content. The 30x redirects are for pages that have moved. No other status code/Location header should be expected to redirect in a web browser.
Alternatively, your web server may have configurable error pages. You can add javascript to the error page to redirect.

iweb pages are downloading rather than viewing

I'm having an issue with a friends iWeb website - http://www.africanhopecrafts.org. Rather than pages viewing they want to download instead but they're all html files. I've tried messing with my htaccess file to see if that was affecting it but nothings working.
Thanks so much
Most likely your friend's web site is dishing up the wrong MIME type. The web server might be malconfigured, but the page can override the content-type responde header by adding a <meta> tag to the page's <head> like this:
<meta http-equiv="content-type" content="text/html" charset="ISO-8859-1" />
(where the charset in use reflects that of the actual web page.)
If the page is being served up with the correct content-type, the browser might be malconfigured to not handle that content type. Does the problem occur for everybody, or just you? IS the problem dependent on the browser in use?
You can sniff the content-type by installing Firefox's Tamper Data plug in. Fire up Firefox, start TamperData and fetch the errant web page via Firefox. Examining the response headers for the request should tell you what content-type the page is being served up with.