Are there any specific spec'd processes that a browser client can use to dynamically encourage a server to push additional requested items into the browser cache using HTTP/2 server push before the client needs to actually use them (not talking about server-side events or WebSockets, here, btw, but rather HTTP/2 server push)?
There is nothing (yet) specified formally for browsers to ask a server to push resources.
A browser could figure out what secondary resources needs to render a primary resource, and may send this information to the server opportunistically on a subsequent request with a HTTP header, but as I said, this is not specified yet.
[Disclaimer, I am the Jetty HTTP/2 maintainer]
Servers, on the other hand, may learn about resources that browsers ask, and may build a cache of correlated resources that they can push to clients.
Jetty provides a configurable PushCacheFilter that implements the strategy above, and implemented a HTTP/2 Push Demo.
The objective of server push is that the server send additional files (e.g. javascripts, css) along with the requested URL (e.g. an HTML page) to the browser before the browser knows what related files are required, thus saving a round-trip and improve webpage load speed. If the browser already know what resources are needed it can request with normal HTTP calls.
Related
Problem: Safari is doing a request with the pushed path but to the site host, resulting in 404s.
Scenario: Cross origin asset that is server pushed. Asset's host and site's host are different domains.
Browser: Safari v12+ (also v13) in both MacOS and iOS.
It is worth noting that the server push feature it self works, but Safari makes this extra request to the host. Also this doesn't happen on Safary v10 or v11.
I ran into this too, and confirmed (by re-writing with Charles Proxy) that Safari does load resources in a link header from the cross-origin domain if the link header uses an absolute path that includes a domain.
This type of HTTP response will not work in Safari:
HTTP/2 200
content-type: application/javascript; charset=utf-8
... other headers
link: </script.js>; rel=preload; as=script; crossorigin
Instead, you need to include the full domain and protocol, like so:
HTTP/2 200
content-type: application/javascript; charset=utf-8
... other headers
link: <https://www.example.com/script.js>; rel=preload; as=script; crossorigin
This is different from most server push tutorials which have a path that's absolute from the root of the domain (e.g. /script.js), but I've confirmed that it works correctly in Safari even when the server-push response is for a JavaScript resource on a different domain than the one that the HTML page lives on.
Scenario: Cross origin asset that is server pushed. Asset's host and site's host are different domains.
You cannot push a resource for another domain except in for very limited circumstances. The server has to be authorative for this server. Basically that means it goes to same IP address and is covered by same certificate. So if you are on www.example.com and have a separate sharded domain on static.example.com on the same server you can in theory push from that. However browser support is really poor for this and I really wouldn't recommend it. You can use the preload resource hint for that instead which is much better understood and supported.
Problem: Safari is doing a request with the pushed path but to the site host
As per above link, Safari does not support cross domain pushing. And neither do lots of other browsers.
resulting in 404s.
That would make sense since the resource you are requesting to push does not exist on that domain
It is worth noting that the server push feature it self works, but Safari makes this extra request to the host.
Then why do you think it is working?
Also this doesn't happen on Safary v10 or v11.
What doesn’t happen? The push? The double download? Both?
We have recently fixed a nagging error on our website similar to the one described in How to stop javascript injection from vodafone proxy? - basically, the Vodafone mobile network was vandalizing our pages in transit, making edits to the JavaScript which broke viewmodels.
Adding a "Cache-Control: no-transform" header to the page that was experiencing the problem fixed it, which is great.
However, we are concerned that as we do more client-side development using JavaScript MVP techniques, we may see it again.
Is there any reason not to add this header to every page served up by our site?
Are there any useful transformations that this will prevent? Or is it basically just similar examples of carriers making ham-fisted attempts to minify things and potentially breaking them in the process?
The reasons not to add this header is speed performance and data transfer.
Some proxy / CDN services encode the media, so if your client is behind proxy or are you using a CDN service, the client may get higher speed and spend littler data transfer. This header actually orders proxy / CDN - not to encode the media , and leave the data as is.
So, if you don't care about this, or your app not use many files like images or music, or you don't want any encoding on your traffic, there is no reason not to do this (and the opposite, recommended to).
See the RFC here: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.5
Google has recently incorporated the service googleweblight so if your pages has the "Cache-Control: no-transform" header directive you'll be opting-out from transcoding your page in case the connection comes from a mobile device with slow internet connection.
More info here:
https://support.google.com/webmasters/answer/6211428?hl=en
I'm trying to optimize my web application using Google's Page Speed API which has highlighted the absence of "Keep-alive" in my HTTP response headers as a major page speed weakness.
In talking with my back-end devs and sys admins, they've told me that using Keep-alive on the site is impossible because we use a load balancer.
I'm wondering, is this accurate? Are there load balancers that support Keep-alive?
It seems strange to me that the Page Speed API would complain about Keep-alive if it were impossible to use with load balancers because I would imagine a fair amount of applications and large sites use load balancers.
Thanks!
I don't know what type of load-balancers do you have... but I don't think that it would prevent the use of keep-alive connections.
The load balancer will handle each incoming connection to one of the backend servers. Now for each object the browser needs to make a new connection just to fetch that object (for example all small images). Establishing and closing TCP connections takes some time. This is why the Google Page Speed suggests to have keep-alive turned on. Another option is put all your small images into one big image and use css sprites to display part of it on different places on your page.
But back to the load balancer. If you have network load balancer, it should work without any questions - it will just redirect incoming TCP connection to one of the backend servers. If you have HTTP load-balancer, it will accept the connection, read the request, send the request to backend server, wait for it to answer and send the answer back to the browser. If you enable keep-alive, the load balancer should forward the next request it receives over the same connection.
For dynamic pages you don't need keep-alive. Keep-alive is mainly useful for static content (js, images, css) as for each one html page you have usually more than 10 static objects. So I would suggest to continue serving html trough that load-balancer and serve static content over different hostname (static.example.com).
I have a problem with my site after implementation of SSL that images do not appear. The scenario is that images come from images.domain.com (hosted on Amazon S3) and my certificate is for www.domain.com.
This problem only seems to happen in IE and not in any other browsers.
The issue is related to "mixed content" - HTTPS pages which have HTTP resources (images, scripts, etc) embedded.
The point of using HTTPS is to ensure that only the originating server and the client have access to the secured page. However, in theory it might be possible for this security to be compromised if HTTP resources are embedded - a server might intercept an unsecured javascript file and inject some code to alter the secured page onload.
Most browsers will indicate that a secure page has mixed content by altering the "secure lock" icon, either by showing the lock as open or broken, or by making the icon red (Chrome displayed a skull and crossbones for a short time, but they realised that this was a bit serious for the potential threat level).
Internet Explorer (depending on the version) will display a message either asking whether the insecure content should be shown (IE<=7), or whether only the secure content should be shown (IE>=8). It sounds like you have somehow disabled this message to always hide the insecure content, however that's not the default behaviour.
I think the best solution for you is to replace your S3 links with HTTPS versions.
I am not a web developer, but someone who often deals with the crap experience that is IE. I am not sure what version you are using, but you do not have a wildcard SSL cert (i.e. *.domain.com), so does it have something to do with an old-school limitation in 3rd party images?
See here for what I allude to above and a very good explanation of how IE caches cross-domain HTTPS content, specifically images. I am not sure what the solution is, but I was curious so I researched a little myself and this might help.
We are reviewing the design of a system. And need to verify what we think may be a security issue.
In this system some sensitive information is sent in the query string. Question is:
Can the query string parameters be read as the request goes over the internet, even if the request is sent over https?
Can the query string parameters be read be read from the browsing history on the client machines?
When you use HTTPS, the SSL/TLS connection is established before any HTTP traffic is sent, thus the whole request (including the URL and its parameters) will be encrypted and won't be readable. The only thing that's possibly visible by a third party is the server certificate (so they could see the host name, but that's it).
The browser's history isn't protected in any way by HTTPS as such, although some browsers may have some "safe browsing" options which would delete some HTTPS URLs automatically perhaps. This one ultimately really depends on the browser and its configuration.
This is certainly a security issue if sensitive details are being passed in get request.
Sensitive data will not only get cached in the user's browser but also in any proxy on d way and plus in webserver logs
Yes for the first. Not sure about the second - depends on the browser, I guess - but I suspect, Yes, here as well.