mod_pagespeed: Combine Javascript, CSS not working with SSL - apache

I am using mod_pagespeed. When using http, combine js, css and rewrite js and css work fine. However, when I switch to https, none of these four functions work. The apache error log is empty about this.
This is the relevant conf line
ModPagespeedEnableFilters rewrite_javascript,rewrite_css,combine_css,combine_javascript,insert_dns_prefetch

mod_pagespeed cannot rewrite HTTPS resources by default, you have to explicitly enable it by one of these options:
ModPagespeedMapOriginDomain - to tell mod_pagespeed to fetch HTTPS resources using HTTP
ModPagespeedLoadFromFile - to tell mod_pagespeed to load HTTPS resources directly from the filesystem.
ModPagespeedFetchHttps - to tell mod_pagespeed to fetch HTTPS resources directly.
ModPagespeedFetchFromModSpdy - if you have mod_spdy installed, to fetch resources using it.
The documentation has more details: https://developers.google.com/speed/pagespeed/module/https_support

Related

Scrapyd links do not work with HTTPS, just keeps loading and loading

I have scrapyd installed in Ubuntu.
I also have a website with SSL enabled, I need to make request to links like https://IP_HERE:6800/listjobs.json?project=default inside my website.
But it looks like Scrapyd does not work with HTTPS.
Even if I open link in browser it just keeps loading and loading.
But if I make request using http:// instead of https:// it works. But I want it to work with HTTPS.
I thought I need to edit my SSL conf file to work with port 6800. I did but still its not working.
Here is my SSL config file looks like.
<IfModule mod_ssl.c>
<VirtualHost *:443 *:6800>
.... and rest of confguration...
By looking at the source code of scrapyd, it uses a TCPServer from Pythons socketserver module. It is not possible to enable SSL in a Python module via the Apache config file.
What you want to use is a HTTPS-to-HTTP proxy, which wraps up scrapyd's HTTP into an HTTPS protocol. You can use Apache for that, see this tutorial from Digital Ocean or this blog post.

mod_pagespeed with amazon S3

I have an EC2 server running apache (www.example.com) and mod_pagespeed is installed and working.
I have static content hosted on an Amazon S3 bucket (examplecdn.com)
When the html is served up from https://www.example.com, there are a couple of style references which are served from https://examplecdn.com.
Here's some sample html sent from https://www.example.com
<link rel="stylesheet" type="text/css" href="//examplecdn.com/assets/css/file_one.css"/>
<link rel="stylesheet" type="text/css" href="//examplecdn.com/assets/css/file_two.css"/>
I have read the documentation on mod_pagespeed, but I'm having trouble understanding it. I would expect the two requests to be rewritten into one http request.
I have confirmed using wget that https://examplecdn.com/assets/css/file_one.css is accessible from the www.example.com server
I have simplified my setup to use .htaccess for testing purposes. I can turn simple filters on and off easily without needing to restart the apache server. I'm trying to use the combine_css filter just to attempt to get a basic setup up and running. Here's my .htaccess file:
ModPagespeed on
ModPagespeedEnableFilters remove_comments
ModPagespeedEnableFilters collapse_whitespace
ModPagespeedEnableFilters combine_css
I know the documentation mentions lots of "Domain" settings, but I don't know which ones will do the trick. Can someone please tell me what changes I need to make to my .htaccess file in order to get this working?
Thanks!
From combine css docs:
The filter will not merge together resources from multiple distinct domains, even if those domains are each authorized by Domain. It will merge together resources from multiple distinct domains that have been mapped together via MapRewriteDomain.
And from here:
This directive lets the server accept https requests for www.example.com without requiring a SSL certificate to fetch resources - in fact, this is the only way PageSpeed can service https requests as currently it cannot use https to fetch resources.
ModPagespeedMapOriginDomain http://examplecdn.com/ https://examplecdn.com/
Maybe this will work for you, but why not have those files local? They will be served by your apache server anyway.
[EDIT]
Tested it, this way worked for me:
pagespeed on;
pagespeed RewriteLevel CoreFilters;
pagespeed Domain *.example.com;
pagespeed Domain https://s3.amazonaws.com/mybucket;
pagespeed MapOriginDomain http://localhost https://s3.amazonaws.com;
pagespeed EnableFilters combine_css;
Tested with nginx but should work the same way with Apache. Also should make no difference if the mapped domain is on cloudfront.

Use https instead of http in urls in templates for static files

Currently we are using the default wirecloud template. But sinde we enabled SSL and redirect every request to the ssl port I would love to change the urls of static ressources to start with https to avoid mixed content warnings.
Is there a simple way to change the urls to always start wit hhttps instead of http?
That's done automatically, except if WireCloud is behind a proxy (so requests comes using HTTP instead of HTTPS). In those cases you can force WireCloud to use https links by adding this line into the settings.py file:
FORCE_PROTO = "https"
See this link for more info.

mod_pagespeed with SSL: from // to https://

Apache 2.2.15 on RHELS 6.1
Using mod_pagespeed on a server behind https (implemented by the network's Reverse Proxy).
All html urls are written as "//server.example.com/path/to/file.css" (so, without the protocol specified).
Problem : using the default configuration, pagespeed rewrites the urls as "http://server.example.com/path/to/file.css"
I'm trying to figure out how to have it rewrite the urls as https (or leave it unspecified as //).
After reading the documentation, I tried using ModPagespeedMapOriginDomain like this
ModPagespeedMapOriginDomain http://localhost https://server.example.com
Also tried
ModPagespeedMapOriginDomain http://localhost //server.example.com
ModPagespeedMapOriginDomain localhost server.example.com
... To no avail. Urls keep being rewritten with "http://".
Question: how can I have pagespeed use https instead of http in its urls?
Full pagespeed config here, if needed
It turns out mod_pagespeed does not work with "protocol-relative" urls.
Still, the issue is bypassed if you enable trim_urls
ModPagespeedEnableFilters trim_urls
Be mindful of the potential risks (depending on your javascript codebase, ajax calls could break or produce unexpected html).
Adding this to your configuration might work:
ModPagespeedRespectXForwardedProto on
That works, if your reverse proxy forwards the X-Forwarded-Proto header in its requests.
That request header tells PageSpeed what the original protocol was that was used for the request at the loadbalancer, and thereby hands it all it needs to know to correctly rewrite urls.

How to configure mod_pagespeed for SSL pages

We have website e.g. http://www.acb.com which points to a hardware load-balancer which is suppose to load-balance two dedicated server. Each server is running apache as a frontend and uses mod_proxy to forward request to tomcat.
Some pages of our website require SSL like https://www.abc.com/login or https://www.abc.com/checkout
SSL is terminated at hardware load-balancer.
When I configured mod_pagespeed it compressed, minimized and merged css file and rewrote them with an absolute url http://www.abc.com/css/merged.pagespeedxxx.css instead of relative url /css/merged.pagespeedxxx.css.
It works fine for non ssl pages but when I navigate to an ssl page such as https://www.abc.com/login all the css and js files are blocked by browser like chrome as their absolute url is not using ssl.
How can I resolve this issue ?
Check for https string in this documentation and this one.
You should show us in your question your current ModPagespeedMapOriginDomain && ModPagespeedDomain settings.
From what I understand from these lines:
The origin_specified_in_html can specify https but the origin_to_fetch_from can only specify http, e.g.
ModPagespeedMapOriginDomain http://localhost https://www.example.com
This directive lets the server accept https requests for www.example.com without requiring a SSL certificate to fetch resources - in fact, this is the only way mod_pagespeed can service https requests as currently it cannot use https to fetch resources. For example, given the above mapping, and assuming Apache is configured for https support, mod_pagespeed will fetch and optimize resources accessed using https://www.example.com, fetching the resources from http://localhost, which can be the same Apache process or a different server process.
And these ones:
mod_pagespeed offers limited support for sites that serve content through https. There are two mechanisms through which mod_pagespeed can be configured to serve https requests:
Use ModPagespeedMapOriginDomain to map the https domain to an http domain.
Use ModPagespeedLoadFromFile to map a locally available directory to the https domain.
The solution would be something like that (or the one with ModPagespeedLoadFromFile)
ModPagespeedMapOriginDomain http://localhost https://www.example.com
BUT, the real problem for you is that apache does not directly receive the HTTPS requests as the hardware load balancer handle it on his own. So the mod-pagespeed output filter does not even know it was requested for an SSL domain. And when it modify the HTML content, applying domain rewrite maybe, it cannot handle the https case.
So... one solution (untested) would be using another virtualhost on the apache server, still HTTP if you want, dedicated to https handling. All https related urls (/login,/checkout,...) would then be redirected to this specific domain name by the hardware load balancer. Let's say http://secure.acb.com. This name is only in use between the load balancer and front apaches (and quite certainly apache should restrict access to this VH to the load balancer only).
Then in these http://secure.acb.com virtualhosts mod_pagespeed would be configured to externally rewrite domains to https://www.example.com. Something like:
ModPagespeedMapOriginDomain http://secure.example.com https://www.example.com
Finally the end user request is https://www.example.com/login, the load balancer manages HTTPS, talk to apache with http://secure.example.com, and page results contains only references to https://www.example.com/* assets. Now when theses assets are requested with an https domain request you still have the problem of serving theses assets. So the hardware load balancer should allow all theses assets url in the https domain and send them to the http://secure.abc.com virtualhosts (or any other static VH).
This sounds like you configured the rewritten URL as http://www.abc.com/css/merged.pagespeedxxx.css yourself - therefor: Try to use a protocol-relative URL, e.g. remove http: and just state //www.abc.com/css/merged.pagespeedxxx.css - this will use the same protocol as the embedding page was requested in.
One of the well standardized but relatively unknown features of URLs