Making two sites appear as single domain - apache

I have two websites set up as follows
www.site1.com - IIS 6.0, Url rewriting with Helicon ISAPI Rewrite 2
www.site2.com - Apache
I've managed to make requests to www.site1.com/site2/ reverse proxy to www.site2.com/, so site2 appears to be part of site1. This was achieved by a RewriteProxy rule in the Helicon httpd.ini
However, to preserve SEO, I also need requests for www.site2.com itself to 301-redirect to www.site1.com/site2 - except of course when the request comes via the above reverse proxy. I believe this is possible via a conditional rewrite rule in the .htaccess file of site2, the condition being that if something unique about a request from site1 is detected, it serves the content, otherwise it 301 redirects.
The condition I'd wanted to use was the www.site1.com IP address, but in our scenario it's likely to change, so I need to use something different to identify such a request.
How else could this be achieved? Is the 301/reverse proxy combination a typical solution to this type of problem?

A slightly odd set-up perhaps but I think doable.
Just off the top of my head, I think that you would need to check the referrer header as part of the rewrite rules in .htaccess. So you would check to see if the referrer is NOT site1 before hitting the rule to rewrite using a 301 header.
If you need further help on setting the rules, let me know and I'll see if I can give you a more specific example - I'm a bit pressed for time just now.

Related

Apache 2.4 rewriting directory URLs without trailing slash to https://default_site/dir/ instead of preserving domain

This is a relatively recent behavioral change and appears to be related only to requests which include a "Upgrade-Insecure-Requests: 1" request header.
Apache has started rewriting such requests for sites which are HTTP-only to an HTTPS URL using the default site name instead of just adding the / at the end of the requested URL.
Example: URL submitted in browser: http://www.example.com/blah
Intended redirect: 301 to http://www.example.com/blah/
Instead redirects: 301 to https://default.site.configured/blah/
This happens whether it's a named virtual on the same address as the default server or a virtual using a separate address with separate Listen directives.
I understand all the arguments in favor of the idea that everything should always be encrypted and I don't want to get into a debate about that. This site doesn't consider the tradeoffs desirable at this time.
The default site does have SSL and is configured to redirect HTTP->HTTPS, but the www.foo.com site is not configured that way and does not wish to implement SSL at this time.
Is there any way to get Apache 2.4 to disregard that "Upgrade" header and simply rewrite the URL as desired rather than altering the domain name?
After banging on this some more, I finally found the source of my woes.
This happens when you have IP based virtual hosts and did not configure a name for them using the "ServerName" directive.
tl;dr: If you are having this problem, try adding a "ServerName www.example.com" directive within the VirtualHost definition for the site and that should resolve it.
Details:
It does not happen until you encounter a URL that requires a rewrite other than adding a trailing /. (i.e. if you get a request that doesn't contain the "Upgrade-Insecure-Requests: 1" header, it only gets the trailing / added, but if you get one with that header, it also tries to rewrite the protocol to https which triggers the full URL rewrite).
In my case, the default host name had an SSL configuration, so it didn't fall back to HTTP after the rewrite or reject the rewrite as invalid.
YMMV, I did not continue to do an exhaustive test of all permutations once I found the solution.

Apache reverse proxy - rewrite and Substitute returning answer

We have reverse proxy server which use rewrite rule to redirect one address to another.
When redirection is working, we get back an answer from that site (google) as a txt page.
Now, we wish to Substitute few words in that page and direct it to the source server that asked for it.
Our configuration looks like this:
ProxyRequests Off
RewriteEngine on
RewriteRule ^/books\.google\.com(.*) https://books.google.com/$1
Substitute "s/thumbnail_url/test/ni"
We do get the page back from google, but Substitute of words in the page is not working.
Hoping someone can answer it.
Thanks
Found the way to do so, by adding the following lines:
SSLProxyEngine On
RequestHeader set Front-End-Https "On"
Substitute "s/thumbnail_url/test/ni" [P]
Cause [P] will made the all query to work with https (as rewrite rule defined) and two first lines support SSL on apache proxy.
Lavi

How can I transparently rewrite an old host url to a new host url?

I have two apache virtual hosts within the same domain (and on same physical system):
old.example.com
new.example.com
I'd like to be able to transparently rewrite or map certain old url's to new. Example:
A request for http://old.example.com/foo would actually result in a request for http://new.example.com/foo
I want the http client (browser) to be unaware of the rewrite...in other words, I'm not looking to redirect. And, I only want to rewrite specific url's.
What can I add to either the virtual host or htaccess file(s) to accomplish this?
I guess you could set up your virtual hosts via mod_rewrite and then simply add those rewriting steps to the configuration.
If, however, all you are trying to do is to re-use some things you have in the file system, without any magic in your config files, I would use symbolic links instead. (I have no idea if there are any equivalents for windows servers, though.)
I found the answer here: http://httpd.apache.org/docs/2.0/misc/rewriteguide.html in the section titled Dynamic Mirror. I added this to my htaccess on http://old.example.com :
RewriteEngine on
RewriteBase /
RewriteRule ^foo http://new.example.com/foo [P]
The feature flag P tells the rule to use Proxy Throughput.

how to set mod_pagespeed to work on all pages

I've been trying with mod_pagespeed and would like to know if anyone know's how I can add a rule to my httpd.conf that would automatically add all current virtual hosts to the list of running domains:
ModPagepeedDomain http://vhost1.com
ModPagepeedDomain http://vhost2.com
ModPagepeedDomain http://vhost3.com
Thank you.
ModPagespeedDomain seems to accept wildcards. From here:
# Wildcards (* and ?) are allowed in the domain specification. Be
# careful when using them as if you rewrite domains that do not
# send you traffic, then the site receiving the traffic will not
# know how to serve the rewritten content.
ModPagespeedDomain *
Place this in the conf file proper, outside any vhosts.
Yes, you can use wildcards, but please don't use ModPagespeedDomain * unless you can actually control the whole web!
This declaration decides which resources to rewrite and which not to. It is a contract saying that all servers matching the pattern will have mod_pagespeed installed!
Please use something like:
ModPagespeedDomain vhost?.com
Unless you are actually behind a rewriting proxy that can rewrite from any domain.
Also, you can contact us at mod-pagespeed-discuss#googlegroups.com and list issues at http://code.google.com/p/modpagespeed/issues/list

Where is the proper place for mod_rewrite entries?

For the love of God, I can't seem to get this mod_rewrite working properly. Instead of doing brute force trial-and-error, let me ask here.
I want mod_rewrite rules to apply to ALL domains.
I want mod_rewrite entries in httpd.conf
I want to get rid of this WWW virus (for SEO purposes):
http://www.example.com > http://example.com
I want to get rid of index.html (for SEO, google indexes it instead of just domain):
http://www.example.com/index.html > http://example.com
http://www.example.com/some/index.html > http://example.com/some/index.html
Domains are inside <virtualhost> entries. I couldnt figure out where to put what or which one should take priority. As i mentioned, I would like to apply these 2 rules to ALL DOMAINS in the server.
The situation is exacerbated by ssl.conf. Will all these need to be entered into ssl.conf too? What will happen when there are 2 redirects like:
http://www.example.com/index.html > http://example.com/index.html > http://example.com
Thank you so much. This has quickly become all so confusing.
Maria
This solves it for me. As I suspected, there is a whole lotta difference where rewriterule is applied. Many people including mean seems to be unaware of this.
http://wiki.apache.org/httpd/RewriteContext
The Apache HTTPD Server deals with requests in discrete phases. While this is usually transparent to the user and administrator it does have an effect on the behaviour of mod_rewrite when rulesets are placed in different contexts. To oversimplify a little, when rules are placed in VirtualHost blocks (or in the main server context) they get evaluated before the server has yet mapped the requested URI to a filesystem path. Conversely, when rules are placed in .htaccess files, or in Directory blocks in the main server config, they are evaluated after this phase has occured.