Robots.txt for multiple domains - seo

We have different domains for each language
www.abc.com
www.abc.se
www.abc.de
And then we have different sitemap.xml for each site. In robots.txt, I want to add sitemap reference for each domain.
Is it possible to have multiple sitemap references for each domain in single robots.txt?
If there are multiple, which one does it pick?

I'm using the following solution in .htaccess after all domain redirects and www to non-www redirection.
# Rewrite URL for robots.txt
RewriteRule ^robots\.txt$ robots/%{HTTP_HOST}.txt [L]
Create a new directory in your root called robots.
Create a text file filled with the specific robots information for every domain.
/robots/abc.com.txt
/robots/abc.se.txt
/robots/abc.de.txt

The robots.txt can only inform the search engines of sitemaps for its own domain. So that one will be the only one it honors when it crawls that domain's robots.txt. If all three domains map to the same website and share a robots.txt then the search engines will effectively find each sitemap.

Based on Hans2103's answer, I wrote this one that should be safe to be included in just about every web project:
# URL Rewrite solution for robots.txt for multidomains on single docroot
RewriteCond %{REQUEST_FILENAME} !-d # not an existing dir
RewriteCond %{REQUEST_FILENAME} !-f # not an existing file
RewriteCond robots/%{HTTP_HOST}.txt -f # and the specific robots file exists
RewriteRule ^robots\.txt$ robots/%{HTTP_HOST}.txt [L]
This rewrite condition should just serve the normal robots.txt if it's present and only look for a robots/ directory with the specified file robots/<domain.tld>.txt.

Related

mod_rewrite to remove index.php from Codeigniter in subdirectory

Codeigniter applications commonly use mod_rewrite to exclude the string index.php from the url. I have two Codeigniter applications within the same domain. One Codigniter application is in the web root folder, another Codigniter application is in a subfolder of the web root folder.
Codeigniter application 1:
http://domain.com/index.php
Codeigniter application 2 (the landing page application):
http://domain.com/land/index.php
The two Codeigniter applications are each atomic and do not share any files between them. Every file in the Codeigniter framework is in public_html/ and again in public_html/land/. So I need to exclude the string index.php in urls addressing the root / folder and also exclude the string index.php in the /land/ subfolder.
The .htaccess file in the root folder uses the widely recommended mod_rewrite rules (code below) from the Codeigniter wiki, and this set of rules works well for the root Codeigniter application (application 1). These rules reside in web root folder.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
#Removes access to the system folder by users.
#Additionally this will allow you to create a System.php controller,
#previously this would not have been possible.
#'system' can be replaced if you have renamed your system folder.
RewriteCond %{REQUEST_URI} ^system.*
RewriteRule ^(.*)$ /index.php?/$1 [L]
#When your application folder isn't in the system folder
#This snippet prevents user access to the application folder
#Rename 'application' to your applications folder name.
RewriteCond %{REQUEST_URI} ^application.*
RewriteRule ^(.*)$ /index.php?/$1 [L]
#Checks to see if the user is attempting to access a valid file,
#such as an image or css document, if this isn't true it sends the
#request to index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?/$1 [L]
</IfModule>
<IfModule !mod_rewrite.c>
# If we don't have mod_rewrite installed, all 404's
# can be sent to index.php, and everything works as normal.
ErrorDocument 404 /index.php
</IfModule>
The above set of rules has no problem removing index.php from the urls in the root Codeigniter application. But this set of rules does not seem to allow the mod_rewrite rules in public_html/land/.htaccess to execute.
When I remove the mod_rewrite rules in public_html/.htaccess, then the mod_rewrite rules in public_html/land/.htaccess start being evaluated.
Is there a way to change the mod_rewrite rules in public_html/.htaccess to handle the special case of a url intended to access the /land/ subfolder?
I think the best solution might be to change the mod_rewrite rules in public_html/.htaccess to allow the mod_rewrite rules in public_html/land/.htaccess to execute when the subfolder is addressed in the url. I am open to any suggestions.
Pre-emptive answer to the question "why don't you just use a subdomain?" 1. Saving money on the SSL certificate. 2) Non-techical users are sometimes confused by subdomains for marketing the base domain name.
Pre-emptive answer to "why don't you combine the Codeigniter applications to use the same files in the framework?" Duplicating the framework files is an easy way to keep the versioning repositories separated.
The problem is the rules in public_html/.htaccess are rewriting the URL's going to /land/, you need a passthrough which makes it so nothing happens when /land/ is requested.Add:
RewriteRule ^land/ - [L]
before the rest of your rules.
Add a rule at the top to just go to the land subfolder if it's part of the request string. That way, the rules in /land/.htaccess will be executed instead of the subsequent rules in /.htaccess. So put this at the top:
RewriteRule ^land.*$ - [NC,L]
This will check if the request begins with 'land' and redirect it to the subdirectory, where .htaccess rules corresponding to that subdirectory will be applied instead.
The reason the existing rule checking for files and folders and not doing the rewrite if the request corresponds to one of them is because whatever follows 'land' in the request is probably not a real file, and so the rewrite rule fires.

Apache Rewrite: secondary htaccess for domain specific RedirectMatch

On shared web-hosting my software supports multiple domains (all domains point to the same public_html root directory).
What I want to do is keep redirects (and any RedirectMatch) in their own host specific/dedicated .htaccess file.
Visually the directory structure looks like this...
/public_html/ (all domains are pointed internally to this directory)
/public_html/.htaccess
/public_html/www.example1.com/
/public_html/www.example2.com/
/public_html/www.example3.com/
There are two approaches I'm considering though would appreciate input from others:
The first would be to keep domain specific redirects out of the main .htaccess file as defined above. So I'd like to have redirects handled by the .htaccess files as defined by below if possible...
/public_html/www.example1.com/.htaccess
/public_html/www.example2.com/.htaccess
/public_html/www.example3.com/.htaccess
...if this is not feasible I'll settle for a rewrite to a PHP file to hand off redirects to PHP instead. I imagine this isn't as performance oriented though on the other hand it would give me the opportunity to log redirects and see how long it takes them to level off.
Some clarifications:
I'm using shared web hosting so anything Apache related needs to be done through .htaccess files only.
There are no redirects/matches in the master .htaccess file nor will there ever be since two domains may eventually attempt to use the same redirect.
Since you are on shared host, You cannot afford to have any solutions concerning conf files (which BTW are better). So wont bother to list them. Best way to do the above is like this:
The code was written keeping in mind that none of the domains share any kind of file/data on the server. Every file/data pertaining to a domain is kept under a folder having the name equal to its domainname.
The code below is tested(both static and non static):
RewritEngine on
RewriteBase /
RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^ - [L]
And add either of the following to the above:
for doing it statically:
RewriteCond %{HTTP_HOST} ^www\.(example1|example2|example3)(\.com)$ [NC]
RewriteRule ^(.*)$ /www.%1%2/$1 [L]
for doing it statically: and also if you want to access the site without www
RewriteCond %{HTTP_HOST} ^(www\.)?(example1|example2|example3)(\.com)$ [NC]
RewriteRule ^(.*)$ /%1%2%3/$1 [L]
for Non-statically do it: this is a better sol
RewriteRule ^(.*)$ /%{HTTP_HOST}/$1 [L]
All the above will do is redirect URI to their specific domain's folder. All other domain specific rewrites can be handled in the respective folders.
If you have URIs without the www, i.e. example1.com change ^www\.(example1|example2|example3)(\.com)$ to ^(www\.)?(example1|example2|example3)(\.com)$

Rename webstore products directory and redirect all its contents

I want to rename our webstore products directory, and redirect all its contents, from /products/ to /Product/ using htaccess.
The problem is that we have several individual domain names (mysite.com, mysite.co.uk) accessing this directory so the resultant htaccess code cannot specify a single destination domain such as .com
As long as all domains/vhosts use the same namespace, you can use the same RewriteRule in all vhosts/.htaccess files:
RewriteEngine On
RewriteRule ^products/([^/\.]+.*) /Product/$1 [L]
Good luck!
Alex.

Setting up Drupal and Wordpress under a single document root

I have a hosting account which provides me a folder to publish my files for my domain (say www.example.com). I have set up Drupal for www.example.com with .htaccess at the top folder to enable clean-urls for the Drupal installation. Now I want to have a Wordpress installation under www.example.com/blog/ and have clean URLs for that blog. But while using .htaccess it is not working ok as the .htaccess at the top folder will override the sub-folder one. How to achieve what I intend to?
This really depends on the exact content of your respective .htaccess files.
One workaround is to add a RewriteCond to the head of the main .htaccess file that, if the request URI matches the sub-directory, stops parsing:
RewriteCond %{REQUEST_URI} ^/blog
RewriteRule .* - [L]
this should lead to the blog URLs being parsed properly, based on the rules specified there.

serve with apache all paths under a domain through one script

i'm hosting a website through a hosting company [1] on a linux/apache server. until now i serve the different content through one script with parameters. an example url is
www.mydomain.com/pages.php?date=1-10-2008
now i want to change the scheme the url is composed of to something which looks completely like a path url. eg.:
www.mydomain.com/pages/date/2008/20/1
for this i need to switch off the normal mapping of url paths to directory folders in apache: all requests to all paths should go to one central script (pages.php), which than analyzes the path component of the url.
how do i tweak apache for this? i hope some .htaccess rules could the trick.
[1] btw, the hosting company is godaddy.com.
Something like:
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . pages.php
should rewrite every request for a file or directory that doesn't exist to pages.php. This will allow you to keep static files (images, stylesheets, etc) in the same document root.
(Shamelessly stolen from WordPress :) )
You are looking for mod_rewrite. Example htaccess file:
RewriteEngine On
RewriteBase /
RewriteRule ^pages/([^/]*)/(.*)$ pages.php?$1=$2
Rather than parse the url in your script, you should be able to handle the specific example above with Apache's ModRewrite module.
http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html
You can use these in the .htaccess file, assuming your host allows this.