SEO optimization - front page appears twice? - apache

Web server has initial page in the file, let's say, somefile.html. .htaccess file has the following instruction:
DirectoryIndex somefile.html
which makes web server to fetch contents of somefile.html when web server's root is requested.
However, some page(s) of the website may refer to somefile.html, which contains exactly the same content as server's root. Thus this situation leads to same contents appearing on the web server twice - when somefile.html is requested, and when / is requested.
How to best correct situation with least effort while keeping website structure?

Add a 301 redirect to your htaccess file in your document root (preferably before any rules you may already have:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^\ ]*)somefile\.html
RewriteRule ^ /%1 [L,R=301]
This redirects any direct request for /somefile.html to just /. As well as any subdirectories: /foo/bar/somefile.html to /foo/bar/.

Related

htaccess Remove directory from end of URL in apache

Ok, so I know this is a question that has been asked many times, however, I have not been able to find an answer to my particular case, so please do not shoot me down.
I have a website: http://gmcomputers.co.za.
I am redirecting this URL, using .htaccess file, to a subfolder to load the content:
RewriteEngine on
RewriteCond %{REQUEST_URI} ^/$
RewriteRule (.*) /gmcomputers/ [L,DPI,R=301]
Which works perefectly, except when I go to http://gmcomputers.co.za I get http://gmcomputers.co.za/gmcomputers/.
So my question is, how do I modify the above code to remove the /gmcomputers/ from being appended?
Please note I copied the code above from a website as I am not at all experienced in redirect, etc and am still learning. Also, the reason I am using .htaccess to redirect is due to there being other websites in the root directory and I therefore cannot edit any config files for Apache.
Thanking you.
You contradict yourself in your question. On the one hand you write that you want to redirect and that this "works perfectly", but then you write that you do not want that result.
My guess is that you actually do not want to redirect at all, but that instead you want to internally rewrite your requests to point to that server side folder. While the URL visible in the browser's URL bar does not show that folder. Is that what you are trying to ask?
If so take a look at this example:
RewriteEngine on
RewriteCond %{REQUEST_URI} !^/gmcomputers
RewriteRule ^ /gmcomputers%{REQUEST_URI} [END]
You might want to add an actual redirection to direct clients actually using the folder name in their requests:
RewriteEngine on
RewriteRule ^/?gmcomputers/(.*)$ /$1 [R=301,END]
RewriteCond %{REQUEST_URI} !^/gmcomputers
RewriteRule ^ /gmcomputers%{REQUEST_URI} [END]
Best is to implement such rules in the central http server's host configuration. If you do not have access to that you can instead use a distributed configuration file (typically called ".htaccess") located in the DOCUMENT_ROOT folder configured for the http host, if you enabled the consideration of such files in your host configuration . Though that comes with a number of disadvantages. Above implementation works likewise for both approaches.

.htaccess multiple subdomain to folder redirection using END flag

I'm setting up a website for a user and it's become apparent that they have lots of subdomains which were previously directed to specific folders. I'd rather find some way that they can manage this themselves by creating the relevant directories rather than me keep adding virtual hosts or altering .htaccess rules each time they want to add/change them.
As such, I came up with the idea of using a catch-all vhost, and using .htaccess to direct the subdomain to the correct folder.
Now I know similar questions have been asked, but I'm trying to achieve this with a single ruleset, and without performing a full HTTP redirect.
Currently I have the below rules, but I'm getting a strange problem
RewriteCond %{REQUEST_URI} !^/\.well-known
RewriteCond %{HTTP_HOST} !=www.example.co.uk [NC]
RewriteCond %{HTTP_HOST} ^(.+)\.example\.co\.uk [NC]
RewriteRule ^(.*)$ /%1/$1 [L,END,QSA]
Basically the idea is to avoid .well-known so that LetsEncrypt can use the document root to get certs for any sub domain, avoid www. which should use the standard path, but then match and redirect any other subdomain.
Without END this predictably ends with a server error and an exceeded the limit of 10 internal redirects message in the log. This at least seems to confirm it's matching and redirecting though.
However, when using the END keyword, as far as I understand it, the rewrite should only happen once; I'm seeing strange behaviour though.
For a specific path, it seems to work fine
GET /index.html HTTP/1.1
Host: journey.example.co.uk
HTTP/1.1 200 OK
... snip content from /journey/index.html ...
But if I don't give a path, it seems like it's processing the redirect twice.
GET / HTTP/1.1
Host: journey.example.co.uk
HTTP/1.1 404 Not Found
... snip ...
<p>The requested URL /journey/journey/index.html was not found on this server
... snip ...
Given the use of %1, which should be the first part of the hostname, $1, which should be either just a / or empty in this case, I don't see how it's ending up with journey twice in the rewritten path.
Think I might of managed to get this working myself by looking through the rewrite flags documentation for the 10th time and finding this.
nosubreq|NS Causes a rule to be skipped if the current request is an internal sub-request.
The further documentation talks about SSI which isn't relevant to my issue, but it does go on to mention the following:
Also, when mod_dir tries to find out information about possible directory default files (such as index.html files), this is an internal subrequest, and you often want to avoid rewrites on such subrequests
My understanding is that the request for / causes mod_dir to make a subrequest for index.html, which results in two requests, and two rewrites.
Adding the above flag to the rule seems to be working, at least in a few quick tests. As such the following rules seem to allow for redirecting any subdomain to the same-named directory under document root.
RewriteCond %{REQUEST_URI} !^/\.well-known
RewriteCond %{HTTP_HOST} !=www.example.co.uk [NC]
RewriteCond %{HTTP_HOST} ^(.+)\.example\.co\.uk [NC]
RewriteRule ^(.*)$ /%1/$1 [END,NS]

How to serve one page website via Apache .htaccess?

Lets say I have developed one page application at example.com/index.html. How can I serve this file no matter what example.com/blabla is the URI. The requested URI should be preserved hence redirection is not an option.
RewriteEngine On
RewriteCond %{REQUEST_URI} !=/index.html
RewriteRule ^ /index.html [R=302]
Also setting it as an 404 document can apparently do the job but it is not the positive way.
ErrorDocument 404 /index.html
In my case it is custom one page application made with window.history.pushState but I am looking for proper way that is used for Angular.JS and Backbone.JS applications
Just one rule would be enough in site root/.htaccess; use FallbackResource:
FallbackResource /index.html

Apache Rewrite: secondary htaccess for domain specific RedirectMatch

On shared web-hosting my software supports multiple domains (all domains point to the same public_html root directory).
What I want to do is keep redirects (and any RedirectMatch) in their own host specific/dedicated .htaccess file.
Visually the directory structure looks like this...
/public_html/ (all domains are pointed internally to this directory)
/public_html/.htaccess
/public_html/www.example1.com/
/public_html/www.example2.com/
/public_html/www.example3.com/
There are two approaches I'm considering though would appreciate input from others:
The first would be to keep domain specific redirects out of the main .htaccess file as defined above. So I'd like to have redirects handled by the .htaccess files as defined by below if possible...
/public_html/www.example1.com/.htaccess
/public_html/www.example2.com/.htaccess
/public_html/www.example3.com/.htaccess
...if this is not feasible I'll settle for a rewrite to a PHP file to hand off redirects to PHP instead. I imagine this isn't as performance oriented though on the other hand it would give me the opportunity to log redirects and see how long it takes them to level off.
Some clarifications:
I'm using shared web hosting so anything Apache related needs to be done through .htaccess files only.
There are no redirects/matches in the master .htaccess file nor will there ever be since two domains may eventually attempt to use the same redirect.
Since you are on shared host, You cannot afford to have any solutions concerning conf files (which BTW are better). So wont bother to list them. Best way to do the above is like this:
The code was written keeping in mind that none of the domains share any kind of file/data on the server. Every file/data pertaining to a domain is kept under a folder having the name equal to its domainname.
The code below is tested(both static and non static):
RewritEngine on
RewriteBase /
RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^ - [L]
And add either of the following to the above:
for doing it statically:
RewriteCond %{HTTP_HOST} ^www\.(example1|example2|example3)(\.com)$ [NC]
RewriteRule ^(.*)$ /www.%1%2/$1 [L]
for doing it statically: and also if you want to access the site without www
RewriteCond %{HTTP_HOST} ^(www\.)?(example1|example2|example3)(\.com)$ [NC]
RewriteRule ^(.*)$ /%1%2%3/$1 [L]
for Non-statically do it: this is a better sol
RewriteRule ^(.*)$ /%{HTTP_HOST}/$1 [L]
All the above will do is redirect URI to their specific domain's folder. All other domain specific rewrites can be handled in the respective folders.
If you have URIs without the www, i.e. example1.com change ^www\.(example1|example2|example3)(\.com)$ to ^(www\.)?(example1|example2|example3)(\.com)$

Apache Rewrite: directory tree to subdomain directory

I have a web application that has one set of files used by 50+ clients and all the configuration for each site comes from a config.php file in their respective directories. This is accomplished with PHP parsing the URL. All this works fine, just having an issue with custom uploaded documents the client can do and are located in
/var/www/sites/user1/cache
There can be multiple subdirs. So when requesting
http://user1.site.com/cache/subdir1/image.jpg
it needs to be read from
/var/www/sites/user1/cache/subdir1/image.jpg
The client is allowed to upload any file type, so I just need the rewrite to take any /cache requests, then grab the subdomain and point to proper directory.
Came up with this, but am still getting an invalid page
RewriteEngine On
RewriteCond %{HTTP_HOST} ^([^\.]+)\.site\.com$
RewriteRule ^cache/(.*)$ /sites/%1/cache/$1 [L]
Any help is appreciated.
If I read the RewriteRule documentation correctly, the L flag on its own would generate an internal redirection, meaning that the substitution would be interpreted as a local file system path.
Try using the complete path:
RewriteRule ^cache/(.*)$ /var/www/sites/%1/cache/$1 [L]
or do an external redirection (using HTTP return status "302 MOVED TEMPORARILY"), to let the user's browser re-send the request with the new path:
RewriteRule ^cache/(.*)$ /sites/%1/cache/$1 [L,R]
The /var/www/ is where the files are on the filesystem. I was routing based on the document root so I didn't need to put that there. But I realized I was missing the leading forward slash on the /cache/. Though your answer wasn't really what I was looking for, it made me see what I was missing. Thanks.
RewriteEngine On
RewriteCond %{HTTP_HOST} ^([^\.]+)\.site\.com$
RewriteRule ^/cache/(.*)$ /sites/%1/cache/$1 [L]