.htaccess to show a directory index.html without a trailing slash - apache

I've got a Jekyll generated site running on an Apache server and I'm having some trouble getting my .htaccess file set up correctly. Jekyll places index.html files into folders which represent each page so my URLs currently look like domain.com/foo/
I'd like to remove that trailing slash from the URL so that it exactly matches what I had set up previously (and also because I think it looks better).
Currently the section of my .htaccess file dealing with rewites looks like:
<IfModule mod_rewrite.c>
RewriteCond %{HTTPS} !=on
RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^(.*)$ http://%1/$1 [R=301,L]
</IfModule>
Options -Indexes
DirectoryIndex index.xml index.html
I have tried following the advice here but that puts me into a redirect loop.
Can anybody help me out? In brief, what I want is for a domain.com/foo URL to show the index.html file form the /foo directory and for domain.com/foo/ and domain.com/foo/index.html to redirect to domain.com/foo.

You should be able to use this to turn off the addition of slashes.
DirectorySlash Off
Note that the trailing slash is added for a good reason. Having the trailing slash in the directory name will make relative URLs point at the same thing regardless of whether the URL ends with "foo/bar/index.html" or just "foo/bar/". Without the trailing slash, relative URLs would reference something up one level from what they normally point at. (eg: "baz.jpg" would give the user "/foo/baz.jpg" instead of "/foo/bar/baz.jpg", as the trailing "bar" will get removed if it isn't protected by a trailing slash.) So if you do this, you probably want to avoid relative URLs.
To then rewrite the directory name to return the index.html you could probably do something like this:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}/index.html -f
RewriteRule ^(.*)$ /$1/index.html [L]
This checks if REQUEST_URI/index.html exists, and if it does performs an internal redirect.

Related

How can I create a redirect with .htaccess to correct path instead of page acess

I am making a multilingual dynamic site that creates a virtual path per language.
So french pages go to domain.com/fr/ english domain.com/en/page domain.com/fr/some/page but in reality these pages are in the base folder and /fr/ is converted to a query string.
This is all working with the following .htaccess:
RewriteEngine on
DirectorySlash Off # Fixes the issue where a page and folder can have the same name. See https://stackoverflow.com/questions/2017748
# Return 404 if original request is /foo/bar.php
RewriteCond %{THE_REQUEST} "^[^ ]* .*?\.php[? ].*$"
RewriteRule .* - [L,R=404]
# Remove virtual language/locale component
RewriteRule ^(en|fr)/(.*)$ $2?lang=$1 [L,QSA]
RewriteRule ^(en|fr)/$ index.php?lang=$1 [L,QSA]
# Rewrite /foo/bar to /foo/bar.php
RewriteRule ^([^.?]+)$ %{REQUEST_URI}.php [L]
My problem is that some sites (Like a Linkedin post) somehow remove the trailing / in the index page automatically. So if I put a link in my post of domain.com/fr/ somehow they make the link domain.com/fr even if it shows domain.com/fr/ but that 404's as domain.com/fr dosent exist.
So how can I redirect domain.com/fr to domain.com/fr/ or localhost/mypath/fr (There's many sites in my local workstation) to localhost/mypath/fr/.
I tried something like:
RewriteRule ^(.*)/(en|fr)$ $1/$2/ [L,QSA,R=301]
RewriteRule ^(en|fr)$ $1/ [L,QSA,R=301]
But that ended up somehow adding the full real computer path in the url:
localhost/mypath/fr becomes localhost/thepathofthewebserverinmypc/mypath/fr/
I would very much appreciate some help as I have yet to find the right rule.
Thank you
RewriteRule ^(en|fr)$ $1/ [L,QSA,R=301]
You are just missing the slash prefix on the substitution string. Consequently, Apache applies the directory-prefix to the relative URL, which results in the malformed redirect.
For example:
RewriteRule ^(en|fr)$ /$1/ [L,R=301]
The substitution is now a root-relative URL path and Apache just prefixes the scheme + hostname to the external redirect. (The QSA flag is unnecessary here, since any query string is appended by default.)
This needs to go before the existing rewrites (and after the blocking rule for .php requests).
Note that the "internal rewrite" directives are correct to not have the slash prefix.
Aside:
DirectorySlash Off
Note that if you disable the directory slash, you must ensure that auto-generated directory listings (mod_autoindex) are also disabled, otherwise if a directory without a trailing slash is requested then a directory listing will be generated (exposing your file structure), even though there might be a DirectoryIndex document in that directory.
For example, include the following at the top of the .htaccess file:
# Disable auto-generated directory listings (mod_autoindex)
Options -Indexes
UPDATE:
this worked on the production server. As the site is in the server root. Would your know how can I also try and "catch" this on my localhost ? RewriteRule ^(.*)/(en|fr)$ /$1/$2/ [L,R=301] dosent catch but with only RewriteRule ^(en|fr)$ /$1/ [L,R=301] localhost/mypath/fr becomes localhost/fr/
From that I assume the .htaccess file is inside the /mypath subdirectory on your local development server.
The RewriteRule pattern (first argument) matches the URL-path relative to the location of the .htaccess file (so it does not match /mypath). You can then make use of the REQUEST_URI server variable in the substitution that contains the entire (root-relative) URL-path.
For example:
RewriteRule ^(en|fr)$ %{REQUEST_URI}/ [L,R=301]
The REQUEST_URI server variable already includes the slash prefix.
This rule would work OK on both development (in a subdirectory) and in production (root directory), so it should replace the rule above if you need to support both environments with a single .htaccess file.

Force subfolders to follow parent htaccess redirect rules?

I am currently using the following .htaccess code on my server to enable me to host the primary domain files from a subfolder in the public_html folder:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www.)?example.com$
RewriteCond %{REQUEST_URI} !^/subdirectory/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /subdirectory/$1
RewriteCond %{HTTP_HOST} ^(www.)?example.com$
RewriteRule ^(/)?$ subdirectory/index.php [L]
This doesn't solve the problem of subfolders of this new root, however. For example, I have two folders:
public_html\example\ - which should correspond to www.example.com
and
public_html\example\subfolder - which should correspond to www.example.com/subfolder
My problem is that navigating to www.example.com/subfolder in the browser redirects me to www.example.com/example/subfolder.
EDIT: Further to the response from #Jon, this is only occurring when navigating to the URL without a trailing slash. Navigating to www.example.com/subfolder/ is working as expected.
How do I prevent the redirect to www.example.com/example/subfolder?
When you go to a URL that's a folder and you are missing the trailing slash, apache will redirect the browser to so that there's a trailing slash at the end. This can sometimes mess with rewrite rules that are internally rewriting requests.
You either must go to example.com/subfolder/ or turn off the directory slash function. However, turning this off can be very dangerous:
Security Warning
Turning off the trailing slash redirect may result in an information disclosure. Consider a situation where mod_autoindex is active (Options +Indexes) and DirectoryIndex is set to a valid resource (say, index.html) and there's no other special handler defined for that URL. In this case a request with a trailing slash would show the index.html file. But a request without trailing slash would list the directory contents.
This means, if someone goes to: example.com/subfolder, they'll see your directory contents, eventhough there's an index.php file there. You can turn off indexes but then they'll just see a 403, and still won't see your index.php.

htaccess Silent Redirect to Subdirectory: Subdirectory showing when no trailing '/'

I have dug high and low around Google and StackOverflow to try and figure out my problem, trying countless solutions but nothing has completely worked.
I'm looking to move the web root of the main domain on my server to a sub-directory. What I have currently for a server path to my web root:
/home/user/public_html/MyWebFilesHere
What I'm looking to have:
/home/user/public_html/subdir/MyWebfilesHere
When I browse to mydomain.com, there should be no visible difference though (i.e. "subdir" not visible after redirect).
Unfortunately, I am restricted to doing this purely with a .htaccess file since I'm on shared hosting and don't have access to Apache config files and such. :(
What I currently have in my .htaccess in public_html is:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www\.)?mydomain\.com$
RewriteCond %{REQUEST_URI} !^/subdir/
RewriteRule ^(.*)$ /subdir/$1 [L]
This successfully redirects all queries to the sub-directory, however there's a really weird issue. If I go to
mydomain.com/Contact/
it works great, redirecting the query to the path /subdir/Contact/ but leaving the address bar alone. If I go to
mydomain.com/Contact
(Note the lack of a trailing '/') though, what shows in the address bar is
mydomain.com/subdir/Contact/
which isn't what I want since "subdir" is showing.
For a working example on my actual site, try browsing to
colincwilliams.com/Contact/
compared with
colincwilliams.com/Contact
Do you guys have any ideas on how to make this work silently both with and without a trailing slash?
This is probably happening because mod_dir (the module that automatically redirects the browser if a request for a directory is missing a trailing slash to the same thing with a trailing slash. See the DirectorySlash directive in mod_dir
What's happening is:
You request: mydomain.com/Contact
mod_dir doesn't touch this since /Contact isn't a directory
/Contact gets rewritten to /subdir/Contact and internally redirected
mod_dir sees that /subdir/Contact is a directory and missing the trailing slash so it redirects the browser to mydomain.com/subdir/Contact/
So now, your browser's location bar has the /subdir/ in it.
You can add DirectorySlash off in your .htaccess to turn off mod_dir from redirecting. But if you want directories to have trailing slashes, you can add a separate condition for it. Based on what you already have, we can expand it to this:
RewriteEngine on
# Has a trailing slash, don't append one when rewriting
RewriteCond %{HTTP_HOST} ^(www\.)?mydomain\.com$
RewriteCond %{REQUEST_URI} !^/subdir/
RewriteCond %{THE_REQUEST} ./\ HTTP/1\.[01]$ [OR]
# OR if it's a file that ends with one of these extensions
RewriteCond %{REQUEST_URI} \.(php|html?|jpg|gif|css)$
RewriteRule ^(.*)$ /subdir/$1 [L]
# Missing trailing slash, append one
RewriteCond %{HTTP_HOST} ^(www\.)?mydomain\.com$
RewriteCond %{REQUEST_URI} !^/subdir/
RewriteCond %{THE_REQUEST} [^/]\ HTTP/1\.[01]$
# But only if it's not a file that ends with one of these extensions
RewriteCond %{REQUEST_URI} !\.(php|html?|jpg|gif|css)$
RewriteRule ^(.*)$ /subdir/$1/ [L]
Note: I changed !^/mydomain/ to !^/subdir/, figured it was a typo because without it, mod_rewrite would loop internally indefinitely (foo -> /subdir/foo -> /subdir/subdir/foo -> /subdir/subdir/subdir/foo, etc). If I got that wrong, you can change it back.
Edit: See my additions of RewriteCond's matching against \.(php|html?|jpg|gif|css). These are the file extensions that get passed through without getting trailing slashes added. You can add/remove to suit your needs.
Jon Lin's answer was very helpful in determining what was causing the problem in my very similar setup. For completeness I will include the relevant information from his answer:
This is probably happening because mod_dir (the module that automatically redirects the browser if a request for a directory is missing a trailing slash to the same thing with a trailing slash. See the DirectorySlash directive in mod_dir
What's happening is:
You request: mydomain.com/Contact
mod_dir doesn't touch this since /Contact isn't a directory
/Contact gets rewritten to /subdir/Contact and internally redirected
mod_dir sees that /subdir/Contact is a directory and missing the trailing slash so it redirects the browser to mydomain.com/subdir/Contact/
So now, your browser's location bar has the /subdir/ in it.
In my case, I had requests being redirected to /subdir with a few exceptions and didn't want to have to re-enable DirectorySlash for each of those exceptions.
By allowing RewriteEngine to continue after the initial redirect to /subdir, it's possible to mimic what mod_dir would be doing while also taking /subdir into account, before mod_dir gets to see it.
RewriteEngine on
RewriteCond %{REQUEST_URI} !^/exception1|exception2|...
RewriteRule ^(.*)$ /subdir/$1
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^subdir/(.*) $1/ [R,L]
Note: You may need to be careful about allowing RewriteEngine to continue if there are further rules. Not matching the second rule will continue on to any further rules which may produce a different result.
This can be avoided by using a third rule to stop RewriteEngine processing if the redirect into /subdir has happened:
RewriteCond %{REQUEST_URI} ^subdir
RewriteRule .* - [L]

How to do a mod_rewrite redirection to relative URL

I am trying to achieve a basic URL redirection for pretty-URLs, and due to images, CSS etc. also residing in the same path I need to make sure that if the URL is accessed without a trailing slash, it is added automatically.
This works fine if I put the absolute URL like this:
RewriteRule ^myParentDir/([A-Z0-9_-]+)$ http://www.mydomain.com/myParentDir/$1/ [R,nc,L]
But if I change this to a relative URL, so that I don't have to change it each time I move things in folders, this simply doesn't work.
These are what I tried and all do not work, or redirect me to the actual internal directory path of the server like /public_html/... :
RewriteRule ^myParentDir/([A-Z0-9_-]+)$ ./myParentDir/$1/ [R,nc,L]
RewriteRule ^myParentDir/([A-Z0-9_-]+)$ myParentDir/$1/ [R,nc,L]
What is the right way to do a URL redirection so that if the user enters something like:
http://www.mydomain.com/somedir/myVirtualParentDir/myVirtualSubdir
he gets redirected to (via HTTP 301 or 302):
http://www.mydomain.com/somedir/myVirtualParentDir/myVirtualSubdir/
Thanks.
EDIT: Adding some more details because it does not seem to be clear.
Lets say I am implementing a gallery, and I want to have pretty URLs using mod_rewrite.
So, I would like to have URLs as follows:
http://www.mydomain.com/somedir/galleries/cats
which shows thumbnails of cats, while:
http://www.mydomain.com/somedir/galleries/cats/persian
which shows one image from the thumbnails of all cats, named persian.
So in actual fact the physical directory structure and rewriting would be as follows:
http://www.domain.com/somedir/gallery.php?category=cats&image=persian
So what I want to do is put a .htaccess file in /somedir which catches all requests made to /galleries and depending on the virtual subdirectories following it, use them as placeholders in the rewriting, with 2 rewrite rules:
RewriteRule ^galleries/(A-Z0-9_-]+)/$ ./gallery.php?category=$1 [nc]
RewriteRule ^galleries/(A-Z0-9_-]+)/+([A-Z0-9_-]+)$ ./gallery.php?category=$1&image=$2 [nc]
Now the problem is that the gallery script in fact needs some CSS, Javascript and Images, located at http://www.domain.com/somedir/css, http://www.domain.com/somedir/js, and http://www.domain.com/somedir/images respectively.
I don't want to hardcode any absolute URLs, so the CSS, JS and Images will be referred to using relative URLs, (./css, ./js, ./images etc.). So I can do rewriting URLs as follows:
RewriteRule ^galleries/[A-Z0-9_-]+/css/(.*)$ ./css/$1 [nc]
The problem is that since http://www.domain.com/somedir/galleries/cats is a virtual directory, the above only works if the user types:
http://www.domain.com/somedir/gallaries/cats/
If the user omits the trailing slash mod_dir will not add it because in actual fact this directory does not actually exist.
If I put a redirect rewrite with the absolute URL it works:
RewriteRule ^galleries/([A-Z0-9_-]+)$ http://www.mydomain.com/subdir/galleries/$1/ [R,nc,L]
But I don't want to have the URL prefix hardcoded because I want to be able to put this on whatever domain I want in whatever subdir I want, so I tried this:
RewriteRule ^galleries/([A-Z0-9_-]+)$ galleries/$1/ [R,nc,L]
But instead it redirects to:
http://www.mydomain.com/home/myaccount/public_html/subdir/galleries/theRest
which obviously is not what I want.
EDIT: Further clarifications
The solution I am looking for is to avoid hardcoding the domain name or folder paths in .htaccess. I am looking for a solution where if I package the .htaccess with the rest of the scripts and resources, wherever the user unzips it on his web server it works out of the box. All works like that apart from this trailing slash issue.
So any solution which involves hardcoding the parent directory or the webserver's path in .htaccess in any way is not what I am looking for.
Here's a solution straight from the Apache Documentation (under "Trailing Slash Problem"):
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ $1/ [R]
Here's a solution that tests the REQUEST_URI for a trailing slash, then adds it:
RewriteCond %{REQUEST_URI} !(/$|\.)
RewriteRule (.+) http://www.example.com/$1/ [R=301,L]
Here's another solution that allows you to exempt certain REQUEST_URI patterns:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !example.php
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://domain.com/$1/ [L,R=301]
Hope these help. :)
This rule should add a trailing slash to any URL which is not a real file/directory (which is, I believe, what you need since Apache usually does the redirect automatically for existing directories).
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+[^/])$ $1/ [L,R=301]
Edit:
In order to prevent Apache from appending the path relative to the document root, you have to use RewriteBase. So, for instance, in the folder meant to be your application's root, you add the following, which overrides the physical path:
RewriteBase /
This might work:
RewriteRule ^myParentDir/[A-Z0-9_-]+$ %{REQUEST_URI}/ [NS,L,R=301]
However, I'm not sure why you think you need this at all. Just make your CSS / JS / image file rewrite rule look something like this:
RewriteRule ^galleries/([A-Za-z0-9_-]+/)*(css|js|images)/(.*)$ ./$2/$3
and everything should work just fine regardless of whether the browser requests /somedir/galleries/css/whatever.css or /somedir/galleries/cats/css/whatever.css or even /somedir/galleries/cats/persian/calico/css/whatever.css.
Ps. One problem with this rule is that it prevents you from having any galleries names "css", "js" or "images". You might want to fix that by naming those virtual directories something like ".css", ".js" and ".images", or using some other naming scheme that doesn't conflict with valid gallery names.
I'm not sure I complelty understand your problem.
The trailing slash redirection is done automatically on most Apache installation because of mod_dir module (99% of chance you'have the mod_dir module).
You may need to add:
DirectorySlash On
But it's the default value.
So. If you access foo/bar and bar is not a file in foo directory but a subdirectory then mod_dir performs the redirection to foo/bar/.
The only thing I known that could break this is the Option Multiviews which is maybe trying to fin a bar.php, bar.php, bar.a-mime-extension-knwon-by-apache in the directory. So you could try to add:
Option -Multiviews
And remove all rewriteRules. If you do not get this default Apache behavior you'll maybe have to look at mod-rewrite, but it's like using a nuclear bomb to kill a spider. Nuclear bombs may get quite touchy to use well.
EDIT:
For the trailing slash problem with mod-rewrite you can check this documentation howto, stating this should work:
RewriteEngine on
RewriteBase /myParentDir/
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ $1/ [R]

Apache mod_rewrite redirect, keep sub-directory name

I'm wondering if this is possible.
I have a single page site in which I'd like to incorporate a trailing slash with a file name that anchors to a section on that site. I'm trying to avoid using hash or hash-bangs.
For example; www.example.com/recent
Right now, I'm removing any trailing slash, but I get a 404 with /recent because it's expecting a file.
RewriteRule ^(.*)/$ /$1 [R=301,L]
Is it possible to redirect to www.example.com, but still maintain the /recent without the server thinking it's a file so I can read it client-side (php/js)? More so that I can keep using the back and forward buttons.
Thanks for any help!
TBH it is not 100% clear for me what you want. As I understand you want URL www.example.com/recent to be rewritten (internal redirect, when URL remains unchanged in browser) to www.example.com/index.php?page=recent (or something like that).
DirectorySlash Off
Options +FollowSymLinks -MultiViews
RewriteEngine On
RewriteBase /
# remove trailing slash if present
RewriteRule ^(.*)/$ /$1 [R=301,L]
# do not do anything for already existing files
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule .+ - [L]
# rewrite all non-existing resources to index.php
RewriteRule ^(.+)$ /index.php?page=$1 [L,QSA]
With the above rules (that need to be placed in .htaccess in website root folder) this can be achieved. Request for www.example.com/recent will be rewritten to www.example.com/index.php?page=recent so your single-page server side script knows which URL was requested. The same will be with any other non-existing resource e.g. www.example.com/hello/pink/kitten => www.example.com/index.php?page=hello/pink/kitten.
It may not be necessary to pass originally requested URI as a page parameter as you should be able to access it in PHP via $_SERVER['REQUEST_URI'] anyway.
If I misunderstood you and this is not what you want then you have to clarify your question (update it with more details, make it sound clear).