Mod-Rewrite rules are breaking 404 routing - apache

I am using the following mod-rewrite in my .htaccess file:
RewriteRule ^$ pages/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ pages/$1 [L]
The intention is to hide the subdirectory called /pages/ from displaying in the URL.
So this: http://mysite.com/pages/home.html
Will look like this: http://mysite.com/home.html
It works but there are some unintended consequences.
As a direct result of the .htaccess code I posted above, my 404 routing is no longer working at all. Anything that should trigger a 404 error page is instead generating a 500 Server Error.
How to fix?
EDIT:
As implied above, it does not matter if a custom 404 page is defined in the .htaccess or not. Without it, or a bad path to the error page, the server should still route to its default 404 page, and not give a 500 Server Error.
Surely, there must be a standard way to suppress sections of a URL without breaking the normal routing of 404 errors. From my online research it seems that my method above commonly breaks the 404 routing, and yet so far, I've seen no applicable solution. (This is not a Wordpress installation; just static HTML content)
EDIT 2:
Since I'm only wanting to suppress the one directory from the URL, I never mentioned that I also have other files & directories which are siblings to /pages/ that cannot be pointed at /pages/, such as /graphics/, /includes/, /css/, /cgi-bin/, robots.txt, favicon.ico, etc.
Maybe this is all an exercise in futility or more trouble than it's worth?
Looking for a definitive answer either way.

Following config will look for your static pages inside the pages/ and if found, it'll display them. This shouldn't break 404.
Put it in root folder of your web in .htaccess
RewriteEngine On
RewriteCond %{DOCUMENT_ROOT}/pages/%{REQUEST_URI} -f [OR]
RewriteCond %{DOCUMENT_ROOT}/pages/%{REQUEST_URI} -d
RewriteRule ^(.*)$ /pages/$1

This should achieve what you are trying to do.
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/([a-zA-Z0-9\_\-]+)\.html$
RewriteRule (.*) /pages/$1 [L]

Thank-you to #Kamil Šrot for getting the closest working solution. However, I needed to add another test ( -d ) to see if the requesting URI is a directory.
This is working great and the 404 error page is again routing properly.
RewriteEngine On
RewriteBase /
RewriteCond %{DOCUMENT_ROOT}/pages/%{REQUEST_URI} -f [OR]
RewriteCond %{DOCUMENT_ROOT}/pages/%{REQUEST_URI} -d
RewriteRule ^(.*)$ /pages/$1

How about adding an error page direction to your htaccess file to handle the 404 page:
ErrorDocument 404 /path/to/your/404.html

Related

Apache mod_rewrite - unwanted redirect instead of rewrite

I have an issue with mod_rewrite and I can't seem to solve it. I stripped the example down to the bare bones and I don't understand why a specific rule forces my browser to redirect instead of rewrite:
RewriteEngine on
#if request is for a physical-file OR for one of the language paths - skip (return as-is)
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
RewriteCond %{REQUEST_URI} ^/de [OR]
RewriteCond %{REQUEST_URI} ^/en-US
RewriteRule ^ - [L]
#otherwise: rewrite to en-US folder
RewriteRule ^(.*)$ /en-US/$1 [NC,L,QSA]
I read the documentation very carefully and it seems like this should actually rewrite every call, so https://example.com/fuBar.html should actually retrieve the file /en-US/fuBar.html from my server - the users browser shouldn't know about it.
What's really happening is that for some reason the browser is redirected to https://example.com/en-US/fuBar.html. While this does display the correct content, it's just not what I want or what I thought this RewriteRule should do. What am I doing wrong?
*add - the .htaccess of the subfolders de and en-US:
RewriteEngine On
# If an existing asset or directory is requested go to it as it is
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
RewriteRule ^ - [L]
# If the requested resource doesn't exist, use index.html
RewriteRule ^ /index.html
There's nothing in the code you've posted that would trigger an external "redirect".
Make sure you have cleared your browser (and any intermediary) cache(s) to ensure you are not seeing an earlier/erroneous 301 (permanent) redirect. (301 redirects are cached persistently by the browser.)
Check the "network traffic" in the browser's developer tools to see the precise nature of this redirect to see what it redirects from/to, and well as the 3xx HTTP status code of the redirect (if indeed this is an external redirect).
It would seem the front-end (JavaScript/Angular) is manipulating the URL in the address bar (there is no redirect). From comments:
Actually there was no redirect happening at all! Rather since I set <base href="/en-US"> somehow my frontend (Angular) seems to have outsmarted me, manipulating the address without me realizing it. Turns out I don't even need to change the base href, I just need the rewrites.

1and1 .htaccess directives: explanation needed

I have no experience with modifying .htaccess file.
I'm trying to add custom error pages to my website, and I got the following template from the hosting provider (1and1). I know how to add the pages, but I would like to understand line by line what the code is doing.
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) /errordocument.html
ErrorDocument 400 /errordocument.html
RemoveType x-mapp-php4 .html
Thank you in advance for your help!
Let's start with line one:
RewriteEngine On - quite a simple one, it says it in the name, it enables the rewrite engine to allow us to do many things. (I wont go into detail on what all these things are)
RewriteCond %{REQUEST_FILENAME} !-f and RewriteCond %{REQUEST_FILENAME} !-d basically checks for anything that isn't a file or a directory, if these two conditions are met, it will move on to the RewriteRule if not, then nothing past this will be run.
RewriteRule (.*) /errordocument.html - This is basically telling the server that if the above conditions are met, to redirect to the error page named errordocument.html. (This coincides with your above conditions being met of course).
ErrorDocument 400 /errordocument.html - Simply put, this just tells the server that if a 400 error is received, to then display the errordocument.html page.
Finally RemoveType x-mapp-php4 .html - This is basically telling your Apache server to remove any extensions that are .html from the end of your URLs.
For more in-depth information into each of these and how extensively they can be used, take a look at the Documentation for Apache by clicking here
I hope this helps you to understand what is going on a bit better.

problems getting mod_rewrite working

I've not done much with mod_rewrite, but I can't seem to get anywhere with this. I'm wondering if perhaps it is not enabled on my server(even though my host says it is).
I have the following url: http://dev.website.com/folder1/translate/horse and I want that to redirect to: http://dev.website.com/folder1/translate.php?word=horse
My .htaccess starts with RewriteEngine on and I've tried various attempts to get it working, but no matter what, it just shows my home page (the default 404 redirect).
Things I've tried:
RewriteRule ^translate/.*$ translate.php?word=$1
RewriteRule ^translate translate.php
and some other things I don't remember, but I can't get anything to work.
The .htaccess file I am using is located in folder1. I have also tried putting random characters in the file to make it throw an error, and it does.
Anything I'm missing? How would I properly create this redirect?
As per request, this is my file structure.
I have the domain www.website.com, and a subdomain dev.website.com. The subdomain is set so that it redirects to www.website.com/dev. So, in this case, dev.website.com/folder1/translate.php = www.website.com/dev/folder1/translate.php. I am not sure how that masking is done, as it is accomplished via my web host's cpanel.
You aren't capturing $1 in brackets so this should work:
In DOCUMENT_ROOT/.htaccess:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^folder1/translate/(.*)$ /folder1/translate.php?word=$1 [L,QSA]
In DOCUMENT_ROOT/folder1/.htaccess:
RewriteEngine On
RewriteBase /folder1/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^translate/(.*)$ translate.php?word=$1 [L,QSA]

Zend Framework setting up the htaccess file

I have been using the Zend Framework for years but have realised some crucial problems with our error handling that we are now fixing.
(I posted a different question here: Why my site is always using the ErrorController for all types of errors irrespective of HTTP Status code? explaining the story there).
My question here is a quick one. What does a common .htaccess file of Zend Framework look like?
According to the latest ZF documentation,
SetEnv APPLICATION_ENV development
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^.*$ - [NC,L]
RewriteRule ^.*$ index.php [NC,L]
However, the above is new to me - can someone explain what it does exactly?
My current .htaccess file has a lot of 301 redirect code but for the purpose of this post I'll only paste the relevant information here:
ErrorDocument 404 http://www.mydomain.com/pagenotfound/
ErrorDocument 503 http://www.mydomain.com/service-unavailable/
RewriteCond %{REQUEST_URI} !^/liveagent
RewriteCond %{REQUEST_URI} !^/blog
RewriteRule !\.(js|ico|gif|GIF|jpg|JPG|jpeg|png|PNG|pdf|css|html|xml|swf|php|mp3|mp4|webm|ogv|f4v|flv|txt|wsdl|css3|ttf|eot|svg|woff)$ index.php
The above has been working fine for us, and basically disallows the "liveagent" and "blog" (Wordpress) directories from running with Zend, but I realise I now need to make the following change:
ErrorDocument 404 definitely has to be removed from the code, as Zend Framework should handle all errors. However, when I remove this, going to a URL like www.mydomain.com/this-does-not-exist.php results in a 404 error standard Apache page - it does not load the ZF or the ErrorController. This is because of the "php" exclusion in the above RewriteRule. I do not simply want to remove this since we sometimes want to be able to access php files on the root, such as a separate "holding.php" file which we use for putting the site on maintenance mode.
What is the standard practice? Should I remove the php extension? However this will not solve other 404's like
www.mydomain.com/this-does-not-exist.css
which is also an exclusion (i.e. CSS) in the above RewriteRule.
Therefore, should I completely change the above to Zend's new code for .htaccess as I mentioned above?
If so, I'm a sort of beginner at htaccess - how can I modify that .htaccess code to allow CSS, JS, video files etc. and the blog and liveagent directories to be excluded from the Zend Framework?
I'd switch to the standard ZF rewrite rules instead of the one you have which uses a long regex to redirect to index.php.
Here is an explanation of what the standard .htaccess rules do:
RewriteCond %{REQUEST_FILENAME} -s [OR] # The request is a regular file with size > 0
RewriteCond %{REQUEST_FILENAME} -l [OR] # The request is to a file that is a symlink
RewriteCond %{REQUEST_FILENAME} -d [OR] # The request is to a directory that exists
# if any of the above conditions are true, then simply handle the request as-is
RewriteRule ^.*$ - [NC,L]
# if none of the above match, then rewrite to index.php
RewriteRule ^.*$ index.php [NC,L]
These default ZF rules don't prevent you from accessing existing php files or any other files that are accessible from your document root. If the file requested exists, then the request for that file is served as is. If the file requested does not exist, then the request is forwarded to index.php
Once the request is forwarded to ZF, if there is no matching route, then the ZF ErrorHandler is called and a 404 page (from ZF) is served.
Using the stock ZF rules won't prevent you from having the desired behavior in your application and server settings, and should be a bit more efficient that the regex you currently have. The only things that will really change is that now requests for files that don't exist will be handled by ZF's error handler and no longer by Apache.
Hopefully that answered your question, if not feel free to comment for clarification.

How do I ignore a directory in mod_rewrite?

I'm trying to have the modrewrite rules skip the directory vip. I've tried a number of things as you can see below, but to no avail.
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
#RewriteRule ^vip$ - [PT]
RewriteRule ^vip/.$ - [PT]
#RewriteCond %{REQUEST_URI} !/vip
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
How do I get modrewrite to entirely ignore the /vip/ directory so that all requests pass directly to the folder?
Update:
As points of clarity:
It's hosted on Dreamhost
The folders are within a wordpress directory
the /vip/ folder contains a webdav .htaccess etc (though I dont think this is important
Try putting this before any other rules.
RewriteRule ^vip - [L,NC]
It will match any URI beginning vip.
The - means do nothing.
The L means this should be last rule; ignore everything following.
The NC means no-case (so "VIP" is also matched).
Note that it matches anything beginning vip. The expression ^vip$ would match vip but not vip/ or vip/index.html. The $ may have been your downfall. If you really want to do it right, you might want to go with ^vip(/|$) so you don't match vip-page.html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
This says if it's an existing file or a directory don't touch it. You should be able to access site.com/vip and no rewrite rule should take place.
The code you are adding, and all answers that are providing Rewrite rules/conditions are useless! The default WordPress code already does everything that you should need it to:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
Those lines say "if it's NOT an existing file (-f) or directory (-d), pass it along to WordPress. Adding additional rules, not matter how specific or good they are, is redundant--you should already be covered by the WordPress rules!
So why aren't they working???
The .htaccess in the vip directory is throwing an error. The exact same thing happens if you password protect a directory.
Here is the solution:
ErrorDocument 401 /err.txt
ErrorDocument 403 /err.txt
Insert those lines before the WordPress code, and then create /err.txt. This way, when it comes upon your WebDAV (or password protected directory) and fails, it will go to that file, and get caught by the existing default WordPress condition (RewriteCond %{REQUEST_FILENAME} !-f).
In summary, the final solution is:
ErrorDocument 401 /misc/myerror.html
ErrorDocument 403 /misc/myerror.html
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
I posted more about the cause of this problem in my specific situation, involving Wordpress and WebDAV on Dreamhost, which I expect many others to be having on my site.
You mentioned you already have a .htaccess file in the directory you want to ignore - you can use
RewriteEngine off
In that .htaccess to stop use of mod_rewrite (not sure if you're using mod_rewrite in that folder, if you are then that won't help since you can't turn it off).
Try replacing this part of your code:
RewriteRule ^vip/.$ - [PT]
...with the following:
RewriteCond %{REQUEST_URI} !(vip) [NC]
That should fix things up.
RewriteCond %{REQUEST_URI} !^pilot/
is the way to do that.
In my case, the answer by brentonstrine (and I see matdumsa also had the same idea) was the right one... I wanted to up-vote their answers, but being new here, I have no "reputation", so I have to write a full answer, in order to emphasize what I think is the real key here.
Several of these answers would successfully stop the WordPress index.php from being used ... but in many cases, the reason for doing this is that there is a real directory with real pages in it that you want to display directly, and the
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
lines already take care of that, so most of those solutions are a distraction in a case like mine.
The key was brentonstrine's insight that the error was a secondary effect, caused by the password-protection inside the directory I was trying to display directly. By putting in the
ErrorDocument 401 /err.txt
ErrorDocument 403 /err.txt
lines and creating error pages (I actually created err401.html and err403.html and made more informative error messages) I stopped the 404 response being generated when it couldn't find any page to display for 401 Authentication Required, and then the folder worked as expected... showing an apache login dialog, then the contents of the folder, or on failure, my error 401 page.
I’ve had the same issue using wordpress and found that the issue is linked with not having proper handler for 401 and 403 errors..
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
These conditions are already supposed not to rewrite the url of existing folders but they don’t do their job for password protected folders. In my case, adding the following two lines to my root .htaccess fixed the problem:
ErrorDocument 401 /misc/myerror.html
ErrorDocument 403 /misc/myerror.html
Of course you need to create the /misc/myerror.html,
This works ...
RewriteRule ^vip - [L,NC]
But ensure it is the first rule after
RewriteEngine on
i.e.
ErrorDocument 404 /page-not-found.html
RewriteEngine on
RewriteRule ^vip - [L,NC]
AddType application/x-httpd-php .html .htm
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
etc
I'm not sure if I understand your objective, but the following might do what you're after?
RewriteRule ^/vip/(.*)$ /$1?%{QUERY_STRING} [L]
This will cause a URL such as http://www.example.com/vip/fred.html to be rewritten without the /vip.