Apache .htaccess mod_rewrite and clean urls - apache

Ok. So I'm building this site which is accessible through two different domains. So I can't use RewriteBase in my .htaccess. The rules (below) I use to work around this problem seem to work fine. However, when I use the below .htaccess settings on my local box with clean URLS (WAMP) it all works fine but the moment I use this on the live server (shared hosting LAMP) every page I navigate to displays the home page (the one under index I guess) even though the URL in the browser is clearly being updated.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} ^/domain1.com/(.*)$
RewriteRule ^(.*)$ /domain1.com/index.php?q=$1 [L,QSA]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} ^/domain2.com/(.*)$
RewriteRule ^(.*)$ /domain2.com/index.php?q=$1 [L,QSA]
</IfModule>
Any help or ideas are very much appreciated.
Luke

Probably the best thing to do is to reproduce the problem on your local box and turn up RewriteLogLevel so you can see what's going on. (Since you usually can't change the log level on shared hosting)
You may be able to "simulate" the problem by doing a directory rewrite in your Apache main configuration. (The shared hosting obviously does its own rewriting before it gets to the .htaccess!) If you can't reproduce the problem, you may have to start trial-and-error debugging on the remote server. This is ugly but if it's your only option:
Use the R (redirect) flag in substitutions to send any rewritten URL back to your browser. Use TELNET (or an appropriate browser add-on) to inspect the HTTP responses.
Don't forget to escape dots in regexes!
As a side note, the RewriteRule pattern is matched before the RewriteConds above it. This kind of setup is probably better for performance:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^/domain1\.com/(.*)$ /domain1.com/index.php?q=$1 [L,QSA]
# ^ should be escaped
Note that I haven't tested this.

Related

URL Rewriting in .htaccess (Apache) displaying 404-Error

this is my first question here on stackoverflow because in the past I always found a question that described my problem perfectly.
But now they were not able to do that, so I decided to ask for help myself.
My goal is to display profiles, but the url shouldn't look like "/profile/show-profile.php?user=admin", just "/profile/admin".
So looked it up on google and found URL rewriting to be potentially useful, by editing the .htaccess file.
The problem is, it doesn't work. I already have some things in my .htaccess (redirecting to https and the 404-Page "/pagenotfound.php") and it seems like they don't work in combination.
# https redirecting
RewriteEngine On
RewriteCond %{SERVER_PORT} !=443
RewriteRule ^(.*)$ https://int-politics.com/$1 [R=301]`
# 404 page
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) /pagenotfound.php
ErrorDocument 404 /pagenotfound.php
# URL REWRITING
RewriteEngine On
RewriteBase /profile/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ show-profile.php?user=$1
When I add the URL Rewriting part the 404-Page doesn't work anymore. Every site that doesn't exist just outputs "/pagenotfound.php" (see image -->)
Not-existing site just outputs /pagenotfound.php instead of showing it.
And the url-rewriting doesn't work too.
It would be wonderful if you could help me with this problem and tell me whats wrong. Thank you very much!
You usage of the RedirectBase is wrong. It should appear only once in such a distributed configuration file. Actually it is not required in this example at all ... Please take a look into the documentation for details on that: https://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewritebase
Also it is vital to understand that the rewriting engine loops if a rule gets applied. And why that makes using the L or the END flag so important.
That probably is what you are looking for:
RewriteEngine On
RewriteBase /
# https redirecting
RewriteCond %{SERVER_PORT} !=443
RewriteRule ^ https://int-politics.com%{REQUEST_URI} [R=301,END]
# profile rewriting
RewriteRule ^/?profile/(\w+)$ /profile/show-profile.php?user=$1 [END]
# 404 page
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^ /pagenotfound.php [END]
ErrorDocument 404 /pagenotfound.php
Best is to implement such rules in the central http server's host configuration. If you do not have access to that (read: if you are using a cheap hosting provider) then you can use a distributed configuration file instead, if the consideration of such files has been enabled (see the documentation for the AllowOverride directive on that).

Apache htaccess cache

Using XAMPP and Windows 10 I'm experimenting with some htaccess rules.
I have a folder C:\xampp\htdocs\website and in this directory there is a .htaccess file, the full contents are;
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1 [L]
To test the .htaccess was working I added the following redirect rule to the end;
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1 [L]
RedirectMatch 301 ^(.*)$ http://www.example.com
This successfully redirected my whole domain, great.
I have since deleted this line from my htaccess file, however my domain is still being redirected.
I have restarted Apache numerous times, cleared my browser history, cache and tried different browsers. I have also tried viewing the site in incognito mode.
Eventually the site returns to normal, this is after numerous service restarts and history deletions.
Is this the expected behavior? Is there a more efficient way of testing the .htaccess?
Any advice is appreciated.
No need to restart APache. Simple make the change in the htaccess file, clear your history/cache and hard reload the page (Ctrl+F5).
You can also close and re-open the browser, see if that helps.

htaccess rewrite .html not required / is optional

I have a working website, with atleast 500 pages ranked in Google.
All pages have .html at end of page.
Now I want to remove .html of all pages, but let the pages in Google (with .html) keep there index.
After searching I cant find the correct answer.
I know the ? is for optional. I tried 2 Rules behind eachother but didnt work too.
Here is what my htaccess now is:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*).html$ find_page.php?redirect=$1&%{QUERY_STRING} [L,QSA]
I tried with adding:
RewriteRule ^(.*)$ find_page.php?redirect=$1&%{QUERY_STRING}
So if URL contains no extension use this rule, else use the normal rule (with htaccess)
I should expect my rule should be something like this: ^(.*)(?\.html)$
So my goal is: With or without html should work, but .php shouldnt be work :-)
Why look for a complex solution?
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)\.html$ find_page.php?redirect=$1 [L,QSA]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)?$ find_page.php?redirect=$1 [L,QSA]
This rewrites all request to that php script, adding the original "file name" as parameter "redirect" and preserves all query parameters. That is what you asked for in your question.
But a warning: you can do this and it will allow to rewrite requests to for example page "redirection" as .../redirection?somearg or .../redirection.html?somearg. But for google both request are completely different pages. This will not help you to preserve any ratings when shifting to the new request scheme.
And a general side note: if you have control over the http server configuration, then you should always prefer to place such rules in the hosts configuration instead of using .htaccess style files. Such files are notoriously error prone, make things complex, are hard to debug and really slow the server down. They should only be used in two cases: if you do not have control over the http server configuration or if you require your scripts to do dynamic changes to your ruleset (which is always a very insecure thing).
Ok solved my problem.
RewriteEngine On
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*[^.#?\ ]+\.html([#?][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*[^.]+)\.html find_page.php?redirect=$1&%{QUERY_STRING} [L,QSA]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ find_page.php?redirect=$1&%{QUERY_STRING} [L,QSA]
With this option there will be checked if the page has .html optional at end. If it has, will the first rule be matched, else will go further and use the second rule which has no html at the end
Try
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ $1.html
You don't need find_page.php for redirection. As it mentioned in other answer http://server/folder/file and http://server/folder/file.html becomes the same for the user but different for the Google.
This does not affect to PHP, folders and other content. It just tries to add «.html» to requested URL if it does not point a file or folder.
I've checked, it works fine even user queries uri with anchor like 1.html#bookmark1

RewriteCond Being Ignored?

I am trying to use mod_rewrite on a Ubuntu 12.04 server to make my URLs more readable, however I want to add an exception for images and css files.
My input URLs are in the format \controller\action which is then re-written to index.php?controller=controller&action=action. I want to add an exception so that if an image or css file is specified, the URL is not re-written, e.g. \images\image.jpg would not be re-written.
My .htaccess code is as follows:
RewriteEngine on
RewriteCond %{REQUEST_URI} !(\.gif|\.jpg|\.png|\.css)$ [NC]
RewriteRule ^([a-zA-z]+)/([a-zA-z]+)$ test.php?controller=$1&action=$2 [L]
RewriteRule ^([a-zA-z]+)/([a-zA-z]+)/([^/]*)$ test.php?controller=$1&action=$2&$3 [L]
My re-write code is working fine and the URLs are coming out as intended, however even if I request an image, the URL is still being re-written. It appears that my RewriteCond is being ignored, anyone any suggestions as to why this might be?
The RewriteCond only applies to your first RewriteRule, it should be reproduced for the second rule. However, I think that is better to add a non-rewriting rule, before, to exclude existing stuffs.
# Do nothing for files which physically exist
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule .* - [L]
# your MVC rules
RewriteRule ^([a-zA-z]+)/([a-zA-z]+)$ test.php?controller=$1&action=$2 [L]
RewriteRule ^([a-zA-z]+)/([a-zA-z]+)/([^/]*)$ test.php?controller=$1&action=$2&$3 [L]
The rewriteCond rule is only applied for the next RewriteRule.
So you need to at least repeat the rewriteCond for your seconde RewriteRule.
No there is certainly better things to do.
For example a usual way of doing it is to test that the url is matching a real static ressource. If all your php code is outside the web directory (in libraries directory, except for index.php) then all styatic ressources available directly on the the document root can only be js files, css files, or image files.
So this is the usual way of doing it:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([a-zA-z]+)/([a-zA-z]+)$ test.php?controller=$1&action=$2 [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([a-zA-z]+)/([a-zA-z]+)/([^/]*)$ test.php?controller=$1&action=$2&$3 [L]
But this is a starting point. We could certainly find something to avoid doing 2 rules for this (maybe I'll have a look later)

HTACCESS Mod_Rewrite Troubles

I'm having a bit of trouble with my mod_rewrite, I had this running perfectly fine on my previous litespeed VPS. Now I'm using my own dedicated server running CentOS 6, so I don't know if I haven't configured correctly.
This is how it currently looks:
http://***.com/?pageName=FourthPage
This is how I want it to look:
http://***.com/FourthPage
This is my current .htaccess file:
RewriteEngine On
RewriteCond %{REQUEST_fileNAME} !-d
RewriteCond %{REQUEST_fileNAME} !-f
RewriteRule ^([^/\.]+)/?$ index.php?pageName=$1
I am not good with htaccess syntax. Configuring such rules should be done directly in the server configuration if you have access. Using htaccess for such purposes is only a workaround. However:
The pattern inside the RequestRule is wrong. Especially the trailing slash ('/'). What is it meant to match? You might want to read again through the documentation of that rule. The pattern is not matched against a full URL but only against a part of the path in case of htaccess.
Have a try with the following:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^/]*)$ index.php?pageName=$1
Also you should enable rewrite logging inside the server. The relevant configuration options are RewriteLog (for the file) and RewriteLogLevel (well, for the level...). It help to understand what is going on inside the rewrite module. However that cannot be done in htaccess too, you need access to the servers general configuration.