htaccess rewrite .html not required / is optional - apache

I have a working website, with atleast 500 pages ranked in Google.
All pages have .html at end of page.
Now I want to remove .html of all pages, but let the pages in Google (with .html) keep there index.
After searching I cant find the correct answer.
I know the ? is for optional. I tried 2 Rules behind eachother but didnt work too.
Here is what my htaccess now is:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*).html$ find_page.php?redirect=$1&%{QUERY_STRING} [L,QSA]
I tried with adding:
RewriteRule ^(.*)$ find_page.php?redirect=$1&%{QUERY_STRING}
So if URL contains no extension use this rule, else use the normal rule (with htaccess)
I should expect my rule should be something like this: ^(.*)(?\.html)$
So my goal is: With or without html should work, but .php shouldnt be work :-)

Why look for a complex solution?
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)\.html$ find_page.php?redirect=$1 [L,QSA]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)?$ find_page.php?redirect=$1 [L,QSA]
This rewrites all request to that php script, adding the original "file name" as parameter "redirect" and preserves all query parameters. That is what you asked for in your question.
But a warning: you can do this and it will allow to rewrite requests to for example page "redirection" as .../redirection?somearg or .../redirection.html?somearg. But for google both request are completely different pages. This will not help you to preserve any ratings when shifting to the new request scheme.
And a general side note: if you have control over the http server configuration, then you should always prefer to place such rules in the hosts configuration instead of using .htaccess style files. Such files are notoriously error prone, make things complex, are hard to debug and really slow the server down. They should only be used in two cases: if you do not have control over the http server configuration or if you require your scripts to do dynamic changes to your ruleset (which is always a very insecure thing).

Ok solved my problem.
RewriteEngine On
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*[^.#?\ ]+\.html([#?][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*[^.]+)\.html find_page.php?redirect=$1&%{QUERY_STRING} [L,QSA]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ find_page.php?redirect=$1&%{QUERY_STRING} [L,QSA]
With this option there will be checked if the page has .html optional at end. If it has, will the first rule be matched, else will go further and use the second rule which has no html at the end

Try
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ $1.html
You don't need find_page.php for redirection. As it mentioned in other answer http://server/folder/file and http://server/folder/file.html becomes the same for the user but different for the Google.
This does not affect to PHP, folders and other content. It just tries to add «.html» to requested URL if it does not point a file or folder.
I've checked, it works fine even user queries uri with anchor like 1.html#bookmark1

Related

Adding exceptions to mod_rewrite

I got my basic redirects work with the mod_rewrite module. When requesting pages e.g. localhost/home it's correctly redirecting to localhost/index.php?page=home, but I have a problem with exceptions.
I created a folder api where I store files by category e.g. api/auth/register.php and api/customer/create.php. I tried to make rewrite rule that contains 2 params (in this example auth and customer) so basically it just drops the .php off from the url.
The rule that I made is following
RewriteRule ^api/(.*)/(.*)/?$ api/$1/$2.php [L]
After adding that line to my .htaccess, problems started to occur. For example my .css and .js files started to redirect. So maybe I need to make some exeption for the apis? Have you some other ideas to improve my rewrite rules?
.htaccess
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^api/(.*)/(.*)/?$ api/$1/$2.php [L] # problems started to occur after adding this line
RewriteRule (.*) index.php?page=$1 [L,QSA]
Thanks in advance.
RewriteCond will affect only the first following RewriteRule so you need the keep them next to your initial Rule, and move the added one above them (with its own conditions).
Also, your /api rule is not strict enough ((.*) will pick anything, including the slashes), which might not matter in you case, but still. I sugest you try with this:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^api/([^/]*)/([^/]*)/?$ api/$1/$2.php [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) index.php?page=$1 [L,QSA]

.htaccess affect other file url

I am trying to make .htaccess rule not affect other file url
example
my .htaccess rule is
RewriteEngine On
RewriteRule ^([^/]*)$ /tr/hp.php?q=$1 [L]
my site url is
mydomain.com/keywords
everything working good on keywords but when I try to open robots.txt
mydomain.com/robots.txt
OR
mydomain.com/images.jpg
any other file url
redirect on /tr/hp.php?q=filename
which .htaccess Rewrite Rule works on both?
Try :
RewriteEngine On
#--exclude real directories
RewriteCond %{REQUEST_FILENAME} !-d
#--and files
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^/]+)/?$ /tr/hp.php?q=$1 [L]
You also have to prevent matching any request pattern that carries a dot in it:
RewriteEngine On
RewriteRule ^([^/.]*)$ /tr/hp.php?q=$1 [L]
Certainly it is possible to further refine this pattern.
Alternatively some people like to prevent rewriting requests to files or folders that physically exist in the file system using RewriteCond:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
But that is something you have to decide upon. This does not help for example if resources like robots.txt are delivered in a dynamic manner...

RewriteRule for several parameters

I'm currently working on a project powered by a home-made CMS and I'm experiencing some issues with URL rewriting.
Here's the thing: all the website is centralized around the index.php located in the main directory. Depending on what he gets thought the URL, the index.php displays the right page (the pages are included from a inc/pages/ folder)
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} (.+)/$
RewriteRule ^ %1 [L,R=301]
RewriteRule ^([A-Za-z0-9-]+)/?$ index.php?page=$1 [NC]
For a single parameter, it works great. http://demo.com/subscribe/ or demo.com/subscribe does transmit a $_GET['page'] to the index.
For some pages, I do need a second parameter. So it's not required for each single pages. Per example, http://demo.com/edit/I-love-Stackoverflow should transmit a $_GET['snd_param')
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} (.+)/$
RewriteRule ^ %1 [L,R=301]
RewriteRule ^([A-Za-z0-9-]+)/([A-Za-z0-9-]+)?$ index.php?page=$1&snd_param=$2 [NC]
I tried this but this isn't working well. First, if the second parameter is not mentioned (demo.com/edit) it's not working. The index doesn't receive the right $_GET['page']. Secondly, when the second parameter is mentionned, it "works" but apache believes this is a directory. My index page is then located in the fictive "I-love-Stackoverflow" folder and loading the CSS, images and javascript fails.
I hope I explained my issue pretty clearly ! Thanks in advance for helping me
You should treat the rules separately. All Conditions preceding rules only apply to a single rule, so basically the second RewriteRule is not executed at all.
You can use something like this:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)$ index.php?page=$1 [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/([^/]+)$ index.php?page=$1&snd_param=$2 [L]
My index page is then located in the fictive "I-love-Stackoverflow"
folder and loading the CSS, images and javascript fails.
You are probably load your assets using relative paths, so the browser only knows for the unmodified url ( http://demo.com/edit/I-love-Stackoverflow ) in your case, and the wrong urls are created when browser load the assets. If you load resources with absolute paths instead of relative, you will be okay.

RewriteCond Being Ignored?

I am trying to use mod_rewrite on a Ubuntu 12.04 server to make my URLs more readable, however I want to add an exception for images and css files.
My input URLs are in the format \controller\action which is then re-written to index.php?controller=controller&action=action. I want to add an exception so that if an image or css file is specified, the URL is not re-written, e.g. \images\image.jpg would not be re-written.
My .htaccess code is as follows:
RewriteEngine on
RewriteCond %{REQUEST_URI} !(\.gif|\.jpg|\.png|\.css)$ [NC]
RewriteRule ^([a-zA-z]+)/([a-zA-z]+)$ test.php?controller=$1&action=$2 [L]
RewriteRule ^([a-zA-z]+)/([a-zA-z]+)/([^/]*)$ test.php?controller=$1&action=$2&$3 [L]
My re-write code is working fine and the URLs are coming out as intended, however even if I request an image, the URL is still being re-written. It appears that my RewriteCond is being ignored, anyone any suggestions as to why this might be?
The RewriteCond only applies to your first RewriteRule, it should be reproduced for the second rule. However, I think that is better to add a non-rewriting rule, before, to exclude existing stuffs.
# Do nothing for files which physically exist
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule .* - [L]
# your MVC rules
RewriteRule ^([a-zA-z]+)/([a-zA-z]+)$ test.php?controller=$1&action=$2 [L]
RewriteRule ^([a-zA-z]+)/([a-zA-z]+)/([^/]*)$ test.php?controller=$1&action=$2&$3 [L]
The rewriteCond rule is only applied for the next RewriteRule.
So you need to at least repeat the rewriteCond for your seconde RewriteRule.
No there is certainly better things to do.
For example a usual way of doing it is to test that the url is matching a real static ressource. If all your php code is outside the web directory (in libraries directory, except for index.php) then all styatic ressources available directly on the the document root can only be js files, css files, or image files.
So this is the usual way of doing it:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([a-zA-z]+)/([a-zA-z]+)$ test.php?controller=$1&action=$2 [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([a-zA-z]+)/([a-zA-z]+)/([^/]*)$ test.php?controller=$1&action=$2&$3 [L]
But this is a starting point. We could certainly find something to avoid doing 2 rules for this (maybe I'll have a look later)

howto mod_rewrite every request to index.php except real files but exclude one real directory?

Let me explain my situation:
I'm using a MVC framework (CodeIgniter), so every request gets rewritten to my index.php file, which in turn routes this to my classes and functions.
Offcours if there are requests for real files they should not be processed by scripts but directly send from webserver to browser.
Owkay no problem, the following rewrite rules will do just that:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1 [L]
I would like that requests for a certain folder (let call it 'private') still be processed by php. The reason for doing this is that i would then verify if the user is authenticated, and if so, send contents to browser.
Any apache gurus in the house who can assist?
Is this a acceptable solution to this the problem?
Try this rule:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d [OR]
RewriteCond $1 ^private($|/)
RewriteRule ^(.*)$ index.php/$1 [L]
This will exclude any URL paths that are private or start with private/ even though they are existing folders or files.