.htaccess pretty url problem (mod_rewrite) - apache

I have a directory that lists products by categories. if a _GET variable exists, it is used in a query. I would like to use "pretty url's", like: example/a/1/b/2/c/3/d/4 becomes example/index.html?a=1&b=2&c=3&d=4
most .htaccess examples I see only use variables to replace the _GET values, but I can use rules like this:
Options +FollowSymlinks
RewriteEngine on
RewriteRule ([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)$ index.html?$1=$2&$3=$4&$5=$6 [L]
RewriteRule ([^/]+)/([^/]+)/([^/]+)/([^/]+)$ index.html?$1=$2&$3=$4 [L]
RewriteRule ([^/]+)/([^/]+)$ index.html?$1=$2 [L]
And it works... However, the when I add longer and longer RewriteRules (like out to &17=$18), it stops working. The last variables in the chain turn into some sort of array based on earlier values (in above it would build index.html?a0=a1&a3=a4)...
Is there a better way to do this?
It seems inefficient?
Is there a limit to the number of variables in .htaccess
How long a rule can be?
Thanks!

mod_rewrite only supports up to $9 and %9.
I recommend you either modify your script to use $_SERVER['PATH_INFO'], or you use RewriteMap to invoke a script to transform the path into a querystring.

mod_rewrite only allows for you to have ten back-references, one of which is the whole matchable part (which ends up leaving you with only nine definable capture groups), so you're definitely limited by that.
However, to me it would make much more sense to examine the server's REQUEST_URI/SCRIPT_NAME/PATH_INFO variable in your script file, and parse that to get the key-value pairs from the URL. Then, you'd simply have this in your .htaccess:
RewriteRule On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule . index.html [L]
And then your script would take care of the rest. I do have to wonder though, if you have that many GET variables, is it actually more readable if they're all made into a "pretty" URL? After all, if you have twenty-some forward slashes in the URL, you may be equally well off just passing a normal query string at that point. It depends on your application though and how users interface with these URLs, so you may have good reason for wanting to do it this way.

Related

htaccess URL rewriting PHP for posts and offers

i've searched for a long time an answer for my issue, I found a lot of ideas, but I can't figure it out and make it work as I expect..
So, I've a website with
"/index.php"
"/posts.php"
What I want is to rewrite the url in order to :
redirect "/index.php", "/index.php/" and "/index/" to "/"
and also :
redirect "/posts/slug-post-1/" to "offers.php" but still display "/posts/slug-post-1/" in which I would split the url to get the slug of the post.
Thanks
What I've tried is :
## To internally redirect /dir/foo to /dir/foo.php
RewriteCond %{REQUEST_FILENAME}.php -f [NC]
RewriteRule ^ %{REQUEST_URI}.php [L]
It actually display "/offers/" and redirect to "/offers.php"
but when i add a post slug, it doesn't work.
So I've also tried :
RewriteRule ^posts/([^/]*)$ /posts.php?slug=$1 [L]
It works only with ctrl+F5 and not a simple refresh.. I don't understand why. This is the same with different brwoser and computer.
I'll give it a shot :-)
To redirect "/index.php", "/index.php/" and "/index/" to "/".
RewriteRule index(.*)$ / [NC, L]
Edit: I really urge you not to change the default behaviour of a crucial file like index.php in main folder. There will be unforeseen consequences.
And the other one:
RewriteRule posts/slug-post-1(.*)$ offers.php [QSA, NC, L]
In your example:
Your RewriteCond %{REQUEST_FILENAME}.php -f [NC] checks if "requests filename" with .php added is a real existing file.
Use RewriteCond ${REQUEST_FILENAME} !-f to make sure the targeted file is not skipped, although it is existing. !-d would check for directories.
If you want slugs to be added dynamically to the new target, use flag QSA, so [QSA, NC, L] instead of [NC, L], for example. Use the condition before the rule.
This is really a comment - but space is limited.
It looks as if you haven't thought through what you are trying to achieve before trying to implement it.
What I want is to rewrite the url in order to : redirect "/index.php", "/index.php/" and "/index/" to "/"
I think you need to do a lot more searching. Webservers don't serve up directories, they typically have a lot of machinery in place to service up content when presented with a request where the path maps to a directory to change that to a file, a script, or a special handler.
I suspect you want to rewrite /, /index.php/ and /index/ to /index.php
But I suspect there's more to what you are trying to achieve here - that you also want to deal with any string after the pattern you are seeking to match which is implied in your attempts to deal with /posts.
So it looks as if you are trying to implement 2 front controller patterns. Implementing a single front controller pattern appears to be a bit of a stretch for you. Implementing 2 at the same time is unlikely to turn out well and whatever you do finally implement will likely be very fragile. You're going to need a router in /index.php so that is the right place to handle the /posts/ requests.
But this is only PART of the problem you need to solve. Having your PHP code intercepting all requests is rather expensive in terms of CPU and memory (unless you have a really good caching policy implemented on your server and it is sitting behind a caching reverse proxy).

Shorten URLs with mod_rewrite

I am currently trying to make a URL shortener feature for one of my projects; what I want to do if a user visits the site with a URL that does not contain any slashes (for directories) or file extensions, it should redirect to a PHP script that will serve up the correct file. For example:
http://example.com/A123 would be rewritten as http://example.com/view.php?id=A123
but
http://example.com/A123/ would not be rewritten, and
http://example.com/A123.png would not be rewritten either. I have been messing with mod_rewrite for a few hours now and for the life of me I cannot get this to work...
With no way to identify the URI that needs to be shortened you need to exclude all other possibilities. This will likely require you to build a lengthy list of exclusions. Below is a starting point. Each of these conditions verifies the requesting URI does NOT match (signified by the !). When it doesn't match all conditions the rule is run.
RewriteCond %{REQUEST_URI} !^/view.php
RewriteCond %{REQUEST_URI} !.html$
RewriteCond %{REQUEST_URI} !/$
RewriteRule ^/(.*)$ http://example.com/view.php?id=$1 [QSA]
The above also requires you (as you have requested) to break a standard practice rule, which is to handle directory requests without a trailing slash. You are likely to come across other issues, as the rules above break your Apache server side directory rules.
Rethinking the logic. If you had some way to identify the URL that is to be shortened it would be much easier. For example 's', http://example.com/s/A123.
RewriteCond %{REQUEST_URI} ^/s/
RewriteRule ^/s/(.*)$ http://example.com/view.php?id=$1 [QSA]
I'm definitely no guru at this, but its similar to what I'm trying to accomplish (see my yet unanswered question)
However, if I understand correctly, this (untested) RewriteRule may work:
RewriteRule ^([^\.\/]*)$ view.php?id=$1 [L]
The magic part is the [^\.\/]* which says: 1 or more (*) instances of a charactor ([]) which is not ([^ ]) a period or a slash (\ escapes these charactors).
Like I said, I haven't tested this, nor am I an expert, but perhaps this will help.

htaccess RewriteRule: filter certain words

I have
RewriteRule ^post/([^/]+)/([^/]+)$ /post/index.php?$1=$2 [NC]
which does what it's supposed to do: take a URL like post/color/black and turn that into post/index.php?color=black.
The problem is that this also affects things like the stylesheet (located at post/styles/style.css), and other files that really exist.
So the question is: if I know the exact $_GET keys that need to be translated, how can I limit the above RewriteRule to only do its magic for those specific keys, but leave everything else untouched?
Thanks.
You can use:
^post/([^/]+)/(black|white|...)$
or
^post/([^/]+)/((?!bad keywords)[^/]+)$
You can either:
Use the -f flag to exclude actually existing resources from the rewrite process by adding the following RewriteRules:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
Or store your static resources outside /post - this would be best because you can exclude the possibility of collisions with 100% certainty, and you save Apache from looking the file up (which is only relevant with a lot of traffic though).

mod_rewrite for all pages (including subfolders) on a site to a single php page

I am in the process of converting my site with many static html pages to a site driven by a database. My problem is that I don't want to lose what google has already indexed, so I would like to rewrite requests to be sent through a php script which lookup the filepath for content in the database. My understanding is that a mod_rewrite would best serve this purpose, unfortunately I have never used it, so I am a bit lost.
What I have:
www.domain.com/index.html
www.domain.com/page.html?var=123&flag=true
www.domain.com/folder/subfolder/
www.domain.com/folder/subfolder/index.html
www.domain.com/folder/subfolder/new/test.html
www.domain.com/folder/subfolder/new/test.html?var=123&flag=true
What I want (I also probably need to urlencode the path)(passing the full uri is also ok):
www.domain.com/index.php?page=/index.html OR www.domain.com/index.php?page=www.domain.com/index.html
www.domain.com/index.php?page=/page.html?var=123&flag=true
www.domain.com/index.php?page=/folder/subfolder/
www.domain.com/index.php?page=/folder/subfolder/index.html
www.domain.com/index.php?page=/folder/subfolder/new/test.html
www.domain.com/index.php?page=/folder/subfolder/new/test.html?var=123&flag=true
Here's my first go at it:
RewriteEngine On # Turn on rewriting
RewriteCond %{REQUEST_URI} .* # Do I even need this?
^(.*)$ /index.php?page=$1
Ideas? Thanks in advance :)
Edit:
So I tried implementing Ragnar's solution, but I kept getting 500 errors when I use 'RewriteCond $1' or include the '/' on the last line. I have setup a test.php file which will echo GET_["page"] so I know that the rewrite is working correctly. So far I can get some of the correct output (but only when I am not in root), for example:
RewriteEngine on
RewriteRule ^page/(.*)$ test.php?page=$1 [L]
If I visit the page http://www.domain.com/page/test/subdirectory/page.html?var=123 it will output 'test/subdirectory/page.html' (missing the querystring, which I need). However, if I use this example:
RewriteEngine on
RewriteRule ^(.*)$ test.php?page=$1 [L]
If I visit http://www.domain.com/page/test/subdirectory/page.html?var=123 it will only output 'test.php' which is thoroughly confusing. Thoughts?
Edit #2:
It seems I've been going about this all wrong. I just wanted the ability to use full uri in my php script page. The final working solution to do what I want is the following:
Options +FollowSymlinks
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /test.php
Then in my php script, I can use $_SERVER['REQUEST_URI'] to get what I need. I knew this should have been easier than what I was trying...
I would recommend you to look into the Apache URL Rewriting Guide, it contains extensive information about rewriting with examples.
If I understand you correctly, you should be able to use something like this
RewriteEngine on
RewriteCond $1
RewriteRule ^(.*)$ index.php/?page=$1 [L]
Which is very similar code to the one you posted. If you want better information, be specific about your problem.
There's no need for so many lines, it only complicates things.
All you need is 2 lines in .htaccess:
rewriteengine on
#rewriterule-: ar1 path =relative. ar2 if relative, that to rewritebase.
rewriterule !^foo/bar/index\.php$ /foo/bar/index.php
#..assert ar1 dismatches correct url
In PHP
You can output the first input of rewriterule in PHP using:
<?=$_SERVER['REQUEST_URI'];
That will give you all the power and allow you to do all things. Simply parse $_SERVER["REQUEST_URI"] manually and you can echo totally different pages depending on the value of $_SERVER["REQUEST_URI"].
Sec bugs
Note that your server may do pathing or buggy pathing before rewriterule. (You can't override this behavior without server privileges.) Eg if the user visits /foo//// you may only see /foo/ or /foo. And eg if the user visits ///foo you may only see /foo. And eg if the user visits /a/../foo you may only see /foo. And eg if the user visits /a//b/../../foo you may only see /foo or /a/foo [because buggy production servers treat multiple / as distinct in the context of .., no kidding].
With circuit break
Rewrite circuit breaks on cin identical to htaccess∙parentfolder∙relative interpreted rewriterule∙arg2. (First off, personally I'd disable circuit breaks to reduce rule complexity but there seems to be no way to do so.)
Circuit-break solution:
rewriteengine on
#rewriterule-: ar1 path =relative. ar2 if relative, that to rewritebase.
rewriterule ^ /foo/bar/index.php
#..circuit-breaking, so assert ar2 if absolute, htaccess parentfolder =root, else, htaccess parentfolder not in interpreted ar2.
Circuit break and rewritebase undesigned use
Circuit break needs either of:
arg2 [of rewriterule] &rlhar; absolute. and htaccess parentfolder &rlhar; root.
arg2 &rlhar; relative. and that folder not in interpreted arg2.
So when that folder ≠ root, circuit break needs arg2 &rlhar; relative. when arg2 &rlhar; relative, circuit break needs⎾that folder &rlhar; not in interpreted arg2⏋.
Say we need circuit break and a htaccess parentfolder that's in interpreted arg2, so we edit arg2 via rewritebase:
rewriteengine on
#rewriterule-: ar1 path =relative. ar2 if relative, that to rewritebase.
rewriterule ^ bar/index.php
#..circuit-breaking, so assert ar2 if absolute, htaccess parentfolder =root, else, htaccess parentfolder not in interpreted ar2.
rewritebase /foo
#..needs to be absolute [</> starting]. pathing [eg </../a/..> and </../..///ran/dom-f/oobar/./././../../..////////.//.>] allowed

Dealing with multiple, optional parameters with mod_rewrite

I'm using apache's mod_rewrite to make my application's URL's pretty. I have the basics of mod_rewrite down pat - several parts of my application use simple and predictable rewrites.
However, I've written a blog function, which use several different parameters.
http://www.somedomain.com/blog/
http://www.somedomain.com/blog/tag/
http://www.somedomain.com/blog/page/2/
I have the following rules in my .htaccess:
RewriteRule ^blog/ index.php?action=blog [NC]
RewriteRule ^blog/(.*) index.php?action=blog&tag=$1 [NC]
RewriteRule ^blog/page/(.*) index.php?action=blog&page=$1 [NC]
However, the rules do not work together. The computer matches the first rule, and then stops processing - even though to my way of thinking, it should not match. I'm telling the machine to match ^blog/ and it goes ahead and matches ^blog/tag/ and ^blog/page/2/ which seems wrong to me.
What's going wrong with my rules? Why are they not being evaluated in the way I'm intending?
Edit: The answer was to terminate the input using $, and re-order the rules, ever so slightly:
RewriteRule ^blog/$ index.php?action=blog [NC,L]
RewriteRule ^blog/page/(.*)$ index.php?action=blog&page=$1 [NC,L]
RewriteRule ^blog/(.*)$ index.php?action=blog&tag=$1 [NC,L]
These rules produced the desired effect.
If you don't want ^blog/ to match anything more than that, specify the end of the input in the match as well:
^blog/$
However, the way many apps do it is to just have a single page that all URLs redirect to, that then processes the rest of the URL internally in the page code. Usually most web languages have a way to get the URI of the original request, which can be parsed out to determine what "variables" were specified, even though Apache points all of them to the same page. Then via includes or some other framework/templating engine you can load the proper logic.
As another note - usually the "more general" rewrite rules are put last, so that things which match a more specific redirect will be processed first. This, coupled with the [L] option after the rule, will ensure that if a more specific rule matches, more general ones won't be evaluated.
I think you need to add an [L] after the [NC] statements otherwise it'll carry on even if its already been matched