Generic htaccess URL rewriting - apache

Even simple .htaccess gives me headaches and I need to do the following generic mapping:
http://example.com/project/controllername/key1/val1/key2/val2/.../keyN/valN
-->
http://example.com/project/controllername.xyz?key1=val1&key2=val2...&keyN=valN
example:
http://example.com/so/pagecontroller/id/1/time/12345/title/helloworld
-->
http://example.com/so/pagecontroller.xyz?id=1&time=12345&title=helloworld
Any guidance will help! Especially with handling special chars like '/', '?' and '&' (more?) in keys and values.
EDIT: To clarify, 'project' and 'controllername' paths are dynamic - they are not static. Also the number of keys and values is not pre-determined! I need help in creating the htaccess file code and where to place this file in the web tree and if apache needs restarting everytime the htaccess file is modified. Thanks!

Try these rules:
RewriteRule ^([^/]+/[^/]+)/([^/]+)/([^/]+)(/.+)?$ $1$4?$2=$3 [QSA,N]
RewriteCond $1 !.+\.xyz$
RewriteRule ^([^/]+/[^/]+)$ $1.xyz [L]

As long as you don't want to select those special characters, they should be no problem. A rule for your example might be:
RewriteRule http://example.com/so/pagecontroller/id/([0-9]+)/time/([0-9]+)/title/(.*)$ http://example.com/so/pagecontroller.xyz?id=$1&time=$2&title=$3
I don't think you can have a dynamic number of variables in a rewrite rule. But what you can do is the following:
RewriteRule http://example.com/so/pagecontroler/(.*) http://example.com/so/pagecontroler.xyz?vars=$1
Than you have a GET parameter with the name "vars" and the rest of the query as a value. You can than split the different keys and values server side e.g. with the explode() function of PHP.

Related

Need .htaccess recipe to display rss feed dynamically

I currently use the following recipe to route .rss files to a script that produces a rss feed dynamically:
RewriteRule ^(.*).rss$ /get-feed.pl?item=$1
It works perfectly for URLs like this:
www.example.com/articles.rss
What I would to like to do is change the URL to this:
www.example.com/rss/articles/
Everything I have tried doesn't work.
I just tried to put some slashes in the recipe but I'm not an expert in these recipes so they didn't work. Somethig like this didn't work: RewriteRule ^/rss/(.*)/$ /get-feed.pl?item=$1
("recipe" = regular expression / "regex" for short OR RewriteRule "pattern" from the Apache docs - At least I think that is what you are referring to? We are not baking a cake here! ;) )
That is very close, except that the URL-path that the RewriteRule pattern matches against does not start with a slash when used in a .htaccess (directory) context. So, it would need to be like this: ^rss/(.*)/$. If you had looked to see what your first rule was returning you would have seen that there was no slash prefix in the backreference that was captured (ie. the value of the item URL parameter).
However, there are other (minor) issues here...
The 2nd path segment cannot be empty, so it would be preferable to match something, rather than anything. eg. (.+) instead of (.*). However, this should be made more restrictive, so to match just a single path segement, instead of any URL-path (which is likely to fail anyway I suspect). eg. Presumably /rss/foo/bar/baz/ should not match?
Again, if you only want to match a string of the form articles then make the regex more restrictive so that it only matches letters (or perhaps letters + numbers + hyphens)?
You are missing the L (last) flag on this rule, which is a problem if you have other directives that follow.
So, if you are wanting to rewrite URLs of the form www.example.com/rss/articles/ (note the trailing slash) then try the following instead:
RewriteRule ^rss/([\w-]+)/$ /get-feed.pl?item=$1 [L]
Make sure the browser cache is cleared before testing.
And this would need to go near the top of the .htaccess file, before any existing rewrites.
Aside: A quick look at your original directive:
RewriteRule ^(.*).rss$ /get-feed.pl?item=$1
This is not strictly correct, as it potentially matches too much. The unescaped dot before rss matches any character. And the .* subpattern matches 0 or more characters of anything - it must be something. So, this should really be something like:
RewriteRule ^([\w-]+)\.rss$ /get-feed.pl?item=$1 [L]

Checking if query_string has a value or else redirect it

I am learning .htaccess
My URL string is
http://abc.bcd.com/company/abc
I do apply to redirect my page if the company name is abc, xyz etc. and my rewrite rule is
RewriteRule ^/company/(.*?)$ /hhhhh/ll/test_page.html?company_letter=$1 [L,PT]
Sometimes my url change to
http://abc.bcd.com/company/abc?locale=en
What will be query string condition to accommodate both the url and should work properly ?
I have tried this but not helping .
RewriteCond %{QUERY_STRING} ^locale=(.*)$
The rewrite condition should help me like
if(locale="something")
/hhhhh/ll/test_page.html?company_letter=abc&locale=something
else
/hhhhh/ll/test_page.html?company_letter=abc
You just need to add QSA flag in your rule:
RewriteRule ^/?company/(.*)$ /hhhhh/ll/test_page.html?company_letter=$1 [L,QSA]
QSA (Query String Append) flag preserves existing query parameters while adding a new one.
The query string part of the incoming URL is a very specific thing. First you should know that classical rewriteRules are not managing the query string.
So, for example, you cannot make a RewriteRule with a check for a query string parameter value. Query strings parameters could be repeted several times, could appear in any order, and are not url-decoded (the location part of the url is url-decoded when mod_rewrite works on it).
This explains why some RewriteCond are sometimes used on the %{QUERY_STRING}, it cannot be done in RewriteRule but could be tested in rewriteCond, with all the previous probelsm ( repetition, order, url-encoding, etc).
But some rewriteRule tags can be applied for query string managment. Currently your tags are [L,PT], which also be writtent [last,passthrough].
You can add a qsappend or QSA tag which explicitly tells mod_rewrite to combine the original query string and the generated one.
So with
RewriteRule ^/company/(.*?)$ /hhhhh/ll/test_page.html?company_letter=$1 [last,passthrough,qsappend]
This:
http://abc.bcd.com/company/abc
Will go to
/hhhhh/ll/test_page.html?company_letter=abc
And this:
http://abc.bcd.com/company/abc?locale=en
Will go to
/hhhhh/ll/test_page.html?company_letter=abc&locale=en

Apache rewrite backreference variable not accessible after first use

I have come across a situation that seems odd to me. It seems that backreference variables when building apache rewrite rules get lost after the first use.
My requirement is changing an old URL pattern to conform to a new path pattern, e.g:
www.example.com/documents/newsletter/newsletter-issue-50.htm
to become
www.example.com/sites/default/newsletter/50/English/newsletter-issue-50.htm
As you can see, the new URL pattern needs to have the issue number specified in 2 places.
My rewrite rule is as follows:
RewriteRule ^documents/newsletter/newsletter-issue-(.*).htm$ http://www.example.com/sites/default/newsletter/$1/English/newsletter-issue-$1.htm [R=301,L]
When I use this rule, I still get a 404 because the resultant URL misses to replace the second "$1" with the issue number , in this case "50". What I get is
http://www.example.com/sites/default/newsletter/50/English/newsletter-issue-.htm
I have used this test site and it confirms that the second backreference variable is not being evaluated at all. Am sure am missing something here, since it should be a simple rule to put in place.
Any help on this would be greatly appreciated.
Thanks.
Strangely enough, I works in the rewrite tester if you surround with 2 sets of parenthesis:
RewriteRule ^documents/newsletter/newsletter-issue-((.*))[.]htm$ http://www.example.com/sites/default/newsletter/$1/English/newsletter-issue-$1.htm [R=301,L]
I have also escaped the file extension prefix

Mod_rewrite. Redirect url with Special Characters (question marks)

I have a website with joomla and I need to redirect (301) some links
They are in this form (index.php?Itemid= identify them - all links that doesn't have this part shouldn't be redirected)
/index.php?Itemid=544&catid=331:savona&id=82356:smembramento-dei-cantieri-baglietto-di-varazze-lopposizione-delle-maestranze&option=com_content&view=article
This should work
RewriteRule ^index.php?Itemid(.*)$ http://www.ligurianotizie.it/archive/index.php?Itemid$1 [L,R=301]
But the first ? (question mark) seems to cause problems.
In fact, if we suppose that the links are without the question mark
/index.phpItemid=544&catid=331:savona&id=82356:smembramento-dei-cantieri-baglietto-di-varazze-lopposizione-delle-maestranze&option=com_content&view=article
I would use
RewriteRule ^index.phpItemid(.*)$ http://www.ligurianotizie.it/archive/index.php?Itemid$1 [L,R=301]
and everything is perfect. But unfortunately real links has that question mark, and I have to find a solution.
What I have to do with that question mark?
Is the ? character escaped? try to add the NE (noescape) flag like this:
RewriteRule ^index.php?Itemid(.*)$ http://www.ligurianotizie.it/archive/index.php?Itemid$1 [L,R=301,NE]
The part behind the question mark is the query string. You can use RewriteCondto determine if it is not empty, and based on that make the decision to redirect.
Note: Query String
The Pattern will not be matched against the query string. Instead, you must use a RewriteCond with the %{QUERY_STRING} variable. You can, however, create URLs in the substitution string, containing a query string part. Simply use a question mark inside the substitution string, to indicate that the following text should be re-injected into the query string. When you want to erase an existing query string, end the substitution string with just a question mark. To combine a new query string with an old one, use the [QSA] flag.
Source: http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html
This should help you:
RewriteEngine On
RewriteBase /
RewriteCond %{QUERY_STRING} Itemid
RewriteRule ^index.php(.*)$ http://www.ligurianotizie.it/archive/index.php$1 [L,R=301]
Every link containing "Itemid" will be redirected, the others not.

question regarding specific mod_rewrite syntax

I know there are other questions that are similar to this.. but I'm really struggling with mod_rewrite syntax so I could use some help.
Basically, what I am trying to do is have the following redirect occur:
domain.com/1/ redirect to domain.com/?id=$1 (also should work for www.domain.com)
What I have so far (that isn't working):
RewriteEngine On
ReRewriteRule ^/([0-9])$ /?id=$1
A few issues.
First is terminology: if you want when a user types domain.com/1/ that the request is served by index.php?id=1, then you are rewriting /1/ to index.php?id=1, not the other way around as you said.
Second, simple typo: RewriteRule, not ReRewriteRule.
Second, [0-9] is the right way to match a number, but it'll only match a single digit. If you want to handle /13 then you should match one or more instances [0-9] by writing [0-9]+.
Third, the target of your rule should be the file you want to serve. / is not a file or an absolute URL, write out the index.php if that's what you mean.
Third, you say you want to handle /1/, but your rule says that the matched request must end in a number, not a slash. If you want to accept the slash whether it's there or not, put that in the rule.
RewriteRule ^/?([0-9]+)/?$ index.php?id=$1 [L]
Does that work?
You've three issues:
RewriteRule is misspelt as point out by Michael, you need to worry about the trailing slash, and you need to stop processing rules when you've found the match:
RewriteRule ^/(\d+)/?$ /?id=$1 [L]
You have misspelled RewriteRule. Otherwise, I think your syntax looks correct.
RewriteEngine On
ReRewriteRule ^/([A-Za-z0-9]+)$ /?id=$1
--^^^---------
Actually, you should probably remove the /:
RewriteEngine On
RewriteRule ^([A-Za-z0-9$_.+!*'(),-]+)$ /?id=$1
------^^^---------
EDIT Added the +. Look at all the answers here. You need a composite of them., including the + and the [L] in addition to what I have here.
EDIT2 Also edited to include alpha characters in the id.
EDIT3 Added special characters to regex. These should be valid in a URL, but it's unusual to find them there.