.htaccess modrewrite problem - apache

I have a website on wordpress and I have written a plugin which required SEO friendly URL now i am stuck at the following
http://domain.com/catalogue-category/1/ is working fine but when i replace the /1/ with the name of the category like http://domain.com/catalogue-category/Seating/ it does not work at all and gives me 404 error.
Its also working at /catalogue-category/?cat=Seating
My apache rewrite rule is
RewriteRule ^catalogue-category/^([^/]+)/$ /catalogue-category/?cat=$1 [L]
I am not that good in mod rewrite as that in PHP, so please bear my ignorance and treat me as a rookie.
Looking forward to hear from the gurus :)

Try:
RewriteRule ^catalogue-category/([^/]+)/$ /catalogue-category/?cat=$1 [L]
You were using the caret ^ twice in your regex pattern (^catalogue-category/^([^/]+)/$), the caret asserts that the match should be from the start of the string.
Pattern explanations
Previous pattern:
Assert position at the beginning of the string «^»
Match the characters “catalogue-category/” literally
«catalogue-category/»Assert position at the beginning of the string «^» Match the regular expression below and
capture its match into backreference number 1 «([^/]+)» Match
any character that is NOT a “/” «[^/]+» Between one and
unlimited times, as many times as possible, giving back as needed
(greedy) «+» Match the character “/”
literally «/» Assert position at the end of the string (or
before the line break at the end of the string, if any) «$»
Suggested one
Assert position at the beginning of the string «^»
Match the characters “catalogue-category/” literally
«catalogue-category/» Match the regular expression below and
capture its match into backreference number 1 «([^/]+)» Match
any character that is NOT a “/” «[^/]+» Between one and
unlimited times, as many times as possible, giving back as needed
(greedy) «+» Match the character “/”
literally «/» Assert position at the end of the string (or
before the line break at the end of the string, if any) «$»

Related

How to mod_rewrite query string which includes path and parameters?

My website uses a rather complicated query string parameter: Its value is a path including parameters.
For SEO (Search Engine Optimization) etc. I'm now attempting to mod_rewrite shortened versions...
example.com/path/c1/d1/e1.html?x=x1&y=y1
example.com/path/c2/d2/e2.html?x=x2&y=y2
example.com/path/c2/d3/e4.html?x=x5&y=y6
...to the currently required...
example.com/path/?param=a/b/c1/d1/e1?x=x1&y=y1
example.com/path/?param=a/b/c2/d2/e2?x=x2&y=y2
example.com/path/?param=a/b/c2/d3/e4?x=x5&y=y6
So the goal is to...
get rid of the fixed part (?param=a/b/) to shorten the address and
don't have two ? in the visible address
preserve the query string value's necessary variable path components (like c1/d1/e1 or c2/d2/e2 or c2/d3/e4)
add .html to the final part before the query string value's ? to make the folder structure appear 1 level less deep
preserve the query string value's necessary variable parameters (like ?x=x1&y=y1 or ?x=x2&y=y2 or ?x=x5&y=y6)
After hours of research and attempting lots of things that did not work, I signed up here to request your advice on how to solve this mess. Would you please be so kind to assist?
Edit / additional infos:
After the fixed string /path/?param=a/b/ it is always 3 variable path segments like c1/d1/e1.
These variable segments can contain alphanumerical characters a-z A-Z 0-9, dash symbol - and bracket symbols ( and ).
Same applies to the parameter values (x1, y1). Additionally, y1 can contain percent symbol % due to URL-encoding.
Using two question marks (one to start the query string and the other as part of the parameter value) looks invalid but works.
The actual file that handles the request is /path/index.php.
Try the following at the top of your .htaccess file, using mod_rewrite:
RewriteEngine on
# REDIRECT: /path/?param=a/b/c1/d1/e1?x=1&y=y1
RewriteCond %{THE_REQUEST} ^[A-Z]{3,7}\s/path/(?:index\.php)?\?param=a/b/([^/]+/[^/]+/[^/]+)\?(x=[^&]+&y=[^&]+)\s
RewriteRule ^(path)/(?:index\.php)?$ /$1/%1.html?%2 [R=302,L]
# REWRITE: /path/c1/d1/e1.html?x=x1&y=y1
RewriteCond %{QUERY_STRING} ^(x=[^&]+&y=[^&]+)$
RewriteRule ^(path)/([^/]+/[^/]+/[^/]+)\.html$ $1/index.php?param=a/b/$2?%1 [L]
The first rule redirects any direct requests for the "old" URL of the form /path/?param=a/b/c1/d1/e1?x=1&y=y1 (index.php is optional) to the "new" canonical URL of the form /path/c1/d1/e1.html?x=x1&y=y1. This is for the benefit of search engines and any third party inbound links that cannot be updated. You must, however, have already changed all your internal links to the "new" canonical URL.
By matching against THE_REQUEST (as opposed to the QUERY_STRING) we avoid a redirect loop by preventing the rewritten URL from being redirected. THE_REQUEST contains the first line of the request headers and is not changed by other rewrites. For example, THE_REQUEST would contain a string of the form:
GET /path/?param=a/b/c1/d1/e1?x=1&y=y1 HTTP/1.1
This is currently a 302 (temporary) redirect. Only change this to a 301 (permanent) redirect once you have tested that this works OK, in order to avoid potential caching issues.
The second rule internally rewrites requests for the "new" canonical URL, eg. /path/c1/d1/e1.html?x=x1&y=y1, back to the original/underlying URL-path, eg. /path/index.php?param=a/b/c1/d1/e1?x=1&y=y1. The & before the last URL parameter is intentional un-escaped (ie. URL decoded) as discussed in comments.
The $1 and $2 backreferences refer back to the captured groups in the RewriteRule pattern. Whereas the %1 and %2 backreferencs refer to the captured groups in the preceding CondPattern.
These variable segments can contain alphanumerical characters a-z A-Z 0-9, dash symbol - and bracket symbols ( and ).
I've used a more general (and shorter) subpattern in the regex above which will match more characters, but is arguably easier to read. ie. [^/]+ - matches anything except a slash and [^&]+ - matches anything except a &.
If you specifically wanted to match only the allowed characters then you could change the above subpatterns to [a-zA-Z0-9()%-]+ or [\w()%-]+ which also matches underscores (_).
UPDATE: x and y are just examples for parameter names, but in reality there can be lots of different parameter names.
the parameters have more than a single character. They consist of letters a-z, A-Z and in the future maybe digits 0-9. There can be more than the two parameters x and y.
Maybe just match any query string (providing there is a query string).
Try the following instead:
# REDIRECT: /path/?param=a/b/c1/d1/e1?x=1&y=y1
RewriteCond %{THE_REQUEST} ^[A-Z]{3,7}\s/path/(?:index\.php)?\?param=a/b/([^/]+/[^/]+/[^/]+)\?([^\s]+)
RewriteRule ^(path)/(?:index\.php)?$ /$1/%1.html?%2 [R=302,L]
# REWRITE: /path/c1/d1/e1.html?x=x1&y=y1
RewriteCond %{QUERY_STRING} ^(.+)$
RewriteRule ^(path)/([^/]+/[^/]+/[^/]+)\.html$ $1/index.php?param=a/b/$2?%1

How to rewrite URLs in htaccess that end with recurring characters

I have changed web platforms and have old URLs that I cannot and do not want to match on the new platform where the old content is now living.
I have an array of old product URLs that all have '-p-' in the URL, followed by a string of numbers and ending in .html (osCommerce platform URLs).
I would like to know how to rewrite:
/x/[rest-of-url]-p-[random numbers].html
to
/x/[rest-of-url]
I would like the end result to look something like this:
http://www.shop.com/shop/versace-black-snakeskin-pony-hair-hobo-p-2214.html
redirects to:
http://www.shop.com/shop/versace-black-snakeskin-pony-hair-hobo
Does anyone know if this is doable in the htaccess file as a rewrite rule?
My managed hosting service providers BeepWeb answered my question.
RewriteRule ^/shop/(.*)-p-(.*).html$ http://www.shop.com/product/$1/ [R=302]
The first argument is the URI that you are matching. The (.) matches any characters. The second argument is the destination URL. The $1 corresponds to the first (.). $2 would be the second (.*), and so on... The [R=302] tells the rewrite to be a 302 redirect (use [R=301] for a 301 redirect).
Using the (.) is essentially like using a wildard. You can instead narrow this down by specifying which characters you want to match as opposed to all characters (instead of using (.) you could use ([abc]*) which would match only against a, b and c characters).
Also, be careful that you do not match other URLs unintentionally (i.e. you need to make sure that the pattern matches are unique to the URLs being rewritten).
If you need the source reference, see the following:
https://httpd.apache.org/docs/current/rewrite/intro.html
Thanks again to http://www.beepweb.com for their detailed response.
Hope it helps others.

condensing rewrite rules in apache

These are my rewrite rules. The first rule is problematic and its sortof what I'm trying to achieve. The last four work but that's an overkill.
RewriteRule ^([^/^.]+)/([0-9]{4})([a-z]{3})([0-9]{2})/p(([0-9]{1,5})-([0-9]{1,3}))?$ /test2-$1-$2-$3-$4-$5-$6
RewriteRule ^([^/^.]+)/([0-9]{4})([a-z]{3})([0-9]{2})/p([0-9]{1,5})-([0-9]{1,3})$ /test-$1-$2-$3-$4-$5-$6
RewriteRule ^([^/^.]+)/([0-9]{4})([a-z]{3})([0-9]{2})/p([0-9]{1,5})-$ /test-$1-$2-$3-$4-$5
RewriteRule ^([^/^.]+)/([0-9]{4})([a-z]{3})([0-9]{2})/p([0-9]{1,5})$ /test-$1-$2-$3-$4-$5
RewriteRule ^([^/^.]+)/([0-9]{4})([a-z]{3})([0-9]{2})/p$ /test-$1-$2-$3-$4
What I'm trying to do is use one rewriterule instead of four to handle URLs in the following formats:
http://example.com/word/####wor##/p#-###
http://example.com/word/####wor##/p#-
http://example.com/word/####wor##/p#
http://example.com/word/####wor##/p
Is there any way I can condense my rewriterules to just one instead of four?
I'll add more info later if necessary.
There might be more elegant ways to solve this, but the following should work:
RewriteRule ^([^/^.]+)/([0-9]{4})([a-z]{3})([0-9]{2})/p(?:([0-9]{1})|([0-9]{1})-|([0-9]{1})-([0-9]{3})){0,1}$ /test-$1-$2-$3-$4-$5-$6-$7-$8 [L,NC]
To break it down:
The first part is what you've been doing so far. It covers capture results $1 to $4.
Then comes /p, which is followed by one big group that's split into several alternations separated with the pipe character |. One of the options contained in this group should match either zero or one time(s) /p(?:...|...|...){0,1}.
The ?: at the beginning of this group signifies that the group as a whole should not be used for capturing as you don't necessarily want to capture the end result (like when it contains an -), so we'll stick to only capturing the sub groups that are relevant. This also means that when counting your capture groups to know which translates to which of your variables ($1, $2 etc.), you will not count this group.
The alternations match the 3 variations you provided for your URLs where the /p is followed by something else:
([0-9]{1})
([0-9]{1})-
([0-9]{1})-([0-9]{3})
Note that in your end result /test-$1-$2-$3-$4-$5-$6-$7-$8 not all variables might match something, depending on the pattern of your URL. Some might remain empty.

How to change URL argument from arg-1 to arg_1 before apache processes it? use .htaccess?

I have a CMS that takes url arguments to return a list of results with this structure:
website.com/argument_1/argument_2.
In order for the site to return the results the args have to have underscores.
However, the code that is generated for the url structure is
website.com/argument-1/argument-2. I need to keep this url structure, but, when someone clicks the link, I need it passed to PHP via apache with the underscores.
I hope that makes sense. Is this done with .htaccess and rewrite rules? I have never written any thing like that before, so any help is appreciated. Thanks
You should definitely do this in PHP. Here is a solution that will work, but requires one pass through RewriteRule for each dash in the URL, so I don't recommend it. But regardless:
RewriteEngine On
RewriteRule ^(.*)-(.*)$ $1_$2 [QSA]
To clarify what will happen here, if the request is for http://www.example.com/argument-1/argument-2/argument-3, the RewriteRule will be run multiple times because it can only replace a single dash per pass. So the URL will be transformed something like this:
Pass 1: http://www.example.com/argument-1/argument-2/argument_3
Pass 2: http://www.example.com/argument-1/argument_2/argument_3
Pass 3: http://www.example.com/argument_1/argument_2/argument_3
As for the $1 and $2, these refer back to the parenthesized components from the regular expression. The regular expression, ^(.*)-(.*)$, breaks down like this:
^ - Match the beginning of the URI
(.*) - Match (and capture) any number of characters, this will be $1 in the replacement
- - Match a dash
(.*) - Match (and capture) any number of characters, this will be $2 in the replacement
$ - Match the end of the URI
So the first pass through, $1 will be /argument-1/argument-2/argument and $2 will be 3. The replacement then puts an underscore between the 2 of them and creates a new URI:
/argument-1/argument-2/argument_3
Then it runs again because the regular expression still matches the new URI (because it has a - in it) and $1 is /argument-1/argument and $2 is 2/argument_3. The replacement then puts an underscore between the 2 of them and creates a new URI:
/argument-1/argument_2/argument_3
Then it runs again because the regular expression still matches the new URI (because it has a - in it) and $1 is /argument and $2 is 1/argument_2/argument_3. The replacement then puts an underscore between the 2 of them and creates a new URI:
/argument_1/argument_2/argument_3
Then Apache continues with this URI since the regular expression no longer matches (because there are no more dashes).

How to make a simple Mod_Rewrite in .htaccess

i have problems with my shared hosting account. the apache server i'm using scrambles utf8 so i can't use Hebrew/Arabic in the url such as www.mydomain.com/אבא.php
So i want to know how can i make it that if someone asks for the page:
www.mydomain.com/%D7%90%D7%91%D7%90.php
he would get to the page: www.mydomain.com/D790D791D790.php (without the percentages)
but his browser url will show the first page he asked for (with no redirection).
I guess using mod_rewrite in .htaccess but have no clue how to approach this.
Please help you guys, this is a 911 for me.
I've been thinking about this for a little while now, and unless you know how many sets there are (6 in your example) I don't think there will be a terribly elegant way to do this. One solution may be to use a rather vague rewrite RewriteRule ^(.*?%.*?)\.php$ foo.php?bar=$1 then process the data in PHP where you have a few more options and quite a bit more flexibility.
Details of the Regex:
^.*?%.*?\.php$
Options: case insensitive
Assert position at the beginning of the string «^»
Match any single character that is not a line break character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “%” literally «%»
Match any single character that is not a line break character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “.” literally «\.»
Match the characters “php” literally «php»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Created with RegexBuddy