condensing rewrite rules in apache - apache

These are my rewrite rules. The first rule is problematic and its sortof what I'm trying to achieve. The last four work but that's an overkill.
RewriteRule ^([^/^.]+)/([0-9]{4})([a-z]{3})([0-9]{2})/p(([0-9]{1,5})-([0-9]{1,3}))?$ /test2-$1-$2-$3-$4-$5-$6
RewriteRule ^([^/^.]+)/([0-9]{4})([a-z]{3})([0-9]{2})/p([0-9]{1,5})-([0-9]{1,3})$ /test-$1-$2-$3-$4-$5-$6
RewriteRule ^([^/^.]+)/([0-9]{4})([a-z]{3})([0-9]{2})/p([0-9]{1,5})-$ /test-$1-$2-$3-$4-$5
RewriteRule ^([^/^.]+)/([0-9]{4})([a-z]{3})([0-9]{2})/p([0-9]{1,5})$ /test-$1-$2-$3-$4-$5
RewriteRule ^([^/^.]+)/([0-9]{4})([a-z]{3})([0-9]{2})/p$ /test-$1-$2-$3-$4
What I'm trying to do is use one rewriterule instead of four to handle URLs in the following formats:
http://example.com/word/####wor##/p#-###
http://example.com/word/####wor##/p#-
http://example.com/word/####wor##/p#
http://example.com/word/####wor##/p
Is there any way I can condense my rewriterules to just one instead of four?
I'll add more info later if necessary.

There might be more elegant ways to solve this, but the following should work:
RewriteRule ^([^/^.]+)/([0-9]{4})([a-z]{3})([0-9]{2})/p(?:([0-9]{1})|([0-9]{1})-|([0-9]{1})-([0-9]{3})){0,1}$ /test-$1-$2-$3-$4-$5-$6-$7-$8 [L,NC]
To break it down:
The first part is what you've been doing so far. It covers capture results $1 to $4.
Then comes /p, which is followed by one big group that's split into several alternations separated with the pipe character |. One of the options contained in this group should match either zero or one time(s) /p(?:...|...|...){0,1}.
The ?: at the beginning of this group signifies that the group as a whole should not be used for capturing as you don't necessarily want to capture the end result (like when it contains an -), so we'll stick to only capturing the sub groups that are relevant. This also means that when counting your capture groups to know which translates to which of your variables ($1, $2 etc.), you will not count this group.
The alternations match the 3 variations you provided for your URLs where the /p is followed by something else:
([0-9]{1})
([0-9]{1})-
([0-9]{1})-([0-9]{3})
Note that in your end result /test-$1-$2-$3-$4-$5-$6-$7-$8 not all variables might match something, depending on the pattern of your URL. Some might remain empty.

Related

htaccess rewrite rule containing certain word

I have a magento web shop and i use a plugin to import stock, prices, products. Annoyingly this plugin doesnt save old urls if i update the product name etc.
Is there a way i can do this with htaccess? For example, i removed the SKU from the end of a product URL but google has indexed some of these old URLs.
Is it possible to rewrite https://www.example.com/xerox-everyday-toner-for-tn242y-yellow-toner-cartridge-006r04226 to https://www.example.com/xerox-everyday-toner-for-tn242y-yellow-toner-cartridge using some wildcards? Obviously everything before the word "cartridge" changes per product so i want a redirect that if a URL contains "-cartridge-" remove everything after that pattern as SKU lengths can change but only contain alphanumeric characters. If a URL does not contain "-cartridge-" do not do anything.
I've tried a few regex patterns using an online htaccess builder but i cant seem to get this correct (unless these sites dont process the regex and thats why i think they dont work).
RewriteRule (.+-cartridge)-.+$ $1 [R=301,L]
This should do the job. Everything up to -cartridge gets captured (capturing the dynamic part before and this static suffix in one go, means we don't have the assemble the substitution URL out of multiple parts, but can just use $1), and after it a - plus some arbitrary characters must follow.
is there anyway you can add a rule so it excludes if "multipack" comes after the "-cartridge-".
The often easiest way to do this, is to place a "do nothing" rule before the one that does the rewriting. Then you can work with a positive match ("if URL ends in -cartridge-multipack, do nothing"), instead of trying to find a negated pattern.
RewriteRule -cartridge-multipack$ - [L]
RewriteRule (.+-cartridge)-.+$ $1 [R=301,L]
Pattern anchored at the end with $ (means nothing is allowed to come after this), - for "no substitution", and the L flag to make the rewrite engine stop the current round of processing.

Need .htaccess recipe to display rss feed dynamically

I currently use the following recipe to route .rss files to a script that produces a rss feed dynamically:
RewriteRule ^(.*).rss$ /get-feed.pl?item=$1
It works perfectly for URLs like this:
www.example.com/articles.rss
What I would to like to do is change the URL to this:
www.example.com/rss/articles/
Everything I have tried doesn't work.
I just tried to put some slashes in the recipe but I'm not an expert in these recipes so they didn't work. Somethig like this didn't work: RewriteRule ^/rss/(.*)/$ /get-feed.pl?item=$1
("recipe" = regular expression / "regex" for short OR RewriteRule "pattern" from the Apache docs - At least I think that is what you are referring to? We are not baking a cake here! ;) )
That is very close, except that the URL-path that the RewriteRule pattern matches against does not start with a slash when used in a .htaccess (directory) context. So, it would need to be like this: ^rss/(.*)/$. If you had looked to see what your first rule was returning you would have seen that there was no slash prefix in the backreference that was captured (ie. the value of the item URL parameter).
However, there are other (minor) issues here...
The 2nd path segment cannot be empty, so it would be preferable to match something, rather than anything. eg. (.+) instead of (.*). However, this should be made more restrictive, so to match just a single path segement, instead of any URL-path (which is likely to fail anyway I suspect). eg. Presumably /rss/foo/bar/baz/ should not match?
Again, if you only want to match a string of the form articles then make the regex more restrictive so that it only matches letters (or perhaps letters + numbers + hyphens)?
You are missing the L (last) flag on this rule, which is a problem if you have other directives that follow.
So, if you are wanting to rewrite URLs of the form www.example.com/rss/articles/ (note the trailing slash) then try the following instead:
RewriteRule ^rss/([\w-]+)/$ /get-feed.pl?item=$1 [L]
Make sure the browser cache is cleared before testing.
And this would need to go near the top of the .htaccess file, before any existing rewrites.
Aside: A quick look at your original directive:
RewriteRule ^(.*).rss$ /get-feed.pl?item=$1
This is not strictly correct, as it potentially matches too much. The unescaped dot before rss matches any character. And the .* subpattern matches 0 or more characters of anything - it must be something. So, this should really be something like:
RewriteRule ^([\w-]+)\.rss$ /get-feed.pl?item=$1 [L]

confusion between two rewrite urls rules

these two rules are confused :
RewriteRule ^health-institute-([a-zA-Z\-]+)-([a-zA-Z\-]+)$ search.php?city=$1&speciality=$2 [L]
RewriteRule ^health-institute-app-([a-zA-Z\-]+)$ search.php?city=$1 [L]
when I want to reach health-institute-app-mycity (2nd rule) the server consider app as a value and try to reach search.php?city=app&speciality=mycity (1st rule)
how can I say that these are two separate rules?
Yes, because the regex ^health-institute-([a-zA-Z\-]+)-([a-zA-Z\-]+)$ in the first rule also matches health-institute-app-mycity.
You need to reverse these two directives so the more specific rule is first.
For example:
RewriteRule ^health-institute-app-([a-zA-Z-]+)$ search.php?city=$1 [L]
RewriteRule ^health-institute-([a-zA-Z-]+)-([a-zA-Z-]+)$ search.php?city=$1&speciality=$2 [L]
(No need to backslash-escape the hyphen when at the start or end of the character class.)
HOWEVER, the regex in the (now) second rule is potentially ambiguous since the hyphen (-) is used to delimit the two values (city and speciality), but the hyphen is also included in both the character classes, so it can presumably be part of the values themselves. However, both city and speciality cannot both contain hyphens, despite the regex seemingly allowing this.
For example, how should a request for health-institute-foo-bar-baz-qux be resolved? Since the quantifier + is greedy, this will currently result in search.php?city=foo-bar-baz&speciality=qux. If there is ever a hyphen in the speciality (as suggested this could be the case by the regex) it will never be matched.

Multiple rewrite rule with different parameters in the same position

I would like to know if the following would be possible
RewriteRule ^([^/]*)/([^/]*)$ /search.php?type=$1&query=$2 [L]
RewriteRule ^([^/]*)/([^/]*)/([^/]*)$ /search.php?type=$1&query=$2&condition=$3 [L]
RewriteRule ^([^/]*)/([^/]*)/([^/]*)/([^/]*)$ /search.php?type=$1&query=$2&page=$3[L]
As you can see the first and third row are similiar with the only difference being the name of the third parameter, the second rule would be used for pages such as
/isbn/1203910293/new
whilst the third rule would be used for pages such as where page aliases page number
/title/harry-potter/2
I know this seems quite silly considering I can just call the condition parameter, but it would clear things up in the future if used the parameter page
The third rule pattern ^([^/]*)/([^/]*)/([^/]*)/([^/]*)$ will not match
/title/harry-potter/2
because the rule requires four parts, e.g.
/title/harry-potter/2/xyz
or at least a trailing slash
/title/harry-potter/2/
Instead it will be matched by the second rule pattern, because it has three parts too, just like
/isbn/1203910293/new
If you want to match page numbers, you need to match against a rule similar to the second rule, but be more specific, like e.g.
RewriteRule ^([^/]*)/([^/]*)/(\d+)$ /search.php?type=$1&query=$2&page=$3 [L]

mod rewrite help

I am trying to use mod rewrite to remove and replace part of my url. I am looking to get my urls looking like this.
http://domain.com/e813c697e8dd8dc2bbfecb1d20b15783.html
instead of
http://domain.com/lookup.php?md5=e813c697e8dd8dc2bbfecb1d20b15783
lookup.php calls matches the md5 to the database and fetches and forwards you to the correct url.
All I need to do now is rewrite it so that it rewrites from this
http://domain.com/lookup.php?md5=e813c697e8dd8dc2bbfecb1d20b15783
to this
http://domain.com/e813c697e8dd8dc2bbfecb1d20b15783.html
I have tried this which works but it makes rewrites from any .html page at root level and makes it display nothing "blank".
RewriteRule ^([a-z0-9]+)\.html$ /lookup.php?md5=$1
Can anyone tell me a way to do this so that my regular html pages are not messed up and be able to display these links how I want to? Thanks.
Your current rule is a way too broad. You need to make it more specific to only match md5 hash value -- which is easy:
RewriteRule ^([a-f0-9]{32})\.html$ /lookup.php?md5=$1 [QSA,L]
Your pattern for file name is too broad -- it will match any file with letters and digits. md5 hash, on another hand, uses very limited subset of characters (a-f only) and digits .. and has to be 32 characters long. This pattern ([a-f0-9]{32}) does the job perfectly.
I have also added L and QSA flags (QSA to preserve any existing query string (like, tracking info, for example) and L to stop matching any other rules).
To further ensure that it does not match any real files which may have name in such format (who knows), add RewriteCond %{REQUEST_FILENAME} !-f line before the rule.
One thing you can do is quantify the number of hex digits:
RewriteRule ^([0-9a-f]{32})\.html$ /lookup.php?md5=$1
as md5 will always have 32 hex digits.
That depends on the naming scheme of your regular html pages. For starters though you could change it to: RewriteRule ^([a-f0-9]+)\.html$ /lookup.php?md5=$1
Which would make any html pages which have a letter not between a and f work.
If none of your HTML pages have numbers in their names, and if the cost of a page not redirecting outweighs the odds of an md5 hash having no numbers in it. You could check that there is at least one digit in the filename: RewriteRule ^([a-f]*\d[a-f\d]+)\.html$ /lookup.php?md5=$1
Lastely if it is acceptable for you to have urls like http://domain.com/md5/e813c697e8dd8dc2bbfecb1d20b15783.html instead you could change it to: RewriteRule ^md5/([a-f0-9]+)\.html$ /lookup.php?md5=$1 and not have to worry about the fragileness of the other methods.
I'm not sure, but I think you have a RewriteBase / somewhere in you .htaccess judging by your example mod_rewrite. If not you might need to add a / right after the ^ in whatever RewriteRule you choose to go with.