Need .htaccess recipe to display rss feed dynamically - apache

I currently use the following recipe to route .rss files to a script that produces a rss feed dynamically:
RewriteRule ^(.*).rss$ /get-feed.pl?item=$1
It works perfectly for URLs like this:
www.example.com/articles.rss
What I would to like to do is change the URL to this:
www.example.com/rss/articles/
Everything I have tried doesn't work.

I just tried to put some slashes in the recipe but I'm not an expert in these recipes so they didn't work. Somethig like this didn't work: RewriteRule ^/rss/(.*)/$ /get-feed.pl?item=$1
("recipe" = regular expression / "regex" for short OR RewriteRule "pattern" from the Apache docs - At least I think that is what you are referring to? We are not baking a cake here! ;) )
That is very close, except that the URL-path that the RewriteRule pattern matches against does not start with a slash when used in a .htaccess (directory) context. So, it would need to be like this: ^rss/(.*)/$. If you had looked to see what your first rule was returning you would have seen that there was no slash prefix in the backreference that was captured (ie. the value of the item URL parameter).
However, there are other (minor) issues here...
The 2nd path segment cannot be empty, so it would be preferable to match something, rather than anything. eg. (.+) instead of (.*). However, this should be made more restrictive, so to match just a single path segement, instead of any URL-path (which is likely to fail anyway I suspect). eg. Presumably /rss/foo/bar/baz/ should not match?
Again, if you only want to match a string of the form articles then make the regex more restrictive so that it only matches letters (or perhaps letters + numbers + hyphens)?
You are missing the L (last) flag on this rule, which is a problem if you have other directives that follow.
So, if you are wanting to rewrite URLs of the form www.example.com/rss/articles/ (note the trailing slash) then try the following instead:
RewriteRule ^rss/([\w-]+)/$ /get-feed.pl?item=$1 [L]
Make sure the browser cache is cleared before testing.
And this would need to go near the top of the .htaccess file, before any existing rewrites.
Aside: A quick look at your original directive:
RewriteRule ^(.*).rss$ /get-feed.pl?item=$1
This is not strictly correct, as it potentially matches too much. The unescaped dot before rss matches any character. And the .* subpattern matches 0 or more characters of anything - it must be something. So, this should really be something like:
RewriteRule ^([\w-]+)\.rss$ /get-feed.pl?item=$1 [L]

Related

.htaccess rule match part of new url and use as query string for old url

I am trying to match part of a URL and then use the matched expression to append onto the end of a query string from the old URL.
I have the following line in .htaccess, once I've worked out what I'm doing wrong I'll be able to fix the rest so for now I will just focus on the following line:
RewriteRule ^league/([^/]*)$/matches/? index.php?page_id=1074&league=$1
I would like ([^/]*)$ to appear where $1 is
So essentially: /league/29/matches/ would point to index.php?page=1074&league=29
Can anyone please tell me what I am doing wrong? :)
RewriteRule ^league/([^/]*)$/matches/? index.php?page_id=1074&league=$1
The $ (end-of-string anchor) after the subpattern ([^/]*)$, in the middle of the regex, does not make sense here and will cause the regex to fail. You should also be using the + quantifier here (1 or more). You are also missing the end-of-string anchor ($) at the end of the regex (otherwise the trailing /? is superfluous). Although you shouldn't really make the trailing slash optional on the rewrite as it potentially opens you up for duplicate content. (You should redirect to append/remove the trailing slash to canonicalise the URL instead.) You are also missing the L flag on the RewriteRule.
Try the following instead:
RewriteRule ^league/([^/]+)/matches/?$ index.php?page_id=1074&league=$1 [L]
Although, if you are expecting digits only (as in your example) then you should be matching digits, not anything. So, the following is perhaps "more" correct:
RewriteRule ^league/(\d+)/matches/$ index.php?page_id=1074&league=$1 [L]

Rewrite encoded URLs with RewriteRules

I was rewritting "domain.com/lolmeter/platformValue/usernameValue" (platformValue and usernameValue are values requested by the user with text inputs) with the following rewrite rule:
RewriteRule ^lolmeter/([a-zA-Z0-9]*)/([a-zA-Z0-9]*)$ /lolmeter.html?platform=$1&username=$2 [L]
button.href = lolmeter/platformValue/usernameValue
I noticed that when the user inputs a whitespace or another non alphanumeric value, it is encoded with "%" symbols automatically, so I tried to rewrite the rule to accept them, like:
RewriteRule ^lolmeter/(([a-zA-Z0-9]|%)*)/(([a-zA-Z0-9]|%)*)$ /lolmeter.html?platform=$1&username=$2 [L]
But it doesn't work, I assume because of the parentheses. Which symbol should I use then for an inner "|" ?
P.S: Is there a more popular or modern way for changing URLs?
RewriteRule ^lolmeter/(([a-zA-Z0-9]|%)*)/(([a-zA-Z0-9]|%)*)$ /lolmeter.html?platform=$1&username=$2 [L]
The RewriteRule pattern matches against the %-decoded URL-path. So, if an encoded space (ie. %20) is present in the URL-path of the request then the rule matches against a literal space, not %20.
You can use the \s shorthand character class inside the character class in your regex to match any whitespace character.
For example:
RewriteRule ^lolmeter/([a-zA-Z0-9\s]+)/([a-zA-Z0-9\s]*)$ /lolmeter.html?platform=$1&username=$2 [L]
Note that I made the quantifier on the second/middle path segment + instead of * since I assume the middle path segment is not optional. Note that multiple contiguous slashes in the URL-path are also reduced before the regex is matched so if the middle path segment was omitted then the passed username would be seen as the platform, which I'm sure is not the intention.
Note also that in the above the space is not re-encoded in the resulting rewrite. Use the B flag to re-encode the space as a + in the query string. (If you specifically needed the space to be re-encoded as %20 then use the BNP flag as well - requires Apache 2.4.26)
P.S: Is there a more popular or modern way for changing URLs?
Not sure exactly what you mean by this, but mod_rewrite on Apache is the URL rewriting module. Always has been and probably always will be.
However, you don't necessarily need to rewrite the request the way you have done, although you may still want to match the URL in a similar way (depending on what else you are doing). You could perhaps just rewrite the request to lolmeter.html and have your script parse the URL-path directly, rather than the query string.
Or, I suppose the "modern way" would be to rewrite everything to a "front-controller" - an entry script that parses the URL and "routes" the request appropriately. This avoids having to have a multitude of rewrites in .htaccess. Although this isn't anything "new", it has perhaps become more common. Many CMS/frameworks use this pattern.

.htaccess: Add optional language parameter to all pages

I have a .htaccess file rewriting all URLs. For example:
# Special urls
RewriteRule ^(article)/([^/]*)(?:/[^/]*)?$ /index.php?page=$1&keyword=$2 [L,QSA,NC]
RewriteRule ^(privacy)$ /index.php?page=legal&type=$1 [L,QSA,NC]
RewriteRule ^(imprint)$ /index.php?page=legal&type=$1 [L,QSA,NC]
# All urls
RewriteRule ^([0-9a-zA-Z\s]+)$ /index.php?page=$1 [QSA]
Now I want to integrate a language parameter in my URL. For example
example.com/test ➡ /index.php?page=test
example.com/en/test ➡ /index.php?lang=en&page=test
How would I accomplish this without having to edit all RewriteRules? Is there a way to check if the part after example.com matches a regex, append a query parameter and handle all future rules normally?
Yes, this is possible. Add the following rule before your existing rules:
# Add optional language URL param if present in the first path segment
RewriteRule ^(\w\w)/(.*) $2?lang=$1 [QSA,DPI]
UPDATE: The DPI flag discards the original path-info that would otherwise be appended to the URL-path after the rewrite. This would result in the directives that immediately follow from failing to match, until the next round of processing. (See update below that goes into more detail.)
This assumes that the language code is always 2 characters.
The URL is rewritten to remove the language path segment from the URL-path and appends this as a lang URL parameter. The L flag is specifically omitted so the following rules are left to match the URL-path without the language code.
Since the following rules already have the QSA flag then the lang= URL parameter is appended.
Note, however, that the lang=en parameter is appended at the end of the query string, not prefixed to the beginning, as in your example.
UPDATE: It seems like not only the language gets appended, but also the page name. For example: example.com/de/index results in index?lang=de/index
Solved by adding DPI to [QSA] ➡ [QSA,DPI]
Hhhmmm, yes and no... given only the directives as stated in the question this should still "work" without the DPI flag (although not as efficiently). In fact, index?lang=de/index does not seem to be possible as a resulting URL (there's no page URL parameter)? It's possible that other directives are perhaps resulting in the path-info being appended to the query string? One thing of note is that the last rule stated in the question is missing the L flag, so any directives that follow are perhaps being unnecessarily processed (and even conflicting). However, the DPI flag is certainly an improvement here and should be added.
In detail...
Given a request of the form /de/index, and /de does not exist as a physical directory, then the /index part on the end of the requested URL-path is additional pathname information (path-info) and this is appended to the URL-path after the rewrite above (which is indeed undesirable). So, a request of the form /de/index is rewritten to index?lang=de, which becomes index/index?lang=de after the path-info is re-appended (note that it's not appended to the query string).
The resulting URL-path index/index (with appended path-info) fails to match the RewriteRule directives that follow. However, the rewrite engine then starts over, at which point the path-info that was (unnecessarily) added is then naturally discarded before the next round of processing. This results in the "correct" index?lang=de URL being used as the input for the second round of processing. This matches the last rule stated in the question and the request is finally rewritten to /index.php?page=index&lang=de.
So, the DPI flag shouldn't strictly be necessary here for it to "work". However, it is certainly recommended as it avoids the unnecessary 2nd pass through the rewrite engine. With the DPI flag, the path-info is not appended after the first rewrite, so the URL-path would match the appropriate rule on the first pass.

mod_rewrite rule gives 404 errors

I am having problems with mod_rewrite which is throwing me the same error 404 is as follows:
RewriteRule ^music.mp3?id=(.*)$ music.php?id= [L]
i need url /music.mp3?id=1 and real url /music.php?id=1
any idea??
I think something is misunderstood in the path RewriteRule
You can't match against the query string (everything after the ? in the URL) in a RewriteRule. But you're not really matching against it anyways, it looks like you just need to appended to your target URI, so:
RewriteRule ^music\.mp3$ music.php [L]
Should be good enough. Any query string parameters (like ?id=1) will automatically get appended at the end.
In your expression (the engine rule), what you want to say is:
"When the url starts with music.mp3?id=, take whatever is after the = and change the URL to music.php?id= and put the part after id="
In regular expressions the . character has special meaning. If you want to say "the dot character", and not give it special meaning, you need to escape it, but putting a \ behind it, like this:
^music\.mp3?id=(.*)$
#Jon Lin already gave you the other part, which is about query strings.

question regarding specific mod_rewrite syntax

I know there are other questions that are similar to this.. but I'm really struggling with mod_rewrite syntax so I could use some help.
Basically, what I am trying to do is have the following redirect occur:
domain.com/1/ redirect to domain.com/?id=$1 (also should work for www.domain.com)
What I have so far (that isn't working):
RewriteEngine On
ReRewriteRule ^/([0-9])$ /?id=$1
A few issues.
First is terminology: if you want when a user types domain.com/1/ that the request is served by index.php?id=1, then you are rewriting /1/ to index.php?id=1, not the other way around as you said.
Second, simple typo: RewriteRule, not ReRewriteRule.
Second, [0-9] is the right way to match a number, but it'll only match a single digit. If you want to handle /13 then you should match one or more instances [0-9] by writing [0-9]+.
Third, the target of your rule should be the file you want to serve. / is not a file or an absolute URL, write out the index.php if that's what you mean.
Third, you say you want to handle /1/, but your rule says that the matched request must end in a number, not a slash. If you want to accept the slash whether it's there or not, put that in the rule.
RewriteRule ^/?([0-9]+)/?$ index.php?id=$1 [L]
Does that work?
You've three issues:
RewriteRule is misspelt as point out by Michael, you need to worry about the trailing slash, and you need to stop processing rules when you've found the match:
RewriteRule ^/(\d+)/?$ /?id=$1 [L]
You have misspelled RewriteRule. Otherwise, I think your syntax looks correct.
RewriteEngine On
ReRewriteRule ^/([A-Za-z0-9]+)$ /?id=$1
--^^^---------
Actually, you should probably remove the /:
RewriteEngine On
RewriteRule ^([A-Za-z0-9$_.+!*'(),-]+)$ /?id=$1
------^^^---------
EDIT Added the +. Look at all the answers here. You need a composite of them., including the + and the [L] in addition to what I have here.
EDIT2 Also edited to include alpha characters in the id.
EDIT3 Added special characters to regex. These should be valid in a URL, but it's unusual to find them there.