I want to remove a string with a question mark at the end of my URL with .htaccess - apache

I want to remove the string
?mobile=1
out from different URLs with .htaccess. So:
https://www.example.com/?mobile=1 should become https://www.example.com/
and
https://www.example.com/something/?mobile=1 should become https://www.example.com/something/
I tried the following
RewriteEngine On
RewriteRule ^(.+)?mobile=1 /$1 [R=301,L,NC]
But that does not seem to work. Any ideas?

RewriteRule ^(.+)?mobile=1 /$1 [R=301,L,NC]
The RewriteRule pattern matches against the URL-path only, which notably excludes the query string. So the above would never match. (Unless there was a %-encoded ? in the URL-path, eg. %3F)
To match the query string you need an additional condition (RewriteCond directive) and match against the QUERY_STRING server variable.
The regex .+ (1 or more) will not match the document root (ie. your first example: https://www.example.com/?mobile=1). You need to allow for an empty URL-path in this case. eg. .* (0 or more).
For example, try the following near the top of your root .htaccess file:
RewriteCond %{QUERY_STRING} =mobile=1
RewriteRule (.*) /$1 [QSD,R=301,L]
This matches the query string mobile=1 exactly, case-sensitive (as in your examples). No other URL parameters can exist. The = prefix on the CondPattern makes this an exact match string comparison, rather than a regex as it normally would.
And redirects to the same URL-path, represented by the $1 backreference in the substitution string that contains the URL-path from the captured group in the RewriteRule pattern.
The QSD (Query String Discard) flag removes the query string from the redirect response.
Test first with a 302 (temporary) redirect and and only change to a 301 (permanent) - if that is the intention - once you have confirmed this works as intended. 301s are cached persistently by the browser so can make testing problematic.

Related

Rewrite that appends a query string is not firing?

I'd like to perform internal redirect
From /search/lalafa to /search/lalafa?post_type=review
The rule is
RewriteRule ^/search/([^/]+)$ /search/$1?post_type=review [L]
Seems correct but it doesn't seem to be matching:
https://htaccess.madewithlove.com/?share=c2ce1c7f-60be-40c8-9f0b-58ffc9eeba40
RewriteRule ^/search/([^/]+)$ /search/$1?post_type=review [L]
This will fail for two reasons:
The URL-path matched by the RewriteRule pattern in .htaccess does not include the slash prefix, so this rule never matches. It should be ^search/([^/]+)$.
If it did match it would create a rewrite-loop (500 Internal Server Error) since you are rewriting to the same URL-path and not checking the query string.
Try the following instead:
RewriteCond %{QUERY_STRING} !=post_type=review
RewriteRule ^search/[^/]+$ $0?post_type=review [L]
The $0 backreference contains the full URL-path, as matched by the RewriteRule pattern (NB: no slash prefix). The preceding condition checks that the query string is not already equal to post_type=review, thus preventing a rewrite-loop.

remove //xx after directory in URL

I need to redirect all URLs like this:
example.com/podcasts//rebt
to
example.com/podcasts
I am trying to adjust this code to do both but I can't get it to work:
RewriteCond %{REQUEST_METHOD} !=POST
RewriteCond %{REQUEST_URI} ^(.*?)(/{2,})(.*)$
RewriteRule . %1/%3 [R=301,L]
To remove //<something> at the end of the URL-path (eg. /podcasts//rebt to /podcasts, try the following instead at the top of the root .htaccess file:
RewriteEngine On
RewriteCond %{THE_REQUEST} \s([^?]+?)//
RewriteRule . %1 [R=301,L]
THE_REQUEST server variable contains the first line of the initial request headers (eg. GET /podcasts/rebt HTTP/1.1) and does not change when the request is internally rewritten (unlike REQUEST_URI).
The regex \s([^?]+?)// captures the part of the URL-path before the first instance of a double slash. Anything after and including the double slash, are discarded. This regex also ensures we do not inadvertently match against the query string (if any).
The %1 backreference contains the captured subpattern (ie. everything before the first double slash in the URL-path) from the preceding CondPattern.
Aside: Note that this will not work properly if the preceding URL-path maps to a physical directory, since it will result in two redirects. eg. /directory//something to /directory to /directory/ (by mod_dir). In this case, you should avoid removing the first trailing slash.
You should test first with a 302 (temporary) redirect to avoid any potential caching issues and only change to a 301 (permanent) redirect when you are sure it's working as intended. You should clear your browser cache before testing.
A look at your existing rule...
RewriteCond %{REQUEST_METHOD} !=POST
RewriteCond %{REQUEST_URI} ^(.*?)(/{2,})(.*)$
RewriteRule . %1/%3 [R=301,L]
This code is intended to reduce multiple slashes to single slashes in the URL-path, not remove the double slash and remaining path entirely. eg. /podcasts//rebt to /podcasts/rebt. However, since it checks against the REQUEST_URI server variable (which can change throughout the request) it may not work as intended.
Also, the condition that checks against the REQUEST_METHOD would seem to be redundant, unless you are erroneously POSTing to double-slashed URLs internally? A 301 redirect removes any POST data (since the browser converts it to GET) - hence why the check may be necessary in certain cases.

htaccess send 404 if query string contains keyword

I'm seeing a lot of traffic which I suspect is probing for a flaw or exploit with the request format of
https://example.com/?testword
I figured while I look into this more I could save resources and disrupt or discourage these requests with a 404 or 500 response
I have tried
RewriteEngine On
RewriteCond %{QUERY_STRING} !(^|&)testword($|&) [NC]
RewriteRule https://example.com/ [L,R=404]
And some other variations on the Query string match but none seem to return 404 when testing. Other questions I have found look for query string values/pairs and rewrite them but no examples seem to exits for just a single value.
RewriteCond %{QUERY_STRING} !(^|&)testword($|&) [NC]
RewriteRule https://example.com/ [L,R=404]
There are a few issues here:
The CondPattern in your condition is negated (! prefix), so it's only successfull when the testword is not present in the query string.
The RewriteRule directive is missing the pattern (first) argument (or substitution (second) argument depending on how you look at it). The RewriteRule directive matches against the URL-path only.
When you specify a non-3xx status code for the R flag, the substitution is ignored. You should specify a single hyphen (-) to indicate no substitution.
To test that the whole-word "testword" exists anywhere in the query string, you can use the regex \btestword\b - where \b are word boundaries. Or maybe you simply want the regex testword - to match "testword" literally anywhere, including when it appears as part of another word? In comparison, the regex (^|&)testword($|&) would miss instances where "testword" appears as a URL parameter name.
Try the following instead:
RewriteCond %{QUERY_STRING} \btestword\b [NC]
RewriteRule ^$ - [R=404]
This matches the homepage only (ie. empty URL-path). The L flag is not required when specifying a non-3xx return status, it is implied.
The - (second argument) indicates no substitution. As mentioned above, when specifying a non-3xx HTTP status, the substitution string is ignored anyway.
To test any URL-path then simply remove the $ (end-of-string anchor) on the RewriteRule pattern. For example:
RewriteCond %{QUERY_STRING} \btestword\b [NC]
RewriteRule ^ - [R=404]
If your homepage doesn't accept any query string parameters then you could simply reject the request (ie. 404 Not Found) when a query string is present. For example:
RewriteCond %{QUERY_STRING} .
RewriteRule ^$ - [R=404]

Rewrite URL containing directory path and parameters to parameter based URL using both path and query string

I use htaccess to rewrite this path:
/inventory/products/tools/
to this url with query string:
/inventory.php?cat=products&type=tools
using the following rule:
RewriteRule ^inventory/(.*)/(.*)/? /inventory.php?cat=$1&type=$2 [L,R=301]
When I add a query string to my url path
/inventory/products/tools/?sort=pricehigh
and use this rule
RewriteCond %{QUERY_STRING} ^(.*)$ [NC]
RewriteRule ^inventory/(.*)/(.*)/? /inventory.php?cat=$1&type=$2&%1 [L,R=301]
I am getting a redirect loop and the urlstring is rewritten over and over
I am trying to end up with the following destination url
/inventory.php?cat=products&type=tools&sort=pricehigh
In the example rule above I am using R=301 in order to visualize the url.
In a production I would use [L] only
Without the trailing slash, the second (.*) also allows for matching zero characters - so due to the greediness of regular expressions, the first (.*) matches products/tools already.
The following should work:
RewriteRule ^inventory/([^/]+)/([^/]+)/?$ /inventory.php?cat=$1&type=$2 [QSA,L,R=302,NE]
([^/]+) demands one or more characters, out of the class of characters that contains everything but the /.
The NE/noescape flag seems necessary here for some reason, otherwise the resulting query string will contain ?cat=products%26type=..., with the & URL-encoded.

htaccess RewriteRule : Problem to omit all after 1st argument

Goal: Want to rewrite all URLs of type
https://www.example.com/page/1234/?/blog/foo/bar/
to
https://www.example.com/page/1234/
In .htaccess I tried many variations along the line
RewriteEngine On
RewriteBase /
RewriteRule ^page/(\d+)/(.*)$ /page/$1 [R=301,L]
Using an .htaccess tester I see that at least the matching pattern is valid.
I would expect that the rewrite would not include anything after $1, but it does, and show the complete original URL.
What am I missing?
https://www.mypage.com/page/1234/?/blog/foo/bar/
Everything after the first ? is the query string part of the URL. By default, Apache passes the query string unaltered from the request to the target URL (unless you create a new query string yourself on the RewriteRule substitution). This explains why you are seeing the same query string on the target URL, without seemingly doing anything with it.
Incidentally, the RewriteRule pattern only matches against the URL-path only - this notably excludes the query string. To match the query string in mod_rewrite you need an additional condition that checks the QUERY_STRING server variable.
On Apache 2.4+ you can use the QSD (Query String Discard) flag to remove the query string from the target URL. Or, specify an empty query string on the substitution - by including a trailing ? (the ? itself does not appear on the resulting URL).
For example (on Apache 2.4):
RewriteCond %{QUERY_STRING} .
RewriteRule ^page/(\d+)/ /page/$1/ [QSD,R=301,L]
The RewriteCond directive checks for the presence of a query string, which is necessary to prevent a redirect loop.
The trailing (.*)$ on the RewriteRule pattern was superfluous.
You had omitted the trailing slash on the end of the substitution (that is present on the example URL). This would have also prevented a redirect loop, but as mentioned, this is not as per your example. (Alternatively, you could include the slash in the captured backreference.)
If you are still on Apache 2.2 then you would need to include a trailing ? instead of the QSD flag. For example:
RewriteRule ^page/(\d+)/ /page/$1/? [R=301,L]
You will need to clear your browser cache before testing, as 301 (permanent) redirects are cached persistently by the browser. For this reason, it is often easier to first test with 302 (temporary) redirects.