mod redirect rule - apache

I just want to redirect the URL through the mod rewrite ruls. I have applied this rule excluding (R=301)
Example :
from http:///webapp/wcs/stores/servlet/en/marksandspencer to http:///en/marksandspencer
I am using this rules for the mod redirect rules.
RewriteEngine on
RewriteCond %{REQUEST_URI} !^(/)?$
RewriteCond %{REQUEST_URI} !^/webapp.*$
RewriteCond %{REQUEST_URI} !^/wcsstore.*$
RewriteRule ^/(.*)$ /webapp/wcs/stores/servlet/$1 [PT,NC,L,QSA]
RewriteRule ^/webapp/wcs/stores/servlet/(.*) /$1 [NE,L,QSA]
RewriteRule ^(/)?$ /webapp/wcs/stores/servlet/en/marksandspencer [PT,NC,L]

No idea what you're trying to do, but if you're using Apache 2.0 or higher, the leading slash is stripped off of URI's when matching is done within a RewriteRule. Also, you have a rule that looks like you're adding a /webapp/wcs/stores/servlet/ to the beginning of a URI, then the very next rule it looks like you are removing it. This will probably cause a loop.
Taking a wild guess at what you are trying to do, I think you need to add a condition to the 2nd rule, and remove the leading slashes:
# internally rewrite URI by appending "/webapp/wcs/stores/servlet/" to the front
RewriteCond %{REQUEST_URI} !^(/)?$
RewriteCond %{REQUEST_URI} !^/webapp.*$
RewriteCond %{REQUEST_URI} !^/wcsstore.*$
RewriteRule ^(.*)$ /webapp/wcs/stores/servlet/$1 [PT,NC,L,QSA]
# if a request is made with "/webapp/wcs/stores/servlet/" in it, redirect to a URI with it stripped
RewriteCond %{THE_REQUEST} ^(GET/POST)\ /webapp/wcs/stores/servlet/
RewriteRule ^webapp/wcs/stores/servlet/(.*) /$1 [R=301,L,QSA]
RewriteRule ^$ /webapp/wcs/stores/servlet/en/marksandspencer [PT,NC,L]

Related

Mod_rewrite is adding /var/www to the resulting URL

I have some problems with the Apache mod_rewrite rules. Whenever I try to go to https://example.com// (see double slashes at the end) it redirects to a 301 page but it's adding the location of the directory, i.e. https://example.com/var/www/my-domain.com/html which is not desirable.
Here is my .htaccess file:
ErrorDocument 404 /views/pages/404.php
RewriteEngine on
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
RewriteRule ^(.*)$ http://%1/$1 [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [L,R=301]
RewriteCond %{THE_REQUEST} \s/+(.*?)/+(/\S+) [NC]
RewriteRule ^(.*) [L,R=404]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/{2,} [NC]
RewriteRule ^(.*) $1 [R=301,L]
RewriteRule ^contact-us/?$ views/pages/contact.php [NC,L]
Same happens when I go to https://example.com//contact-us.
https://example.com/contact-us// is redirecting well to https://example.com/contact-us and https://example.com//contact-uss is redirecting well to the 404 page.
If one needs further information let me know.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/{2,} [NC]
RewriteRule ^(.*) $1 [R=301,L]
You are missing the slash prefix on the substitution. This results in a relative-path substitution (since the $1 backreference does not contain the slash prefix), to which mod_rewrite prefixes the directory-prefix (ie. /var/www/example.com/html). This would result in the malformed redirect you are seeing. The RewriteRule should be written as:
RewriteRule (.*) /$1 [R=301,L]
(The ^ anchor on the RewriteRule pattern is unnecessary here.)
However, the following redirect is also invalid:
RewriteCond %{THE_REQUEST} \s/+(.*?)/+(/\S+) [NC]
RewriteRule ^(.*) [L,R=404]
You are missing the substitution argument altogether. [L,R=404] will be seen as the substitution string (not the flags, as intended). This would also result in a malformed rewrite/redirect. The RewriteRule should be written as:
RewriteRule (.*) - [R=404]
Note the - (single hyphen) is used as the substitution argument (which is later ignored). When specifying a non-3xx response code, the L flag is implied.
However, I'm curious what it is you are trying to do here, as you appear to be "accepting" multiple slashes in one directive (by reducing them), but then rejecting multiple slashes in another directive (with a 404)? Why not reduce all sequences of multiple slashes wherever they occur in the URL-path?
For example, replace the following (modified code):
# Remove trailing slash from URL (except files and directories)
# >>> Why files? Files don't normally have trailing slashes
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [L,R=301]
# Reject multiple slashes later in the URL or 3+ slashes at the start of the URL
RewriteCond %{THE_REQUEST} \s/+(.*?)/+(/\S+) [NC]
RewriteRule (.*) - [R=404]
# Reduce multiple slashes at the start of the URL
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/{2,} [NC]
RewriteRule (.*) /$1 [R=301,L]
With something like the following (depending on requirements):
# Reduce sequences of multiple slashes to a single slash in the URL-path
# NB: This won't work to reduce slashes in the query string (if that is an issue)
RewriteCond %{THE_REQUEST} //+
RewriteRule (.*) /$1 [R=302,L]
# Remove trailing slash from URL (except directories)
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [R=302,L]
Note that I've reversed the directives so that slashes are reduced before the final trailing slash is removed.
Test with 302s to avoid caching issues. And clear your browser cache before testing.
UPDATE: If double slashes could ever (legitimately) occur in the query string portion of the URL then the above will result in a redirect loop since the condition checks for multiple slashes anywhere in the URL (including the query string), whereas the RewriteRule only reduces multiple slashes in the URL-path. If you need to allow multiple slashes in the query string then change the CondPattern from //+ to \s[^?]*//+ to specifically check the URL-path only, not the entire URL. In other words:
RewriteCond %{THE_REQUEST} \s[^?]*//+
RewriteRule (.*) /$1 [R=302,L]

Apache manipulating query string on https

the following rules works on http but don't on https.
RewriteCond %{QUERY_STRING} "page=" [NC]
RewriteRule (.*) /$1? [L]
RewriteRule ^/path/file.html$ https://www.domain.tld/path/file/ [R=301,L]
Why the query_string part doesn't work in https?
Based on the comments of your question, it appears that you just want to remove the query string anytime there is a page parameter. This type of rewrite rule doesn't strip off or change the URL unless there is a redirect. So if you don't add R=301 or R to the rewrite rule, the query string will not be removed. All of the following worked on my server to remove the query string, and my server is 100% HTTPS:
RewriteCond %{QUERY_STRING} "page=" [NC]
RewriteRule (.*) /$1? [R=301,L]
Or you may be able to use the QSD flag instead of question mark:
RewriteCond %{QUERY_STRING} "page=" [NC]
RewriteRule (.*) /$1 [R=301,L,QSD]
Or you may be able to use something like this:
RewriteCond %{QUERY_STRING} "page=" [NC]
RewriteRule .* /? [R=301,L]
Or just R (for 302) instead of R=301:
RewriteCond %{QUERY_STRING} "page=" [NC]
RewriteRule (.*) /$1? [R,L]
But in no case was the query string removed unless I used a redirect.
Really all you need is QSD http://httpd.apache.org/docs/current/rewrite/flags.htm
RewriteCond %{QUERY_STRING} "page=" [NC]
RewriteRule (.*) /$1 [L,QSD]
I'm not sure why it would work on http but not https unless you're using separate vhosts for http and https and the settigns were slightly different
Remember that Rewrite rules are internal unless using the R flag. When you use the R flag it tells the browser to go to a different page causing a full server/client round-trip. Otherwise, it just changes the request and proceeds as normal.
I'm not really strong with .htaccess file, and I can't explain what's going on behind the scenes, but from my opinion your .htaccess file should look like this:
RewriteCond %{QUERY_STRING} page= [NC]
RewriteRule (.*) /$1? [L]
RewriteRule ^path/file.html$ https://www.domain.tld/path/file/ [R=301,L]
It work perfectly for
http://example.com?page=1
https://example.com/?page=1
http://example.com/path/file.html
https://example.com/path/file.html
Tested (with love) here

RewriteRule with greedy matches trying to redirect root to a page, and rewrite all other requests to https

I am trying to work with apache2 rewrite to perform two functions
rewrite (or redirect) / (root) requests to a specific page: /index
all requests (including / and /index) on port 80, are rewritten to HTTPS
Here is my failed attempt
RewriteEngine On
RewriteRule ^$ https://%{HTTP_HOST}%/index [R]
RewriteCond %{REQUEST_URI} !^$
RewriteCond %{HTTPS} off
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}
My guess is that the last rule was being too aggressive/greedy (everything is rewriting to https) - so I tried to prevent the root match from getting there by adding a RewriteCond that matches on not root, but it is not working.
Edit: is my approach wrong? Is there an obvious mistake in Rules/Conditions?
RewriteCond %{REQUEST_URI} !^$
This will always match, %{REQUEST_URI} does not have the current per-directory prefix removed from it like the implicit thing you match a per-directory RewriteRule against. IOW add the slash in.
For completeness here is the final configuration:
RewriteEngine On
RewriteRule ^$ https://%{HTTP_HOST}%/index
RewriteCond %{REQUEST_URI} !^/$
RewriteCond %{HTTPS} off
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}

.htaccess redirect loop issue

If I go to domain.com, site redirects to domain.com/en, which is expected. But then the last rule kicks in and throws it in a loop making my url look something like this:
http://domain.com/en/?lang=en&request=&site=basecamp&lang=en&request=&site=basecamp&lang=en&request=&site=basecamp&lang=en&request=&site=basecamp
.htaccess file
Options +FollowSymlinks
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} !/(content|images|css|js|fonts|pdfs)/
RewriteRule /([^.]+\.(jpe?g|gif|bmp|png|css|js|eot|svg|ttf|ico|pdf))$ /$1 [NC,R,L]
RewriteCond %{REQUEST_URI} !(content|images|css|js|fonts|pdfs)/.*
RewriteRule !^[a-z]{2}/ /en/ [NC,L,R]
RewriteCond %{REQUEST_URI} !/(init\.php|content|images|css|js|fonts|pdfs)/
RewriteRule ^([a-z]{2})/(.*)$ init.php?lang=$1&request=$2&site=basecamp[L,QSA]
Why is the htaccess file redirecting to /?GET_VARS instead of /init.php?GET_VARS ?
And how can I avoid this loop?
Try changing your last condition by removing the leading and trailing slashes. You've rewritten your URI to:
/init.php
But there's no trailing slash like there is in the condition that you've provided, so the rule gets applied again. It should look like:
RewriteCond %{REQUEST_URI} !(init\.php|content|images|css|js|fonts|pdfs)
Not sure why you insist on having the trailing slash at the end of your condition here:
# with this / here, it will never match init.php --------v
RewriteCond %{REQUEST_URI} !/(init\.php|content|images|css|js|fonts|pdfs)/
But you can solve it by just excluding init.php directly:
RewriteCond %{REQUEST_URI} !init.php
So, just so it's clear, your last set of conditions/rule will look like:
RewriteCond %{REQUEST_URI} !init.php
RewriteCond %{REQUEST_URI} !/(content|images|css|js|fonts|pdfs)/
RewriteRule ^([a-z]{2})/(.*)$ init.php?lang=$1&request=$2&site=basecamp [L,QSA]
Or worst case, just add this in the very beginning of your rules:
RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^ - [L]
Which prevents any kind of rewrite engine looping altogether.

Apache Rewrite Rules - Path Style

I am coming from the IIS world to Apache and would appreciate some help on the rewrite rules.
I want this relative path:
/index.php?go=order&id=12345
to be rewritten as:
/go/order/id/12345
Also, if there are more parameters, they should be converted to path format:
/index.php?go=order&id=12345&action=process
becomes
/go/order/id/12345/action/process
How do I achieve this please? Thanks for any input.
Try putting this in your vhost config:
RewriteEngine On
# Start converting query parameters to path
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?[^\ ]+
RewriteCond %{QUERY_STRING} ^([^=]+)=([^&]+)&?(.*)$
RewriteRule ^(.*)$ $1/%1/%2?%3 [L]
# done converting, remove index.php and redirect browser
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?[^\ ]+
RewriteCond %{QUERY_STRING} ^$
RewriteRule ^/index.php/(.*)$ /$1 [R=301,L]
# internally rewrite paths to query strings
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !^/index\.php
RewriteRule ^/([^/]+)/([^/]+)/?(.*) /$3?$1=$2 [L,QSA]
# No more path, rewrite to index.php
RewriteRule ^/$ /index.php [L]
These rules will make it so if you type in a URL like http://yourdomain.com/index.php?a=b&1=2&z=x in your browser, the browser will get redirected to http://yourdomain.com/a/b/1/2/z/x. When the clean looking URL gets requested, the 2nd set of rules internally rewrites it back to /index.php?a=b&1=2&z=x. If you want to put these rules in an htaccess file (in your document root), you need to remove all the leading slashes in the RewriteRule's. So ^/ needs to be ^ in the last 3 rules.
Note that if you simply go to http://yourdomain.com/index.php, without a query string, nothing gets rewritten.