I have an AEM 6.3 instance running behind an Apache instance which version is 2.4.6, with Dispatcher module in it. All is good, but now I need to wipe out all query params for all URLs that end with ".html".
This may sound simple to accomplish, but I came across an issue I can't resolve. This is the rewrite rule I'm using to remove all the query params from URLs ending in .html:
RewriteRule ^/(.*)\.html$ /$1.html [QSD]
Technically, one could see this rewrite as not a rewrite actually, because it is sending the original request to the same URL, but the flag QSD is for dropping all query params.
The problem is, if I reload my Apache instance whit this rule included, I start getting errors like this:
[Wed Jun 10 14:53:35.698908 2020] [authz_core:error] [pid 31733] [client 54.209.162.6:61649] AH01630: client denied by server configuration: /etc/clientlibs, referer: https://my.domain.com/etc/clientlibs/mygroup/some/simple/page.html
I know some people had issues like this when migrating from Apache 2.2 to 2.4. This is not my case, and I have also checked my vhost configuration. I don't have directives from Apache 2.2 like "Order deny,allow" or "Allow from all". I'm using "Require all granted".
One weird thing in AEM logs, is that when my Rewrite rule is not in place, I can see error.log logging that "/etc/clientlibs/mygroup/some/simple/page.html" is found. But if I put the rule and reload Apache, I see this from logs:
10.06.2020 10:16:40.085 *INFO* [54.209.162.6 [1591798600081] GET /etc/clientlibs/mygroup/some/simple/page/jcr:content.json HTTP/1.1] org.apache.sling.engine.impl.SlingRequestProcessorImpl service: Resource /etc/clientlibs/mygroup/some/simple/page/jcr:content.json not found
It is like the extension .html would be ripped off from URL, and since there is no extension, AEM or rather Sling is trying to use the default content resolver which is JSON.
why donÄt you just use
RewriteRule ^ %{REQUEST_URI} [L,R,QSD]
(maybe the redirect is not needed in your case... but it makes things clear to the browser).
Or if you just want to make sure that your request is cached in the dispatcher and not passed throught to AEM each time, use:
/filter {
/0001 { /type "deny" /method "POST" /url "/etc/*" }
/0002 { /type "allow" /method "GET" /url "/etc/*" /query "a=*" }
}
in your dispatcher config (s. https://docs.adobe.com/content/help/en/experience-manager-dispatcher/using/configuring/dispatcher-configuration.html for details).
I finally was able to fix my issue. Even though I still don't understand the full picture. This is my final condition and rule:
RewriteCond %{QUERY_STRING} ^.
RewriteRule ^/(.*)\.html$ /$1.html [QSD,PT]
Adding "PT" along with "QSD" makes Apache not return the "client denied" error. The condition around QUERY_STRING it is just to make sure Apache only manipulates those requests that really have query params in the URL, or technically at least one char
What about adding a conditional to skip this rule to be applied for /etc/clientlibs just before the rewrite rule. RewriteCond %{REQUEST_URI} !^/etc/clientlibs.*
Related
I'm using an Apache RewriteMap to permanently redirect 60 or so urls. In one development environment, the below configuration works flawlessly while in another development environment it doesn't work at all. Most notably, the last RewriteCond never passes and unfortunately the logging options I've attempted are no help. With LogLevel debug rewrite:trace8, I can see that the RewriteCond just before the map is receiving the expected input, but the map nonetheless returns no match:
[Thu Apr 19 19:35:19.109789 2018] [rewrite:trace4] [pid 11188] mod_rewrite.c(470): [client 127.0.0.1:62369] 127.0.0.1 - - [server.dev/sid#7f7719da0d50][rid#7f7719fc8000/initial] [perdir /html/path/] RewriteCond: input='/help_center/help_center.php?' pattern='^/?(.*[^\\?])\\??/?$' => matched
[Thu Apr 19 19:35:19.109793 2018] [rewrite:trace4] [pid 11188] mod_rewrite.c(470): [client 127.0.0.1:62369] 127.0.0.1 - - [server.dev/sid#7f7719da0d50][rid#7f7719fc8000/initial] [perdir /html/path/] RewriteCond: input='NOTFOUND' pattern='!NOTFOUND' [NC] => not-matched
For debugging purposes I've simplified things to pass a constant key into the map, but the map nonetheless returns no substitute value. I've also tried simplifying the map file, adding all the possible variations of a key it could receive (with and without leading/trailing slashes and ?). I've tried renaming the map file and extension, renaming the map itself, moving the map file outside of the public directory, all with no change in results. The map file is readable as is the directory it's in, Apache starts up error-free with the config and yes I've been restarting it when testing config changes.
What's left to try? Both systems run CentOS 7, Apache 2.4, one works and one doesn't. Configs below
Apache server-level config declaring map
RewriteMap help_center txt:/path/to/rewritemap/help_center.map
.htaccess
RewriteEngine on
RewriteCond %{REQUEST_URI} !(\.(js|css|less|png|swf|flv|jpg|svg|ico))$
RewriteCond %{REQUEST_URI}?%{QUERY_STRING} ^/?(.*[^\?])\??/?$
RewriteCond ${help_center:%1|NOTFOUND} !NOTFOUND [NC]
RewriteRule ^.*$ /${help_center:%1}/ [QSD,L,NC,R=301]
Abbreviated contents of help_center.map
help_center/help_center.php help-center
help_center/help_center.php?1_7_q-1 help-center/article/authorized-dealer
pages/appliance_installation help-center/article/installation-services
UPDATE
After 6 hours of debugging, I've finally been able to get the RewriteMap to match. By moving the RewriteMap directive to the <VirtualHost _default_:443> scope of /etc/conf.d/ssl.conf, urls are rewriting as expected.
Why is this the case?
The config between these two environments is very similar with the exception of virtual hosting. Both environments run SSL and redirect all requests to the SSL site. The environment that works is configured without a name based virtual host config (i.e. one site for the server) while the environment that didn't work is running name based virtual hosting. I have a single hypothesis related to this:
While I'd expect the root directive to apply to the default site and all virtual sites (including SSL), perhaps the RewriteMap directive must be virtual host scoped to be referenced by a virtual host. Not sure this makes sense, but I couldn't find any documentation to clarify. After testing, turning off NameVirtualHost nothing changes :/
Related: It seems that referencing a non-existent RewriteMap logs no error. Perhaps it was buried while I had debug logging turned on, but is there a config for logging RewriteMap reference errors? That would have help narrow my debugging much sooner.
Finally, for examining parsed Apache config run this from command line: httpd -DDUMP_CONFIG -k start or to view in vi httpd -DDUMP_CONFIG -k start | vi -. I was able to use this to confirm that my RewriteMap directive was being loaded despite it still not matching.
I'm not entirely sure this should qualify as an answer, but I THINK this is what has made it work: Remove the QSD argument and append a ? to manually discard it. This change was made because we have a server running 2.2 that would error out entirely (rather than not match) and it seems that a side effect is that our 2.4 servers are now matching. I'm sure not it's Voodoo, but unfortunately I can't explain it with confidence.
## QSD not available in apache 2.2, add a ? to the end of the rewrite to discard
RewriteRule ^.*$ /${help_center:%1}/? [L,NC,R=301]
I am not using Virtual Hosts or anything fancy though I have some .htaccess files setup. Following is my rewrite rule in httpd.conf:
RewriteEngine On
RewriteCond %{REQUEST_URI} !^/app/smsapi [NC]
RewriteRule (.*) https://www.example.com/uri=%{REQUEST_URI} [R,L]
This rule basically says that if the uri does not begin with /app/smsapi then fire the rewrite. But when I restart the server and try it I get some weird results.
When I request the URL https://www.example.com/app/smsapi/index.php, I get a 200 Success code which is as expected. But, when I request the URL http://www.example.com/app/smsapi/index.php, it redirects to https://www.example.com/uri=/app/smsapi/index.php. So it actually fires the rule even though the request URI does not satisfy the condition.
So, then I decided to turn off the rewrite rules and give it a go. Now, both those URL give me a 200 Success code.
Now, I know this problem cannot be solved easily by other people who do not have access to the server, but am I right in saying that this is certainly a problem with REQUEST_URI not firing correctly? I have shown that without the rewrite rule, everything works normally, but with the rewrite rule, the second URL is redirected. Therefore, the redirection must be caused by the rewrite rule? Additionally, the condition for redirect rule is not satisfied. Doesn't this prove that there is something wrong with the functioning of the rewrite rule?
Is there any other possibility?
UPDATE
Something very weird is happening here. I setup a local server and tried the same rule and what I got for the URL http://192.168.0.112/app/ is
http://192.168.0.112/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/uri=/app/
which is correct because as long as the URL is not like /app/smsapi, it should redirect it. Wonder why this is not happening on the real server. Also, where you insert these rules seems to make a difference. (I am only including these rules after the LoadModule command).
On localhost, if I put these rules either above or below the Directory section, it won't work. But, if I include it inside the Directory section it will.
On server, if I include the rules inside the Directory section, they won't work. But, if I include them either above or below the Directory section, they start working.
This seems to me to be due to a difference in the versions. My localhost is an Ubuntu Desktop 16.04 running Apache 2.4.18. While the server is CentOS 6.8 running Apache 2.2.15.
But, i think the mystery as to why on the server redirect happens only once (though it is configured to go upto 20 times) has something to do with https. Which is also related to the original problem in which https is redirected even on a non-matching rule.
Clues anyone?
UPDATE
I updated the httpd.conf file with the same rules but I used http:// instead of https:// and it gave me the correct result with 20 redirects. That means I have isolated the problem to https.
You are reporting the exact issue in the first phrase: "I am not using Virtual Hosts or anything fancy though I have some .htaccess files setup"
.htaccess is "fancy" and overcomplicated, not virtualhosts.
If you had defined that RewriteCond in virtualhost in the first place it would work, but .htaccess is per-dir context (aka a nightmare) and the regex ^/ will never match in that context.
If you want to match REQUEST_URI in per-dir context (directory or .htaccess) you need to drop the initial slash, that is:
RewriteCond %{REQUEST_URI} !^app/smsapi [NC]
Extra, also consider you MAY NOT need to add a RewriteCond for this:
RewriteRule ^(?!app/smsapi)(.*) https://www.example.com/uri=$1 [R,L]
I'm developing a webapp and for the static files I'm simply using apache at localhost while the backend is on a couchdb instance running at localhost:5984.
The webapp interacts with files from the backend all the time. So what is happening when trying to test on apache all file requests to localhost:5984 are getting blocked due the cross-domain policy so the only way to get that working is starting the browser by setting flags to ignore that.
But again I get stuck when trying to test the app on mobile such ipad or iphone.
Currently I have this on my .htaccess file.
RewriteEngine on
# these are 302 http redirections instead of serving as a proxy
RewriteRule auth http://localhost:5984/auth [L]
RewriteRule db/([\s\S]+) http://localhost:5984/db/$1 [L]
RewriteRule send/([\s\S]+) http://localhost:5984/send/$1 [L]
# these are just redirections to static files and work great
RewriteRule ^([a-z/.]+) _attachments/$1 [L]
RewriteRule ^$ _attachments/ [L]
As you can see I have really no idea on how to deal with apache configuration unfortunately.
But what is happening right now is that for some of these rules apache is simply redirecting the page instead of provide it as a proxy server which causes the issue with cross-domain.
Also on the first auth rule I send POST and DELETE requests which as a redirection instead of proxy it won't pass the data being POSTed through.
So what I would like to achieve is to activate some kind of feature (if it exists) which will make apache simply render the page as it was on the localhost domain instead of redirect it. (I named this a a proxy, but perhaps that's not even the right term, sorry for any mistake committed with the nomenclatures).
Is is possible to achieve such action?
Thanks in advance
Have a look at these links / options:
[P] flag:
http://httpd.apache.org/docs/current/rewrite/flags.html#flag_p
http://httpd.apache.org/docs/current/rewrite/proxy.html
mod_proxy (possibly -- but I think #1 should be enough if it's on the same server):
http://httpd.apache.org/docs/current/mod/mod_proxy.htm
I am a newbie to ubuntu and apache. Can someone tell me how I could direct to
www.mysite.com/drupal6
when user address www.mysite.com?
Thanks a lot.
Cheers.
If you are running Apache and Ubuntu, there is actually a really easy way to force this redirect using a simple php script.
Create an index.php file in the root of your server and paste the following code into it
<?php header("location: drupal6/") ?>
This will cause the site to auto-redirect to the drupal6 folder whenever it is visited.
This should work. Create a file in the root folder of your server called .htaccess - the dot at the beginning is very important as this helps the server identify the file as a hidden / system config file.
Open the file and paste the following lines of code in :
Options +FollowSymlinks
RewriteEngine on
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ www.mysite.com/drupal6/$1 [R,L]
This should force all traffic to the server to redirect to your custom folder.
A brief explanation of the .htaccess code
If you want rewrites to work, you have to enable the Rewrite Engine and tell the server to follow symlinks.
The second section establishes the rule - specifically applying it to all traffic on the standard web port of 80.
The final line tells the server to grab everything after the URL and append it to the new address (mysite.com/drupal6).
There's a lot more you can do with .htaccess files but you really need to Google for good examples to test out.
Look at Apache's mod_rewrite documentation. You will need a RewriteRule in your apache configuration at the minimum, you may also need RewriteCond's to define when the RewriteRule is used.
Your rewrite pattern will be rewriting the REQUEST_URI with something from: ^/$ to: /drupal6. The ^ and $ are essential to prevent Apache getting into an infinite loop while rewriting the base URI by only matching "/" and not "/anything-else".
I assume you're on a recent version of Ubuntu and Apache? If so, see the Apache 2.2 documentation on mod_rewrite.
On my site I have mod_rewrite rules to make the URLs more search engine friendly, and it all works fine on the frontend, but I'm getting errors in the error log like this
[Thu Jan 22 22:51:36 2009] [error] [client {IP ADDRESS HERE}] File does not exist: /{some rewritten directory}
The rules I'm using are rather simple, along the lines of
RewriteRule ^pages/(.*)_(.*).html$ page.php?id=$2
Is there a way to avoid these errors?
MultiViews could cause this. If it is enabled, Apache tries to find a file similar to the requested URI before passing the request along to mod_rewrite. So try to disable it:
Options -MultiViews
I don't think those errors have anything to do with mod_rewrite, they're just saying that a file doesn't exist. Plain old 404 errors.
Incidentally, shouldn't rewrite patterns normally start with a slash? Like so:
RewriteRule ^/pages/(.*)_(.*).html$ /page.php?id=$2