Escaping spaces in mod_rewrite - apache

I have the following Apache mod_rewrite rule:
RewriteRule ^(.*) http://127.0.0.1:4321/$1 [proxy]
This works great; Apache forwards all requests to the CherryPy server I have running on the same machine.
Unfortunately, I'm having some problems with paths which have a space. If I make a request for /Sites/some%20site/image.png then Apache makes a request to CherryPy for /Sites/some site/image.png which messes up CherryPy.
Is there a way to specify in my RewriteRule that I'd like to re-escape spaces in the URL before forwarding the request to CherryPy?
EDIT: I found a reference to something that might help, but I went ahead and ducked the problem by replacing the spaces with underscores and having CherryPy do a conversion before serving the files.
I'd still like to know a better solution if anyone has one; unfortunately I'm on a deadline and don't have time to muck around with this myself at the moment. I may return to this later and post further updates when I find the time.

Please see http://tools.cherrypy.org/wiki/ModRewrite#Bewaretheencodingbug for the best known solution.

Related

Apache: .htaccess vs vhost conf file for blocking URLs

I need to block some uld URLs that are generating a lot of traffic in my web server (Apache). For example to block all the requests like https://example.com/xxxxxx/
I Can't do that with IPtables so I am using mod_rewrite with a rule in my .htaccess
That is still consuming a lot of resources and I am wondering if there is a better way to block the request before reaching Apache. Or another most efficient way to do it within Apache. For example, I heard that parsing .htaccess files consumes resources so not sure if using the vhost .conf file can help or it is really the same...
Any advice on how can I block requests using the URL?
Thank you experts!
Certainly distributed configuration files consume more load than a single, central and static configuration. But the differences are not like day and night. The issue with a distributed configuration is more the effort to keep the overview, to maintain it.
If you can keep those requests away from the http server at all you certainly will see more difference. You could consider using a frontend server. Something like nginx or HAProxy that acts as a gate keeper and only forwards those requests you actually want to respond to. This makes little sense on a single system though, you'd need two separate cloud services or even systems for that.
The best approach would be to add something like this to your httpd / vhost.conf file:
RewriteEngine on
RewriteCond %{REQUEST_URI} !^/xxxx$
RewriteRule ^ - [F]
Every call to /xxxx would result in mod_rewrite to return a 403 response.
Make sure to place those rules into the according vhost tag.

Rewrite dynamic url htaccess apache

I would need to perform a redirect by extrapolating a part of the url and then creating the new one.
Specifically, I have to redirect:
https://(part to be extracted).montecoasp.it
up:
https://(extracted part).montecosrl.it
PLEASE NOTE: The part to be extracted may not even be there.
Can anyone tell me what to write in the htaccess file? Should you use RewriteUrl, RedirectMatch or what? Thank you.
I assume this is what you are looking for:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(\w+\.)?montecoasp\.it$
RewriteRule ^ https://%1montecoasp.it%{REQUEST_URI} [R=301,END]
You can implement such rule in a distributed configuration file, but you should prefer to use the static http server's host configuration.
Obviously the rewriting module needs to be loaded into the http server for this. And if you want to use a distributed configuration file (".htaccess"), then you need to enable those too...
In general it is a good idea to start out with a R=302 temporary redirection and only to change that to a R=301 permanent redirection once everything is sorted out. That prevents nasty caching issues...
You definitely should start reading the documentation of the tools you are using. You want to learn how things work, you do not just want to blindly copy things. As typical for OpenSource software the apache documentation is of excellent quality and comes with great examples:
https://httpd.apache.org/docs/current/howto/htaccess.html
https://httpd.apache.org/docs/current/mod/mod_rewrite.html

HTACCESS ignores images

I have the following very simple htaccess file:
RewriteEngine On
RewriteRule a.jpg b.jpg
RewriteRule c.php d.php
All four resources are in the root folder.
The PHP rule works as expected, however, the JPG rule is just ignored as if it were not there. The image a.jpg continues to display.
I am completely clueless on why that would happen.
The only explanation I could think of is that Apache is somehow configured not to INVOKE htaccess at all if the requested resource is an image. Is that even possible?
I found out the reason and I am posting my answer in case anyone faces the same issue.
It appears that both Nginx and Apache are configured on the server. Nginx is internet facing and Apache is internal.
It appears that the web hosting company has done so to benefit from Nginx's better performance and to provide compatibility to anyone coming from Apache environment at the same time.
When Nginx receives a PHP request from the internet it allows the request to pass through and reach Apache but when the resource is a static resource (image, css, js) Nginx delivers the resource itself for optimum performance.
The htaccess image rule above is not processed because the request is not even reaching Apache.
I temporarily solved the problem by not allowing Nginx to handle the images itself and allowing them to proceed to Apache.
The better solution of course is to remove htaccess dependency and handle everything within Nginx configuration file, which I will be doing soon.
The best solution of course is to remove Apache completely but it is a shared server and I don't have full control.

Drupal Clean Urls break randomly for arbitrary paths

I've done everything right. My server has mod_rewrite enabled, my virtualhost path has AllowOverride set to All, and I have the .htaccess file in place with the rewrite rules same as everyone. But I have trouble accessing some pages using their clean url paths. So for 90% of the pages, clean urls work fine. But for that 10%, they don't.
I have checked whether those pages exist -- they do. Checked whether they are accessible using index.php?q=[path] -- and they are. They are only inaccessible through clean url paths.
Can anyone help me with this mystery?
Because you can access your pages through q=path/to/menu/item, then it's clear that it is mod_rewrite that is at fault and not Drupal.
To debug what is going on with your rewrite, either turn on the rewrite log and tail -f it while you request the troubled pages, or alternatively print_r($_GET) at the top of index.php or page.tpl.php to see what is actually being requested.
If you are comfortable posting your potentially sensitive .htaccess here, do so and we can have a look at it for you to see if there are any misconfigurations.
mod_rewrite has a few long-standing bugs that mangle URLs on the way through (do your problem urls have any escape characters?). I don't know if Drupal does this, but in other PHP apps I have had to add code to re-do the rewrite once the correct entrypoint has been reached.
Unfortunately, Drupal can't take its search path in PATH_INFO (as a lot of other apps do), otherwise you could use mod_alias which is much simpler and much more reliable.

Redirecting a Directory to a Script on Apache

So I'm playing with a script that makes it super easy to mirror images off of the web. The script works great (based off of the old imgred.com source, if you've seen that) problem is, it looks a little clunky when using it.
Currently, in order to use the script, you go to a url like:
http://mydomain.com/mirror/imgred.php?Image=http://otherdomain.com/image.jpg
What I'd like to do is to be able to go to:
http://mydomain.com/mirror/http://otherdomain.com/image.jpg
and have it redirect to the former URL, preferably transparent to the user.
I'm reasonably certain that this can be done via .htaccess with a MOD_REWRITE of some kind, but I'm getting frustrated trying to get that to work.
After messing with this myself, I found out that apache collapses any double slash in the URL before the query part into a single slash, and passes the result to mod_rewrite. Maybe that was giving you problems?
This might work for you (.htaccess in the mirror directory):
RewriteEngine On
RewriteBase /mirror
RewriteRule ^http(s?):/(.*) imgred.php?Image=http$1://$2 [L]
Don't know if your script accepts https addresses as well, so I included that just to be sure