How can I redirect people accessing my files as directories? - apache

I have the following situation:
On my webserver I have an instance of websvn running, where specific repositories and revisions can be accessed by a URL like
http://www.myhost.com/listing.php?repname=repository1&path=%2Ftrunk%2Fbackend
Somehow, out there in the wild, a wrong URL is being used to access this
http://www.myhost.com/listing.php/?repname=repository1&path=%2Ftrunk%2Fbackend
(Notice the slash after listing.php)
Now, although the URL works and websvn still shows the webpage, images and stylesheets do not get loaded correctly, since they are referenced relative.
I tried to add an .htaccess file to the webroot to redirect people accessing the file as directory to the correct URL.
I have tried multiple variations and ended up with this file:
RewriteEngine on
RewriteRule ^/listing.php/ listing.php [R=301,QSA]
But, since I am writing here, you already guessed it: It doesn't work.
I also tried
RewriteEngine on
RewriteRule ^/listing.php(.*) listing.php$1 [R=301,QSA]
What am I doing wrong?

Perhaps among other things, a RewriteRule within .htaccess that starts with “^/” will never match anything at all. (Examples that include a leading slash are for the global configuration file.) Remove the leading forward slash and see if that helps.
Also, I recommend changing the 301 to a 307 until you get it working. Otherwise, your browser will cache the 301 result, redirecting on subsequent references without consulting your server at all and likely giving you very confusing results.

Related

htaccess RewriteRule problems

I have a web page which works fine on live server. However some links to files (jpg, pdf and others) which are created with cms editor contain relative paths.
When I run that page on my local test server which serves the pages out of a sub folder of localhost the relative paths to the files are wrong since they are missing the subfolder at the beginning. The html page loads fine. It's just some files in it that have wrong path and won't load.
page loads from http://localhost/level1/
files are trying to load from http://localhost/level2/ and I get 404s.
They should be loading from http://localhost/level1/level2/
So I setup a RewriteRule to correct the path but no matter what I have tried I can't get it to work. I have tried various flags including [R,L] but nothing changes the URI in the html.
currently I have:
RewriteRule ^/level2/(.*)$ /level1/level2/$1 [R]
Any suggestions?
Thanks
Sounds like those links are not relative paths but absolute ones (starting with a leading slash (/). That is why the issue occurs at all. Relative paths make much more sense.
This would be the version to be used inside your http servers host configuration:
RewriteEngine on
RewriteRule ^/level2/(.*)$ /level1/level2/$1 [L,QSA]
Here the version for .htaccess style files (note the missing leading slash):
RewriteEngine on
RewriteRule ^level2/(.*)$ /level1/level2/$1 [L,QSA]
You could use a version that can be used in both situations:
RewriteEngine on
RewriteRule ^/?level2/(.*)$ /level1/level2/$1 [L,QSA]
Note however that in general one should always prefer to place such rules inside the http servers host configurations. .htaccess style files are notoriously error prone, hard to debug and they really slow down the server, often for nothing. .htaccess style files only offer a last option for those who are using a really cheap web hosting provider. Or for situations where a web application has to write its own rewrite rules, which obviously is a security nightmare on its own...

htaccess 301 redirect "works" but I get a 404 error at new site

Sorry for such a long title, but it pretty well describes what is happening.
Details: I have two sites, different domains. Previously, I had a temporary site in a not-visible, but published directory in the older domain. Only those who had the extra directory (or the extra path would normally see the temporary site).
Now that I have a new domain and a permanent new site, I simply want to redirect any attempts to access the old directory/pages/site. Here is the line I added to the old site's htaccess file (last line, BTW):
redirect 301 /mailscamalert.com/weather2/ http://www.mid-southweather.com/
That "works" at least in the sense that the user ends up at the new site. But that site throws up a 404 'flag' and the user ends up at my "erer" page. All the new site's navigation is on that page, of course, but it is probably very confusing!
I've tried removing the trailing "/" on 'weather2/' and/or "...com/", adding
"index.html" to the new site's url. No change in ending up at the error page. Also have tried "meta" redirects and even a bit of php:
header("HTTP/1.1 301 Moved Permanently");
header("Location: http://www.mid-southweather.com/index.html");
Any helpful suggestions or links, greatly appreciated!
Thanks!
Please correct your rewrite rule to the following:
RewriteEngine On
RewriteRule ^weather2/ http://www.mid-southweather.com/ [R=301,L]
If you had multiple pages in that directory and want them all to redirect to your new target domain, do this:
RewriteRule ^weather2/.* http://www.mid-southweather.com/ [R=301,L]
Your current rewrite rule appears to append the actual directory you are trying to redirect to the target url.
Location: http://mid-southweather.com/weather2/
As I discovered using live http headers extension. That weather2 directory of course doesn't exist on your new site, thus the 404.
Just dump your lines and replace them with mine and it should all work nicely. And undo any other changes you may have done in the process of trying to get it working.
Make sure you don't have other rewrite rules going on, it looks to me like you might have one more running somewhere.

Does REQUEST_URI hide or ignore some filenames in .htaccess?

I'm having some difficulty with a super simple htaccess redirect.
All I want to do is rewrite absolutely everything, except a couple files.
htaccess looks like this:
RewriteEngine On
RewriteCond %{REQUEST_URI} !sitemap
RewriteCond %{REQUEST_URI} !robots
RewriteRule ^(.*)$ http://example.com/$1 [L,R=301]
The part that works is that everything gets redirected to new domain as it should be. And I can also access robots.txt without being forwarded, but not with sitemap.xml. If I try to go to sitemap.xml, the domain forwards along anyway and opens the sitemap file on the new domain.
I have this exact same issue when trying to "ignore" index.html. I can ignore robots, I can ignore alternate html or php files, but if I want to ignore index.html, the regex fails.
Since I can't actually SEE what is in the REQUEST_URI variable, my guess is that somehow index.html and sitemap.xml are some kind of "special" files that don't end up in REQUEST_URI? I know this because of a stupid test. If I choose to ignore index.html like this:
RewriteCond %{REQUEST_URI} !index.html
Then if I type example.com/index.html I will be forwarded. But if I just type example.com/ the ignore actually works and it shows the content of index.html without forwarding!
How is it that when I choose to ignore the regex "index.html", it only works when "index.html" is not actually typed in the address bar!?!
And it gets even weirder! Should I type something like example.com/index.html?option=value, then the ignore rule works and I do NOT get forwarded when there are attributes like this. But index.html by itself doesn't work, and then just having the slash root, the rule works again.
I'm completely confused! Why does it seem like REQUEST_URI is not able to see some filenames like index.html and sitemap.xml? I've been Googling for 2 days and not only can I not find out if this is true, but I can't seem to find any websites which actually give examples of what these htaccess server variables actually contain!
Thanks!
my guess is that somehow index.html and sitemap.xml are some kind of "special" files that don't end up in REQUEST_URI?
This is not true. There is no such special treatment of any requested URL. The REQUEST_URI server variable contains the URL-path (only) of the request. This notably excludes the scheme + hostname and any query string (which are available in their own variables).
However, if there are any other mod_rewrite directives that precede this (including the server config) that rewrite the URL then the REQUEST_URI server variable is also updated to reflect the rewritten URL.
index.html (Directory Index)
index.html is possibly a special case. Although, if you are explicitly requesting index.html as part of the URL itself (as you appear to be doing) then this does not apply.
If, on the other hand, you are requesting a directory, eg. http://example.com/subdir/ and relying on mod_dir issuing an internal subrequest for the directory index (ie. index.html), then the REQUEST_URI variable may or may not contain index.html - depending on the version of Apache (2.2 vs 2.4) you are on. On Apache 2.2 mod_dir executes first, so you would need to check for /subdir/index.html. However, on Apache 2.4, mod_rewrite executes first, so you simply check for the requested URL: /subdir/. It's safer to check for both, particularly if you have other rewrites and there is possibility of a second pass through the rewrite engine.
Caching problems
However, the most probable cause in this scenario is simply a caching issue. If the 301 redirect has previously been in place without these exceptions then it's possible these redirections have been cached by the browser. 301 (permanent) redirects are cached persistently by the browser and can cause issues with testing (as well as your users that also have these redirects cached - there is little you can do about that unfortunately).
RewriteCond %{REQUEST_URI} !(sitemap|index|alternate|alt) [NC]
RewriteRule .* alternate.html [R,L]
The example you presented in comments further suggests a caching issue, since you are now getting different results for sitemap than those posted in your question. (It appears to be working as intended in your second example).
Examining Apache server variables
#zzzaaabbb mentioned one method to examine the value of the Apache server variable. (Note that the Apache server variable REQUEST_URI is different to the PHP variable of the same name.) You can also assign the value of an Apache server variable to an environment variable, which is then readable in your application code.
For example:
RewriteRule ^ - [E=APACHE_REQUEST_URI:%{REQUEST_URI}]
You can then examine the value of the APACHE_REQUEST_URI environment variable in your server-side code. Note that if you have any other rewrites that result in the rewritting process to start over then you could get multiple env vars, each prefixed with REDIRECT_.
With the index.html problem, you probably just need to escape the dot (index\.html). You are in the regex pattern-matching area on the right-hand side of RewriteCond. With the un-escaped dot in there, there would need to be a character at that spot in the request, to match, and there isn't, so you're not matching and are getting the unwanted forward.
For the sitemap not matching problem, you could check to see what REQUEST_URI actually contains, by just creating an empty dummy file (to avoid 404 throwing) and then do a redirect at top of .htaccess. Then, in browser URL, type in anything you want to see the REQUEST_URI for -- it will show in address bar.
RewriteCond %{QUERY_STRING} ^$
RewriteRule ^ /test.php?var=%{REQUEST_URI} [NE,R,L]
Credit MrWhite with that easy test method.
Hopefully that will show that sitemap in URL ends up as something else, so will at least partially explain why it's not pattern-matching and preventing redirect, when it should be pattern-matching and preventing redirect.
I would also test by being sure that the server isn't stepping in front of things with custom 301 directive that for whatever reason makes sitemap behave unexpectedly. Put this at the top of your .htaccess for that test.
ErrorDocument 301 default

Apache - mod_rewrite how to prefer files instead of directories (if both have same name)?

I have very similar problem like this one: Apache mod_rewrite - prefer files over directories with pretty URLs
However it is not same, and solutions mentioned in above link doesn't work for me.
My directory structure looks like this:
/pages/articles/january.php
/pages/articles.php
/pages/home.php
/articles/
/index.php
Now, I am including in index.php pages (depending on url).
For example, when user types address www.domain.com (or www.domain.com/home), index.php will include /pages/home.php
But if I enter this URL: www.domain.com/articles it will make link something like this: www.domain.com/articles/?page[]=articles (in other words index.php won't include /pages/articles.php file)
On the other hand, this works perfectly: www.domain.com/articles/january
This is my htaccess file:
RewriteEngine On
RewriteBase /
RewriteRule ^([a-zA-Z0-9+]*)/([a-zA-Z0-9+]*)$ %{DOCUMENT_ROOT}/index.php?page[]=$1&page[]=$2 [QSA,L]
RewriteRule ^([a-zA-Z0-9+]*)$ %{DOCUMENT_ROOT}/index.php?page[]=$1 [QSA,L]
I use array for page because I can have subpages (and subpages work fine!).
EDIT: I have found this however it doesn't solves my problem :(
Can someone tell me why it does this and how to fix it?
Or, how can I give priority to files instead directories?
EDIT 2: I solved it by removing "/articles/" directory, however I am still interested how to make it work through htaccess file rules.
Per the comments above, your original issue was worked around by removing the articles directory, but you still wanted to be able to deal with directories in that sort of situation.
You'd probably want to split into two sets of rules. Have an earlier rule that uses the -d flag in a RewriteCond to catch directories so that it could treat them differently as needed. Alternately, ignore directories in the first rule by negating the flag (!-d) then catch them in the later rule.

Apache mod_rewrite not doing anything (?)

I'm having some trouble with Apache's mod_rewrite. One of the things I'm trying to get it to do is hide some of my implementation details, so that, for example, the user sees the URL http://www.mysite.com/login but Apache responds with the page at http://www.mysite.com/doc_root/login.php instead (preferably without showing the user that it's a PHP file or the directory structure). Here's what I have in my .htaccess file:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www.)?mysite.com*
RewriteRule ^/(\w+) /doc_root/$1.php [L]
#Redirect http://www.mysite.com to the login page
RewriteRule ^/?$ https://www.mysite.com/doc_root/login.php
But when I go to http://www.mysite.com/login, I get a 404 error even though the page exists. I clearly don't have a great understanding of how the mod_rewrite conditionals and rules work, so can anyone please tell me what I'm doing wrong? Thanks.
Take doc_root out of all the stuff you have it in. That will give you the result you're asking for. However I'm not sure if it's desired or not. How are you going to force someone to login if they manually type http://www.mysite.com/index.php?
Also if you're trying to force all traffic to SSL it's better to use a second VirtualHost and Redirect instead of mod_rewrite. Those are all questions probably better suited for ServerFault
Unless your site has a bunch of different domain names, and you only want mysite.com to do the rewriting, you don't need the RewriteCond. (Potential problem. Apache likes to dick around with the domain name unless you set UseCanonicalName off. If the name isn't what it's expecting, the rewrite won't happen.)
In RewriteCond (and RewriteRule) patterns, . matches any character. Add a backslash before them. (Minor bug. Shouldn't cause rewrites to fail, but they would match stuff like "mysite-com" as well.)
mod_rewrite is actually a URL-to-filename filter. Though it is often used to rewrite URLs to other URLs, sometimes it will misbehave if what you're rewriting to is a URL and it can't tell. (Especially if what it's rewriting to would be an alias, or would otherwise not translate directly to a real filename.) If you add a [PT] flag onto your rule, though, it will consider the rewritten thing a URL and pass it along to the other filters (including the ones that turn URLs into filenames).
Do you really need "/doc_root"? The document root should already be set up in Apache using the DocumentRoot directive, and shouldn't need to be part of the URL unless you have multiple apps on the same domain (in which case it's the app root; the document root doesn't change).
UPDATE:
Another thing i just thought about: Rewrite rules work differently in .htaccess files. Apache likes to strip off the leading slash. So you will probably want to get rid of the first slash in your patterns, or at least make it optional (^/?login instead of ^/login).
^/?(\w+) will match /doc_root/login.php, and cause a rewrite to /doc_root/doc_root.php. You should probably have a $ at the end of your pattern.