mod_rewrite makes error logs fill up - apache

I am using apache's mod rewrite to make "pretty urls", here is an example of a rule:
RewriteRule ^articles/(.+?)\.([0-9]+)/page\=([0-9]+)?$ /index.php?module=articles_full&aid=$2&title=$1&page=$3 [N,NC,QSA]
Problem is my error log fills up with stuff like this:
[Fri Feb 01 20:36:11 2013] [error] [client 94.246.88.189] File does not exist: /home/gamingon/public_html/articles
[Fri Feb 01 20:35:55 2013] [error] [client 66.249.73.195] File does not exist: /home/gamingon/public_html/articles
[Fri Feb 01 20:34:39 2013] [error] [client 66.249.73.195] File does not exist: /home/gamingon/public_html/articles
What would be the best way to stop the errors? It seems to work okay though but still i don't think the error logs are supposed to fill with that?

Add a RewriteCond that will make the RewriteRule kick in only when the file doesn't exist:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^articles/(.+?)\.([0-9]+)/page\=([0-9]+)?$ /index.php?module=articles_full&aid=$2&title=$1&page=$3 [N,NC,QSA]
Edit:
Looking at your rewrite rule again, it's a very specific regex that will only match when the uri is in the format
The text "articles"
one or more of any character
a dot
one or more digits
the text "/pages="
one or more digits
So, it will match on a URL like:
http://example.com/articles/awesome-article.3/page=2
but not on a URL like:
http://example.com/articles/awesome-article/page=2.
If you're seeing 404s, it's likely people are trying to visit URLs of the second form. Could be that your site has some links that are not being created correctly, check the logs to see if you can find any referers to help track down the problem.
I tested your rule and it works fine for me when I give it a correctly formatted URL.

Related

Apache .htaccess redirect file

I have written this rule in the .htaccess file:
# Rewrite super sized ciim5 images to large
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^ciim5/([0-9]+)/([0-9]+)/super_000000\.jpg$ /ciim5/$1/$2/large_000000.jpg [NC,L,R=301]
What it should do is when the path /ciim5/123/456/super_000000.jpg is requested, it should 301 redirect to the same path, but serve large_000000.jpg instead. Both super_0000000.jpg and large_000000.jpg are actual physical files on that server (effectively we want to hide super sized images from public and redirect them to large)
I've tried this rule on my localhost (Apache/2.4.7 (Ubuntu)) and it worked (rest of the .htaccess file is same as the production one). But when I upload it onto the production server (Apache/2.4.18 (Ubuntu)), it's not doing anything at all. But the previously existing rewrites do work (= mod_rewrite is enabled). I have made sure the .htaccess file is updated on the production site (and it is), and even made a test.html file and open it in the browser just to make sure I'm uploading files in the correct directory.
I'm now sitting here scratching my head, can someone tell me why the above doesn't work on the production server and how to get it to do the redirect?
Edit:
The path /ciim5 is actually setup as an alias to a directory that's not in the web root. So could this be the case of the .htaccess file in the web root not being read? Should I put the .htaccess file in the /ciim5 directory instead?
Edit 2:
Enabling logs shows this:
[Thu Dec 01 10:05:48.444800 2016] [rewrite:trace1] [pid 4572] mod_rewrite.c(476): [client *:33116] * - - [*.org.uk/sid#7f767fb185c8][rid#7f767fa9c0a0/initial] pass through /ciim5/295/491/super_000000.jpg
[Thu Dec 01 10:05:48.445015 2016] [rewrite:trace3] [pid 4572] mod_rewrite.c(476): [client *:33116] * - - [*.org.uk/sid#7f767fb185c8][rid#7f767fa9c0a0/initial] [perdir /static-media/] strip per-dir prefix: /static-media/295/491/super_000000.jpg -> 295/491/super_000000.jpg
[Thu Dec 01 10:05:48.445113 2016] [rewrite:trace3] [pid 4572] mod_rewrite.c(476): [client *:33116] * - - [*.org.uk/sid#7f767fb185c8][rid#7f767fa9c0a0/initial] [perdir /static-media/] applying pattern '^(ciim5/[0-9]+/[0-9]+/)super_000000\\.jpg$' to uri '295/491/super_000000.jpg'
[Thu Dec 01 10:05:48.445204 2016] [rewrite:trace1] [pid 4572] mod_rewrite.c(476): [client *:33116] * - - [*.org.uk/sid#7f767fb185c8][rid#7f767fa9c0a0/initial] [perdir /static-media/] pass through /static-media/295/491/super_000000.jpg
What's this "strip per-dir prefix" thing? Seems like it removes ciim5 or static-media (the actual folder which ciim5 is aliased to). So I should actually remove the ciim5 part from my rewriterule?
So here's what I had to do:
My webroot is in /var/www/*.org.uk. But the media files were in /static-media in the root directory. I had to put an .htaccess file in /static-media and put the following directive:
RewriteRule ^([0-9]+/[0-9]+/)super_000000\.jpg$ /ciim5/$1large_000000.jpg [NC,L,R=301]
Seems like when the request comes to Apache, the strip per-dir prefix (I don't know what this is!) seems to remove /static-media and /ciim5 directory prefixes, so the incoming request (lets say /ciim5/123/456/super_000000.jpg) the .htaccess file in /static-media receives is: /123/456/super_000000.jpg. So now I need to redirect that to the same path but with /ciim5 prefix and to large_000000.jpg. This was done by trial and error really, and I don't fully understand why this works the way this works. So please expand on this answer.

RewriteRule not working, 404 error obtained

Using Apache/2.2.31, I created the following rule on my '.htaccess' file:
RewriteEngine on
RewriteRule ^([^/]+)/$ /do.php?label=$1 [L]
However, when accessing 'http://foo.com/whatever/', I get a 404 error message. I've checked my error log and:
[Mon Jun 27 11:12:15 2016] [error] [client 192.168.1.132] File does not exist: /path-to-web/whatever, referer: http://foo.com/
'http://foo.com' works ok, and 'http://foo.com/do.php?label=whatever' works ok as well.
I've checked phpinfo, and 'mod_rewrite' is loaded in Apache. Additionally, I tried to enable it via 'http.conf', but Apache tells me that "module rewrite_module is built-in and can't be loaded".
What am I doing wrong? This same '.htaccess' works ok in two other Apache servers.
I tried with "Options +FollowSymLinks -MultiViews" as well
Thank you very much

Why are requests being made to /eyeblaster on my website?

My Adsense-supported website's error.log file has lots entries like this which I'm fairly sure are advert related:
[Fri Apr 12 07:19:57 2013] [error] [client IP] File does not exist: /var/www/[mywebsite]/htdocs/eyeblaster, referer: http://apac-bidder.mathtag.com/notify/iframe? [snip lots of junk]
What is it and is it harmful?
Should I create an empty eyeblaster.html file to shut it up?
I also have another two that are like it - are they related?:
[Fri Apr 12 07:08:52 2013] [error] [client IP] File does not exist: /var/www/[mywebsite]/htdocs/7196176924447058959
[Fri Apr 12 07:13:58 2013] [error] [client IP] File does not exist: /var/www/[mywebsite]/htdocs/_sans
Take a look at StackOverflow question above - Determine which advertisement made a request to /eyeblaster/addineyev2.html. I think all will be revealed :).
Eyeblaster is an online advertising company now called Sizmek (formerly Mediamind).
As John mentioned, this file is used as an iframe buster, just like DoubleClick's iframe buster, which should be placed by a publisher on [www.example.com]/doubleclick/DARTIframe.html
Another solution is to disallow eyeblaster.
Just add these to your robots.txt:
Disallow: /eyeblaster
Disallow: /addineyeV2.html
Or I prefer to redirect eyeblaster to index.html inside .htaccess file
RewriteCond %{REQUEST_URI} (eyeblaster|addineyeV2) [NC]
RewriteRule ^(.*)$ /index.html? [R=301,L]
eyeblaster is just another piece of malware which most machines end up getting on at one point from what I can tell it hasn't caused any series issues but it is best to remove it from your machine whenever your anti-malware software pics it up

htaccess: remove cgit.cgi from path

So I just installed cgit on a shared host. I custom compiled it and use symlinks from $HOME/mydomain-and-public-www-folder.tld/g to $HOME/local/lib/cgit/prod to link $HOME/local/lib/cgit/${git describe} to a build from my updated builds to test the latest version. What does this include?
[sharedhost]$ pwd
$HOME/local/lib/cgit/prod
[sharedhost]$ ls
.htaccess cgit.cgi cgit.css cgit.png lib/
Now, that looks all nice and good on the landing page when I put in https://mydomain.tld/g/ in the browser and I see the pretty cgit interface. When I click any link, I get the proper repos, but all the formatting is gone (the CSS, PNG, and JS files go bye-bye, and all the links are in the form https://mydomain.tld/g/cgit.cgi/randomrepo.git for all the URLs. Of course all the files in the prod path are not working, because the browser looks for things like https://mydomain.tld/g/cgit.cgi/cgit.css instead of the needed https://mydomain.tld/g/cgit.css. Now, this was my basic .htaccess file to get it working.
[sharedhost]$ more .htaccess
# GIT BEGIN ###########################################################
Options +Indexes +FollowSymLinks +ExecCGI
Action fastcgi-script cgit.cgi
SetEnv HTTP_CGIT_CONFIG /home/username/local/lib/cgit/cgit-conf/cgitrc
RewriteEngine On
DirectoryIndex cgit.cgi
# GIT END ############################################################
# AUTHENTICATION BEGIN ###############################################
AuthType Digest
AuthName "cgitdigestdomain"
AuthDigestDomain /cgitdigestdomain/
AuthUserFile /home/username/local/lib/cgit/cgit-conf/.htpasswd
Require valid-user
# AUTHENTICATION END ################################################
I have tried a whole bunch of Rewrite patterns recommended like this [1] or that [2]. I am relatively new to more advanced .htaccess rules, so can someone point on how to remove cgit.cgi from the URLs with mod-rewrite and ensure css and png files in the same directory are accessible and QSA/query strings are handled properly. Sorry for the long post. I thought more detail would show why the obvious how-tos on this were not working for me.
[1] Remove 'index.php' from URL with .htaccess
[2] htaccess remove subdirectory from url
UPDATE:
Both answers seem to be good, but I do not think I am understanding the virtual-root and scan-path features, but it seems like these answers go farther than before, but the critical part is still missing. Any url to a repo (which is in a far removed path $HOME/data/scm/priv) is not getting redirected properly, and I get a 404.
[Fri Jul 13 01:16:18 2012] [error] [client 127.0.0.1] File does not exist: $HOME/mydomain-and-public-www-folder.tld/missing.html, referer: https://domain.tld/g/
[Fri Jul 13 01:24:20 2012] [error] [client 127.0.0.1] File does not exist: $HOME/mydomain-and-public-www-folder.tld/failed_auth.html, referer: https://domain.tld/g/
[Fri Jul 13 01:24:21 2012] [error] [client 127.0.0.1] File does not exist: $HOME/mydomain-and-public-www-folder.tld/g/bicon.git, referer: https://domain.tld/g/
[Fri Jul 13 01:24:21 2012] [error] [client 127.0.0.1] File does not exist: $HOME/mydomain-and-public-www-folder.tld/missing.html, referer: https://domain.tld/g/
[Fri Jul 13 01:26:11 2012] [error] [client 127.0.0.1] File does not exist: $HOME/mydomain-and-public-www-folder.tld/g/admin-scripts.git, referer: https://domain.tld/g/
[Fri Jul 13 01:26:11 2012] [error] [client 127.0.0.1] File does not exist: $HOME/mydomain-and-public-www-folder.tld/missing.html, referer: https://domain.tld/g/
Names/IPs changed to protect the innocent, but you get the point. So it appears the repo requests are not going back to the cgit CGI, so where are they going? I working on trying to get better redirect logs. We shall see.
UPDATE 2:
And of course, RewriteLog is a directive you cannot put in .htaccess as I forgot, silly me. Not sure what to do now on a shared hosting. How obnoxious. This is one of several issues that pushed me to look into VPS hosting and bite the money bullet and use this shared host for only really stupid stuff.
P.S.: I did email the original issue as is to the cgit dev list, and never got a response on this either. So they do not care and feel it is a moronic question, not sure. Haha.
Usually this kind of problem is much easier to solve if you have some help from your tool. And you do.
According to this page You can set the virtual-root setting (in your case to "/g" and use these rules:
RewriteRule ^/g/$ /g/cgit.cgi [L,QSA]
RewriteRule ^/g/([^/]+)/$ /g/cgit.cgi?r=$1 [L,QSA]
RewriteRule ^/g/([^/]+)/([^/]+)/$ /g/cgit.cgi?r=$1&p=$2 [L,QSA]
I adjusted the paths to what I think would be right for you but I'm not sure about the recursiveness here. It should work (becasue of the L) but if it does not, I suggest you rename the folder /g into something like /git and use it in the paths (e.g. /git/cgit.cgi)
As for the css and js files, you will need another rule, something like this:
ReWriteCond /g/$1 -f
ReWriteRule ^/g/(.+)$ - [L]
This should leave the url untouched if the file exists. Put this rule before the rules above.
Update:
I was hoping adding virtual-root would change the links on the page to the desired form; I suggest you double check that cgit.cgi has picked up the configuration.
However, if you would want to work around that, try this rule:
ReWriteCond /g/$1 -f
ReWriteRule ^/g/cgit.cgi/(.+)$ /g/$1 [L]
Fozi's answer didn't quite work in my case. These rewrite rules work well with virtual-path=/ in the /etc/cgitrc.
<Directory /var/www/cgit>
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*) /cgit.cgi/$1
</Directory>
Make sure to restart your web server instead of reloading it when messing with cgit, because it seems to cache some things.
Does this work?
# Redirect the browser to remove the "cgit.cgi" from the URL's address bar
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /g/cgit\.cgi/
RewriteRule ^g/cgit\.cgi/(.+)\.git /g/$1.git [R=301,L]
# Internally add the "cgit.cgi" so that it gets served properly
RewriteCond %{REQUEST_URI} !^/g/cgit.cgi/
RewriteRule ^g/(.+)\.git /g/cgit.cgi/$1.git [L]
You'd want to probably insert that somewhere after the RewriteEngine On. The first rule checks if the requested URI contains /g/cgit.cgi/ and ends with a .git, and if so, redirect with the cgit.cgi removed. This will make it so relative links to see will have a base of /g/ instead of /g/cgit.cgi/. Then, we need to change them back internally, which is where the second rule comes in. If the request doesn't include a cgit.cgi, then internally add one in as long as the URI ends with .git.
EDIT: I may have misunderstood what you meant by "how to remove cgit.cgi from the URLs with mod-rewrite"
See if these rules work for you:
RewriteCond %{REQUEST_URI} ^/g/cgit\.cgi/(.+)$
RewriteCond %{DOCUMENT_ROOT}/g/%1 -f
RewriteRule ^g/cgit\.cgi/(.+)$ /g/$1 [L]
The first condition checks if the request URI starts with /g/cgit.cgi/ and makes a backreference to whatever is after it. Then it checks if there is physically a file that exists if the cgit.cgi part is removed, and if so, rewrite the URI so that it's removed.
This way, requests for your css and js and images that sit in the /g/ directory, but get relative linked to from URL's like /g/cgit.cgi/something.git will internally resolve as long as the referenced resource is actually in /g/.

Help Needed With (I thought) Basic mod_rewrite Setup

I run my own server at home. I am attempting to do a mod_rewrite but I cannot seem to get it to work correctly.
Say my url is "http://www.site.com/". I have a user set up, whose public_html directory is serving pages; let's say that url is "http://www.site.com/~myUser/theDirectory/". What I am trying to do is set it so that when you type in "http://www.site.com/theDirectory/" it will actually serve the pages out of "http://www.site.com/~myUser/theDirectory/", but look like its coming from "http://www.site.com/theDirectory/". I edited my /etc/apche2/sites-enabled/default file and added these lines:
RewriteEngine On
RewriteRule ^theDirectory/$ /home/myUser/public_html/~myUser/theDirectory/
I also tried various versions of the rule:
RewriteRule ^/theDirectory/$ /home/myUser/public_html/theDirectory/
RewriteRule ^/var/www/html/theDirectory/$ /home/myUser/public_html/theDirectory/
I also made sure that the rewrite module was enabled. At first, I was getting this error:
[Fri Jun 17 18:11:35 2011] [error] [client xxx.xxx.xxx.xxx] File does not exist: /var/www/theDirectory
So I created that file, and now I am getting this error:
[Fri Jun 17 23:15:45 2011] [error] [client xxx.xxx.xxx.xxx] Directory index forbidden by Options directive: /var/www/theDirectory/
So, I'm not really sure where to go from here. Any and all advice will be appreciated. Thanks for taking the time to read.
Have a great day :-)
Add the following to your virtual host configuration:
RewriteEngine on
RewriteRule ^/theDirectory(/.*)$ /home/myUser/public_html/theDirectory$1 [L]