How to get analytics stats for mod_rewrite redirects? - apache

I have a website with about 200 apache mod_rewrite rules in httpd.conf, running on apache webserver in redhat.
Here's an example of one of the rules, most of them are short URLs that redirect to really long URLs:
RewriteRule ^grad2014/?$ /registration-and-records/graduation/live/index.html [R=301,L]
I've been asked to get some web analytics for these redirects.
"How many people used the URL mysite.com/grad2014?" - Well, I don't really know, because /grad2014 doesn't exist on the webserver, and google analytics are set up on the index.html page.
I don't seem to see any of the shortcuts in the access.log. Is there another way to see which URLs redirects are the most popular ? Is there a way to start logging this ?

One way you could do it:
Add the rewrite match as a query param on the target url:
RewriteRule ^grad2014/?$ /registration-and-records/graduation/live/index.html?rr=$0 [R=301,L]
Then in your GA code, capture that URL param and put it in a custom variable. I don't know if you know anything about custom variables, but here is an example of one way to set it:
// example function get query param. use your own if you already have one
function getParam(n){var n=n||'';var x=new RegExp("[\\?&]"+n.replace(/[\[]/,"\\\[").replace(/[\]]/,"\\\]")+"=([^&#]*)");var r=x.exec(window.location.href);return(r==null)?'':r[1]}
var rr = getParam('rr');
if (rr) {
_gaq.push(["_setCustomVar", 1, "Mod Rewrite Redirect URL", rr, 3]);
}
// your on-page trigger
_gaq.push(["_trackPageview"]);
NOTE: by default GA counts the page name as location.pathname+location.search so adding the rr param to the url is going to affect your pages report. The easiest way to fix this is to create a filter within GA to strip it from incoming page name (request URI), but if you need to, you can write some code to get location.pathname+location.search minus rr query param and populate that value as the 2nd element in your _trackPageview.
_gaq.push(["_trackPageview","custom page name here"]);

The access.log should give you what you want e.g.
cd /var/log/apache2 || cd /var/log/httpd
# Get a list of the 301 redirects issued.
grep 301 access?log
# Count the 301 redirects issued.
grep 301 access?log | wc -l
# Count a specific redirect
grep 301 access?log | grep grad2014 | wc -l

Related

.htaccess RewriteRule based on anchors

I am trying to update a RewriteRule. Previously, the redirect looked like this:
https://mywebsite.com/docs/#/en/introduction → https://manual.mywebsite.com/#/en/introduction
I would like to use /docs/ for something else now but I would like to keep redirecting requests containing a forward slash after the # to the manual subdomain. This is what I would like to achieve:
The old redirects continue working as usual:
https://mywebsite.com/docs/#/en/introduction → https://manual.mywebsite.com/#/en/introduction
This would not get redirected since there is no forward slash following the #:
https://mywebsite.com/docs/#overview
Here is what I have:
The .htaccess file containing the following existing rule which redirects everything:
RewriteRule ^docs/(.*) https://manual.mywebsite.com/$1 [L,R=301]
I tried this but it did not work (I tried with https://htaccess.madewithlove.com/ which says that my rule does not match the URL I entered):
RewriteRule ^docs/#/(.*) https://manual.mywebsite.com/#/$1 [L,R=301]
I also read about the NE (no escape) flag (https://httpd.apache.org/docs/2.2/rewrite/advanced.html#redirectanchors) which did not help either.
I am also sure that the server is actually using the file.
To summarize, my problem is that I want to match a URL containing /docs/#/ and redirect it to a subdomain, keeping the /#/ and everything that follows it.
Anchors are not part of the URL that's transmitted to the server: They stay within the browser, thus you can't build a rewrite rule that take anchors into account.
Inspect your access-logs to see what you get

Rewriting and Redirection using mod_rewrite

so I have an apache server and I am trying to access an external google doc. My main issue here is capturing the key of a key value pair in a url and then redirecting to that key. I have the regex required that puts the key into it's own group but unfortunately I am a bit green to Redirecting and Rewriting URL's.
It may be worth noting that by default my Apache Server is looking for directory called 'redirect' every time I try to leave the site
I have tried using "%{QUERY_STRING}" as Rewrite Condition and I feel like this is the correct approach because I have tested my my regex and it works according to multiple online regex testers
//This is my URL:
http://example.com/redirect?url=https%3A%2F%2Fdocs.google.com%myForm%2Fd%2Fe%2F1FAaBUNCHOFSTUFFBELONGING TO GOOGLE%2FviewmyForm
//Regex to capture the value in it's own group:
redirect(.*)url?=(.*)
//This is my code for the capturing and redirecting:
RewriteCond %{QUERY_STRING}" redirect(.*)url?=(.*)
RewriteRule ^redirect(.*) %2 [R=301]
Expected Results: Replace the URL with the key that has been captured using regex
Actual results:
Error 404 redirect/ not found on this Apache server

301 Redirect A URL Pattern Using .htaccess

I want to redirect my URLs to a new pattern. For this purpose, I used 301 redirect for every single URL but that are taking a huge time and my .htaccess file is going large and large as I have thousands of URLs.
So now Someone said to me to use .htaccess to use 301 redirect or rewrite engine option. Now I am new to .htaccess to use Pattern Redirect. First of all clear me out that is this possible to use 301 redirect in patterns? If yes then Can I do pattern 301 redirect in the below URLs? I want to redirect the below pattern so can you help me?
/search/label/XXXXXXXXXX to /category/XXXXXXXXXX
/year/month/XXXXXXXXXX.html/?m=0 to /year/month/XXXXXXXXXX.html
/year/month/XXXXXXXXXX.html/?m=1 to /year/month/XXXXXXXXXX.html
/search to /
/feed to /
XXXXXXXXXX means some text/no that are dynamic and changeable. year and month means only no that are also dynamic and changeable. / means site homepage. Rest are fixed text.
Please keep in mind that sometime there are many variables in every URL so we also want to avoid that that always start from ?variable=value&variable=value in the end of every URL.
After asking here, I keep trying myself too so I am able to do it and working on my side. I added below codes in my .htaccess file and after that I am able to redirect all upper URLs without any 404 error.
Redirect 301 /search/label http://www.example.com/category
Redirect 301 /search http://www.example.com
Redirect 301 /feed http://www.example.com
For 2,3 URL pattern, I did nothing because after checking, its not showing any 404 error as they are only variable in front of URL so no need to edit that.

multiple folder redirect

I have been trying variations of the following without success:
Redirect permanent /([0-9]+)/([0-9]+)/(.?).html http://example.com/($3)
It seems to have no effect. I have also tried rewrite with similar lack of results.
I want all links similar to: http://example.com/2002/10/some-long-title.html
to redirect the browser and spiders to: http://example.com/some-long-title
When I save this to my server, and visit a link with the nested folders, it just returns a 404 with the original URL unchanged in the address bar. What I want is the new location in the address bar (and the content of course).
I guess this is more or less what you are looking for:
RewriteEngine On
ReriteRule ^/([0-9]+)/([0-9]+)/(.?)\.html$ http://example.com/$3 [L,R=301]
This can be used inside the central apache configuration. If you have to use .htaccess files because you don't have access to the apache configuration then the syntax is slightly different.
Using mod_alias, you want the RedirectMatch, not the regular Redirect directive:
RedirectMatch permanent ^/([0-9]+)/([0-9]+)/(.+)\.html$ http://example.com/$3
Your last grouping needs to be (.+) which means anything that's 1 character or more, what you had before, (.?) matches anything that is either 0 or 1 character. Also, the last backreference doesn't need the parentheses.
Using mod_rewrite, it looks similar:
RewriteEngine On
RewriteRule ^/([0-9]+)/([0-9]+)/(.+)\.html$ http://example.com/$3 [L,R=301]

Isapi Rewrite 301 redirect resolves as 404 - circular reference?

I'm trying to use IIS Isapi Rewrite to do the following...
I need seo-friendly URLs to be (silently) converted back to application friendly URLs like so:
RewriteRule ^/seo-friendly-url/ /test/index.cfm [I,L]
Simple enough.
But I also need URLs already indexed in search engines (for example) to be 301 redirected to the seo-friendly version. Like so:
RewriteRule ^/test/index.cfm /seo-friendly-url/ [I,R=301]
Each of these works fine in isolation. But when I have both in my .ini file I end up with /seo-friendly-url/ showing in my browser address bar but I'm being served a 404. (Yes, /test/index.cfm definitely exists!)
I know it looks like a circular reference, but the first rule only rewrites the URL between IIS and the application - there's no redirect, so I'm not hitting Isapi Rewrite a second time. Or am I wrong about that?
I've enabled logging on Isapi Rewrite and I'm seeing the following:
HttpFilterProc SF_NOTIFY_PREPROC_HEADERS
DoRewrites
New Url: '/seo-friendly-url/'
ApplyRules (depth=0)
Rule 1 : 1
Result (length 15): /test/index.cfm
ApplyRules (depth=1)
Rule 1 : -1
Rule 2 : 1
Result (length 18): /seo-friendly-url/
ApplyRules: returning 301
ApplyRules: returning 1
Rewrite Url to: '/seo-friendly-url/'
Anyone got any ideas?
You have two different rewrites here and it should work if you do it right
The first one is never seen by the client user-agent. It requests /seo-friendly and you rewrite it internally and respond
The second one is not really a rewrite, but a redirect. You send that back to the client and it re-requests the /seo-friendly -- I think you need to use [R=301,L] to say that this is the end of the line -- just return it (L does that)
Through some trial and error I've come up with a solution for this.
Specify that the redirect match is at the end of the string using the $ symbol:
RewriteRule ^/test/index.cfm$ /seo-friendly-url/ [I,R=301]
Make the rewritten URL trivially different from the redirect match string - in this case adding an unnecessary "?":
RewriteRule ^/seo-friendly-url/ /test/index.cfm? [I,L]