Manipulate user's address bar with mod_rewrite - apache

I have a page at example.com/themizer.php, but I want it to appear that it's actually located at example.com/themizer/ (or example.com/themizer/index.php) for all practical purposes. I know how to basically make an alias for it with mod_rewrite, but how do I make it appear that users are being redirected to that alias? Example: a user requests example.com/themizer.php and the address in their browser turns into example.com/themizer/ without actually redirecting. Is this possible?

With server-sided configuration, you can only accomplish this with a redirect. This does not necessarily need to be a problem. Just make sure that the urls on your site point to the fancy url and not to the internal url. Otherwise you generate a lot of requests that have to be redirected, instead of just redirecting the odd request that came in in an other way (e.g. through an external old url or old bookmark). You do it like this:
#External redirect
RewriteCond %{THE_REQUEST} ^GET\ /themizer\.php\ HTTP
RewriteRule ^themizer\.php$ /themizer/ [R,L]
#Internal rewrite
RewriteRule ^themizer/?$ themizer.php [L]
If you really must, you can use javascript to 'push' a new window state into the history, updating the address bar. This causes the "go to previous page" button in your browser to contain bogus though. In other words: Going to the previous page does not work as expected, which I would not recommend since there is a better option available. You can do it with the following javascript statement in browsers that support it:
window.history.pushState( null, document.title, "/themizer" );

Related

.htaccess Redirect based on HTTP_REFERER being empty

I'm trying to set up a redirect on a WP blog installation that will detect anyone coming in from nowhere (i.e. not from another site). The idea is to trap some of the spambots that plug pre-constructed URLs into the system to create comments/posts. I figure if they don't have a referrer site, I can pop them back to the homepage (www.domain.com/index.php or just www.domain.com), which should mess with the bots but not with real people.
I understand that the referrers can be forged but hopefully it'll stop the stupids, at least.
I have very little clue about .htaccess rewrite rules (I apologise for being a noob), but I couldn't find one that did this in existing answers or anywhere else online, despite several searches. Either no one's done it or I'm not phrasing correctly.
Any help appreciated. :)
I'd advise against this. By doing it, you may annoy and alienate a portion of potential your users: for example my browser is set not to report referer information, others use anonymity networks. The dump bots you can catch by matching their reported user agent string (as seen here).
Otherwise it's simple: match against the HTTP_REFERER environmental variable in a RewriteCond:
RewriteCond %{HTTP_REFERER} ^$
RewriteRule .* http://example.com/
The RewriteCond checks to see if the referer is an empty string; the RewriteRule redirects everything to http://example.com/ root. This is a hard redirect, meaning that the server will issue an R=301 moved permanently header. If you just want to sneakily serve another resource, use a soft redirect by specifying a relative URL, like RewriteRule .* index.php. However, it may be kinder for people not reporting referrer information to redirect them to a page saying something like "You should enable referrer reporting if you want to read this page".
For more examples on such things, see the manual. There's a very similar prevent-hotlinking method there.

301 Redirect in .htaccess for re-submitting URL-s

I want to ask how do I redirect Search Engines to take a second look on my new, fresh, rewritten URL-s?
So, my former URL-s were structured like this :
http://www.sample.com/tutorials.php?name=something
and now they look much more cleaner and better :
http://www.sample.com/tutorials/programming/something.php
So, as I said, I want Google (and other engines) to take a look at my new links, which are much more SEO friendly and for that I will be indexed better.
I was told the 301 redirect method was the best, but I don't have a clue what is it, how it works and where to learn how to use it. So, I am asking you.
Side note : Would updating my sitemap.xml file and re-submitting it to Google Webmaster Tools help in this process?
Thanks in advance!
There are 2 kinds (in this context) redirects. When a client, be it a browser, search engine indexing bot, or whatever, requests a URI, the server can tell the client "What you are looking for exists, but it's somewhere else". In the case of a 302 or temporary redirect, it's essentially telling the client "What you are looking for exists, but it's temporarily over here at this URL". In the case of a 301 or permanent redirect, it's essentially telling the client "What you are looking for exists, but it has permanently moved over to this URL".
In the case of the later, browsers, proxy servers, and search engine indexes know that the old URL is no longer valid and to stop using it, and from now on to use the new URL that was returned by the server via a 301 redirect. In the case of a search engine like Google, it has an index of the old URL and all the data that its accumulated over the lifetime of that URL assoicated with it. When one of its bots sees a 301, it knows that the old URL, and its content, isn't gone, but it just permanently moved to another URL. All of the associated data Google has collected for the old URL gets trasnfered to the new URL. Google can probably figure most of this stuff out without a 301 redirect, but it's a sure way to make sure Google has got a right.
You can do such a redirect via mod_rewrite:
RewriteEngine On
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /tutorials\.php\?name=([^&\ ]+)
RewriteRule ^ /tutorials/programming/%1.php [L,R=301]
You should put this near the top of the htaccess file in your document root. The condition checks that an actual request has been made for /tutorials.php with a query string name="something". The "something" part gets grouped by the match and is accessed via the %1 backreference.
The 301 redirect is a response that the server can make which signals to the user (or search engine) that the page they are looking for has been permanently moved to a specified other page. It is possible to configure apache to give a 301 for certain urls, but it is probably easier to have the whatever server-side language you are using take the request, and then issue a 301.
The chances are that Google will work out what is going on fairly quickly without 301s or anything else, but submitting a sitemap to them or using the URL Parameters functionality in Google's Webmaster Tools might help.

Understanding difference between redirect and rewrite .htaccess

I'd like to understand the difference between redirecting and rewriting a URL using .htaccess.
So here's an example: Say I have a link like www.abc.com/ index.php?page=product_types&cat=88 (call this the "original" url)
But when the user types in abc.com/shoes (let's call this the "desired" url), they need to see the contents of the above link. To accomplish this, I would do this:
Options +FollowSymLinks
RewriteEngine on
RewriteBase /
RewriteRule ^(.*)shoes(.*)$ index.php?page=product_types&cat=88
Nothing wrong with this code and it does the trick. However, if I type in the original url in the address bar, the content comes up, but the url does not change. So it remains as www.abc.com/index.php?page=product_types&cat=88
But what if I wanted the desired url (/shoes) to show up in the address bar if I typed in www.abc.com/ index.php?page=product_types&cat=88? How would this be accomplished using .htaccess? Am I running into a potential loop?
Some of the explanation can be found here: https://stackoverflow.com/a/11711948/851273
The gist is that a rewrite happens solely on the server, the client (browser) is blind to it. The browser sends a request and gets content, it is none the wiser to what happened on the server in order to serve the request.
A redirect is a server response to a request, that tells the client (browser) to submit a new request. The browser asks for a url, this url is what's in the location bar, the server gets that request and responds with a redirect, the browser gets the response and loads the URL in the server's response. The URL in the location bar is now the new URL and the browser sends a request for the new URL.
Simply rewriting internally on the server does absolutely nothing to URLs in the wild. If google or reddit or whatever site has a link to www.abc.com/index.php?page=product_types&cat=88, your internal server rewrite rule does absolutely nothing to that, nor to anyone who clicks on that link, or any client that happens to request that URL for any reason whatsoever. All the rewrite rule does is internally change something that contains shoes to /index.php?page=product_types&cat=88 within the server.
If you want make it so a request is made for the index.php page with all of the query strings, you can tell the client (browser) to redirect to the nicer looking URL. You need to be careful because rewrite rules loop and your redirect will be internally rewritten which will cause a redirect which will be internally rewritten, etc.. causing a loop and will throw a 500 Server Error. So you can match specifically to the request itself:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?page=product_types&cat=88
RewriteRule ^/?index.php$ /shoes [L,R=301]
This should only be used to make it so links in the wild get pointed to the right place. You must ensure that your content is generating the correct links. That means everything on your site is using the /shoes link instead of the /index.php?page=product_types&cat=88 link.

Prevent users from accessing files using non apache-rewritten urls

May be a noob question but I'm just starting playing around with apache and have not found a precise answer yet.
I am setting up a web app using url-rewriting massively, to show nice urls like [mywebsite.com/product/x] instead of [mywebsite.com/app/controllers/product.php?id=x].
However, I can still access the required page by typing the url [mywebsite.com/app/controllers/product.php?id=x]. I'd like to make it not possible, ie. redirect people to an error page if they do so, and allow them to access this page with the "rewritten" syntax only.
What would be the easiest way to do that? And do you think it is a necessary measure to secure an app?
In your PHP file, examine the $_SERVER['REQUEST_URI'] and ensure it is being accessed the way you want it to be.
There is no reason why this should be a security issue.
RewriteCond %{REDIRECT_URL} ! ^/app/controllers/product.php$
RewriteRule ^app/controllers/product.php$ /product/x [R,L]
RewriteRule ^product/(.*)$ /app/controllers/product.php?id=$1 [L]
The first rule will redirect any request to /app/controllers/product.php with no REDIRECT_URL variable set to the clean url. The Rewrite (last rule) will set this variable when calling the real page and won't be redirected.

Apache RewriteMap and hiding the URL

I'm trying to implement persistent URLs under Apache and I'm having trouble getting the URL passed back from the RewriteMap to remain hidden. That is, if I have the PURL:
http://www.mysite.com/psearch?purl=12345
and the mapped value for it is:
http://www.mysite.com/search?name=test&type=test2
I want the PURL to be the URL displayed in the browser address bar. Unfortunately, it keeps displaying the site that the PURL maps to instead. My rule is the following:
RewriteCond %{REQUEST_URI} /psearch(/)*$
RewriteMap mapper prg:/scripts/rewritetest.pl
RewriteRule ^/(.*)$ ${mapper:$1} [L]
All the mapper does right now is return a URL for a test page on the system, since I'm trying to get the address hiding working. And I know I'm not grabbing the parameters right now, I'm just trying to get the test running using the psearch keywork, and will add the rest later if it's possible to hide the address.
Any help is appreciated, Thanks!
Turns out the problem was that I was returning the full URL, which forced a full redirect. Passing back just the REQUEST_URI portion made things work.
Forcing the headers to expire also helped, since things were getting cached that were obscuring when something was working properly.