How does .htaccess work? - apache

I'm trying to make my website display the other pages as a www.example.com/pageone/ link instead of www.example.com/pageone.html.
Problem is, i'm reading up on ways to do that using .htaccess and its getting me confused because i don't understand the commands.
I understand that the first step, however, is to write this in the .htaccess file:
RewriteEngine On
After this step, i have absolutely no idea whats !-d nor {REQUEST_FILENAME} nor ^(.*) and all that programming text. Is there a documentation that i can refer to?
Or can anyone provide me a simple explanation on how to configure the .htaccess file to understand that if i want to go to
www.example.com/pageone.html
, all i need to type into the URL is
www.example.com/pageone/
and PHP files, etc as well?

First of all, there's the Official Documentation. To solve your specific problem, I would go about this way:
RewriteEngine on #Turn on rewrite
RewriteCond %{REQUEST_FILENAME} !-f #If requested is not a filename...
RewriteCond %{REQUEST_FILENAME} !-d #And not a directory
RewriteRule ^([^/]+)/?$ /$1.html [L] #Preform this redirect
The RewriteConds only apply to the next following rule. If you were to have multiple rules, you'd need to write the conditions for each one.
Now, the Apache server matches the requested path (everything after www.example.com/), to see if it matches any of the rules you've specified. In which case, there is only one:
^([^/]+)$
This regular expression matches any number of characters, which are not slash /, followed by an optional trailing slash. If the match was found, it will rewrite the request to the second parameter: /$1.html, $1 means "Whatever was matched between the brackets", which in our case is all of the non-slash characters.
The [L] flag, tells the rewriting engine to stop looking for rules if this rule was matched.
So to conclude, www.example.com/whatever/ will be rewritten sliently at the server to www.example.com/whatever.html

RewriteEngine on
RewriteBase /
RewriteRule ^([^/]+)$ /$1.html
That should be all you need for this rewrite. It basically says "Anything that is not a forward slash will be assigned to the variable $1. So /foo would point to /foo.html

For official documentation you can look here Apache httpd mod_rewrite.
On Google you can search with keywords such as url rewriting tutorial.
The weird characters are called regular expressions. It's not an easy part to learn but there is a lot of tutorial about them.
PS: this is not a straight answer but some stuff to let you go further and understand how url rewriting works.

Related

Editing .htaccess file to modify URL

I'm trying to modify my .htaccess file to modify my URL and have tried many methods but cannot achieve exactly what I want. For example I have this URL:
http://mywebsite.com/FOLDER/index.php?id=5
Now I want it to look like:
http://mywebsite.com/FOLDER/5
or
http://mywebsite.com/FOLDER/ID/5
My .htaccess contains the following code:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^index/([0-9]+)/([0-9a-zA-Z_-]+) index.php?id=$1 [NC]
I cannot figure out what's wrong. Thanks.
You can use:
RewriteEngine on
# external redirect from actual URL to pretty one
RewriteCond %{THE_REQUEST} \s/+FOLDER/index\.php\?id=(\d+) [NC]
RewriteRule ^ /FOLDER/%1? [R=301,L,NE]
# internal forward from pretty URL to actual one
RewriteRule ^FOLDER/(\d+)/?$ FOLDER/index.php?id=$1 [L,QSA,NC]
The first argument of RewriteRule is what the incoming url without domain and without preceding paths (more on that later) is going to be matched against. This url is, in your case, http://mywebsite.com/FOLDER/5. Assuming that your .htaccess file is in your DocumentRoot, the regex will match against FOLDER/5.
You are currently trying to match FOLDER/5 with ^index/([0-9]+)/([0-9a-zA-Z_-]+), which is not going to work. A better regex would be ^(.*)/([0-9]+)$ or ^(.*)/ID/([0-9]+)$. You can then rewrite to $1/index.php?id=$2. I would recommend using the [L] flag to stop rewriting for this round to avoid common problems with multiple rules matching while you do not expect them to.
Besides this, make sure that your .htaccess files are being read (e.g. by checking that if you enter garbage, you get a 500 internal server error), that mod_rewrite is enabled, that you are allowed to override FileInfo. You also may need to turn AcceptPathInfo off.

Apache rewrite Subdirectories URL Internally Redirect to Query String

I want to make apache rewrite all links in the form :
host.com/links/<a>/<b>/<c>
such as :
host.com/links/1/2/3
To the form :
host.com/links/?a=1&b=2&c=3
I understand i need to add .htaccess with rewriting rules to links folder but dont really understand the syntax of the rewriting rules.
can any one help?
According to this link
The URL in the browser would be:
host.com/links/1/2/3
The actual page rendered by the server would be:
host.com/links/?a=1&b=2&c=3
Add to .htaccess this lines:
RewriteEngine On
RewriteRule ^/links/([^/]+)/([^/]+)/([^/]+) /?a=$1&b=$2&c=$3 [PT]
When you want to rewrite more complex URLs, you need to create more
complex regular expressions. Just about any pattern can be expressed
as a regular expression, if you break it down into small chunks. This
regular expression breaks down into just a few component parts, once
you get past staring at the seemingly random characters:
[^/]
The above component is a character class containing a "not slash".
So, if we do ...
[^/]*
that means "zero or more not-slash characters". In other words, we're
looking for everything between the slashes. There are two sets of
these, because we're looking for two blocks of things between slashes.
Armed with that little nugget of information, go look at the regular
expression again and see if it makes a little more sense. As with the
earlier , I used the [PT] flag to indicate that the target URL was not
merely a file to be served, but was something that needed to be
handled. In this case, it's going to be a cgi-script handler. So
Apache passes the resulting URL through to that handler.
The basci syntax of rewriting rule is to be followed like below.
Enable mod_rewrite and .htaccess through httpd.conf and then put this code in your .htaccess under DOCUMENT_ROOT directory
Options +FollowSymLinks -MultiViews
# Turn mod_rewrite on
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} ^(demo)\.(.+)$ [NC]
RewriteRule ^(setup)/?$ http://www.%2/%1/$1 [L,R=301,NC]
RewriteCond %{HTTP_HOST} ^(demo)\.(.+)$ [NC]
RewriteRule ^(xyz)/?$ http://www.%2/%1/web/$1 [L,R=301,NC]
RewriteCond %{HTTP_HOST} !^demo\. [NC]
RewriteRule ^(xyz)/?$ /web/$1 [L,R=301,NC]
For more informations about rewrite flags
https://httpd.apache.org/docs/2.4/rewrite/flags.html
https://httpd.apache.org/docs/2.4/rewrite/intro.html

Shorten URLs with mod_rewrite

I am currently trying to make a URL shortener feature for one of my projects; what I want to do if a user visits the site with a URL that does not contain any slashes (for directories) or file extensions, it should redirect to a PHP script that will serve up the correct file. For example:
http://example.com/A123 would be rewritten as http://example.com/view.php?id=A123
but
http://example.com/A123/ would not be rewritten, and
http://example.com/A123.png would not be rewritten either. I have been messing with mod_rewrite for a few hours now and for the life of me I cannot get this to work...
With no way to identify the URI that needs to be shortened you need to exclude all other possibilities. This will likely require you to build a lengthy list of exclusions. Below is a starting point. Each of these conditions verifies the requesting URI does NOT match (signified by the !). When it doesn't match all conditions the rule is run.
RewriteCond %{REQUEST_URI} !^/view.php
RewriteCond %{REQUEST_URI} !.html$
RewriteCond %{REQUEST_URI} !/$
RewriteRule ^/(.*)$ http://example.com/view.php?id=$1 [QSA]
The above also requires you (as you have requested) to break a standard practice rule, which is to handle directory requests without a trailing slash. You are likely to come across other issues, as the rules above break your Apache server side directory rules.
Rethinking the logic. If you had some way to identify the URL that is to be shortened it would be much easier. For example 's', http://example.com/s/A123.
RewriteCond %{REQUEST_URI} ^/s/
RewriteRule ^/s/(.*)$ http://example.com/view.php?id=$1 [QSA]
I'm definitely no guru at this, but its similar to what I'm trying to accomplish (see my yet unanswered question)
However, if I understand correctly, this (untested) RewriteRule may work:
RewriteRule ^([^\.\/]*)$ view.php?id=$1 [L]
The magic part is the [^\.\/]* which says: 1 or more (*) instances of a charactor ([]) which is not ([^ ]) a period or a slash (\ escapes these charactors).
Like I said, I haven't tested this, nor am I an expert, but perhaps this will help.

How can you ignore the end of a URL using mod_rewrite?

I'd like to structure my website like this:
domain.com/person/edit/1
domain.com/person/edit/2
domain.com/person/edit/3
etc.
I have a page to which all these requests should go:
domain.com/person/edit.html
The JavaScript will look at the trailing part of the url when the page is loaded so I want the server to internally ignore it.
I've got this rewrite rule:
RewriteRule ^person/view/(.*)$ person/view.html [L]
I'm sure that I'm missing something obvious but when I visit one of the pages above I get this 404 message:
The requested URL /person/view.html/1 was not found on this server.
As far as I understood it the [L] means that if this rule applies Apache should stop rewriting and serve up the alternate page. Instead it seems to be applying the rule at the earliest possible moment and then appending the rest of the unmatched url to the re-written one.
How do I get these re-writes to work properly?
"As far as I understood it the [L] means that if this rule applies Apache should stop rewriting and serve up the alternate page."
Well .. [L] flag tells Apache to stop checking other rules .. and rewrite goes to next iteration .. where it again checks against all rules again (that is how it works).
Try these "recipe" (put it somewhere on top of your .htaccess):
Options +FollowSymLinks -MultiViews
# activate rewrite engine
RewriteEngine On
# Do not do anything for already existing files
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule .+ - [L]
Another idea to try -- add DPI flag to your [L]: [L,DPI]
If Options will not help, then rewrite rule should. But it all depends on your Apache's configuration. If the above does not work -- please post your whole .htaccess (update your question).

Why would mod_rewrite rewrite twice?

I only recently found out about URL rewriting, so I've still got a lot to learn.
While following the Easy Mod Rewrite tutorial, the results of one of their examples is really confusing me.
RewriteBase /
RewriteRule (.*) index.php?page=$1 [QSA,L]
Rewrites /home as /index.php?page=index.php&page=home.
I thought the duplicates might have had been caused by something in my host's configs, but a clean install of XAMPP does the same.
So, does anyone know why this seems to parse twice?
And, to me this seems like, if it's going to do this, it would be an infinite loop -- why does it stop at 2 cycles?
From Example 1 on this page, which is part of the tutorial linked in your question:
Assume you are using a CMS system that rewrites requests for everything to a single index.php script.
RewriteRule ^(.*)$ index.php?PAGE=$1 [L,QSA]
Yet every time you run that, regardless of which file you request, the PAGE variable always contains "index.php".
Why? You will end up doing two rewrites. Firstly, you request test.php. This gets rewritten to index.php?PAGE=test.php. A second request is now made for index.php?PAGE=test.php. This still matches your rewrite pattern, and in turn gets rewritten to index.php?PAGE=index.php.
One solution would be to add a RewriteCond that checks if the file is already "index.php". A better solution that also allows you to keep images and CSS files in the same directory is to use a RewriteCond that checks if the file exists, using -f.
1the link is to the Internet Archive, since the tutorial website appears to be offline
From the Apache Module mod_rewrite documentation:
'last|L' (last rule)
[…] if the RewriteRule generates an internal redirect […] this will reinject the request and will cause processing to be repeated starting from the first RewriteRule.
To prevent this you could either use an additional RewriteCond directive:
RewriteCond %{REQUEST_URI} !^/index\.php$
RewriteRule (.*) index.php?page=$1 [QSA,L]
Or you alter the pattern to not match index.php and use the REQUEST_URI variable, either in the redirect or later in PHP ($_SERVER['REQUEST_URI']).
RewriteRule !^index\.php$ index.php?page=%{REQUEST_URI} [QSA,L]