Htaccess /username rewrite conflict with existing mapped files - apache

I have made a .htaccess file to rewrite /username to /profile.php?=username
this is my htaccess file
Options All -Indexes
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-zA-Z0-9_-]+)$ /profile.php?username=$1 [QSA,L]
the issue is that the other files in the same level are skipped unless if i added their .php extention which is awful.
Can i prevent /username rewrite if that file exist and also do another rewrite in order to have URL without file extention ?

You can keep these 2 rules in the given order:
Options All -Indexes -MultiViews
RewriteEngine On
## To internally rewrite /dir/file to /dir/file.php
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.+?)/?$ $1.php [L]
## for user profile
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([\w-]+)/?$ profile.php?username=$1 [QSA,L]
It is important to turn off -MultiViews (content negotiation service) of Apache for this.

#MrWhite nothing for now, i just want to stop rewrite when file exist
The rules would work together. If you check that the target .php file exists before rewriting the request (on your extensionless URLs) - as you should be - then you don't need to apply the same filesystem check on your existing rule that rewrites the request to profile.php.
For example:
# Append ".php" if request file without extension and target file exists
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^([^.]+[^/])$ $1.php [L]
# Rewrite user profiles (directory check is optional)
#RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-zA-Z0-9_-]+)$ profile.php?username=$1 [QSA,L]
I'm assuming your URLs do not contain dots (this naturally avoids having to make an exception for requests that end in a file extension). The regex ^([^.]+[^/])$ matches URLs that do not contain a dot and do not end in a slash (ie. directories end in a slash).
Filesystem checks are relatively expensive, so are best kept to a minimum (or avoided altogether if possible). In the rule that appends the ".php" extension, there is no need to check that the request does not map to a directory before checking that the request does map to a file when .php is appended, these checks are mutually inclusive. (But if a directory did exist then the file wouldn't be accessible - catch 22.)
Likewise, there is no need to check that the user-profile URL does not map to a file, unless you also have files that don't have a file extension (very unlikely and best avoided anyway if you do).
Even the directory check on the user-profile URL is debatable. This is only required if you need to be able to access subdirectories off the root directory directly.
With this limited set of rules, it doesn't really matter whether MutliViews is enabled or not. (Although best practice would dictate that MultiViews should be disabled here, to avoid future conflicts.) The effect of having MultiViews enabled will just mean the first rule that appends the .php file extension is bypassed (not required). But having MultiViews enabled essentially enables extensionless URLs on everything.
Consider restructuring your user-profile URLs
HOWEVER, there is a fundamental "problem" with your user-profile URL structure - namely that they do "conflict" with actual file requests. The actual file requests will naturally take priority, but this means that you can't have usernames that happen to match files in the root directory - since the user profile will not be accessible. This check would need to be enforced when creating/updating user accounts.
It would be better to avoid this ambiguity to begin with and allow all usernames (that could also match root files) by creating a "unique" URL. eg. /user/<username>. This also completely avoids having to perform the directory check. For example:
# Rewrite user profiles (directory check is not required)
RewriteRule ^user/([a-zA-Z0-9_-]+)$ profile.php?username=$1 [QSA,L]

Related

.htaccess rewrite - remove all extensions

I would like a to do a rewrite rule that removes all extensions - regardless of filename
https://example.com/filename.extension -> https://example.com/filename
for example:
https://example.com/horses.txt -> https://example.com/horses
https://example.com/icecream.json -> https://example.com/icecream
I tried:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^(.*)\.*$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)$ *? [QSA,L]
</IfModule>
not working as it should
You can only reasonably do what you are asking with MultiViews.
For example, as simple as:
Options +MultiViews
You need to remove your existing mod_rewrite directives.
Now, a request for example.com/horses will be correctly routed to /horses.txt, or whatever file extensions you are using. MultiViews uses mod_negotiation.
This isn't so easy to do with mod_rewrite, since you need to test each file extension in turn in order to work out what file you need to rewrite back to in order to route the request correctly. eg. Should a request for example.com/horses route to /horses.txt or horses.jpg? MultiViews does this comparison for you.
I would like a to do a rewrite rule that removes all extensions
Although, you need to actually remove the file extension in the HTML source. This isn't something you do in .htaccess, unless you need to preserve SEO or backlinks that have already linked back to the old URLs.
UPDATE: Perhaps I wasn't clear enough, I would like the url to display without the extension even if it is linked to it, or to go to that file if linked without the extension
Well, you need to actually remove the file extension on all your internal links. You can issue a "redirect" in .htaccess to remove the extension for the benefit of search engines and 3rd party links - but if you rely on this for your internal links then it will potentially slow users and your site as you are doubling the number of requests hitting your server.
To remove the file extension for direct requests (SEO / 3rd party links), you could do something like this:
RewriteEngine On
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^([^.]+)\.[\w]{2,4}$ /$1 [R=302,L]
This does assume that the only dot in the URL-path is the one that delimits the file extension.
The difficult part is then internally rewriting the request back to the underlying file with an extension - that's where MultiViews comes in (first part of my answer).

Removing .php extensions from output

I'm developing a small CMS solution with Perch. It's currently running on WampServer on my local development machine.
As Perch doesnt provide friendly URL's out of the box, I wanted to implement this, whilst ensuring the /perch directory remains untouched.
So far, I have the rewriting part working i.e. a request for /blog.php will 301 to /blog, and, /blog will rewrite to /blog.php, using the rules below:
Options +FollowSymLinks -MultiViews
RewriteEngine On
# Rewrites domiain.com/file to domain.com/file.php
RewriteCond %{REQUEST_URI} !^/perch
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.*)$ $1.php
# Redirects domain.com/file.php to domain.com/file
RewriteCond %{REQUEST_URI} !^/perch
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteCond %{REQUEST_URI} ^(.+)\.php$
RewriteRule (.*)\.php$ /$1 [R=301,L]
However, I'm still left with .php extensions in the HTML output. I tried adding the following to my .htaccess file:
AddOutputFilterByType SUBSTITUTE text/html
#Replace all .php extensions
Substitute s|.php||ni
#Original blog pattern /blog/post.php?s=2014-11-18-my-first-blog-post
Substitute s|blog/post\?s=(\w+)|blog/$1|i
However, this is applied globally, i.e. even to links within the /perch folder. I couldn't find anyway of adding a condition to apply it to everything except for the /perch folder - is there such a way?
I also looked at the ProxyPass/ProxyReversePass documentation, but this seems like overkill to just replace some HTML on a page.
Any help would be greatly appreciated.
Kind regards,
dotdev
Are you talking about the Perch CMS from www.grabaperch.com?
Everything is here: http://docs.grabaperch.com/video/v/simple-url-rewriting/
However, I'm still left with .php extensions in the HTML output
.htaccess / mod_rewrite does nothing to your HTML output.
Think of the RewriteRules as a postman who delivers mail (URLs) to target mailboxes (actual files).
What you do is you "manually" omit the .php extension in your markup (HTML output):
In perch_pages_navigation(), you need to set hide-extensionsto true
URLs you add manually: just write them without .php
Now you need to instruct the postman to route those addresses to the .php file anyway. That's what these RewriteRules are for. So .htaccess doesn't remove the .php suffix - on the contrary, it adds it.
Here's the basic .htaccess (goes into your public_html directory) for Perch (or any "remove .php" use case) + Perch Blog. I've added some explanations:
# make sure the address we received (e.g. /mypage) is not an existing file
RewriteCond %{REQUEST_FILENAME} !-f
# make sure it's not an existing directory either
RewriteCond %{REQUEST_FILENAME} !-d
# make sure there IS an existing .php file corresponding to it
RewriteCond %{REQUEST_FILENAME}.php -f
# if the address starts with "blog/", pick what comes afterwards, put it into the GET Parameter and quit (that's the [L])
RewriteRule ^blog/([a-zA-Z0-9-/]+)$ /blog/post.php?s=$1 [L]
# if the first conditions are ok, but it wasn't a blog post (else we would have quit), just append .php to it. Ah, and keep other get params (that's the QSA=Query String Append).
RewriteRule ^(.+)$ $1.php [L,QSA]
For more refined possibilities, you can e.g. start here: https://github.com/PerchCMS/perchdemo-swift/blob/master/public_html/.htaccess
This will have no impact at all on the functionality of the CMS in /perch/.

How do I make a custom URL parser with Apache?

I heard this can be done with the web.config file. I want to make it so, for instance, my URL http://help.BHStudios.org/site might go to http://BHStudios.org/help.php?section=site, or http://i.BHStudios.org/u3Hiu might redirect to some other URL stored in a database with the hash u3Hiu as the key, or if something goes wrong and the internal file structure is exposed like http://Kyli.BHStudios.org/http/bhstudios/v2/self/index.php (something that happens with GoDaddy's servers for whatever reason) it'll change it to its intended URL http://Kyli.BHStudios.org before that's exposed tot he user.
Since I've never done this before, could you please also explain why you gave the answer you did?
A few Apache mod_rewrite rules in either your servers httpd.conf or in a .htaccess file, in your htdocs directory will do the majority of what you want e.g.
RewriteEngine On
RewriteBase /
# Default Rule - for non physical objects (not a file or directory):
# Internally rewrite (user won't see the URL) to /index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^ /index.php [L]
#If the Browser request contains a .php, instruct the browser to remove it.
RewriteCond %{THE_REQUEST} \.php [NC]
RewriteRule ^/?(.*)\.php$ http://%{HTTP_HOST}/$1 [R=301,NC,L]
# Specific rule
RewriteRule ^/?site /help.php?section=site
The masking of real file system objects will not be perfect, and slightly pointless, as a user just needs to right click and view source on any served page, to obtain the actual URL's.

\. in RewriteCond matches a file requested from the browser, but not relative references in the file itself

I am trying to redirect files in a directory to a CGI if the URL does not exist, where the URL is further processed. Since that has also consquences for non-existing files referenced inside of HTML-documents, an HTML-document can trigger an entire cascade of redirects if it contains a lot of references (imgs, css files, js files) that do not exist. Of course, in an ideal world, all of those should exist but, well...
Anyway, since I won't be using the character "." in any of the URLs I want to redirect, I thought it a pretty nifty idea to exclude file names with a "." in the RewriteCond, since that should take care of .css, .js and .gif/.jpg.
Not so lucky. If I enter a URL with a "." in the browser location, I get the (correct) message "file not found", but when I check in the server logs, every non-existent file referenced by the HTML template is passed on to the CGI, regardless of whether it contains "." or not. css/doc.css is processed as will be images/bg.png and all other files containing ".". My .htaccess file contains the following rules:
RewriteEngine on
RewriteBase /bla
RewriteCond %{REQUEST_FILENAME} !\.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) /cgi-bin/env.pl?.template=Main.html&.query=$1 [PT]
However, after changing one of the references in the HTML-file for an external stylesheet (css/doc.css) to an absolute URL (aka /css/doc.css), it only provokes the "File not found" error, as it should according to the above rule. Is Apache not applying those regexes to relative URLs?
First of all there is nothing relative when Apache receives a HTTP/HTTPS request. Relative paths are resolved by your browser itself before sending out request to web server.
Now try changing your code to this:
Options +FollowSymLinks -MultiViews
# Turn mod_rewrite on
RewriteEngine On
RewriteBase /bla/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !\.
RewriteRule ^(.+)$ /cgi-bin/env.pl?.template=Main.html&.query=$1 [L,QSA]

Trying to password protect a URL with htaccess

I am using Expression Engine 2 Freelancer editon that doesn't have an authentication module.
I am trying to password protect a template group that has a virtual directory www.domain.com/template
What I am trying to do is use the htaccess in the root to force people to enter a username and password when they try and navigate to the to "template" section and the two files under it.
The way that Expression Engine works the templates are routed to and not physical directories.
My question is how can I password protect this url, I tried using LocationMatch but it didn't work?
Thanks
You can't efficiently protect a mod_rewritten URL (if it's possible at all). An attacker would just have to access the physical location that the protected URL gets rewritten to - which you would be leaving unprotected in this scenario.
You will still have to do this on PHP side, I think. If your PHP is running as an Apache module, it should be possible to check whether the requested resource belongs to the protected directory (either through QUERY_STRING or some other indicator), and then send the proper headers requesting authentication as described here in the PHP manual.
Which method of removing index.php from the URL are you using?
If you're using the "File and Directory Check" Method, you can modify the stock Apache mod_rewrite rule to exclude a certain directory while still allowing all other requests to be run thru index.php.
For example, using the base "File and Directory Check" rewrite rule:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond $1 !\.(gif|jpe?g|png)$ [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php/$1 [L]
</IfModule>
With this method, Apache checks to see if the file or directory exists -- if it does the file is served to the browser; if it doesn't exist then it's sent thru index.php and parsed as an ExpressionEngine URI.
To exclude your directory, modify the rewrite rule by adding your .htaccess Basic Authenticated password-protected directory:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} !^/(secret-directory|secret-directory/.*)$
RewriteCond $1 !\.(gif|jpe?g|png)$ [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php/$1 [L]
</IfModule>
I'm not fully aware of what all the limitations are with the Freelancer License, but I answered a similar question about password-protecting pages in ExpressionEnginethat may prove helpful in your situation.