Mod Rewrite and Redirecting directories - apache

A little explanation and then 2 questions....
Essentially I am building a single page app to display media (by tags, by type, etc.. etc..). All the media is uploaded & tagged by me, so I am not scrapping for content or relying on 3rd party services (twitter, facebook, flickr, imgur, etc...). I am doing most of the work with JS (RequireJS modules...) and am leveraging ToroPHP for a simple, lightweight & restful API
My end goal is this:
Allow returning users to type in URLs like: / OR //, and always load my root index.php (maintaining the url). At the same time i need several subdirectories available for the api to
fetch data:
/assets/ (CSS, Font Files, Sprites or SVG Icons)
/components/ (for RequireJS scripts)
/api/ (this is just a sub directory that has a ToroPHP instance for the API)
I believe the below snippet solves this issue (I was wondering if I could get a good explanation of what this is doing though? I have pieced it together from snippets on the internet):
RewriteEngine on
RewriteRule ^/?assets/.+$ - [L]
RewriteRule ^/?components/.+$ - [L]
RewriteRule ^/?api/.+$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond $1 !^(index\.php)
RewriteRule ^(.*)$ /index.php [L]
Additionally, I was hoping you could help me figure out a way to only allow my "app" (that is, internal calls (is there a server user on apache?) initiated by RequireJS modules to the API) to have access to /assets/ & /components/ & /api/ so if a user types in /api/test?subject=123 they are routed to a page that isn't the index.php, but isn't the actual API either. I would like this to be the same for /components/ & /assets/ as well.
*Summary questions:
1). Help explain the code snippet above.
2). Can I allow my server access to /assets/ & /components/ & /api/ but not allow a user to type into them?
Obviously, Apache isn't my specialty, but I am fairly confident in learning.
Thanks!

Help explain the code snippet above.
RewriteEngine on
Turns on the rewrite engine, none of the rules will do anything unless the rewrite engine is turned on
RewriteRule ^/?assets/.+$ - [L]
RewriteRule ^/?components/.+$ - [L]
RewriteRule ^/?api/.+$ - [L]
These rules are called "pass-through" rules. The - target means "do nothing" and the L flag stops the rewriting for the current pass. These essentially just mean: if URI starts with /assets/, do nothing and stop rewriting. If the URI starts with /components/, do nothing and stop rewriting. If the URI starts with /api/ then do nothing and stop rewriting.
The next rule has a few conditions associated with it. The rule won't get applied unless all conditions are met:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
These check if the requested URI maps to an existing file or directory, The !-f means the request isn't for a file, and the !-d means the request isn't a directory.
RewriteCond $1 !^(index\.php)
This checks if the request doesn't start with /index.php.
Finally, if all 3 conditions are met, then:
RewriteRule ^(.*)$ /index.php [L]
rewrite whatever the request is to /index.php, and stop rewriting.
Can I allow my server access to /assets/ & /components/ & /api/ but not allow a user to type into them?
No. If someone goes to your page and your page links to something in one of these directories, the browser loads it just like it would if someone typed it into the URL address bar. The only difference is that (sometimes) the browser will include a "Referer" header request letting the server know what page told the browser to load the file. It's not always going to be used by all browsers and it can easily be forged. So checking the referer isn't going to guarantee that people can't still directly load your files.
In order to check the referer, add this right below the RewriteEngine On line:
RewriteCond %{HTTP_REFERER} !^https?://example.com/
RewriteRule ^(assets|components|api)/ - [L,F]
This is essentially a condition that checks the referer, and if it doesn't start with "http://example.com" or "https://example.com", assuming that "example.com" is your site, then the rule checks that the request starts with either /assets/, /components/, or /api/, and passes it through without changing anything, except the F flag causes the server to return a 403 Forbidden.

Related

Unable to find the source of a rewrite

I'm trying to setup an apache server with a somewhat complex htaccess rules. I have found various solutions for individual rules, however, when I try to put the whole picture together I lack experience with htaccess to make it work and I experience some weird rewrites.
Rules:
The root contains index.php which launches a single page UI app. That means that URLs should be rewritten to /index.php
There are a few folders which should be exempt from Rule #1, e. g. if /admin is accessed, /admin/index.php should be shown, not /index.php.
Rule #2 also applies to all files on the server (so that the UI app can load resources like js/css/etc.)
All http://* requests should be redirected to https://*
Some possibly relevant details:
The server has a LetsEncrypt SSL certificate installed. The "redirect from http" was enabled during installation, I figured it might've created some weird rewrite rules, so I navigated to etc/apache2/sites-available/lamp-server.conf and commented out these three lines
#RewriteEngine on
#RewriteCond %{SERVER_NAME} =[DOMAIN]
#RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
Here's the relevant part of /.htaccess. There are no other rewrites/redirects in this file.
RewriteEngine On
RewriteCond %{REQUEST_URI} !^/(wp|admin|phpmyadmin)(/.*)?$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* /index.php [QSA]
RewriteCond %{HTTPS} !on
RewriteRule .* https://%{HTTP_HOST}%{REQUEST_URI} [L,R]
When I clear the cache in browsers, it seems to work correctly for a while. But when I open the SPA part of the server (i. e., the first RewriteRule is triggered), the browser caches some sort of rewrite and then /admin also starts pointing to the SPA, unless I do a Ctrl+F5 each time.
I tried executing GET requests via Fiddler 4, everything seems to be working correctly there. Tried these scenarios:
HTTP root - got a 302 redirect.
HTTPS root - 200 OK.
/search SPA page - got 200 OK with rendered /index.php as response.
/admin page outside of SPA - got 200 OK with rendered /admin/index.php as response.
During the development of the htaccess rules there might've been a time when a 301 redirect was executed, which is now messing with my browser cache. However, then it's weird that even if I install a new fresh browser, it gets affected by the issue. Even if I try to access it from my mobile (which I never did before, so there's nothing cached), the same issue happens.

.htaccess rewrite returning Error 404

RewriteEngine on
RewriteCond %{QUERY_STRING} (^|&)public_url=([^&]+)($|&)
RewriteRule ^process\.php$ /api/%2/? [L,R=301]
Where domain.tld/app/process.php?public_url=abcd1234 is the actual location of the script.
But I am trying to get .htaccess to make the URL like this: domain.tld/app/api/acbd1234.
Essentially hides the process.php script and the get query ?public_url.
However the script above is returning error 404 not found.
I think this is what you are actually looking for:
RewriteEngine on
RewriteCond %{QUERY_STRING} (?:^|&)public_url=([^&]+)(?:$|&)
RewriteRule ^/?app/process\.php$ /app/api/%1 [R=301,QSD]
RewriteRule ^/?app/api/([^/]+)/?$ /app/process.php?public_url=$1 [END]
If you receive an internal server error (http status 500) for that then check your http servers error log file. Chances are that you operate a very old version of the apache http server, you may have to replace the [END] flag with the [L] flag which probably will work just fine in this scenario.
And a general hint: you should always prefer to place such rules inside the http servers (virtual) host configuration instead of using dynamic configuration files (.htaccess style files). Those files are notoriously error prone, hard to debug and they really slow down the server. They are only supported as a last option for situations where you do not have control over the host configuration (read: really cheap hosting service providers) or if you have an application that relies on writing its own rewrite rules (which is an obvious security nightmare).
UPDATE:
Based on your many questions in the comments below (we see again how important it is to be precise in the question itself ;-) ) I add this variant implementing a different handling of path components:
RewriteEngine on
RewriteCond %{QUERY_STRING} (?:^|&)public_url=([^&]+)(?:$|&)
RewriteRule ^/?app/process\.php$ /api/%1 [R=301,QSD]
RewriteRule ^/?api/([^/]+)/?$ /app/process.php?public_url=$1 [END]
I am trying to get .htaccess to make the URL like this: example.com/app/api/acbd1234.
You don't do this in .htaccess. You change the URL in your application and then rewrite the new URL to the actual/old URL. (You only need to redirect this, if the old URLs have been indexed by search engines - but you need to watch for redirect loops.)
So, change the URL in your application to /app/api/acbd1234 and then rewrite this in .htaccess (which I assume in in your /app subdirectory). For example:
RewriteEngine On
# Rewrite new URL back to old
RewriteRule ^api/([^/]+)$ process.php?public_url=$1 [L]
You included a trailing slash in your earlier directive, but you omitted this in your example URL, so I've omitted it here also.
If you then need to also redirect the old URL for the sake of SEO, then you can implement a redirect before the internal rewrite:
RewriteEngine On
# Redirect old URL to new (if request by search engines or external links)
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{QUERY_STRING} (?:^|&)public_url=([^&]+)(?:$|&)
RewriteRule ^process\.php$ /app/api/%1? [R=302,L]
# Rewrite new URL back to old
RewriteRule ^api/([^/]+)$ process.php?public_url=$1 [L]
The check against REDIRECT_STATUS is to avoid a rewrite loop. ?: inside the parenthesised subpattern avoids the group being captured as a backreference.
Change the 302 (temporary) to 301 (permanent) only when you are sure it's working OK, to avoid erroneous redirects being cached by the browser.

How to setup request proxy using URL rewriting

I have an e-commerce site that resides in:
http://dev.gworks.mobi/
When a customer clicks on the signin link, the browser gets redirected to another domain, in order for authentication:
http://frock.gworks.mobi:8080/openam/XUI/#login/&goto=http%3A%2F%2Fdev.gworks.mobi%3A80%2Fcustomer%2Faccount%2Flogin%2Freferer%2FaHR0cDovL2Rldi5nd29ya3MubW9iaS8%2C%2F
I'm trying to rewrite http://dev.gworks.mobi/* to http://frock.gworks.mobi:8080/openam/*, without redirection.
I've tried this in the .htaccess of the dev.gworks.mobi site:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/openam(.*)$ [NC]
RewriteRule ^(.*)$ http://frock.gworks.mobi:8080/$1 [P,L]
</IfModule>
But when I access http://dev.gworks.mobi/openam, it shows a 404 page not found page.
Can anyone help me to achieve my use case?
Try this:
RewriteEngine on
RewriteBase /
# Make sure it's not an actual file being accessed
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# Match the host
RewriteCond %{HTTP_HOST} ^dev\.gworks\.mobi
# Rewrite the request if it starts with "openam"
RewriteRule ^openam(.*)$ http://frock.gworks.mobi:8080/$1 [L,QSA]
This will rewrite all the requests to dev.gworks.mobi/openam to frock.gworks.mobi:8080.
If you want to mask the URI in a way that it's not visible to the visitor that she's visiting the authentication app, you need to add a P flag. Please note that it needs Apache's mod_proxy module in place:
RewriteRule ^openam(.*)$ http://frock.gworks.mobi:8080/$1 [P,L,QSA]
Feel free to drop the L flag, if it's not the last rewrite rule. See RewriteRule Flags for more information.
The 404
If it's all in place and you're still getting a 404 error, make sure that the target URL is not throwing 404 errors in the first place.
Second, check if you're still getting the error with the correct referrer URI set. It might be designed in a way to throw a 404, if the referrer is not correctly set. If that's the case, which I suspect, you need to use the R flag and redirect instead of proxying the request.
Last thing that comes to my mind, some webapps are not built in a way to figure out the URI address. The host, as well as the port number, might be hard-coded somewhere in the config files. Make sure that the authentication app is able to be run from another URL without the need to edit the configs.
Test
You can test the rewriterule online:

How do I make a custom URL parser with Apache?

I heard this can be done with the web.config file. I want to make it so, for instance, my URL http://help.BHStudios.org/site might go to http://BHStudios.org/help.php?section=site, or http://i.BHStudios.org/u3Hiu might redirect to some other URL stored in a database with the hash u3Hiu as the key, or if something goes wrong and the internal file structure is exposed like http://Kyli.BHStudios.org/http/bhstudios/v2/self/index.php (something that happens with GoDaddy's servers for whatever reason) it'll change it to its intended URL http://Kyli.BHStudios.org before that's exposed tot he user.
Since I've never done this before, could you please also explain why you gave the answer you did?
A few Apache mod_rewrite rules in either your servers httpd.conf or in a .htaccess file, in your htdocs directory will do the majority of what you want e.g.
RewriteEngine On
RewriteBase /
# Default Rule - for non physical objects (not a file or directory):
# Internally rewrite (user won't see the URL) to /index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^ /index.php [L]
#If the Browser request contains a .php, instruct the browser to remove it.
RewriteCond %{THE_REQUEST} \.php [NC]
RewriteRule ^/?(.*)\.php$ http://%{HTTP_HOST}/$1 [R=301,NC,L]
# Specific rule
RewriteRule ^/?site /help.php?section=site
The masking of real file system objects will not be perfect, and slightly pointless, as a user just needs to right click and view source on any served page, to obtain the actual URL's.

Codeigniter 2 and .htaccess - how to implement "down for maintenance" mode?

I know this question might have been asked a few times already, but I need a specific solution for CodeIgniter, using an .htaccess file that would send every request to an index_failsafe.php file instead of the normal index.php but ONLY if the url doesn't start with 'admin'. Example:
www.myhost.com/admin -> work as usual
www.myhost.com/welcome -> sent to failsafe page
in case 1:
RewriteRule ^.*$ index.php/$1 [L]
in case 2:
RewriteRule ^.*$ index_failsafe.php/$1 [L]
My rewrite conditions are:
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
Is it possible to do this?
Personally I do it by IP - so I can take my site offline, but I still have full access to it (to test new functions and make sure it is working before bringing back up)
RewriteEngine on
# For maintenance:
# If your IP address is 1.1.1.1 - then dont re-write
RewriteCond %{REMOTE_ADDR} !^1\.1\.1\.1
# If the person is requesting the maintenance page, also dont rewrite (prevent loops)
RewriteCond %{REQUEST_URI} !/maintenance.html$
# Otherwise rewrite all requests to the maintenance page
RewriteRule $ /maintenance.html [R=302,L]
# do not rewrite links to the documentation, assets and public files
RewriteCond $1 !^(assets)
# do not rewrite for php files in the document root, robots.txt or the maintenance page
RewriteCond $1 !^([^\..]+\.php|robots\.txt|maintenance\.html)
# but rewrite everything else
RewriteRule ^(.*)$ index.php/$1 [L]
just change !^1.1.1.1 to your current IP. i.e. !^121.65.56.65
If your not sure what your IP is - just google "what is my IP" - and it will show up as the first hit
But in terms of your specific question - this should work:
RewriteCond %{REQUEST_URI} !/admin$
RewriteRule $ /index_failsafe.php [R=302,L]
edit:
If you use cookies to store session data for users then it might be simpler to change the cookie name to force everyone to log out, then change the login page controller to load a view that says "down for maintenance" or whatever.
When you're done just change the cookie name back to what it was and everyone will still be logged in, and make sure to change back the view that the login page controller loads so users can log in normally.
To change the session cookie for CI, open up config.php and change the value for:
$config['sess_cookie_name']
You can take it a step further by creating an alternate login controller and view titled "maintenance login" or something like that, and then you can still log in for testing.
This is the method that I use when I need to take my saas down for maintenance, and it works great. Our public facing sales page is unaffected, and I don't have to mess with htaccess.