URL Rewriting is Sending Requests for Non-Existent Pages to Main Page, Instead of 404 Page - apache

I'm trying to set up some URL rewriting using my .htaccess file. I'm a beginner, so bear with me! What I'm trying to do is reroute all requests from the root to the /main/ folder, since that's where I've got my CMS's files. For example, www.example.com/page should request www.example.com/main/page, but just show up in the browser as www.example.com/page. There are times, though, when I want to be able to access other folders, like when I post a dev or stage site at www.example.com/dev or www.example.com/stage. In these cases, we just can't have the request rerouted. Here's what I have so far:
RewriteRule ^(/)?$ /main/ [L]
to rewrite root requests to /main, and:
RewriteCond %{REQUEST_URI} !-f
RewriteCond %{REQUEST_URI} !-d
RewriteRule ^(.*)$ /main/$1 [L,QSA]
for all the other requests. It works fine, most of the time, but the problem is that if I type in www.example.com/something_that_doesnt_exist, it shows the main page (i.e. /main/), and keeps the URL in the address bar. If I take that same URI and add /main/, so in this case www.example.com/main/something_that_doesnt_exist it goes to the 404 page, which is the desirable result. How can I fix this so that www.example.com/something_that_doesnt_exist goes to the 404 page?
And, for bonus points, how to I get www.example.com/main to actually forward to www.example.com, so that if someone types it in it either goes to www.example.com (changing the URL in the address bar), or does a 404?

Related

Unable to find the source of a rewrite

I'm trying to setup an apache server with a somewhat complex htaccess rules. I have found various solutions for individual rules, however, when I try to put the whole picture together I lack experience with htaccess to make it work and I experience some weird rewrites.
Rules:
The root contains index.php which launches a single page UI app. That means that URLs should be rewritten to /index.php
There are a few folders which should be exempt from Rule #1, e. g. if /admin is accessed, /admin/index.php should be shown, not /index.php.
Rule #2 also applies to all files on the server (so that the UI app can load resources like js/css/etc.)
All http://* requests should be redirected to https://*
Some possibly relevant details:
The server has a LetsEncrypt SSL certificate installed. The "redirect from http" was enabled during installation, I figured it might've created some weird rewrite rules, so I navigated to etc/apache2/sites-available/lamp-server.conf and commented out these three lines
#RewriteEngine on
#RewriteCond %{SERVER_NAME} =[DOMAIN]
#RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
Here's the relevant part of /.htaccess. There are no other rewrites/redirects in this file.
RewriteEngine On
RewriteCond %{REQUEST_URI} !^/(wp|admin|phpmyadmin)(/.*)?$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* /index.php [QSA]
RewriteCond %{HTTPS} !on
RewriteRule .* https://%{HTTP_HOST}%{REQUEST_URI} [L,R]
When I clear the cache in browsers, it seems to work correctly for a while. But when I open the SPA part of the server (i. e., the first RewriteRule is triggered), the browser caches some sort of rewrite and then /admin also starts pointing to the SPA, unless I do a Ctrl+F5 each time.
I tried executing GET requests via Fiddler 4, everything seems to be working correctly there. Tried these scenarios:
HTTP root - got a 302 redirect.
HTTPS root - 200 OK.
/search SPA page - got 200 OK with rendered /index.php as response.
/admin page outside of SPA - got 200 OK with rendered /admin/index.php as response.
During the development of the htaccess rules there might've been a time when a 301 redirect was executed, which is now messing with my browser cache. However, then it's weird that even if I install a new fresh browser, it gets affected by the issue. Even if I try to access it from my mobile (which I never did before, so there's nothing cached), the same issue happens.

How to setup request proxy using URL rewriting

I have an e-commerce site that resides in:
http://dev.gworks.mobi/
When a customer clicks on the signin link, the browser gets redirected to another domain, in order for authentication:
http://frock.gworks.mobi:8080/openam/XUI/#login/&goto=http%3A%2F%2Fdev.gworks.mobi%3A80%2Fcustomer%2Faccount%2Flogin%2Freferer%2FaHR0cDovL2Rldi5nd29ya3MubW9iaS8%2C%2F
I'm trying to rewrite http://dev.gworks.mobi/* to http://frock.gworks.mobi:8080/openam/*, without redirection.
I've tried this in the .htaccess of the dev.gworks.mobi site:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/openam(.*)$ [NC]
RewriteRule ^(.*)$ http://frock.gworks.mobi:8080/$1 [P,L]
</IfModule>
But when I access http://dev.gworks.mobi/openam, it shows a 404 page not found page.
Can anyone help me to achieve my use case?
Try this:
RewriteEngine on
RewriteBase /
# Make sure it's not an actual file being accessed
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# Match the host
RewriteCond %{HTTP_HOST} ^dev\.gworks\.mobi
# Rewrite the request if it starts with "openam"
RewriteRule ^openam(.*)$ http://frock.gworks.mobi:8080/$1 [L,QSA]
This will rewrite all the requests to dev.gworks.mobi/openam to frock.gworks.mobi:8080.
If you want to mask the URI in a way that it's not visible to the visitor that she's visiting the authentication app, you need to add a P flag. Please note that it needs Apache's mod_proxy module in place:
RewriteRule ^openam(.*)$ http://frock.gworks.mobi:8080/$1 [P,L,QSA]
Feel free to drop the L flag, if it's not the last rewrite rule. See RewriteRule Flags for more information.
The 404
If it's all in place and you're still getting a 404 error, make sure that the target URL is not throwing 404 errors in the first place.
Second, check if you're still getting the error with the correct referrer URI set. It might be designed in a way to throw a 404, if the referrer is not correctly set. If that's the case, which I suspect, you need to use the R flag and redirect instead of proxying the request.
Last thing that comes to my mind, some webapps are not built in a way to figure out the URI address. The host, as well as the port number, might be hard-coded somewhere in the config files. Make sure that the authentication app is able to be run from another URL without the need to edit the configs.
Test
You can test the rewriterule online:

If specific URL not found - 301 to root

Basically I'm trying to edit my .htaccess file to do the following:
I want to 301 a URL to another URL but ONLY if that URL is not found (404).
Thus the following will not suffice, seeing as it will redirect the URL regardless of whether or not the URL was found.
Redirect 301 /oldpage.html http://www.example.com/newpage.html
Is this possible to do through .htaccess?
And yes, I know this might be an odd request but I have my reasons for needing this.
Try this mod_rewrite rule in your root .htaccess:
RewriteEngine On
# If the request is not for a valid directory
RewriteCond %{REQUEST_FILENAME} !-d
# If the request is not for a valid file
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^oldpage\.html http://www.example.com/newpage.html [L,NC,R=302]
You really don't want to 301 a URL to another URL. Permanent redirects cause all sorts of problems which cannot be reverted.
I want to 301...if that URL is not found (404).
Again, that's a really bad idea. Even if your deployment and testing processes are perfect you need to be able to see and respond to the requests coming in.
Use an error document to display a message with a meta redrect after a delay to bounce the user back to the home page.

How to rewrite all 404's to index.php using cPanel URLs?

I'm trying to rewrite all 404's to index.php where I use PHP's parse_url() to determine which file to include (e.g. about-us.php, contact-us.php) and I'm getting some really weird results.
I'm working on a 'dev' URL automatically created by cPanel:
http://xxx.xxx.xxx.xxx/~mySite/
Current Method
My .htaccess file contains the following:
RewriteEngine on
RewriteBase /~mySite
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-zA-Z0-9-_.]+)$ index.php
And the results are a mixed bag:
xxx.xxx.xxx.xxx/~mySite/contact-us renders just fine.
xxx.xxx.xxx.xxx/~mySite/contact-us/phone throws a 404 that isn't caught by mod_rewrite.
xxx.xxx.xxx.xxx/~mySite/about-us rewrites to the server root (xxx.xxx.xxx.xxx/index.php).
Previously Tried
ErrorDocument 404 /~mySite/index.php
And I get the similar results, except for the following:
xxx.xxx.xxx.xxx/~mySite/contact-us/phone rewrites to index.php but all my CSS and JS includes are off because they're trying to load relative to /~mySite/contact-us instead of /~mySite.
Any help? I'm going out of my mind. Especially the fact that contact-us works fine, but about-us doesn't?
First don't use cPanel preview. That is not a good way to view your dev site. Who knows how it will affect the rules. Also control panels do weird things anyway.
Preview your site using your real site domain name. You can do that my modifying your HOST file on your computer so that only you can view it by the domain name. This little guide will show you how to edit it. It takes like 2 minutes.
Then that should help to check things better.
Most likely why the error document doesn't work because ~mySite is most likely not your document root. That is typically how cpanel does it's preview links. So your real error document should probably be as Marc B stated.
ErrorDocument 404 /index.php
If you want a mod_rewrite solution, this should also work. But I wouldn't use both.
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ / [R=301,L]
On a side note, I think it's usually good to use a 404 page so that users know that the requested page is not a real page instead of some people thinking something is just wrong with your site because it redirects to the home page. Like facebook, it gives you a big thumbs up with a bandage on it saying it the page is not available. I've seen many custom 404 pages that were pretty clever so maybe it might be time to just get creative.

Simple mod_rewrite for search querystring

I am trying to rewrite
/search?keyword=foobar
to
/search/foobar
without much success.
I currently have the following which seem to produce a 404:
RewriteCond %{QUERY_STRING} ^keyword=(.*)$ [NC]
RewriteRule .* /search/%1? [L,R=301]
Unless you have a resource at /search/foobar then of course you're going to get a 404. Two entirely different things are happening here. The server has a physical resource that gets served (or a script that runs) that apache knows about. If apache sees /search/foobar, it is going to look for a directory called "search" and either a directory or a file called "foobar". If it sees neither, it's going to return a 404. The other part of what's happening is the browser, completely separate from apache, sees a URL (e.g. /search/foobar) and does what it needs to do in order to request for the resource. It talks to the webserver and asks for /search/foobar.
When the request comes in, it's up to the URL-file processing pipeline to turn that into a file which points to where the resource is. If mod_rewrite takes the URL and rewrites it to /blah/blah/blah, there better be a directory called /blah/blah and a file in there called blah or else it's going to 404.
Your rules are saying, if an incoming request is for anything with the query string ?keyword=(something), then redirect the browser to /search/(something). The browser sees this, and does what it's supposed to do; it sends another request for /search/(something). Apache's going to see this and wonder what the request is all about, not knowing what the request is for, and return 404.
What you probably want is to first, handle the /search/(something) URI's
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^/?search/(.*)$ /search?keyword=$1 [L,QSA]
So when a request comes in as /search/foobar, the rewrite engine internally rewrites it to something apache can understand, /search?keyword=foobar. This internal rewrite happens entirely on the server, the browser is ignorant of it.
Now, when a form is submitted as a GET method, you'll end up with a ?keyword=(something) in the URL, and it looks like you're trying to get rid of that. So apache gets the query string, and there must be something to redirect the browser to the nicer looking URL, at which point the browser does its thing, submits a brand new request, which gets internally rewritten by the above rule back to what it should be.
RewriteCond %{THE_REQUEST} \?keyword=([^\ &]+)&?([^\ ]*)
RewriteRule ^ /search/%1?%2 [L,R=301]
I sorted it out with the following:
RewriteRule search/(.*)$ /search?keyword=$1 [L]
RewriteCond %{THE_REQUEST} \?keyword=([^\ &]+)&?([^\ ]*)
RewriteRule ^ /search/%1?%2 [L,R=301]
but not quite. Having issues where there are multiple querystrings or some other URLs containing search/ in the URL, eg. /search/css/foobar.css?version=152