Apache mod_rewrite for SEO urls, but rewrite to 404 if file doesn't exist - apache

I'm unable to find the answer for this, so please let me know if it's been resolved before.
I'm using mod_rewrite to do "pretty" URLs, but if you request a file that doesn't exist (like, a typo) it will redirect and add .php a bunch of times and then fail. The code I have below:
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://inquisito.rs/$1/ [R=301,l]
RewriteRule ^(.*)/$ /$1.php [L]
So if you go to http://inquisito.rs/aion/ it will show you the aion page, but if you go to, lets say, inquisito.rs/aio/ on accident, it gives this
http://inquisito.rs/aio.php.php.php.php.php.php.php.php.php.php.php.php.php.php.php.php.php.php.php.php/
Thanks in advance, I can't tell you how many times I've used information from here to resolve issues at work and at home.

Using the example you've given, this is how the rules are applied:
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f # /aio/ is not a file, so this matched
RewriteCond %{REQUEST_URI} !(.*)/$ # This DOES NOT match, because you have a trailing slash
RewriteRule ^(.*)$ http://inquisito.rs/$1/ [R=301,L] # This rule doesn't run, because the condition above wasn't met
# This rule is separate from the RewriteConds above
RewriteRule ^(.*)/$ /$1.php [L] # This does match because of the lack of RewriteConds and because you have a trailing slash
Try this (untested) set of rules instead:
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f # Make sure no matching file exists
RewriteCond %{REQUEST_URI} !\.php$ # Don't match requests that already end .php
RewriteCond %{REQUEST_URI} !(.*)/$ # Check for missing trailing slash
RewriteRule ^(.*)$ http://inquisito.rs/$1/ [R=301,L] # Redirect with trailing slash
# Separate rule
RewriteCond %{REQUEST_URI} !\.php$ # Don't match requests that already end .php
RewriteRule ^(.*)/$ /$1.php [L] # Internal redirect to matching PHP file
It's important to note that all matching RewriteRules cause a new request to be processed by htaccess again.

Related

.htaccess language based RewriteRule like example.com/en/

I'm trying to write redirect directives in the .htaccess to forward internally all user requests like this:
Every request in a language folder should redirect to the requested file with the language query string:
example.com/en/contact.php -> example.com/contact.php?lang=en
Redirect any request without language path to a default language folder like this:
example.com -> example.com/en
Remove trailing slash if the address is entered with it:
example.com/en/ to example.com/en
For the folder projects, every request should lead to the view-project.php file with the respective query strings:
example.com/en/projects/test -> example.com/view-project.php?lang=en&path=test
Here is my attempt, but it's not working without trailing slash on a request like: http://www.example.com/de and is not redirecting http://www.example.com to a default language folder.
RewriteEngine On
RewriteRule ^(en|de)/(.*)$ $2?lang=$1 [L,QSA,NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^projects/([^/\.]+)/?$ view-project.php?path=$1 [QSA,L]
How can I achieve this?
This is possible a duplicate and I apologize for that. I searched everywhere and read about 100 posts, but I did't found what I'm looking for.
after struggling a while and with the help of someone else, here is the .htaccess file that works for me:
RewriteBase /example.com/
DirectorySlash Off
RewriteEngine On
RewriteOptions AllowNoSlash
RewriteRule ^$ de [R=301,L]
RewriteRule ^(de|en)/$ $1 [R=301,L]
RewriteRule ^(de|en)$ index.php?lang=$1 [L]
RewriteRule ^(de|en)/projects/(.+) view-project.php?lang=$1&path=$2 [L,QSA]
RewriteRule ^(de|en)/(.+) $2?lang=$1 [L,QSA]
Try:
RewriteEngine On
RewriteBase /mysite.com/
RewriteCond %{HTTP:Accept-Language} (en|de) [NC]
RewriteRule ^ /%1/index.php [L]
RewriteCond %{HTTP:Accept-Language} (en|de)
RewriteCond %{DOCUMENT_ROOT}%1%{REQUEST_URI} !-f
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-f
RewriteRule ^(en|de)/(.+)$ $2&lang=$1 [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^projects/([^/\.]+)/?$ view-project.php?path=$1 [QSA,L]
Edit:
The above was before the detailed explanation: After the detailed explanation, I've come up with this solution which is very similar to the solution the author has come up with while I was not aware about the edit explaining the situation:
DirectorySlash Off # disables mod_dir slash redirect
RewriteEngine On
RewriteBase /mysite.com/ # rewrites inside /mysite.com/
RewriteOptions AllowNoSlash # stops ignoring the directory redirects that redirect directories without slash, so by default: example.com/dir1 -> example.com/dir2 would be ignored. This is used because DirectorySlash is off
RewriteRule ^(en|de)\/projects\/(.+)$ view-project.php?lang=$1&path=$2 [QSA,L] # rule 4
RewriteRule ^(en|de)\/(.*)$ $2?lang=$1 [QSA,L] # rule 1
RewriteRule ^$ en [R=302,L] # rule 2
RewriteRule ^(en|de)\/$ $1 [R=302,L] # rule 3

.htaccess - remove everything after third slash in path

On my website, I only use 3 slashes in my URL path:
https://example.com/this/isatest/
Right now I use .htaccess which makes it possible (as a side effect) to add as many stuff on the URL as you like:
https://example.com/this/isatest/hipperdihopperdus/pizza/bacon/with/cheese
I'd like to automatically remove everything after "isatest" while keeping the trailing slash using .htaccess.
This is what my .htaccess currently looks like:
Options -Indexes
Options +FollowSymLinks
RewriteEngine on
# 301 Redirect all requests that don't contain a dot or trailing slash to
# include a trailing slash
RewriteCond %{REQUEST_URI} !/$
RewriteCond %{REQUEST_URI} !\.
RewriteRule ^(.*) %{REQUEST_URI}/ [R=301,L]
RewriteCond %{THE_REQUEST} /index\.html [NC]
RewriteRule ^index\.html$ /? [R=301,L,NC]
RewriteRule ^listen/$ /console/ [NC,L]
# Rewrites urls in the form of /parent/child/
# but only rewrites if the requested URL is not a file or directory
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)$ index.php?page=$1 [L,QSA]
How can I achieve this?
As your first rule, after the RewriteEngine directive, you can do something like the following:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^/]+/[^/]+/). /$1 [R=302,L]
This checks if there is anything else (the dot) after two path segments and a slash, and redirects to removed "anything else".
Note that this is a 302 (temporary) redirect. Only change this to a 301 (permanent) redirect - if that is the intention - once you have confirmed that it works OK. This is to avoid the browser caching erroneous redirects whilst testing.
UPDATE: It may be more efficient to simply avoid redirecting files that end in a recognised file extension. Or perhaps exclude known directory location(s) of your static resources. For example:
RewriteCond %{REQUEST_URI} !\.(css|js|jpg|png|gif)$ [NC]
RewriteRule ^([^/]+/[^/]+/). /$1 [R=302,L]
OR,
RewriteCond %{REQUEST_URI} !^/static-resources/
RewriteRule ^([^/]+/[^/]+/). /$1 [R=302,L]
You can add this rule just below RewriteEngine On line:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+/[^/]+/).+$ /$1 [R=301,L,NE]

Forcing trailing backslash in .htaccess causing 404 error

I'm trying to find the correct .htaccess config to force a trailing slash after every URL, but it's causing 404's in many instances.
I have the following directory structure:-
articles
post.html
portfolio
lorum1.html
lorum2.html
lorum3.html
contact.html
Example 1
So if I go to:-
myurl.com/articles/post.html or
myurl.com/articles/post or
myurl.com/articles/post/
I need these to all redirect to the .html, but with the url showing:
myurl.com/articles/post/
Example 2
So if I go to myurl.com/contact/, it needs to display the content of myurl.com/contact.html, whilst still maintaining the myurl.com/contact/ url.
What's currently happening
Here's an example using the contact path. I get a 200 response if I go to myurl.com/contact and myurl.com/contact.html, but a 404 if I go to myurl.com/contact/.
Here's what I have so far.
<IfModule mod_rewrite.c>
RewriteEngine on
# Remove .html
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
# Force trailing slash
RewriteCond %{REQUEST_URI} /+[^\.]+$
RewriteRule ^(.+[^/])$ %{REQUEST_URI}/ [R=301,L]
</IfModule>
I'm struggling to make sense of this problem, so thanks in advance to anybody who can help!
Your example is very close to hitting your requirements. The problem is that the first RewriteRule is matching ^(.*)$, which in regex terms is equal to matching literally anything.
Your rules can be implemented by adding the missing forward slash to the regex before the end-position metacharacter $, meaning that the regex will now match anything, as long as it ends with a forward slash: ^(.*)/$
Fixed:
<IfModule mod_rewrite.c>
RewriteEngine on
# Remove .html
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)/$ $1.html
# Force trailing slash
RewriteCond %{REQUEST_URI} /+[^\.]+$
RewriteRule ^(.+[^/])$ %{REQUEST_URI}/ [R=301,L]
</IfModule>
However while trying to understand Apache's syntax I think I found a much simpler solution to achieve the same:
RewriteEngine on
# Rewrite requests ending in a slash to show the html file
RewriteRule ^(.+)/$ $1.html
# If the request uri doesn't end in .html...
RewriteCond %{REQUEST_URI} !\.(html?)$
# ...rewrite requests that don't end in a slash so that they do end in a slash
RewriteRule ^(.*)([^/])$ /$1$2/ [R=303]

Redirect to index.php in root

I've written the following code in my htaccess file:
RewriteRule ^(.*)/$ $1
RewriteCond %{REQUEST_URI} !^index.php.*$
RewriteRule ^(.*)$ /index.php?route=$1 [END]
It works perfect for every path, except for directories that exist. For example, if I enter http://localhost/profilepic and such directory actually exists, it redirects to http://localhost/profilepic/?route=profilepic, but I want it to be implicitly converted to http://localhost/index.php?route=profilepic.
Thanks in advance.
The reason this is happening is because of mod_dir and the DirectorySlash directive. Essentially, if it sees a URI without a trailing slash, and it maps to an existing directory, then it'll redirect the request so that it has the trailing slash. Since mod_dir and mod_rewrite are both in different places in URL-file processing pipeline, both mod_dir and mod_rewrite get applied to the same URL. That's why you end up with a redirect and a weird URL with the query string.
If you absolutely must have directories without trailing slashes, then you need to turn of DirectorySlash. The problem with turning it off is that there is an information disclosure security concern that will make it so people can look at the contents of a directory even if you have an index file. That means you have to make up for mod_dir using mod_rewrite.
So get rid of the rule:
RewriteRule ^(.*)/$ $1
and replace it with these rules:
DirectorySlash Off
# redirect direct requests that end with a slash to remove the slash.
RewriteCond %{THE_REQUEST} \ /+[^\?\ ]+/($|\ |\?)
RewriteRule ^(.*)/$ /$1 [L,R]
# internally add the trailing slash for directories
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.*[^/])$ /$1/ [L]
Here is another way you can have your rules without turning off DirectorySlash (considered a security hole):
RewriteEngine On
# remove trailing slash for non-directories
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{THE_REQUEST} \s(.+?)/+[?\s]
RewriteRule ^(.+?)/$ /$1 [R=301,L]
# routing for directories
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+?)/$ /index.php?route=$1 [L]
# routing for non directories
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+?)/?$ /index.php?route=$1 [L]

How do i force www subdomain on both https and http

For whatever reason I can't seem to get this right, I've looked at many examples on here and apache's website. I'm trying to force www.domain.com instead of domain.com on EITHER http or https but I am not trying to force https over http.
the following code seems to work for all https connections but http will not redirect to www.
RewriteEngine On
RewriteCond %{HTTPS} on
RewriteCond %{HTTP_HOST} !^www\.domain\.com$ [NC]
RewriteRule ^ https://www.domain.com%{REQUEST_URI} [R=301]
RewriteEngine On
RewriteCond %{HTTPS} off
RewriteCond %{HTTP_HOST} !^www\.domain\.com$ [NC]
RewriteRule ^ http://www.domain.com%{REQUEST_URI} [R=301]
You don't need the second RewriteEngine directive. That may or may not be causing a parse issue making the second set of rules not work. To test whether this is the case, try switching the order of the two blocks you have.
It's good practice to use L to modify requests that are definitely the last. So, change [R=301] to [R=301,L] both times it appears.
Largely as a matter of style, I would consider changing the RewriteRule directives to something like (using http or https as appropriate):
RewriteRule ^(.*)$ http://www.domain.com$1 [R=301,L,QSA]
Your rules seem to be fine. You can combine them as follows:
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteCond %{HTTPS}s on(s)|
RewriteRule ^ http%1://www.example.com%{REQUEST_URI} [L,R=301]
Also note the additional L flag to stop the rewriting process after this rule has been applied.
In case anyone still need an answer to this. Use another .htaccess. Get guide from here, I found it and it looks good: http://www.farinspace.com/codeigniter-htaccess-file/
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
### Canonicalize codeigniter URLs
# If your default controller is something other than
# "welcome" you should probably change this
RewriteRule ^(welcome(/index)?|index(\.php)?)/?$ / [L,R=301]
RewriteRule ^(.*)/index/?$ $1 [L,R=301]
# Removes trailing slashes (prevents SEO duplicate content issues)
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)/$ $1 [L,R=301]
# Enforce www
# If you have subdomains, you can add them to
# the list using the "|" (OR) regex operator
RewriteCond %{HTTP_HOST} !^(www|subdomain) [NC]
RewriteRule ^(.*)$ http://www.domain.tld/$1 [L,R=301]
# Enforce NO www
#RewriteCond %{HTTP_HOST} ^www [NC]
#RewriteRule ^(.*)$ http://domain.tld/$1 [L,R=301]
###
# Removes access to the system folder by users.
# Additionally this will allow you to create a System.php controller,
# previously this would not have been possible.
# 'system' can be replaced if you have renamed your system folder.
RewriteCond %{REQUEST_URI} ^system.*
RewriteRule ^(.*)$ /index.php/$1 [L]
# Checks to see if the user is attempting to access a valid file,
# such as an image or css document, if this isn't true it sends the
# request to index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1 [L]
</IfModule>
<IfModule !mod_rewrite.c>
# Without mod_rewrite, route 404's to the front controller
ErrorDocument 404 /index.php
</IfModule>
Remember, once you have your CodeIgniter htaccess file setup, you will want to go into your “/system/application/config/config.php”, find the following:
$config['index_page'] = "index.php";
and change it to:
$config['index_page'] = "";