Apache2 mod_rewrite difficulty with GET variables - apache

On the website.conf file I have:
<VirtualHost *:80>
DocumentRoot /srv/http/website/cgi-bin
ServerName website
ServerAlias www.website
RewriteEngine on
RewriteRule ^$ ""
RewriteRule ^([a-z]+)$ /?tab=repo
...
My goal is to have http://localhost/ redirect to localhost and http://localhost/word redirect to http://localhost/?tab=word.
With the current directives I get a 404 error, because it's trying to open the file repo # DocumentRoot. All I need is to rewrite the URL to make the word be a GET variable.
A directive like the following works:
RewriteRule /word$ http://localhost/?tab=word
This is obviously somewhat simplistic because I would then have to do it for every possibility.
I experimented with those directives on this website https://htaccess.madewithlove.com/, that I found from another thread on SO, the results are what I expect them to be, I.E.: http://localhost/word is transformed to http://localhost/?tab=word.
Extra info: The website does not have any PHP.

# Virtual Host
RewriteRule ^$ ""
RewriteRule ^([a-z]+)$ /?tab=repo
A directive like the following works:
RewriteRule /word$ http://localhost/?tab=word
The difference with the "working directive" is that you've included a slash prefix. The regex ^([a-z]+)$ does not allow for a slash prefix, so never matches.
You are also failing to use the captured backreference (ie. $1) in the substitution string, so it would always rewrite to /?tab=repo regardless of the URL requested.
Consequently, the first rule, that matches against ^$ will never match either - but this rule is not required. You are not performing a redirect when requesting the root - you just don't want to do anything and instead allow mod_dir to serve the directory index.
In a virtualhost context the URL-path matched by the RewriteRule pattern is a root-relative URL-path, starting with a slash.
So, your rule(s) should be like this instead:
RewriteEngine On
RewriteRule ^/([a-z]+)$ /?tab=$1 [L]
(Or, make the slash prefix optional, ie. ^/?([a-z]+)$)
However, /?tab=<word> is not strictly a valid end-point. What is the actual file that is handling the request? This should be included in the rewrite (and not rely on the DirectoryIndex). You state you are not using PHP, so how are you reading the URL parameter?
I experimented with those directives on this website https://htaccess.madewithlove.com/,
You are not using .htaccess in your example. mod_rewrite behaves slightly differently depending on context (.htaccess, directory, virtualhost and server).
Reference:
https://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewriterule
What is matched?
In VirtualHost context, The Pattern will initially be matched against
the part of the URL after the hostname and port, and before the query
string (e.g. "/app1/index.html"). This is the (%-decoded) URL-path.
In per-directory context (Directory and .htaccess), the Pattern is
matched against only a partial path, for example a request of
"/app1/index.html" may result in comparison against "app1/index.html"
or "index.html" depending on where the RewriteRule is defined.

After tinkering a bit more I got down to this:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} -f
RewriteRule ^(.*)$ %{REQUEST_FILENAME} [PT,L]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} -d
RewriteRule ^(.*)$ %{REQUEST_FILENAME} [PT,L]
RewriteRule ^/([^index.cgi]{1}.+)$ /index.cgi?tab=$1 [L]
It opens files if they exist and sends requests that don't exist to my C++ cgicc program. The only thing I don't understand is why the -d condition isn't opening directories the same way the -f one opens files.

Related

How can I create a redirect with .htaccess to correct path instead of page acess

I am making a multilingual dynamic site that creates a virtual path per language.
So french pages go to domain.com/fr/ english domain.com/en/page domain.com/fr/some/page but in reality these pages are in the base folder and /fr/ is converted to a query string.
This is all working with the following .htaccess:
RewriteEngine on
DirectorySlash Off # Fixes the issue where a page and folder can have the same name. See https://stackoverflow.com/questions/2017748
# Return 404 if original request is /foo/bar.php
RewriteCond %{THE_REQUEST} "^[^ ]* .*?\.php[? ].*$"
RewriteRule .* - [L,R=404]
# Remove virtual language/locale component
RewriteRule ^(en|fr)/(.*)$ $2?lang=$1 [L,QSA]
RewriteRule ^(en|fr)/$ index.php?lang=$1 [L,QSA]
# Rewrite /foo/bar to /foo/bar.php
RewriteRule ^([^.?]+)$ %{REQUEST_URI}.php [L]
My problem is that some sites (Like a Linkedin post) somehow remove the trailing / in the index page automatically. So if I put a link in my post of domain.com/fr/ somehow they make the link domain.com/fr even if it shows domain.com/fr/ but that 404's as domain.com/fr dosent exist.
So how can I redirect domain.com/fr to domain.com/fr/ or localhost/mypath/fr (There's many sites in my local workstation) to localhost/mypath/fr/.
I tried something like:
RewriteRule ^(.*)/(en|fr)$ $1/$2/ [L,QSA,R=301]
RewriteRule ^(en|fr)$ $1/ [L,QSA,R=301]
But that ended up somehow adding the full real computer path in the url:
localhost/mypath/fr becomes localhost/thepathofthewebserverinmypc/mypath/fr/
I would very much appreciate some help as I have yet to find the right rule.
Thank you
RewriteRule ^(en|fr)$ $1/ [L,QSA,R=301]
You are just missing the slash prefix on the substitution string. Consequently, Apache applies the directory-prefix to the relative URL, which results in the malformed redirect.
For example:
RewriteRule ^(en|fr)$ /$1/ [L,R=301]
The substitution is now a root-relative URL path and Apache just prefixes the scheme + hostname to the external redirect. (The QSA flag is unnecessary here, since any query string is appended by default.)
This needs to go before the existing rewrites (and after the blocking rule for .php requests).
Note that the "internal rewrite" directives are correct to not have the slash prefix.
Aside:
DirectorySlash Off
Note that if you disable the directory slash, you must ensure that auto-generated directory listings (mod_autoindex) are also disabled, otherwise if a directory without a trailing slash is requested then a directory listing will be generated (exposing your file structure), even though there might be a DirectoryIndex document in that directory.
For example, include the following at the top of the .htaccess file:
# Disable auto-generated directory listings (mod_autoindex)
Options -Indexes
UPDATE:
this worked on the production server. As the site is in the server root. Would your know how can I also try and "catch" this on my localhost ? RewriteRule ^(.*)/(en|fr)$ /$1/$2/ [L,R=301] dosent catch but with only RewriteRule ^(en|fr)$ /$1/ [L,R=301] localhost/mypath/fr becomes localhost/fr/
From that I assume the .htaccess file is inside the /mypath subdirectory on your local development server.
The RewriteRule pattern (first argument) matches the URL-path relative to the location of the .htaccess file (so it does not match /mypath). You can then make use of the REQUEST_URI server variable in the substitution that contains the entire (root-relative) URL-path.
For example:
RewriteRule ^(en|fr)$ %{REQUEST_URI}/ [L,R=301]
The REQUEST_URI server variable already includes the slash prefix.
This rule would work OK on both development (in a subdirectory) and in production (root directory), so it should replace the rule above if you need to support both environments with a single .htaccess file.

In Apache how to do an external redirect to the slashless version of a URL with a subfolder .htaccess file

On Apache 2.4 I have an .htaccess (in a subfolder) which rewrites slashless requests inside that folder to appropriate index files:
DirectorySlash Off
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{REQUEST_URI} !/$
RewriteCond %{REQUEST_FILENAME}/index.html -f
RewriteRule (.*) $1/index.html [L]
This works for the slashless version exactly as expected. Now I want to redirect the slashed version externally to the slashless version. I tried adding the lines:
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{REQUEST_URI} /$
RewriteRule ^(.*)/ $1 [R=302,L]
However this does not work: The redirect is issued, however it does not go to the slashless URL, but to a URL with a system specific part injected.
So, for a sample URL http://example.com/path/to/dir/ the redirected URL looks like this http://example.com/fs9e/username/sub/public/path/to/dir instead of just http://example.com/path/to/dir.
How can I fix this? Many thanks for any pointers!
PS: The real case is a little bit more complicated because I do a subdomain-to-folder rewrite in the root .htacces, but I assume this is not relevant here.
RewriteRule ^(.*)/ $1 [R=302,L]
You are missing the slash prefix (/) on the substitution string (2nd argument) - to make the substitution root-relative. Or rather, /subfolder/ (since this .htaccess file is located in a subfolder). Since this is a relative substitution string (not starting with a slash or scheme+hostname), the directory-prefix*1 (which I assume is /fs9e/username/sub/public/path/) is added back (by default*2), resulting in a malformed redirect. (This is correct for internal rewrites, but not external redirects.)
It should be like this:
RewriteRule ^(.*)/$ /subfolder/$1 [R=302,L]
Note you were also missing the end-of-string anchor ($) on the RewriteRule pattern. (This also negates the need for the preceding condition that checks that REQUEST_URI ends in a slash.)
Note also that this "redirect" should go before the earlier "rewrite".
*1 The directory-prefix is the absolute filesystem path of the location of the .htaccess file.
*2 The alternative is to set a RewriteBase /subfolder - but that then affects all relative substitutions. You could also use an environment variable to apply a specific prefix only to some rules.

Apache htaccess rewrite root and all root folders to subfolder without redirecting

Options +FollowSymLinks -MultiViews
# Turn mod_rewrite on
RewriteEngine On
RewriteBase /
RewriteRule ^$ /subdir/ [L,NC]
I want to rewrite the root domain to subfolder without changing the URL in the browser. The above code works just for the root domain but not any folders and files.
For example, I have https://example.com/ and https://example.com/subdir/.
With the above code in .htaccess file, when I go to https://example.com/ I see the contents of https://example.com/subdir/ which is good.
But when I go to https://example.com/test.txt I should see https://example.com/subdir/test.txt but I get The requested URL was not found on this server.
Same happens when I go to https://example.com/abc expecting to see contents of https://example.com/subdir/abc
Any idea?
RewriteRule ^$ /subdir/ [L,NC]
Change this to read:
RewriteRule !^subdir/ subdir%{REQUEST_URI} [L]
Any request that does not start /subdir/ is internally rewritten to /subdir/<url>. The REQUEST_URI server variable contains the full URL-path (including the slash prefix).
I removed the slash prefix from the substitution string since you have defined a RewriteBase /. (Although neither are strictly necessary here.)
UPDATE:
...when I go to example.com/s I am being redirected to example.com/subdir/s/
s is a subfolder within subdir, does that make any difference?
Ah yes, if /s is a subdirectory then mod_dir will append the trailing slash (to "fix" the URL) with an external 301 redirect. This redirect occurs after the URL has been rewritten to /subdir/s - thus exposing the /subdir subdirectory.
To handle this situation we can add another rule (a redirect) before the existing rewrite that first checks whether the request would map to a directory within the /subdir subdirectory and append a slash if it is omitted (before mod_dir would append the slash to the rewritten URL).
For example:
RewriteCond %{REQUEST_URI} !/$
RewriteCond %{DOCUMENT_ROOT}/subdir%{REQUEST_URI} -d
RewriteRule !\.\w{2,4}$ %{REQUEST_URI}/ [R=301,L]
This states... for any request that:
!\.\w{2,4}$ - does not contain (what looks like) a file extension of between 2 and 4 characters (assuming your directories aren't named this way)
!/$ - and does not currently end in a slash.
-d - and exists as a physical directory in the /subdir subdirectory.
THEN redirect to append the trailing slash on the original request
Whilst this probably should be a 301 (permanent) redirect, you should first test with a 302 (temporary) redirect to avoid potential caching issues.
You will need to clear your browser cache before testing, since the erroneous 301 redirect from /s to /subdir/s/ will have been cached by the browser.
A potential optimisation is to remove the filesystem check and simply assume that any request that does not contain a file extension should map to a directory. (But this depends on whether you are handling these URLs in any other way.)
Summary
Options +FollowSymLinks -MultiViews
# Turn mod_rewrite on
RewriteEngine On
# If the requested URL exists as a directory in "/subdir" then append a slash
RewriteCond %{REQUEST_URI} !/$
RewriteCond %{DOCUMENT_ROOT}/subdir%{REQUEST_URI} -d
RewriteRule !\.\w{2,4}$ %{REQUEST_URI}/ [R=301,L]
# Rewrite everything to "/subdir"
RewriteRule !^subdir/ subdir%{REQUEST_URI} [L]

I Cannot get rewriterule to work

I'm using VertrigoServ 2.27 on my laptop (localhost:8080).
Rewrite module is enabled and i tested it with alice.html and bob.html example
(http://stackoverflow.com/questions/6944521/url-rewriting-on-localhost)
and it works with .htacces inside www-subfolder. And I also put rubbish text inside .htaccess and
I got error from Apcheserver so rewrite mod is running and I cab use rules.
here is the rule: (inside /www/folder1/.htaccess)
Options +FollowSymLinks
RewriteEngine On
RewriteRule /(.*) /index.php?disp=$1 [QSA,L]
So when I put this url into browser my index page loaded ok.
http://localhost:8080/folder1/index.php
And the problem is here: When I request login via index.page(login page) and send url to localhost
server by cliking send-button
the url changed localhost:8080/login, it should be localhost:8080/folder1/login
how I can keep subfolder name in url?
and I want convert urls like this: www.best-food-of-the-usa.com/index.php?operation=top&state=arizona&
city=mesa&limit=10
to like this: www.best-food-of-the-usa.com/arizona/mesa/top10.html
Any help is appreciated. Thanks
\Jose
RewriteRule is behaving as the docs say.
From RewriteRule Directive Apache Docs
What is matched?
In VirtualHost context, The Pattern will initially be matched against the part of the URL after the hostname and port, and before the query string (e.g. "/app1/index.html").
In Directory and htaccess context, the Pattern will initially be matched against the filesystem path, after removing the prefix that lead the server to the current RewriteRule (e.g. "app1/index.html" or "index.html" depending on where the directives are defined).
If you wish to match against the hostname, port, or query string, use a RewriteCond with the %{HTTP_HOST}, %{SERVER_PORT}, or %{QUERY_STRING} variables respectively.
So, when in Directory and htaccess context the prefix /www/folder1/ will be removed. Also remember when matching with RewriteRule in Directory and htaccess context, the pattern will never begin with /.
So, your RewriteRule Should be:
Options +FollowSymLinks
RewriteEngine On
RewriteRule (.*) /index.php?disp=$1 [QSA,L]

How to redirect "/" to "/home.html" only if the file "/index.html" does not physically exists?

I found a way to redirect (not load, but change the URL) "/" to "/home.html". And now I want to add a RewriteCond to avoid the redirection if the file "/index.html" exists.
I tried (without the comments), but it didn't worked :
# We check that we comes from "domain.tld/"
RewriteCond %{REQUEST_URI} =/
# We check that there is no index.html file at the site's root
RewriteCond %{REQUEST_URI}index\.html !-f
# We redirect to home.html
RewriteRule ^(.*)$ %{REQUEST_URI}home\.html [R=301,L]
Help me Obi-wan Kenobi... You're my only hope!
#Gumbo
It's a little bit more complicated than the above example. In fact, I manage both localhost and production development with the same .htaccess, so I tried something like this (following your answer) :
# Redirect domain.tld/ to domain.tld/home.html (only if domain.tld/index.html does not exists)
RewriteCond %{DOCUMENT_ROOT}index\.html !-f [OR]
RewriteCond %{DOCUMENT_ROOT}domain.tld/www/index\.html !-f
RewriteRule ^$ %{REQUEST_URI}home\.html [R=301,L]
I looked at the path returned by "%{DOCUMENT_ROOT}domain.tld/www/index.html" and it's exactly the path of my index.html file... nevertheless, it didn't worked too. :(
By the way, thanks for the "^$" astuce to avoid "%{REQUEST_URI} =/" ! \o/
Any idea why ?
The file check -f requires a valid file system path. But %{REQUEST_URI}index\.html is not a file system path but a URI path. You can either use -F instead to check the existence via a subrequest. Or use DOCUMENT_ROOT to build a valid file system path:
RewriteCond %{DOCUMENT_ROOT}/index.html !-f
RewriteRule ^$ %{REQUEST_URI}home.html [R=301,L]
Furthermore, the other condition can be accomplished with the pattern of RewriteRule. As you’re using mod_rewrite in a .htaccess file, the corresponding path prefix is stripped (in case of the document root directory: /) so that the remaining path is an empty string (matched by ^$).
if you have access to httpd.conf (apaches config file) you could set the default page in there.
Something like this:
<IfModule dir_module>
DirectoryIndex index.html home.html
</IfModule>
Based on the rule set that you posted in your update, you have a bit of a logical error going on. Right now, one of your RewriteCond conditions will always be true, since it seems likely that both index files will never exist in the same environment (one exists in development, the other in production). Since you've OR'ed them together, this means that your RewriteRule will never be ignored due to the condition block.
It's simple enough to fix (I've also added additional forward slashes, since DOCUMENT_ROOT typically doesn't have a trailing slash):
RewriteCond %{DOCUMENT_ROOT}/index.html !-f
RewriteCond %{DOCUMENT_ROOT}/domain.tld/www/index.html !-f
RewriteRule ^$ %{REQUEST_URI}home.html [R=301,L]
Note too that you could setup a virtual host with a local host name so that your development and production would be similar in terms of relative paths.