Page-alias without .html-suffix - apache

One TYPO3 installation I have here uses the field alias in the page settings. It does not make use of simulatestatic or realurl. If the alias of a page is to foo, this page is reachable under the following URLs:
/index.php?id=foo
/foo.html
I now want the page to be reachable under an additional URL: /foo, without the .html.
My approach was to simply use mod_rewrite add some rules like this:
RewriteCond %{REQUEST_FILENAME} !index.php$
RewriteCond %{REQUEST_FILENAME} !\.html$
RewriteRule ^([^/]+)$ $1.html [QSA]
My RewriteRules work, they rewrite the URI /foo first to /foo.html and later to /index.php. This does not work, I get a 404 when requesting /foo.
I assume this happens since TYPO3 still gets the info that the original URI was /foo instead of /foo.html, which it doesn't recognize.
How could this be solved, without using realurl or simulatestatic (the side-effects are unwanted), and without using a HTTP redirect (the URL in the browser should be /foo)? Is there something like a server-internal redirect in apache?

If mod_proxy is activated on your server, you could use the proxy flag [P] and write:
RewriteCond %{REQUEST_FILENAME} !\.html$
RewriteRule ^([^/]+)$ /$1.html [QSA,P]
If not, you could write a php file that acts as a proxy (I've done that before for Typo3). cURL oder even a simple file_get_contents() is very handy in this case. Make sure to only load pages from your domain. Redirect your non-.html files to the proxy file that redirects them to the .html-file which is then processed by Typo3.

Related

How to redirect the page after rewrite the URL

I have to rewrite the URL so I have used the below code in my htaccess which is working
RewriteRule ^products/part/pumps/?$ pumps.php [L,NC]
if someone tries to access the url link exmple.com/products/part/pumps/ then it's showing the pumps.php data.
Now my issue is if some try to access the page example.com/pumps.php then how can I redirect to this link exmple.com/products/part/pumps/
I have tried but getting a redirecting error
Redirect 301 /pumps.php /products/part/pumps/
I have many pages and i have to redirect them also. sharing two example here
RewriteRule ^power\.php$ /products/engine/power/ [R=301,L]
RewriteRule ^connecting\.php$ /products/connection/power/connecting/ [R=301,L]
Use the following before your existing rewrite:
# External redirect
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^pumps\.php$ /products/part/pumps/ [R=301,L]
(But test first with a 302 redirect to avoid potential caching issues.)
By checking that the REDIRECT_STATUS environment variable is "empty" we can redirect only direct requests from the user and not the rewritten request by the later rewrite. After the request is successfully rewritten, REDIRECT_STATUS has the value 200 (as in 200 OK HTTP status).
The RewriteCond (condtion) directive must precede every RewriteRule directive that triggers an external redirect.
The Redirect directive (part of mod_alias, not mod_rewrite) is processed unconditionally and will end up redirecting the rewritten request, resulting in a redirect loop. You need to use mod_rewrite throughout.
Use the END flag instead of RewriteCond (requires Apache 2.4)
Alternatively, you can modify your existing rewrite to use the END flag (instead of L) to prevent a loop by the rewrite engine. The RewriteCond directive as mentioned above can then be omitted. But note that the END flag is only available on Apache 2.4+.
For example:
# External redirects
RewriteRule ^pumps\.php$ /products/part/pumps/ [R=301,L]
# Internal rewrites
RewriteRule ^products/part/pumps/?$ pumps.php [END,NC]
It is advisable to group all the external redirects together, before the internal rewrites.
Unfortunately, due to the varying nature of these rewrites it doesn't look like the rules can be reduced further.

Apache mod_rewrite - unwanted redirect instead of rewrite

I have an issue with mod_rewrite and I can't seem to solve it. I stripped the example down to the bare bones and I don't understand why a specific rule forces my browser to redirect instead of rewrite:
RewriteEngine on
#if request is for a physical-file OR for one of the language paths - skip (return as-is)
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
RewriteCond %{REQUEST_URI} ^/de [OR]
RewriteCond %{REQUEST_URI} ^/en-US
RewriteRule ^ - [L]
#otherwise: rewrite to en-US folder
RewriteRule ^(.*)$ /en-US/$1 [NC,L,QSA]
I read the documentation very carefully and it seems like this should actually rewrite every call, so https://example.com/fuBar.html should actually retrieve the file /en-US/fuBar.html from my server - the users browser shouldn't know about it.
What's really happening is that for some reason the browser is redirected to https://example.com/en-US/fuBar.html. While this does display the correct content, it's just not what I want or what I thought this RewriteRule should do. What am I doing wrong?
*add - the .htaccess of the subfolders de and en-US:
RewriteEngine On
# If an existing asset or directory is requested go to it as it is
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
RewriteRule ^ - [L]
# If the requested resource doesn't exist, use index.html
RewriteRule ^ /index.html
There's nothing in the code you've posted that would trigger an external "redirect".
Make sure you have cleared your browser (and any intermediary) cache(s) to ensure you are not seeing an earlier/erroneous 301 (permanent) redirect. (301 redirects are cached persistently by the browser.)
Check the "network traffic" in the browser's developer tools to see the precise nature of this redirect to see what it redirects from/to, and well as the 3xx HTTP status code of the redirect (if indeed this is an external redirect).
It would seem the front-end (JavaScript/Angular) is manipulating the URL in the address bar (there is no redirect). From comments:
Actually there was no redirect happening at all! Rather since I set <base href="/en-US"> somehow my frontend (Angular) seems to have outsmarted me, manipulating the address without me realizing it. Turns out I don't even need to change the base href, I just need the rewrites.

.htaccess rewrite returning Error 404

RewriteEngine on
RewriteCond %{QUERY_STRING} (^|&)public_url=([^&]+)($|&)
RewriteRule ^process\.php$ /api/%2/? [L,R=301]
Where domain.tld/app/process.php?public_url=abcd1234 is the actual location of the script.
But I am trying to get .htaccess to make the URL like this: domain.tld/app/api/acbd1234.
Essentially hides the process.php script and the get query ?public_url.
However the script above is returning error 404 not found.
I think this is what you are actually looking for:
RewriteEngine on
RewriteCond %{QUERY_STRING} (?:^|&)public_url=([^&]+)(?:$|&)
RewriteRule ^/?app/process\.php$ /app/api/%1 [R=301,QSD]
RewriteRule ^/?app/api/([^/]+)/?$ /app/process.php?public_url=$1 [END]
If you receive an internal server error (http status 500) for that then check your http servers error log file. Chances are that you operate a very old version of the apache http server, you may have to replace the [END] flag with the [L] flag which probably will work just fine in this scenario.
And a general hint: you should always prefer to place such rules inside the http servers (virtual) host configuration instead of using dynamic configuration files (.htaccess style files). Those files are notoriously error prone, hard to debug and they really slow down the server. They are only supported as a last option for situations where you do not have control over the host configuration (read: really cheap hosting service providers) or if you have an application that relies on writing its own rewrite rules (which is an obvious security nightmare).
UPDATE:
Based on your many questions in the comments below (we see again how important it is to be precise in the question itself ;-) ) I add this variant implementing a different handling of path components:
RewriteEngine on
RewriteCond %{QUERY_STRING} (?:^|&)public_url=([^&]+)(?:$|&)
RewriteRule ^/?app/process\.php$ /api/%1 [R=301,QSD]
RewriteRule ^/?api/([^/]+)/?$ /app/process.php?public_url=$1 [END]
I am trying to get .htaccess to make the URL like this: example.com/app/api/acbd1234.
You don't do this in .htaccess. You change the URL in your application and then rewrite the new URL to the actual/old URL. (You only need to redirect this, if the old URLs have been indexed by search engines - but you need to watch for redirect loops.)
So, change the URL in your application to /app/api/acbd1234 and then rewrite this in .htaccess (which I assume in in your /app subdirectory). For example:
RewriteEngine On
# Rewrite new URL back to old
RewriteRule ^api/([^/]+)$ process.php?public_url=$1 [L]
You included a trailing slash in your earlier directive, but you omitted this in your example URL, so I've omitted it here also.
If you then need to also redirect the old URL for the sake of SEO, then you can implement a redirect before the internal rewrite:
RewriteEngine On
# Redirect old URL to new (if request by search engines or external links)
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{QUERY_STRING} (?:^|&)public_url=([^&]+)(?:$|&)
RewriteRule ^process\.php$ /app/api/%1? [R=302,L]
# Rewrite new URL back to old
RewriteRule ^api/([^/]+)$ process.php?public_url=$1 [L]
The check against REDIRECT_STATUS is to avoid a rewrite loop. ?: inside the parenthesised subpattern avoids the group being captured as a backreference.
Change the 302 (temporary) to 301 (permanent) only when you are sure it's working OK, to avoid erroneous redirects being cached by the browser.

Why is this RewriteRule altering QUERY_STRING, but leaving REQUEST_URI untouched?

I have a copy of Concrete5, a PHP-based CMS, running on example.com.
Concrete5 comes with the following basic instructions for pretty URLs (redirecting all URLs to a central index.php)
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} !^/c5.7
RewriteRule ^.*$ c5.7/$0 [L] # Concrete5 is running in the c5.7/ subdirectory
</IfModule>
Pretty straightforward.
Now I have a certain set of URLs that take the form
/product/{productname}
that I need to forward to the Concrete5 (virtual) URL
/products/details?name={productname}
That URL is set up and works as expected when I enter it manually in the browser.
So I added a line to the htaccess file and it now looks like this:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
# New rule for products
RewriteCond %{REQUEST_URI} ^/product/
RewriteRule ^product/(.+)$ /products/details?name=$1 [QSA]
RewriteCond %{REQUEST_URI} !^/c5.7
RewriteRule ^.*$ c5.7/$0 [L]
</IfModule>
I can confirm the RewriteRule gets triggered when I choose a random, external URL as the redirection target.
But whenever it is an internal redirect like above, what happens is, I get a 404 inside Concrete5. When I inspect what was passed to it, I see:
REQUEST_URI: /product/my-random-product
QUERY_STRING: name=my-random-product
So it appears that the rule is triggered and does some rewriting, but REQUEST_URI remains unchanged!
Why?
Is it because PHP 7.1 is running via CGI?
I have tried a zillion variations and all the flags in the book, with little success.
The REQUEST_URI in PHP is not the same as the REQUEST_URI within mod_rewrite, so you can't do it like this. In PHP it always contains the original URL. So you can't change it like this if your CMS is working off that.
You should set up your CMS to use the URLs you want, rather than trying to augment your CMS's URL rewriting like this.
If you inspect REDIRECT_URL in PHP you will see the last rewritten URI.
REQUEST_URI in PHP will always be the original request URI.
Because this is already explained by LSerni and SuperDuperApps, I won't elaborate.
Instead, I'm offering a quick solution: modify the REQUEST_URI and add a name parameter in PHP instead of in .htaccess.
Add the following code to the start of your Concrete5 index.php to make sure that REQUEST_URI is modified
before any Concrete5 code runs:
if(preg_match('-^/product/([^?]*)-',$_SERVER['REQUEST_URI'],$matches)){
$_SERVER['REQUEST_URI'] = '/products/details';
$_GET['name'] = $matches[1];
}
Your setup works on a PHP 7.1 machine (without Concrete5). It does call a script I just put in, which is in /c5.7/products/details. So the Apache part is working.
Inside the script, I see that REQUEST_URI is the old value prior to the rewrite.
So its value is normal and it not being rewritten is a red herring - it isn't supposed to be rewritten. The 404 error must be due to something else.
Your Concrete5 routing should support the real URL, not just the virtual one, because C5's routing relies itself on REQUEST_URI. If this is so, you need to create a route for your short URLs
Route::register('/product/{productname}' ...)
and an appropriate controller to get the parameters and invoke the "old" controller.
One possibility using .htaccess could be this, but I'm not too sure it will work since REQUEST_URI is still left unchanged:
# New rule for products
RewriteCond %{REQUEST_URI} ^/product/
RewriteRule ^product/(.+)$ c5.7/products/details?name=$1 [L,QSA]
Otherwise you need to do an external redirect, which will disclose the URL in the browser:
RewriteRule product/(.*)$ http://.../products/details?name=$1 [QSA]
See also this other question.

How to setup request proxy using URL rewriting

I have an e-commerce site that resides in:
http://dev.gworks.mobi/
When a customer clicks on the signin link, the browser gets redirected to another domain, in order for authentication:
http://frock.gworks.mobi:8080/openam/XUI/#login/&goto=http%3A%2F%2Fdev.gworks.mobi%3A80%2Fcustomer%2Faccount%2Flogin%2Freferer%2FaHR0cDovL2Rldi5nd29ya3MubW9iaS8%2C%2F
I'm trying to rewrite http://dev.gworks.mobi/* to http://frock.gworks.mobi:8080/openam/*, without redirection.
I've tried this in the .htaccess of the dev.gworks.mobi site:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/openam(.*)$ [NC]
RewriteRule ^(.*)$ http://frock.gworks.mobi:8080/$1 [P,L]
</IfModule>
But when I access http://dev.gworks.mobi/openam, it shows a 404 page not found page.
Can anyone help me to achieve my use case?
Try this:
RewriteEngine on
RewriteBase /
# Make sure it's not an actual file being accessed
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# Match the host
RewriteCond %{HTTP_HOST} ^dev\.gworks\.mobi
# Rewrite the request if it starts with "openam"
RewriteRule ^openam(.*)$ http://frock.gworks.mobi:8080/$1 [L,QSA]
This will rewrite all the requests to dev.gworks.mobi/openam to frock.gworks.mobi:8080.
If you want to mask the URI in a way that it's not visible to the visitor that she's visiting the authentication app, you need to add a P flag. Please note that it needs Apache's mod_proxy module in place:
RewriteRule ^openam(.*)$ http://frock.gworks.mobi:8080/$1 [P,L,QSA]
Feel free to drop the L flag, if it's not the last rewrite rule. See RewriteRule Flags for more information.
The 404
If it's all in place and you're still getting a 404 error, make sure that the target URL is not throwing 404 errors in the first place.
Second, check if you're still getting the error with the correct referrer URI set. It might be designed in a way to throw a 404, if the referrer is not correctly set. If that's the case, which I suspect, you need to use the R flag and redirect instead of proxying the request.
Last thing that comes to my mind, some webapps are not built in a way to figure out the URI address. The host, as well as the port number, might be hard-coded somewhere in the config files. Make sure that the authentication app is able to be run from another URL without the need to edit the configs.
Test
You can test the rewriterule online: