Htaccess add trailing slash to root - apache

Is it possible to add trailing slash to root URL?
For example, I need https://domain.example -> https://domain.example/

https://domain.example and https://domain.example/ are the same URL - there is always a slash immediately after the hostname (at the start of the URL-path), even if you do not see it in the browser's address bar (it is present in the HTTP request).
The browser "prettifies" the URL you see in the address bar - this is not necessarily the exact URL that the browser uses in the request.
See the following question on the Webmasters stack:
https://webmasters.stackexchange.com/questions/35643/is-trailing-slash-automagically-added-on-click-of-home-page-url-in-browser

Related

Does 301 redirect in htaccess have to use full path?

When changing file or directory names or "prettifying" URLs via .htaccess, I have always previously used this format:
Redirect 301 /oldfile.htm /newfile
However, according to this article, I have been doing it incorrectly all these years:
The last section is the full path to the new file. This is a
fully-qualified URL, meaning you need the http://
(http://www.domain.com/new-file.html).
Are they correct? I always use a redirect check script after writing my rules, and they always check out, even with relative paths.
The truth can be found in official docs, that says
The new URL should be an absolute URL beginning with a scheme and hostname. In Apache HTTP Server 2.2.6 and later, a URL-path beginning with a slash may also be used, in which case the scheme and hostname of the current server will be added.

How to prevent AEM/Sling from adding trailing slash to the extensionless URLs?

All extensionless URLs on the site, which resolve to the actual nodes, are being redirected (with 301 code) to their versions with added trailing slash. It doubles amount of requests to the frontend web server so we would like to fix this.
We do use Apache mod_rewrite to rewrite all incoming URLs (with or without slash) to their .html equivalents in order to make dispatcher caching consistent, but the actual processing is a bit weird.
In general, we have three cases:
URL has an extension ( i.e. /content/xxx/yyy.html ) - it's being processed right away, no redirects
URL has trailing slash ( /content/xxx/yyy/ ) - it is processed by mod_rewrite and rewritten to /content/xxx/yyy.html successfully. no redirects
Extensionless URL ( /content/xxx/yyy ) - processed by mod_rewrite, rewritten to /content/xxx/yyy.html and immediately redirected to /content/xxx/yyy/ which is subsequently goes through the routine from the point 2 above.
To exclude Apache originated redirects we disabled almost all modules, such as mod_dir, mod_negotiation, mod_autoindex, etc to avoid redirects due to the content negotiation or directory indexing but requests are still being redirected.
Our app doesn't contain any redirects based on the URL so I'm wondering if there is any OSGI service or hidden configuration setting which triggers such redirects?
We also have a set of shortcuts on the site, Apache rewrites them to actual URLs and they are NOT being redirected.
For example, if requesting URL is /aboutus it's being successfully mapped to the /content/xxx/yyy/operations/aboutus.html and processed in one loop without any additional redirects. The problem described above is valid only there is an actual corresponding node in JCR and request is extensionless.

Unwanted redirect in Chrome when appending "#anydomain.com"

I just notice if we append "#anydomain.com" to any URL Chrome (and also FF) redirects user to the domains appended.
For example:
http://www.google.com#facebook.com/ - Will redirect to facebook.com
http://www.facebook.com#google.com/ - Will redirect to google.com
I would like to prevent it from my website, does anyone know anything about it?
Thanks in advance!
-B.J.
Adding more info:
If I try to add a '/' before the '#', like this:
http://www.google.com/#facebook.com/
Then Google gives me 404 page not found... But my website still redirects with the '/'
The # symbol is used as part of the URI scheme to login users to a site.
If you notice, as soon as you click it says "You are about to log in Facebook.com with the username..."
Its part of the HTTP protocol. You can't really do anything about it.
Read : http://en.wikipedia.org/wiki/URI_scheme
I fix it.
The issue was in the Apache configuration, the permanent redirect I have to redirect from HTTP to HTTPS had the wrong template, it was missing a final slash '/'.
So when accessing:
http://mydomain.com/#anotherdomain.com
it was redirecting to https://mydomain.com#anotherdomain.com without the final slash, and the browser default behaviour was to redirect to anotherdomain.com without even hitting my server.
It was only about adding the slash on the redirect clause I have.
Redirect permanent https://mydomain.com/

Level of obscurity of destination URLs via mod_rewrite

To achieve a single layer of content delivery security, I'm looking into the possibility of obscuring a resource URL via an .htaccess RewriteRule:
RewriteEngine on
RewriteBase /js/
RewriteRule obscure-alias\.js http://example.com/sensitive.js
It would of course be implemented as:
<script type="text/javascript" src="obscure-alias.js"></script>
Because this is not a 301 redirect, but rather a routing scenario similar to that of many of our frameworks we used today, would it be safe to say that this RewriteRule adequately obfuscates the actual URL where this resource is located, or:
Can the destination URL still be found out via some HTTP header sniffing utility
Might a web browser be able to reveal the "Download URL"
I'm going to pre-answer my own questions by saying no to both since the "internal proxy" is taking place on the server-side and not on the client side if I understand it correctly: http://httpd.apache.org/docs/current/mod/mod_rewrite.html. I just wanted to confirm that when Apache goes to serve the destination URL, that it also isn't passing along information to the user agent what the URL was that it rewrote the original request as.
It depends on how you specify the redirect target.
If your http://example.com/ is running on the same server, there will be an internal redirect that is invisible to the client. From the manual:
Absolute URL
If an absolute URL is specified, mod_rewrite checks to see whether the hostname matches the current host. If it does, the scheme and hostname are stripped out and the resulting path is treated as a URL-path. Otherwise, an external redirect is performed for the given URL. To force an external redirect back to the current host, see the [R] flag below.
if the absolute URL points to a remote domain, a header redirect will be performed. A header redirect is visible to the client and will reveal the sensitive location.
To make sure no external redirect takes place, specify a relative URL like
RewriteRule obscure-alias\.js sensitive.js
Note that the sensitive JS file's URL can still be guessed.
To find out whether a request results in a header redirect, log in onto a terminal (eg. on a Linux server) and do
wget --server-response http://www.example.com
If the first HTTP/.... line (there may be more than one) is something that begins with a 3xx, like
HTTP request sent, awaiting response...
HTTP/1.1 302 Moved Temporarily
you are looking at a header redirect.
Possible using proxy throughput.
See http://httpd.apache.org/docs/2.4/rewrite/proxy.html
Also alluded to here as well: mod_rewrite not working as internal proxy

Preventing trailing slash on domain name

I want my site to show up as www.mysite.com, not www.mysite.com/
Does Apache add a trailing slash after a domain name by default, or does the browser append it? If I want to prevent this using an .htaccess, what would the url rewrite rule be?
If you request:
http://myhost.com
The request needs to look like this in HTTP:
GET / HTTP/1.0
Host: myhost.com
For historical reasons, some browsers did append the slash because otherwise it translates to
GET <nothing> HTTP/1.0
Host: myhost.com
Which would be an illegal request.
Note that:
http://myhost.com/page
is legal, because it translates to:
GET /page HTTP/1.0
Host: myhost.com
http://www.searchenginejournal.com/linking-issues-why-a-trailing-slash-in-the-url-does-matter/13021/
http://www.alistapart.com/articles/slashforward/
URLs were initially used to model directories, so the trailing slash was required. I think if you don't have the trailing slash some webservers will not be able to find the content correctly.
Browser adds such slash automatically when requesting the URL. How it displaying in address bar it's a different story.
For example: www.adobe.com -- type it in different browsers and see how they will display it:
Firefox (Windows, 6.0.2) = http://www.adobe.com/
Google Chrome (Windows, 13.0.782.220 m) = www.adobe.com
Opera (Windows 11.51) = www.adobe.com
Internet Explorer 9 = http://www.adobe.com/
As explained by Anthony's first link, the slash is part of the address. Every domain (and not just "the vast majority") has a name resembling www.mysite.com, but this is just a domain name, not an URL. An URL is the address of a file, ie protocol+domainname+pathfile, so http://www.mysite.com/ is added the missing filename by DirectoryIndex and therefore is an URL, but http://www.mysite.com just doesn't mean anything since in this case the file path would be empty.
The fact that your browser doesn't display the boring parts of your URL is not related to your website's configuration.
If really the same browser behaves differently on different websites, I would be curious to know what browser and what websites you used.