htaccess: remove cgit.cgi from path - apache

So I just installed cgit on a shared host. I custom compiled it and use symlinks from $HOME/mydomain-and-public-www-folder.tld/g to $HOME/local/lib/cgit/prod to link $HOME/local/lib/cgit/${git describe} to a build from my updated builds to test the latest version. What does this include?
[sharedhost]$ pwd
$HOME/local/lib/cgit/prod
[sharedhost]$ ls
.htaccess cgit.cgi cgit.css cgit.png lib/
Now, that looks all nice and good on the landing page when I put in https://mydomain.tld/g/ in the browser and I see the pretty cgit interface. When I click any link, I get the proper repos, but all the formatting is gone (the CSS, PNG, and JS files go bye-bye, and all the links are in the form https://mydomain.tld/g/cgit.cgi/randomrepo.git for all the URLs. Of course all the files in the prod path are not working, because the browser looks for things like https://mydomain.tld/g/cgit.cgi/cgit.css instead of the needed https://mydomain.tld/g/cgit.css. Now, this was my basic .htaccess file to get it working.
[sharedhost]$ more .htaccess
# GIT BEGIN ###########################################################
Options +Indexes +FollowSymLinks +ExecCGI
Action fastcgi-script cgit.cgi
SetEnv HTTP_CGIT_CONFIG /home/username/local/lib/cgit/cgit-conf/cgitrc
RewriteEngine On
DirectoryIndex cgit.cgi
# GIT END ############################################################
# AUTHENTICATION BEGIN ###############################################
AuthType Digest
AuthName "cgitdigestdomain"
AuthDigestDomain /cgitdigestdomain/
AuthUserFile /home/username/local/lib/cgit/cgit-conf/.htpasswd
Require valid-user
# AUTHENTICATION END ################################################
I have tried a whole bunch of Rewrite patterns recommended like this [1] or that [2]. I am relatively new to more advanced .htaccess rules, so can someone point on how to remove cgit.cgi from the URLs with mod-rewrite and ensure css and png files in the same directory are accessible and QSA/query strings are handled properly. Sorry for the long post. I thought more detail would show why the obvious how-tos on this were not working for me.
[1] Remove 'index.php' from URL with .htaccess
[2] htaccess remove subdirectory from url
UPDATE:
Both answers seem to be good, but I do not think I am understanding the virtual-root and scan-path features, but it seems like these answers go farther than before, but the critical part is still missing. Any url to a repo (which is in a far removed path $HOME/data/scm/priv) is not getting redirected properly, and I get a 404.
[Fri Jul 13 01:16:18 2012] [error] [client 127.0.0.1] File does not exist: $HOME/mydomain-and-public-www-folder.tld/missing.html, referer: https://domain.tld/g/
[Fri Jul 13 01:24:20 2012] [error] [client 127.0.0.1] File does not exist: $HOME/mydomain-and-public-www-folder.tld/failed_auth.html, referer: https://domain.tld/g/
[Fri Jul 13 01:24:21 2012] [error] [client 127.0.0.1] File does not exist: $HOME/mydomain-and-public-www-folder.tld/g/bicon.git, referer: https://domain.tld/g/
[Fri Jul 13 01:24:21 2012] [error] [client 127.0.0.1] File does not exist: $HOME/mydomain-and-public-www-folder.tld/missing.html, referer: https://domain.tld/g/
[Fri Jul 13 01:26:11 2012] [error] [client 127.0.0.1] File does not exist: $HOME/mydomain-and-public-www-folder.tld/g/admin-scripts.git, referer: https://domain.tld/g/
[Fri Jul 13 01:26:11 2012] [error] [client 127.0.0.1] File does not exist: $HOME/mydomain-and-public-www-folder.tld/missing.html, referer: https://domain.tld/g/
Names/IPs changed to protect the innocent, but you get the point. So it appears the repo requests are not going back to the cgit CGI, so where are they going? I working on trying to get better redirect logs. We shall see.
UPDATE 2:
And of course, RewriteLog is a directive you cannot put in .htaccess as I forgot, silly me. Not sure what to do now on a shared hosting. How obnoxious. This is one of several issues that pushed me to look into VPS hosting and bite the money bullet and use this shared host for only really stupid stuff.
P.S.: I did email the original issue as is to the cgit dev list, and never got a response on this either. So they do not care and feel it is a moronic question, not sure. Haha.

Usually this kind of problem is much easier to solve if you have some help from your tool. And you do.
According to this page You can set the virtual-root setting (in your case to "/g" and use these rules:
RewriteRule ^/g/$ /g/cgit.cgi [L,QSA]
RewriteRule ^/g/([^/]+)/$ /g/cgit.cgi?r=$1 [L,QSA]
RewriteRule ^/g/([^/]+)/([^/]+)/$ /g/cgit.cgi?r=$1&p=$2 [L,QSA]
I adjusted the paths to what I think would be right for you but I'm not sure about the recursiveness here. It should work (becasue of the L) but if it does not, I suggest you rename the folder /g into something like /git and use it in the paths (e.g. /git/cgit.cgi)
As for the css and js files, you will need another rule, something like this:
ReWriteCond /g/$1 -f
ReWriteRule ^/g/(.+)$ - [L]
This should leave the url untouched if the file exists. Put this rule before the rules above.
Update:
I was hoping adding virtual-root would change the links on the page to the desired form; I suggest you double check that cgit.cgi has picked up the configuration.
However, if you would want to work around that, try this rule:
ReWriteCond /g/$1 -f
ReWriteRule ^/g/cgit.cgi/(.+)$ /g/$1 [L]

Fozi's answer didn't quite work in my case. These rewrite rules work well with virtual-path=/ in the /etc/cgitrc.
<Directory /var/www/cgit>
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*) /cgit.cgi/$1
</Directory>
Make sure to restart your web server instead of reloading it when messing with cgit, because it seems to cache some things.

Does this work?
# Redirect the browser to remove the "cgit.cgi" from the URL's address bar
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /g/cgit\.cgi/
RewriteRule ^g/cgit\.cgi/(.+)\.git /g/$1.git [R=301,L]
# Internally add the "cgit.cgi" so that it gets served properly
RewriteCond %{REQUEST_URI} !^/g/cgit.cgi/
RewriteRule ^g/(.+)\.git /g/cgit.cgi/$1.git [L]
You'd want to probably insert that somewhere after the RewriteEngine On. The first rule checks if the requested URI contains /g/cgit.cgi/ and ends with a .git, and if so, redirect with the cgit.cgi removed. This will make it so relative links to see will have a base of /g/ instead of /g/cgit.cgi/. Then, we need to change them back internally, which is where the second rule comes in. If the request doesn't include a cgit.cgi, then internally add one in as long as the URI ends with .git.
EDIT: I may have misunderstood what you meant by "how to remove cgit.cgi from the URLs with mod-rewrite"
See if these rules work for you:
RewriteCond %{REQUEST_URI} ^/g/cgit\.cgi/(.+)$
RewriteCond %{DOCUMENT_ROOT}/g/%1 -f
RewriteRule ^g/cgit\.cgi/(.+)$ /g/$1 [L]
The first condition checks if the request URI starts with /g/cgit.cgi/ and makes a backreference to whatever is after it. Then it checks if there is physically a file that exists if the cgit.cgi part is removed, and if so, rewrite the URI so that it's removed.
This way, requests for your css and js and images that sit in the /g/ directory, but get relative linked to from URL's like /g/cgit.cgi/something.git will internally resolve as long as the referenced resource is actually in /g/.

Related

AEM Apache Dispatcher 2.4.6 client denied by server configuration

I have an AEM 6.3 instance running behind an Apache instance which version is 2.4.6, with Dispatcher module in it. All is good, but now I need to wipe out all query params for all URLs that end with ".html".
This may sound simple to accomplish, but I came across an issue I can't resolve. This is the rewrite rule I'm using to remove all the query params from URLs ending in .html:
RewriteRule ^/(.*)\.html$ /$1.html [QSD]
Technically, one could see this rewrite as not a rewrite actually, because it is sending the original request to the same URL, but the flag QSD is for dropping all query params.
The problem is, if I reload my Apache instance whit this rule included, I start getting errors like this:
[Wed Jun 10 14:53:35.698908 2020] [authz_core:error] [pid 31733] [client 54.209.162.6:61649] AH01630: client denied by server configuration: /etc/clientlibs, referer: https://my.domain.com/etc/clientlibs/mygroup/some/simple/page.html
I know some people had issues like this when migrating from Apache 2.2 to 2.4. This is not my case, and I have also checked my vhost configuration. I don't have directives from Apache 2.2 like "Order deny,allow" or "Allow from all". I'm using "Require all granted".
One weird thing in AEM logs, is that when my Rewrite rule is not in place, I can see error.log logging that "/etc/clientlibs/mygroup/some/simple/page.html" is found. But if I put the rule and reload Apache, I see this from logs:
10.06.2020 10:16:40.085 *INFO* [54.209.162.6 [1591798600081] GET /etc/clientlibs/mygroup/some/simple/page/jcr:content.json HTTP/1.1] org.apache.sling.engine.impl.SlingRequestProcessorImpl service: Resource /etc/clientlibs/mygroup/some/simple/page/jcr:content.json not found
It is like the extension .html would be ripped off from URL, and since there is no extension, AEM or rather Sling is trying to use the default content resolver which is JSON.
why donÄt you just use
RewriteRule ^ %{REQUEST_URI} [L,R,QSD]
(maybe the redirect is not needed in your case... but it makes things clear to the browser).
Or if you just want to make sure that your request is cached in the dispatcher and not passed throught to AEM each time, use:
/filter {
/0001 { /type "deny" /method "POST" /url "/etc/*" }
/0002 { /type "allow" /method "GET" /url "/etc/*" /query "a=*" }
}
in your dispatcher config (s. https://docs.adobe.com/content/help/en/experience-manager-dispatcher/using/configuring/dispatcher-configuration.html for details).
I finally was able to fix my issue. Even though I still don't understand the full picture. This is my final condition and rule:
RewriteCond %{QUERY_STRING} ^.
RewriteRule ^/(.*)\.html$ /$1.html [QSD,PT]
Adding "PT" along with "QSD" makes Apache not return the "client denied" error. The condition around QUERY_STRING it is just to make sure Apache only manipulates those requests that really have query params in the URL, or technically at least one char
What about adding a conditional to skip this rule to be applied for /etc/clientlibs just before the rewrite rule. RewriteCond %{REQUEST_URI} !^/etc/clientlibs.*

Apache 2.4 RewriteMap never matches

I'm using an Apache RewriteMap to permanently redirect 60 or so urls. In one development environment, the below configuration works flawlessly while in another development environment it doesn't work at all. Most notably, the last RewriteCond never passes and unfortunately the logging options I've attempted are no help. With LogLevel debug rewrite:trace8, I can see that the RewriteCond just before the map is receiving the expected input, but the map nonetheless returns no match:
[Thu Apr 19 19:35:19.109789 2018] [rewrite:trace4] [pid 11188] mod_rewrite.c(470): [client 127.0.0.1:62369] 127.0.0.1 - - [server.dev/sid#7f7719da0d50][rid#7f7719fc8000/initial] [perdir /html/path/] RewriteCond: input='/help_center/help_center.php?' pattern='^/?(.*[^\\?])\\??/?$' => matched
[Thu Apr 19 19:35:19.109793 2018] [rewrite:trace4] [pid 11188] mod_rewrite.c(470): [client 127.0.0.1:62369] 127.0.0.1 - - [server.dev/sid#7f7719da0d50][rid#7f7719fc8000/initial] [perdir /html/path/] RewriteCond: input='NOTFOUND' pattern='!NOTFOUND' [NC] => not-matched
For debugging purposes I've simplified things to pass a constant key into the map, but the map nonetheless returns no substitute value. I've also tried simplifying the map file, adding all the possible variations of a key it could receive (with and without leading/trailing slashes and ?). I've tried renaming the map file and extension, renaming the map itself, moving the map file outside of the public directory, all with no change in results. The map file is readable as is the directory it's in, Apache starts up error-free with the config and yes I've been restarting it when testing config changes.
What's left to try? Both systems run CentOS 7, Apache 2.4, one works and one doesn't. Configs below
Apache server-level config declaring map
RewriteMap help_center txt:/path/to/rewritemap/help_center.map
.htaccess
RewriteEngine on
RewriteCond %{REQUEST_URI} !(\.(js|css|less|png|swf|flv|jpg|svg|ico))$
RewriteCond %{REQUEST_URI}?%{QUERY_STRING} ^/?(.*[^\?])\??/?$
RewriteCond ${help_center:%1|NOTFOUND} !NOTFOUND [NC]
RewriteRule ^.*$ /${help_center:%1}/ [QSD,L,NC,R=301]
Abbreviated contents of help_center.map
help_center/help_center.php help-center
help_center/help_center.php?1_7_q-1 help-center/article/authorized-dealer
pages/appliance_installation help-center/article/installation-services
UPDATE
After 6 hours of debugging, I've finally been able to get the RewriteMap to match. By moving the RewriteMap directive to the <VirtualHost _default_:443> scope of /etc/conf.d/ssl.conf, urls are rewriting as expected.
Why is this the case?
The config between these two environments is very similar with the exception of virtual hosting. Both environments run SSL and redirect all requests to the SSL site. The environment that works is configured without a name based virtual host config (i.e. one site for the server) while the environment that didn't work is running name based virtual hosting. I have a single hypothesis related to this:
While I'd expect the root directive to apply to the default site and all virtual sites (including SSL), perhaps the RewriteMap directive must be virtual host scoped to be referenced by a virtual host. Not sure this makes sense, but I couldn't find any documentation to clarify. After testing, turning off NameVirtualHost nothing changes :/
Related: It seems that referencing a non-existent RewriteMap logs no error. Perhaps it was buried while I had debug logging turned on, but is there a config for logging RewriteMap reference errors? That would have help narrow my debugging much sooner.
Finally, for examining parsed Apache config run this from command line: httpd -DDUMP_CONFIG -k start or to view in vi httpd -DDUMP_CONFIG -k start | vi -. I was able to use this to confirm that my RewriteMap directive was being loaded despite it still not matching.
I'm not entirely sure this should qualify as an answer, but I THINK this is what has made it work: Remove the QSD argument and append a ? to manually discard it. This change was made because we have a server running 2.2 that would error out entirely (rather than not match) and it seems that a side effect is that our 2.4 servers are now matching. I'm sure not it's Voodoo, but unfortunately I can't explain it with confidence.
## QSD not available in apache 2.2, add a ? to the end of the rewrite to discard
RewriteRule ^.*$ /${help_center:%1}/? [L,NC,R=301]

RewriteRule not working, 404 error obtained

Using Apache/2.2.31, I created the following rule on my '.htaccess' file:
RewriteEngine on
RewriteRule ^([^/]+)/$ /do.php?label=$1 [L]
However, when accessing 'http://foo.com/whatever/', I get a 404 error message. I've checked my error log and:
[Mon Jun 27 11:12:15 2016] [error] [client 192.168.1.132] File does not exist: /path-to-web/whatever, referer: http://foo.com/
'http://foo.com' works ok, and 'http://foo.com/do.php?label=whatever' works ok as well.
I've checked phpinfo, and 'mod_rewrite' is loaded in Apache. Additionally, I tried to enable it via 'http.conf', but Apache tells me that "module rewrite_module is built-in and can't be loaded".
What am I doing wrong? This same '.htaccess' works ok in two other Apache servers.
I tried with "Options +FollowSymLinks -MultiViews" as well
Thank you very much

Help Needed With (I thought) Basic mod_rewrite Setup

I run my own server at home. I am attempting to do a mod_rewrite but I cannot seem to get it to work correctly.
Say my url is "http://www.site.com/". I have a user set up, whose public_html directory is serving pages; let's say that url is "http://www.site.com/~myUser/theDirectory/". What I am trying to do is set it so that when you type in "http://www.site.com/theDirectory/" it will actually serve the pages out of "http://www.site.com/~myUser/theDirectory/", but look like its coming from "http://www.site.com/theDirectory/". I edited my /etc/apche2/sites-enabled/default file and added these lines:
RewriteEngine On
RewriteRule ^theDirectory/$ /home/myUser/public_html/~myUser/theDirectory/
I also tried various versions of the rule:
RewriteRule ^/theDirectory/$ /home/myUser/public_html/theDirectory/
RewriteRule ^/var/www/html/theDirectory/$ /home/myUser/public_html/theDirectory/
I also made sure that the rewrite module was enabled. At first, I was getting this error:
[Fri Jun 17 18:11:35 2011] [error] [client xxx.xxx.xxx.xxx] File does not exist: /var/www/theDirectory
So I created that file, and now I am getting this error:
[Fri Jun 17 23:15:45 2011] [error] [client xxx.xxx.xxx.xxx] Directory index forbidden by Options directive: /var/www/theDirectory/
So, I'm not really sure where to go from here. Any and all advice will be appreciated. Thanks for taking the time to read.
Have a great day :-)
Add the following to your virtual host configuration:
RewriteEngine on
RewriteRule ^/theDirectory(/.*)$ /home/myUser/public_html/theDirectory$1 [L]

mod_rewrite generating errors in log

On my site I have mod_rewrite rules to make the URLs more search engine friendly, and it all works fine on the frontend, but I'm getting errors in the error log like this
[Thu Jan 22 22:51:36 2009] [error] [client {IP ADDRESS HERE}] File does not exist: /{some rewritten directory}
The rules I'm using are rather simple, along the lines of
RewriteRule ^pages/(.*)_(.*).html$ page.php?id=$2
Is there a way to avoid these errors?
MultiViews could cause this. If it is enabled, Apache tries to find a file similar to the requested URI before passing the request along to mod_rewrite. So try to disable it:
Options -MultiViews
I don't think those errors have anything to do with mod_rewrite, they're just saying that a file doesn't exist. Plain old 404 errors.
Incidentally, shouldn't rewrite patterns normally start with a slash? Like so:
RewriteRule ^/pages/(.*)_(.*).html$ /page.php?id=$2