Apache LocationMatch named regular expression strange behavior - apache

I'm trying to create dynamic apache config for lots of directories with OpenID auth, but I never got it to work, I think that there's something wrong with named regexp, but I don't know.
Here's my virtualhost config:
AliasMatch ^/backup/(.*)$ /user_server_backups/$1
<LocationMatch "^/backup/(?<sitename>[^/]+)">
Require claim "roles:%{env:MATCH_SITENAME}"
AuthType openid-connect
</LocationMatch>
Whenever I try to access I got 401
I tried also using numbered regexp but as described in docs numbered regexps are ignored.

You'll need to use at least version 2.4.2.1 of mod_auth_openidc, see: https://github.com/zmartzone/mod_auth_openidc/pull/469

Related

named group backreference in LocationMatch is not recognized in ProxyPassMatch

my English is rather poor and it is my first question, so hopefully I do it the right way ;-)
I use Apache HTTPD 2.4.41 (Win64) and I wanted to use following LocationMatch-Rule:
<LocationMatch "^/es/(?<ind>.*)/_search$">
AllowMethods GET POST
ProxyPassMatch http://localhost:9200/%{MATCH_IND}/_search
ProxyPassReverse http://localhost:9200
</LocationMatch>
The rule seems to match, as I receive a response from the backend server (ElasticSearch). The response body shows that something did not work with the named group backreference:
GET /es/archives/_search
{
"error": "no handler found for uri [/%25%7BMATCH_IND%7D/_search/es/archives/_search] and method [POST]"
}
It seems that the named group backreference has not been recognized and has been passed right through to the backend server without being interpreted.
At least, the original URL has been appended (as stated in the doc). As a workaround I could even leave it like this, but in my opinion, it is not the right way to achieve this.
Any Ideas of the reason why both the named group backreference and the variable seems not be recognized by Apache ? My Apache version (2.4.41) should also be fine as named group backreferences have been introduced in version 2.4.8.
I literally spent hours on Stack Overflow and Google searching for a similar situation, but nothing helped so far.
Hope, someone can help!
It seems that <LocationMatch> documentation is a big vague when it comes to using the matched expression with ProxyPass and ProxyPassMatch. The %{MATCH_*} expressions do not seem to work with them. However it seems that back-references (ie. $1) do work. So you probably want something like:
<LocationMatch "^/es/(?<ind>.*)/_search$">
AllowMethods GET POST
ProxyPassMatch http://localhost:9200/$1/_search
ProxyPassReverse http://localhost:9200
</LocationMatch>
Note that named groups need to be used in the regex, otherwise the back-reference won't be populated.

Apache 2.2 Allow from env=_variable_

I have an Apache 2.2 set up with LDAP Authorization, which is working fantastically as expected, and have also made it so that I can bypass Authentication when accessing it locally.
Allow from localIP hostnameA hostnameB, etc...
If I curl from the server, I don't get any Auth Required. So all good and working as expected.
What I need now is to make one particular URL to also bypass authorisation.
I have tried all the usual solution of using SetEnvIf;
SetEnvIf Request_URI "^/calendar/export" bypassauth=true`
Allow from env=bypassauth IP_ADDRESS HOSTNAME_A HOSTNAME_B
But this is just not working!!
Local access is still unrestricted, but remotely it is not (no change there)
If I dump out my server environment variables on that URL's script, I can see my bypassauth variable is being passed.
I just cannot for the life of me figure out why the Allow from env=bypassauth part is not working, while it still obeys the additional directive parameters.
I also tried another suggestion, using the Location directive;
<Location /calendar/export>
Satisfy Any
Allow from all
AuthType None
SetEnv WTF 123
</Location>
Again, I can see my new environmental variable (WTF) appear on this URL (when I dumped the server envs in the script), so I know that the SetEnv and SetEnvIf directives are working.
Is there anything I'm missing (any Apache2.2 quirks?), as all the solutions I've seen so far just are not working. It's as if my Allow from changes are having no effect after restarting Apache. I'm starting to feel my sanity slip.
Is there also a particular order when writing the directives for Satisfy Any, Order allow, deny and the Auth* directives, which might be effecting this?
Finally managed to figure it out!! :)
Seems my url was being processed by mod_rewrite (my environmental variable being prefixed by REWRITE_ should have rung alarm bells), which according to this post https://stackoverflow.com/a/23094842/4800587, the mod_rewrite is performed AFTER our SetEnvIf and Allow directives.
Anyway, long story short; I used the rewritten/final URL and the Location section to bypass authentication using the Allow any directive. So I changed...
<Location "/calendar/export">
Allow from all
</Location>
to..
<Location "/calendar/index.php/export">
Allow from all
</Location>
which is the final URL (after rewrite), and now works.

Using RedirectMatch with HTTP_HOST in the destination

I keep reading that, where possible, I should not be using mod_rewrite. As such, I am trying to do a http to https rewrite with RedirectMatch.
Question: How can I use RedirectMatch and use Apache server variables (such as %{HTTP_HOST}) in the URL parameter?
This code fails to return a response to the client (Chrome):
RedirectMatch ^(.*) https://%{HTTP_HOST}/$1
I recently asked a similar question to this, but it may have been too wordy and lacks direction for an answer: Redirecting http traffic to https in Apache without using mod_rewrite
If you're using 2.4.19 or later, the Redirect directive has a somewhat obscure feature: putting it inside a Location or LocationMatch will enable expression syntax.
So your example can be written as
<LocationMatch ^(?<PATH>.*)>
Redirect "https://%{HTTP_HOST}%{env:MATCH_PATH}"
</LocationMatch>
(Here, the ?<PATH> notation means that the match capture will be saved to an environment variable with the name MATCH_PATH. That's how we can use it later in the Redirect.)
It's even easier if you always redirect using the entire request path, because you can replace the capture group entirely with the REQUEST_URI variable:
<Location "/">
Redirect "https://%{HTTP_HOST}%{REQUEST_URI}"
</Location>
Now, is this easier to maintain/understand than just using mod_rewrite for this one case? Maybe not. But it's an option.
No, You can't use variables of that type with Redirect/RedirectMatch. If you need variables, such as %{HTTP_HOST}, use mod_rewrite.
Note: I commend you for not trying to use mod_rewrite right away, because most people will go for mod_rewrite even for the simplest of redirections, which is clearly overkill and most times it is just looking to complicate things unnecessarily.
Writing for users who might face the same in future.
Not sure how you are adding vhost entries.
I guess this vhost entries are added automatically with help of some programming script.
Do you use VhostDirective with ServerName?
<VirtualHost *:8080>
ServerName example.domain.com
</VirutalHost>
If so, then you can use the same domain value for populating RedirectMatch field.
If you are manually adding vhost entries just write that domain URL value explicitly instead of HTTP_HOST.
Or let me know if its a different scenario.

How can LocationMatch and ProxyPassMatch be Combined?

I am setting up an Apache 2.4.6 server on an internal machine for testing purposes. One of the things that Apache server is supposed to do is act as a reverse-proxy for another server found on localhost:3030.
The server on localhost:3030 expects one out of a few dataset names on its first path level (for now, the set comprises only of the dataset experimental, but some more will be added later on), so I am trying to pass that through from the requested path.
In my vhost, this works:
<Location /experimental/>
ProxyPass http://localhost:3030/experimental/
ProxyPassReverse /
</Location>
For additional datasets, I could copy that and replace experimental with the other dataset names. Obviously, that leads to a lot of code duplication/redundancy, which is both a source of errors and a maintenance horror.
Therefore, I would like to become somewhat more flexible and treat several datasets in a single such block. This should be possible with the LocationMatch directive.
As indicated by this comment and this page, I need to replace ProxyPass ProxyPassMatch when using that inside a LocationMatch block. Essentially, the docs state the same:
The same will occur inside a LocationMatch section, however ProxyPass does not interpret the regexp as such, so it is necessary to use ProxyPassMatch in this situation instead.
The LocationMatch docs explain:
From 2.4.8 onwards, named groups and backreferences are captured and written to the environment with the corresponding name prefixed with "MATCH_" and in upper case. This allows elements of URLs to be referenced from within expressions and modules like mod_rewrite. In order to prevent confusion, numbered (unnamed) backreferences are ignored. Use named groups instead.
That information is only valid as of Apache 2.4.8, which is presumeably why the following does not work on my 2.4.6 installation:
<LocationMatch /(?<dataset>experimental)/>
ProxyPassMatch http://localhost:3030/%{env:MATCH_DATASET}/
ProxyPassReverse /
</LocationMatch>
On the other hand, this page and that posting imply that the numerical group index ($1) can be used (as the help text is valid only as of httpd 2.4.8, my suspicion / hope is that the numerical reference works before 2.4.8 (?)
In any case, I have tried this:
<LocationMatch "/(experimental)/">
ProxyPassMatch http://localhost:3030/$1/
ProxyPassReverse /
</LocationMatch>
yet according to the logs, the internal call invokes http://localhost:3030/$1/ instead of http://localhost:3030/experimental/ when requesting the experimental path on the vhost URL.
The ProxyPassMatch docs only say:
When used inside a LocationMatch section, the first argument is omitted and the regexp is obtained from the LocationMatch.
However, the text does not bother to provide an example for how to combine LocationMatch and ProxyPassMatch. What am I doing wrong?
The doc also states When the URL parameter doesn't use any backreferences into the regular expression, the original URL will be appended to the URL parameter., which seems to be your case.
Further more, you are missing the host in your ProxyPassReverse directive.
This should work just fine:
<LocationMatch "^/experimental/.*$">
ProxyPassMatch http://localhost:3030
ProxyPassReverse http://localhost:3030
</LocationMatch>
Got this working on Apache 2.4.29:
<LocationMatch "/fruit/(?:apple|banana|pear)">
ProxyPass http://localhost:8080
ProxyPassReverse http://localhost:8080
</LocationMatch>
The URL called by Apache is for example
http://localhost:8080/fruit/apple
The (?: is crucial when you are using parentheses in this example.

LocationMatch and DAV svn

I am trying to make our Subversion repository accessible via multiple URLs. To do so, I was thinking to use the LocationMatch directive. My configuration is:
<Location ~ "/(svn|repository)">
DAV svn
SVNPath /opt/svn
AuthzSVNAccessFile /etc/subversion/access
</Location>
The above configuration does NOT work. Strange thing is, that if I use for example this configuration, it works well for both URLs:
<Location ~ "/(svn|repository)">
SetHandler server-status
</Location>
For me, it looks like the combination of DAV svn and LocationMatch does not really work, or am I doing something wrong here?
I too am having problems with as I wanted to use regexs to avoid other subpaths getting caught by my match.
e.g.
<LocationMatch "^/test/.*$>
is not the same as
<Location "/test">
as in the latter, http://site.com/newproduct/test would get caught by the last one, but not the first one. So would http://site.com/test/scripts . This is why LocationMatch exists, but it fails whenever I put in regexs. It appears to work if I use LocationMatch w/o any regular expressions though.
The problem seems to be that when you use regular expressions in your Location or LocationMatch section, the Apache server is rewriting some metadata on the request with the content of the regular expression (possibly to let the handler that takes this request that it was targeted by a regular expression).
When the dav_svn handler gets the request, it consults this metadata to resolve the path it needs to take to get the resource being asked. Because the regular expression is not a real path, you get errors like this:
svn: PROPFIND of '%5E/(svn%7Crepository)/!svn/vcc/default': Could not parse response status line
I don't have any fix for that, except not using regular expressions with dav_svn: in my case I wanted to use an XSLT formatter to show a nice UI for the subversion repository when accessing it using a web browser, and the XSL resources were supposed to be accessed on a different path on the same host name that hosts the subversion repo, so I wanted to use a regular expression Location to have the path to the XSL resources not hit the dav_svn handler. This was a bust, so instead I just deployed websvn on a different host name and that was that.
Does the client get an error, and is there an error in the HTTP error logs?
SVN may get confused you map multiple locations to a single SVN Repo. See http://subversion.apache.org/faq.html#http-301-error . I'm troubleshooting this for another problem right now.
Does it work if you remove the regular expression? I'll assume yes, but I wanted to verify.
<Location "/svn">
fast solution that works: just add your reference to each vhost, you want to make the svn repro accessible.