Named rules and local rules - snakemake

Version 7.3.8
I commonly structure my config.yaml such that public datasets can be automatically downloaded if not present using:
# config.yaml
paths:
vcf: 'path/to/vcf_{chrom}.vcf'
...
urls:
vcf: 'ftp://path_to_vcf_{chrom}.vcf'
...
# Snakefile
for key, url in config['urls'].items():
rule:
name: f'download_{key}'
output: config['paths'][key]
params: url=url
shell: # do download
Since the addition of the name directive, I can give the rules meaningful names (which is great!). The issue is I'd like to specify all these download rules are localrules. I can use
# Snakefile
localrules:
download_vcf,
...
where download_vcf is a token, not a string, so I have to manually keep that list up to date with my config. I'd like to programmatically add each url to local rules. I can do something like:
workflow._localrules.update(f'download_{key}' for key in config['urls'])
but I'd like to avoid using the private variable.
Any other recommendations? Is this something worth a feature request? Either a method to update localrules or a new directive localrule to replace rule (similar to checkpoint)? The more I think about it, the more it makes sense to label a rule as local instead of a separate localrules directive.

Related

OpenBSD's httpd daemon {block} directives not working

I'am trying to restrict access to some subfolders of simple website hosted on OpenBSD's httpd native server. Config is rather simple, it is for testing purposes:
server "10.0.1.222" {
listen on 10.0.1.222 port 80
log style combined
location "/*php*"{
root "/FOLDER"
fastcgi socket "/run/php-fpm.sock"
}
directory {
index "index.php"
}
location "/*" {
root "/FOLDER"
}
location "/SUBFOLDER/*" {block}
}
Inside the SUBFOLDER I placed some htmls not intended to direct viewing.
With last location directive I expect requests like http://10.0.1.222/SUBFOLDER/01.html to be blocked with 403 code but I can't achieve it.
While http://10.0.1.222/SUBFOLDER/ returns access denied, requesting any proper html document name within SUBFOLDER serves that request without any complaints.
If string: /SUBFOLDER/* is (as I suppose) proper shell glob that should match string /SUBFOLDER/ itself + any string given after, then requests like http://10.0.1.222/SUBFOLDER/01.html should be returned with code 403. But it isn't working.
I tried many combinations: "/SUBFOLDER/*", "/SUBFOLDER/*.html" and so on with or without leading /. No effect.
There is probably something I do not understand, but I can't debug my mistake.
What am I missing?
Quick answer for my own question, obtained from misc#openbsd.org: according to the manual man httpd.conf in case of the location statement first match wins. To avoid some more specific rules being ignored it is necessary to put them before more global ones.
In my case putting blocking directive just after log style combined solved the problem.

Trailing slash and routes

I recently started experimenting with traefik and I'm swiching over from nginx.
I'm a bit confused by how the paths in Path, PathStrip, PathPrefix and PathPrefixStrip work regarding trailing slashes.
In nginx for proxied requests this is the documentation:
If a location is defined by a prefix string that ends with the slash
character, and requests are processed by one of proxy_pass,
fastcgi_pass, uwsgi_pass, scgi_pass, or memcached_pass, then the
special processing is performed. In response to a request with URI
equal to this string, but without the trailing slash, a permanent
redirect with the code 301 will be returned to the requested URI with
the slash appended. If this is not desired, an exact match of the URI
and location could be defined like this:
location /user/ {
proxy_pass http://user.example.com;
}
location = /user {
proxy_pass http://login.example.com;
}
How would it be possible to replicate this behaviour?
Essentially I’d like traefik to append the trailing slash when not present, so that PathPrefixStrip:/mylocation/ will also match /mylocation and issue a 301 for /location/.
In addition I'm a bit confused by the difference between Path and PathPrefix when used as Modifiers, is there some documentation that explains the difference in their respective behaviour?
Thank you.
This question is old, still helpful for novice traefik users.
This question was probably related with traefik 1.x, here the official matchers documentation for 1.7
It stands:
Path: /products/, /articles/{category}/{id:[0-9]+} Match exact request path. It accepts a sequence of literal and regular expression paths.
...
PathPrefix: /products/, /articles/{category}/{id:[0-9]+} Match request prefix path. It accepts a sequence of literal and regular expression prefix paths.
Also their are very clear about path Path Matcher Usage Guidelines
But this is old and deprecated in favor of the current version. So check the latest documentation
The traefik is very different on 2.x version. You can check the migration guide here
Now you need to setup:
entrypoints
routers
middlewares
services
In your router, the rule property is where your set the Path or PathPrefix matcher. The rule reference are here
Path(/path, /articles/{cat:[a-z]+}/{id:[0-9]+}, ...) Match exact request path. It accepts a sequence of literal and regular expression paths.
PathPrefix(/products/, /articles/{cat:[a-z]+}/{id:[0-9]+}) Match request prefix path. It accepts a sequence of literal and regular expression prefix paths.

How to obtain values of all Apache's mod_rewrite environment variables, without PHP?

I have an Apache instance but without PHP (almost all the SO answers I found for this are PHP-specific). In fact it's mostly serving static content.
I'm working on some mod_rewrite redirections and I'd like to know the exact values of all environment variables.
Those pages list Apache's available env variables and example values
http://www.zytrax.com/tech/web/env_var.htm
https://www.cheatography.com/davechild/cheat-sheets/mod-rewrite/
however I'd like to see the exact values coming from my requests, to facilitate working on my rewrite rules.
What would be the easiest way to get all the Apache environment values? (without installing PHP on it).
As a poor man's debugging, I know I can get values one by one by defining some example rewrites like this
RewriteRule ^/test.htm http://localhost/test2.htm?SERVER_NAME=%{SERVER_NAME} [R,L,NC]
and then hitting http://localhost/test.htm and observing the redirect, but this is not a really good solution.
Is there a better way to learn about all the environment, not specific to any particular language like PHP?
You can obtain the values of all Apache environment variables with Perl. The standard Apache distribution still bundles the good old printenv.pl CGI script. Here's what mine (Apache/2.4 on Windows) looks like:
#!D:/programs/perl/bin/perl.exe
#
# To permit this cgi, replace # on the first line above with the
# appropriate #!/path/to/perl shebang, and on Unix / Linux also
# set this script executable with chmod 755.
#
# ***** !!! WARNING !!! *****
# This script echoes the server environment variables and therefore
# leaks information - so NEVER use it in a live server environment!
# It is provided only for testing purpose.
# Also note that it is subject to cross site scripting attacks on
# MS IE and any other browser which fails to honor RFC2616.
##
## printenv -- demo CGI program which just prints its environment
##
use strict;
use warnings;
print "Content-type: text/plain; charset=iso-8859-1\n\n";
foreach my $var (sort(keys(%ENV))) {
my $val = $ENV{$var};
$val =~ s|\n|\\n|g;
$val =~ s|"|\\"|g;
print "${var}=\"${val}\"\n";
}
Of course:
You need Perl installed
Apache administrator will typically not enable the default /cgi-bin directory
Other that using a program, you're out of luck. I'm not aware of any builtin Apache module that reports ENV variables (not even mod_info).

Browse zipfiles on apache webserver

I already have an awk script called viewzip.cgi which works as follows:
...viewzip.cgi/path_to_zipfile/zipfile.zip/
will show the root directory of that file,
...viewzip.cgi/path_to_zipfile/zipfile.zip/subdir/
shows a subdirectory (if present)
...viewzip.cgi/path_to_zipfile/zipfile.zip/path_to_file/file
will download one particular file.
Now what I want is omitting the "viewzip.cgi" part in the URL and an automatic redirect working as follows:
...path_to_zipfile/zipfile.zip
should download the zipfile as it would be standard behaviour, but
...path_to_zipfile/zipfile.zip/
with the trailing slash should redirect to a path like the first example, and also when trailing subdirs or files are appended.
How can I do that, if so? I have access to file system (i.e. ".htaccess") but not to apache's root configuration files. Or is there a (possibly well-known) better solution? A similar problem applies to .chm files which would be more easily browseable when unpacked on server on request. It would be nice if I don't need to repeat a redirection line for each single zipfile I have.
henni
The RedirectMatch keyword does the job.
RedirectMatch .../((?!viewzip\.cgi/).*)\.zip/(.*) http://www.../.../viewzip.cgi/$1.zip/$2

How do I make the browser display the clean URL on a re-write?

I have a hyperlink that looks like this:
http://domain.com/sample/comments/65
And when I click on it, it goes to this:
http://domain.com/sample/comments/index.php?submissionid=65
I'm using a rewrite rule to make it do this. This is what I want, except I also want the URL displayed in the browser to still look like "http://domain.com/sample/comments/65."
How can I do this? The .htaccess file is displayed below.
RewriteEngine on
RewriteRule ^comments/([0-9]+)?$ http://domain.com/sample/comments/index.php?submissionid=$1 [NC,L]
Thanks in advance,
John
You must remove the part http://domain.com/sample/, otherwise it will force a redirect:
RewriteEngine on
RewriteRule ^comments/([0-9]+)?$ comments/index.php?submissionid=$1 [NC,L,B]
The B flag is also necessary because you're using the backreference inside a query string, which requires escaping.
The manual says (emphasis mine):
When using the rewrite engine in .htaccess files the per-directory prefix (which always is the same for a specific directory) is automatically removed for the pattern matching and automatically added after the substitution has been done. This feature is essential for many sorts of rewriting; without this, you would always have to match the parent directory, which is not always possible. There is one exception: If a substitution string starts with http://, then the directory prefix will not be added, and an external redirect (or proxy throughput, if using flag P) is forced. See the RewriteBase directive for more information.
This would not be case if you put the rewrite rule in the virtual host or main configuration as long the request host and the host in the rewrite rule matched.