Change word to path - rebol

Yet another question related to Change path or refinement
This time, I want to change the a inside a block to a/b
Using change:
test: [a]
change test 'a/b
Splits the values into two:
>> test
== [a b]
Which isn't what I want, but rather, as a single path [a/b]

The solution is to use change/only:
test: [a]
change/only test 'a/b
Gives:
>> test
== [a/b]

change/only works, though there is a simpler way in that case:
>> test: [a]
== [a]
>> test/1: 'a/b
== a/b
>> test
== [a/b]

Related

How does snakemake --show-failed-logs works

I want to use snakemake with the --show-failed-logs parameter but not sure how it works or what to expect.
For example (not a working example I just typed and copied some code parts, if necessary I can create a working example)
My command is:
snakemake -s Snakefile_test.smk --configfile test.yml -j 4 --restart-times 2 --show-failed-logs
In Snakefile_test.smk I have a rule:
rule testrule:
input:
R1_trimmed = rules.trim.output.R1_trimmed,
R2_trimmed = rules.trim.output.R2_trimmed
output:
plot_R1 = output+"/{sample}/figures/{sample}_R1_plot.png",
plot_R2 = output+"/{sample}/figures/{sample}_R2_plot.png",
log:
testrule_log = output+"/{sample}/logs/plot_log.txt"
run:
shell("python scripts/createplot.py -r1_trimmed {input.R1_trimmed} -r2_trimmed {input.R2_trimmed} -plot_r1 {output.plot_R1} -plot_r2 {output.plot_R2} -log {log.testrule_log}")
In createplot.py to test I have something like:
if __name__ == "__main__":
logging.basicConfig(filename=args.log, level=logging.DEBUG, format='%(asctime)s %(levelname)s %(name)s %(message)s')
logger=logging.getLogger(__name__)
#to add some content to the log file
try:
1/0
except ZeroDivisionError as err:
logger.error(err)
main()
#to let the script crash
a = 1/0
Because obviously createplot.py will crash the pipeline I expected that snakemake would print the contents of testrule_log to the screen. But that is not the case. I am using version 5.25.0
EDIT:
I am familiar with
script:
"scripts/createplot.py"
But need to use shell in this case. If that causes the problem let me know.

Aggregate undetermined number of files for all wildcards in one rule

I have a set of files which will be individually processed to produce multiple files. Exactly how many files is unknown before runtime. (If it matters, this is demultiplexing DNA sequencing results.) I then have a script which takes all of these files at once.
Right now I have something like this:
checkpoint demultiplex:
input: "{sample}.fastq"
output: directory("{sample}")
shell:
# in reality the number of output files is not known
"mkdir -p {output} &&"
"touch {output}/{wildcards.sample}-1.fastq &&"
"touch {output}/{wildcards.sample}-2.fastq &&"
"touch {output}/{wildcards.sample}-3.fastq"
def find_outputs(wildcards) :
outdir = checkpoints.demultiplex.get(**wildcards)
return glob.glob("{sample}/{sample}-*.fastq".format_map(wildcards))
rule analysis:
input: find_outputs
outputs: "results.txt"
script: "scripts/do_analysis.R"
This obviously doesn't work, because the values of {sample} (Assume they should be A, B, C, D) are never defined.
As I was writing the question, I came up with this answer, which seems to work. However, if you have something cleaner, I would be happy to accept it!
For checkpoints.<rule>.get() to work its magic, it has to be in the body of a function which is given as a reference, not called. Also, this function needs to take one argument, wildcards.
So we make a function that returns closures having the behavior we need. The value of wildcards (which will be empty in this case) is ignored, allowing us to specify the values manually.
def find_outputs(sample):
def f(wildcards):
checkpoints.demultiplex.get(sample = sample)
return glob.glob("{sample}/{sample}-*.fastq".format(sample = sample))
return f
rule analysis:
input:
find_outputs("A"),
find_outputs("B"),
find_outputs("C"),
find_outputs("D")
output: "results.txt"
script: "script/do_analysis.R"

How do I store the value returned by either run or shell?

Let's say I have this script:
# prog.p6
my $info = run "uname";
When I run prog.p6, I get:
$ perl6 prog.p6
Linux
Is there a way to store a stringified version of the returned value and prevent it from being output to the terminal?
There's already a similar question but it doesn't provide a specific answer.
You need to enable the stdout pipe, which otherwise defaults to $*OUT, by setting :out. So:
my $proc = run("uname", :out);
my $stdout = $proc.out;
say $stdout.slurp;
$stdout.close;
which can be shortened to:
my $proc = run("uname", :out);
say $proc.out.slurp(:close);
If you want to capture output on stderr separately than stdout you can do:
my $proc = run("uname", :out, :err);
say "[stdout] " ~ $proc.out.slurp(:close);
say "[stderr] " ~ $proc.err.slurp(:close);
or if you want to capture stdout and stderr to one pipe, then:
my $proc = run("uname", :merge);
say "[stdout and stderr] " ~ $proc.out.slurp(:close);
Finally, if you don't want to capture the output and don't want it output to the terminal:
my $proc = run("uname", :!out, :!err);
exit( $proc.exitcode );
The solution covered in this answer is concise.
This sometimes outweighs its disadvantages:
Doesn't store the result code. If you need that, use ugexe's solution instead.
Doesn't store output to stderr. If you need that, use ugexe's solution instead.
Potential vulnerability. This is explained below. Consider ugexe's solution instead.
Documentation of the features explained below starts with the quote adverb :exec.
Safest unsafe variant: q
The safest variant uses a single q:
say qx[ echo 42 ] # 42
If there's an error then the construct returns an empty string and any error message will appear on stderr.
This safest variant is analogous to a single quoted string like 'foo' passed to the shell. Single quoted strings don't interpolate so there's no vulnerability to a code injection attack.
That said, you're passing a single string to the shell which may not be the shell you're expecting so it may not parse the string as you're expecting.
Least safe unsafe variant: qq
The following line produces the same result as the q line but uses the least safe variant:
say qqx[ echo 42 ]
This double q variant is analogous to a double quoted string ("foo"). This form of string quoting does interpolate which means it is subject to a code injection attack if you include a variable in the string passed to the shell.
By default run just passes the STDOUT and STDERR to the parent process's STDOUT and STDERR.
You have to tell it to do something else.
The simplest is to just give it :out to tell it to keep STDOUT. (Short for :out(True))
my $proc = run 'uname', :out;
my $result = $proc.out.slurp(:close);
my $proc = run 'uname', :out;
for $proc.out.lines(:close) {
.say;
}
You can also effectively tell it to just send STDOUT to /dev/null with :!out. (Short for :out(False))
There are more things you can do with :out
{
my $file will leave {.close} = open :w, 'test.out';
run 'uname', :out($file); # write directly to a file
}
print slurp 'test.out'; # Linux
my $proc = run 'uname', :out;
react {
whenever $proc.out.Supply {
.print
LAST {
$proc.out.close;
done; # in case there are other whenevers
}
}
}
If you are going to do that last one, it is probably better to use Proc::Async.

TCL/Expect variable vs $variable

Which one is proper way of using variable with or without dollar sign? I thought that variable (without $) is used only during variable declaration (similar to Bash):
set var 10
In all other cases when variable is referred or used (but not declared) the proper syntax is $variable (with $):
set newVar $var
puts $var
puts $newVar
But then I found code where it is interchanged and seems that this code is working:
# using argv
if {[array exists argv]} {
puts "argv IS ARRAY"
} else {
puts "argv IS NOT AN ARRAY"
}
# using $argv
if {[array exists $argv]} {
puts "\$argv IS ARRAY"
} else {
puts "\$argv IS NOT AN ARRAY"
}
# using argv
if {[string is list argv]} {
puts "argv IS LIST"
} else {
puts "argv IS NOT LIST"
}
# using $argv
if {[string is list $argv]} {
puts "\$argv IS LIST"
} else {
puts "\$argv IS NOT LIST"
}
Output:
argv IS NOT AN ARRAY
$argv IS NOT AN ARRAY
argv IS LIST
$argv IS LIST
Edit in reply to #glenn jackman:
Your reply pointed me to further research and I've found that TCL is capable doing some sort of "self modifying code" or whatever is correct name e.g.:
% set variableName "x"
x
% puts $x
can't read "x": no such variable
% set $variableName "abc"
abc
% puts $x
abc
% puts [set $variableName]
abc
%
%
%
%
%
%
% set x "def"
def
% puts $x
def
% puts [set $variableName]
def
%
Now your answer bring some light to problem, but one question remains. This is excerpt from documentation:
set varName ?value?
array exists arrayName
Documentation says that both functions expect variable name (not value) in other words it expect variable instead of $variable. So I assume (based on above self modifying code) that when I pass $variable instead of variable the variable substitution took place (exactly the same as code above). But what if $variable contains something that is not a list neither array (my arguments during testing was: param0 param1 "param 2" param3). From this point of view the output that says $argv IS LIST is wrong. What am I missing here?
Edit in reply to #schlenk:
Finally I (hope) understand the problematic. I've found great article about TCL, which explain (not just) this problematic. Let me pinpoint a few wise statement from this article:
In Tcl what a string represents is up to the command that's
manipulating it.
Everything is a command in Tcl - as you can see there is no
assignment operator.
if is a command, with two arguments.
The command name is not a special type but just a string.
Also following SO answer confirms this statement:
"In Tcl, values don't have a type... they question is whether they can be used as a given type."
The command string is integer $a means:
"Can I use the value in $a as an integer"
NOT
"Is the value in $a an integer"
"Every integer is also a valid list (of one element)... so it can be
used as either and both string is commands will return true (as will
several others for an integer)."
I believe the same applies also for string is list command:
% set abc "asdasd"
asdasd
% string is list $abc
1
% string is alnum $abc
1
string is list returns 1 because $abc is string and also it is one element list etc. In most tutorials there are said that following snippet is the proper way of declaring and working with lists:
% set list1 { 1 2 3 }
% lindex $list1 end-1
2
But when everything in TCL is string the following is also working in my experience (if I am wrong correct me please).
% set list2 "1 2 3"
1 2 3
% lindex $list2 end-1
2
It depends on the command. Some Tcl commands require a variable name as a parameter, if they need to modify the contents of the variable. Some are:
set
foreach
lappend
incr
Most but certainly not all commands want to take a variable's value.
You'll need to check the documentation for the relevant commands to see if the parameters include "varName" (or "dictionaryVariable"), or if the parameters are named as "string", "list", etc
An example using info exists which takes a varName argument:
% set argv {foo bar baz}
foo bar baz
% info exists argv ;# yes the variable "argv" exists
1
% info exists $argv ;# no variable named "foo bar baz"
0
% set {foo bar baz} "a value" ;# create a variable named "foo bar baz"
a value
% info exists $argv ;# so now that variable exists
1
The important thing to know is that $x in Tcl is just syntactical sugar for the command set x. So you can translate any $x in Tcl code into [set x] in the same place to see what really happens.
The other important thing to consider is immutable values. Tcl values are immutable, so you cannot change them. You can just create a new changed value. But you can change the value stored inside a variable.
This is where the difference between commands taking a variable name and those that take a value comes in. If a command wants to change the value stored in a variable, it takes a variable name. Examples are lappend, lset, append and so on. Other commands return a new value and take a value as argument, examples include lsort, lsearch, lindex.
Another important point is the fact that you do not really have a list type. You have strings that look like lists. So that is what Tcl's string is list tests. This has some consequences, e.g. you cannot always decide if you have a string literal or a one item list, as it is often the same. Example given:
% set maybe_list a
% string is list $maybe_list
1
Combine that with Tcls nearly unrestricted names for variables, as already demonstracted by Glenn and you can get really confused. For example, these are all valid Tcl variable names, you just cannot use all of them with the $ shortcut:
% set "" 1 ;# variable name is the empty string
1
% puts [set ""]
% set " " 1 ;# variable name is just whitespace
1
% puts [set " "]
1
% set {this is a list as variable name} 1 ;# a variable with a list name
1
% puts [set {this is a list as variable name}]
1
% set Δx 1
1
% incr Δx
2
% puts [set Δx]
2

Bash $PATH is caching every modification

How to clear the cache of $PATH in BASH. Every time I modify the $PATH, the former modifications are conserved too! So my $PATH is already one page :-), and it bothers me to work, because it points to some not right places (because every modification is being appended in the end of the $PATH variable). Please help me to solve this problem.
because every modification is being
appended in the end of the $PATH
variable
Take a close look at where you are setting $PATH, I bet it looks something like this:
PATH="$PATH:/some/new/dir:/another/newdir:"
Having $PATH in the new assignment gives you the appending behavior you don't want.
Instead do this:
PATH="/some/new/dir:/another/newdir:"
Update
If you want to strip $PATH of all duplicate entries but still maintain the original order then you can do this:
PATH=$(awk 'BEGIN{ORS=":";RS="[:\n]"}!a[$0]++' <<<"${PATH%:}")
PATH=$(echo $PATH | tr ':' '\n' | sort | uniq | tr '\n' ':')
Once in a while execute the above command. It will tidy up your PATH variable by removing any duplication.
-Cheers
PS: Warning: This will reorder the Paths in PATH variable. And can have undesired effects !!
When I'm setting my PATH, I usually use this script - which I last modified in 1999, it seems (but use daily on all my Unix-based computers). It allows me to add to my PATH (or LD_LIBRARY_PATH, or CDPATH, or any other path-like variable) and eliminate duplicates, and trim out now unwanted values.
Usage
export PATH=$(clnpath /important/bin:$PATH:/new/bin /old/bin:/debris/bin)
The first argument is the new path, built by any technique you like. The second argument is a list of names to remove from the path (if they appear - no error if they don't). For example, I have up to about five versions of the software I work on installed at any given time. To switch between versions, I use this script to adjust both PATH and LD_LIBRARY_PATH to pick up the correct values for the version I'm about to start using, and remove the values of the version I'm no longer using.
Code
: "#(#)$Id: clnpath.sh,v 1.6 1999/06/08 23:34:07 jleffler Exp $"
#
# Print minimal version of $PATH, possibly removing some items
case $# in
0) chop=""; path=${PATH:?};;
1) chop=""; path=$1;;
2) chop=$2; path=$1;;
*) echo "Usage: `basename $0 .sh` [$PATH [remove:list]]" >&2
exit 1;;
esac
# Beware of the quotes in the assignment to chop!
echo "$path" |
${AWK:-awk} -F: '#
BEGIN { # Sort out which path components to omit
chop="'"$chop"'";
if (chop != "") nr = split(chop, remove); else nr = 0;
for (i = 1; i <= nr; i++)
omit[remove[i]] = 1;
}
{
for (i = 1; i <= NF; i++)
{
x=$i;
if (x == "") x = ".";
if (omit[x] == 0 && path[x]++ == 0)
{
output = output pad x;
pad = ":";
}
}
print output;
}'
Commentary
The ':' is an ancient way of using /bin/sh (originally the Bourne shell - now as often Bash) to run the script. If I updated it, the first line would become a shebang. I'd also not use tabs in the code. And there are ways to get the 'chop' value set that do not involve as many quotes:
awk -F: '...script...' chop="$chop"
But it isn't broken, so I haven't fixed it.
When adding entries to PATH, you should check to see if they're already there. Here's what I use in my .bashrc:
pathadd() {
if [ -d "$1" ] && [[ ":$PATH:" != *":$1:"* ]]; then
PATH="$PATH:$1"
fi
}
pathadd /usr/local/bin
pathadd /usr/local/sbin
pathadd ~/bin
This only adds directories to PATH if they exist (i.e. no bogus entries) and aren't already there. Note: the pattern matching feature I use to see if the entry is already in PATH is only available in bash, not the original Bourne shell; if you want to use this with /bin/sh, that part'd need to be rewritten.
I have a nice set of scripts that add path variables to the beginning or end of PATH depending on the ordering I want. The problem is OSX starts with /usr/local/bin after /usr/bin, which is exactly NOT what I want (being a brew user and all). So what I do is put a new copy of /usr/local/bin in front of everything else and use the following to remove all duplicates (and leave ordering in place).
MYPATH=$(echo $MYPATH|perl -F: -lape'$_=join":",grep!$s{$_}++,#F')
I found this on perlmonks. Like most perl, it looks like line noise to me so I have no idea how it works, but work it does!