Difference between .absolute and .abspath - raku

Is there one? They both yield the same string
given 'file.txt'.IO -> $io {
say $io."$_" for <path abspath absolute>
}
# file.txt
# /Users/Me/file.txt
# /Users/Me/file.txt

The method Path::IO::absolute got a multi candidate that accepts a prefix to be glued in-between the current work dir and the filename or path fragment provided as a Path::IO instance.
dd 'file.txt'.IO.absolute('foo');
OUTPUT«"/home/camelia/foo/file.txt"␤»

Related

Merge multiple output chunks to one file in nextflow

I have a nextflow process that outputs multiple files, like below:
[chr1,/path/to/chr1_chunk1.TC.linear]
[chr1,/path/to/chr1_chunk1.HDL.linear]
[chr1,/path/to/chr1_chunk2.TC.linear]
[chr1,/path/to/chr1_chunk2.HDL.linear]
.....
The above example I got after using transpose() operator.
Now, I want to concatenate All chunks and all chromosome together ordered by chunk and chromosome number so that I get 1 file for TC and another file for HDL. I have multiple traits in many chunks so this link wouldn't be helpful. output files (chromosomal chunks) merging in nextflow
Any help?
You can use a combination of the branch and collectFile operators. Look at the following directory structure below (where the .linear files have their names as contents):
➜ sandbox tree .
.
├── ex1.HDL.linear
├── ex1.TC.linear
├── ex2.HDL.linear
├── ex2.TC.linear
├── ex3.HDL.linear
├── ex3.TC.linear
└── example.nf
I wrote the following minimal reproducible example:
workflow {
files = Channel.fromPath('**.linear', checkIfExists: true)
files
.branch {
TC: it.name.contains('TC')
HDL: it.name.contains('HDL')
}
.set { result }
result
.TC
.collectFile(name: 'TC.txt', storeDir: '/Users/mribeirodantas/sandbox')
result
.HDL
.collectFile(name: 'HDL.txt', storeDir: '/Users/mribeirodantas/sandbox')
}
After running this pipeline with nextflow run example.nf, I will get in the /Users/mribeirodantas/sandbox folder two new files: TC.txt and HDL.txt. The content of TC.txt, for example, is:
ex2.TC.linear
ex3.TC.linear
ex1.TC.linear
If your chunk files are sufficiently small, you can use the collectFile operator to concatenate them into files with names defined using a dynamic grouping criteria:
The grouping criteria is specified by a closure that must return a
pair in which the first element defines the file name for the group
and the second element the actual value to be appended to that file.
To sort by chromosome number and then by chunk number, you can use the toSortedList and flatMap operators to feed the sorted collection into the collectFile operator:
input_ch
.map { key, chunk_file ->
def matcher = chunk_file.name =~ /^chr(\d+)_chunk(\d+)\.(\w+)\.linear$/
def (_, chrom, chunk, trait) = matcher[0]
tuple( (chrom as int), (chunk as int), trait, chunk_file )
}
.toSortedList( { a, b -> (a[0] <=> b[0]) ?: (a[1] <=> b[1]) } )
.flatMap()
.collectFile( sort: false ) { chrom, chunk, trait, chunk_file ->
[ "${trait}.linear", chunk_file.text ]
}

Why do I get a `java.nio.file.ProviderMismatchException` when I access `isEmpty()` on a staged file

I am getting a java.nio.file.ProviderMismatchException when I run the following script:
process a {
output:
file _biosample_id optional true into biosample_id
script:
"""
touch _biosample_id
"""
}
process b {
input:
file _biosample_id from biosample_id.ifEmpty{file("_biosample_id")}
script:
def biosample_id_option = _biosample_id.isEmpty() ? '' : "--biosample_id \$(cat _biosample_id)"
"""
echo \$(cat ${_biosample_id})
"""
}
i'm using a slightly modified version of Optional Input pattern.
Any ideas on why I'm getting the java.nio.file.ProviderMismatchException?
In your script block, _biosample_id is actually an instance of the nextflow.processor.TaskPath class. So to check if the file (or directory) is empty you can just call it's .empty() method. For example:
script:
def biosample_id_option = _biosample_id.empty() ? '' : "--biosample_id \$(< _biosample_id)"
I like your solution - I think it's neat. And I think it should be robust (but I haven't tested it). The optional input pattern that is recommended will fail when attempting to stage missing input files to a remote filesystem/object store. There is a solution however, which is to keep an empty file in your $baseDir and point to it in your scripts. For example:
params.inputs = 'prots/*{1,2,3}.fa'
params.filter = "${baseDir}/assets/null/NO_FILE"
prots_ch = Channel.fromPath(params.inputs)
opt_file = file(params.filter)
process foo {
input:
file seq from prots_ch
file opt from opt_file
script:
def filter = opt.name != 'NO_FILE' ? "--filter $opt" : ''
"""
your_commad --input $seq $filter
"""
}

channel checks as empty even if it has content

I am trying to have a process that is launched only if a combination of conditions is met, but when checking if a channel has a path to a file, it always returns it as empty. Probably I am doing something wrong, in that case please correct my code. I tried to follow some of the suggestions in this issue but no success.
Consider the following minimal example:
process one {
output:
file("test.txt") into _chProcessTwo
script:
"""
echo "Hello world" > "test.txt"
"""
}
// making a copy so I check first if something in the channel or not
// avoids raising exception of MultipleInputChannel
_chProcessTwo.into{
_chProcessTwoView;
_chProcessTwoCheck;
_chProcessTwoUse
}
//print contents of channel
println "Channel contents: " + _chProcessTwoView.toList().view()
process two {
input:
file(myInput) from _chProcessTwoUse
when:
(!_chProcessTwoCheck.toList().isEmpty())
script:
def test = _chProcessTwoUse.toList().isEmpty() ? "I'm empty" : "I'm NOT empty"
println "The outcome is: " + test
}
I want to have process two run if and only if there is a file in the _chProcessTwo channel.
If I run the above code I obtain:
marius#dev:~/pipeline$ ./bin/nextflow run test.nf
N E X T F L O W ~ version 19.09.0-edge
Launching `test.nf` [infallible_gutenberg] - revision: 9f57464dc1
[c8/bf38f5] process > one [100%] 1 of 1 ✔
[- ] process > two -
[/home/marius/pipeline/work/c8/bf38f595d759686a497bb4a49e9778/test.txt]
where the last line are actually the contents of _chProcessTwoView
If I remove the when directive from the second process I get:
marius#mg-dev:~/pipeline$ ./bin/nextflow run test.nf
N E X T F L O W ~ version 19.09.0-edge
Launching `test.nf` [modest_descartes] - revision: 5b2bbfea6a
[57/1b7b97] process > one [100%] 1 of 1 ✔
[a9/e4b82d] process > two [100%] 1 of 1 ✔
[/home/marius/pipeline/work/57/1b7b979933ca9e936a3c0bb640c37e/test.txt]
with the contents of the second worker .command.log file being: The outcome is: I'm empty
I tried also without toList()
What am I doing wrong? Thank you in advance
Update: a workaround would be to check _chProcessTwoUse.view() != "" but that is pretty dirty
Update 2 as required by #Steve, I've updated the code to reflect a bit more the actual conditions i have in my own pipeline:
def runProcessOne = true
process one {
when:
runProcessOne
output:
file("inputProcessTwo.txt") into _chProcessTwo optional true
file("inputProcessThree.txt") into _chProcessThree optional true
script:
// this would replace the probability that output is not created
def outputSomething = false
"""
if ${outputSomething}; then
echo "Hello world" > "inputProcessTwo.txt"
echo "Goodbye world" > "inputProcessThree.txt"
else
echo "Sorry. Process one did not write to file."
fi
"""
}
// making a copy so I check first if something in the channel or not
// avoids raising exception of MultipleInputChannel
_chProcessTwo.into{
_chProcessTwoView;
_chProcessTwoCheck;
_chProcessTwoUse
}
//print contents of channel
println "Channel contents: " + _chProcessTwoView.view()
println _chProcessTwoView.view() ? "Me empty" : "NOT empty"
process two {
input:
file(myInput) from _chProcessTwoUse
when:
(runProcessOne)
script:
"""
echo "The outcome is: ${myInput}"
"""
}
process three {
input:
file(defaultInput) from _chUpstreamProcesses
file(inputFromProcessTwo) from _chProcessThree
script:
def extra_parameters = _chProcessThree.isEmpty() ? "" : "--extra-input " + inputFromProcessTwo
"""
echo "Hooray! We got: ${extra_parameters}"
"""
}
As #Steve mentioned, I should not even check if a channel is empty, NextFlow should know better to not initiate the process. But I think in this construct I will have to.
Marius
I think part of the problem here is that process 'one' creates only optional outputs. This makes dealing with the optional inputs in process 'three' a bit tricky. I would try to reconcile this if possible. If this can't be reconciled, then you'll need to deal with the optional inputs in process 'three'. To do this, you'll basically need to create a dummy file, pass it into the channel using the ifEmpty operator, then use the name of the dummy file to check whether or not to prepend the argument's prefix. It's a bit of a hack, but it works pretty well.
The first step is to actually create the dummy file. I like shareable pipelines, so I would just create this in your baseDir, perhaps under a folder called 'assets':
mkdir assets
touch assets/NO_FILE
Then pass in your dummy file if your '_chProcessThree' channel is empty:
params.dummy_file = "${baseDir}/assets/NO_FILE"
dummy_file = file(params.dummy_file)
process three {
input:
file(defaultInput) from _chUpstreamProcesses
file(optfile) from _chProcessThree.ifEmpty(dummy_file)
script:
def extra_parameters = optfile.name != 'NO_FILE' ? "--extra-input ${optfile}" : ''
"""
echo "Hooray! We got: ${extra_parameters}"
"""
}
Also, these lines are problematic:
//print contents of channel
println "Channel contents: " + _chProcessTwoView.view()
println _chProcessTwoView.view() ? "Me empty" : "NOT empty"
Calling view() will emit all values from the channel to stdout. You can ignore whatever value it returns. Unless you enable DSL2, the channel will then be empty. I think what you're looking for here is a closure:
_chProcessTwoView.view { "Found: $it" }
Be sure to append -ansi-log false to your nextflow run command so the output doesn't get clobbered. HTH.

Terraform template_file get pass all received variables to a function

is there in Terraforom in template_files a way to pass through all the received variables to other place?
I mean something similar than $# in bash.
For example:
resource "template_file" "some_template" {
template = "my_template.tpl")}"
vars {
var1 = "value1"
var2 = "value2"
}
}
and then from the rendered file:
#!/bin/bash
echo "Var1: ${var1}"
echo "Var2: ${var2}"
echo "But I want it in someway similar to this:"
for v in $#; do
echo "$v";
done
According to the documentation, no.
From https://www.terraform.io/docs/providers/template/d/file.html
Variables for interpolation within the template. Note that variables
must all be primitives. Direct references to lists or maps will cause
a validation error.
Primitives in terraform are string, number and boolean.
So it means you can not pass a hash or a list to group all the variables in one.
Use join and pass all the variables as one and parse/split them within a script (with tr/IFS tricks)
join("; ", [var.myvar1, var.myvar2, var.myvar3])
and then
IN="${allvars}"
IFS=';' read -ra ADDR <<< "$IN"
for i in "${ADDR[#]}"; do
echo "$i"
done

How do I pick elements from a block using a string in Rebol?

Given this block
fs: [
usr [
local [
bin []
]
share []
]
bin []
]
I could retrieve an item using a path notation like so:
fs/usr/local
How do I do the same when the path is a string?
path: "/usr/local"
find fs path ;does not work!
find fs to-path path ;does not work!
You need to complete the input string path with the right root, then LOAD it and evaluate it.
>> path: "/usr/local"
>> insert path "fs"
>> do load path
== [
bin []
]
Did you know Rebol has a native path type?
although this doesn't exactly answer your question, I tought I'd add a reference on how to use paths directly in Rebol. Rebol has a lot of datatypes, and when you can, you should leverage that rich language feature. Especially when you start to use and build dialects, knowing what types exist and how to use them becomes even more potent.
Here is an example on how you can build and run a path directly, without using strings. in order to represent a path within source code, you use the lit-path! datatype.
example:
>> p: 'fs/usr/local
== fs/usr/local
>> do p
== [
bin []
]
you can even append to a path to manipulate it:
>> append p 'bin
== fs/usr/local/bin
>> do p
== []
if it where stored within a block, you use a path! type directly (not a lit-path!):
>> p: [fs/usr/local/bin]
== [fs/usr/local]
>> do first p
== [
bin []
]
also note that using paths directly has advantages over using strings because the path is composed of a series of words, which you can do some manipulation more easily than with strings example:
>> head change next p 'bin
== fs/bin/local
>> p: 'fs/path/issue/is
== fs/path/issue/is
>> head replace p 'is 'was
== fs/path/issue/w
as opposed to using a string:
>> p: "fs/path/issue/is"
== "fs/path/issue/is"
>> head replace p "is" "was"
== "fs/path/wassue/is"
If you want to browse the disk, instead of Rebol datasets, you must simply give 'FS a value of a file! and the rest of the path with browse from there (this is how paths work on file! types):
fs: %/c/
read dirize fs/windows