Overriding Nextflow Parameters with Commandline Arguments - nextflow

Given the following nextflow.config:
google {
project = "cool-project"
region = "europe-west4"
lifeSciences {
bootDiskSize = "200 GB"
debug = true
preemptible = true
}
}
Is it possible to override one or more of those settings using command line arguments. For example, if I wanted to specify that no preemptible machines should be used, can I do the following:
nextflow run main.nf -c nextflow.config --google.lifeSciences.preemptible false
?

Overriding pipeline parameters can be done using Nextflow's command line interface by prefixing the parameter name with a double dash. For example, put the following in a file called 'test.nf':
#!/usr/bin/env nextflow
params.greeting = 'Hello'
names = Channel.of( "foo", "bar", "baz" )
process greet {
input:
val name from names
output:
stdout result
"""
echo "${params.greeting} ${name}"
"""
}
result.view { it.trim() }
And run it using:
nextflow run -ansi-log false test.nf --greeting 'Bonjour'
Results:
N E X T F L O W ~ version 20.10.0
Launching `test.nf` [backstabbing_cajal] - revision: 431ef92cef
[46/22b4f0] Submitted process > greet (1)
[ca/32992c] Submitted process > greet (3)
[6e/5880b0] Submitted process > greet (2)
Bonjour bar
Bonjour foo
Bonjour baz
This works fine for pipeline params, but AFAIK there's no way to directly override executor config like you describe on the command line. You can however, just parameterize these values and set them on the command line like described above. For example, in your nextflow.config:
params {
gc_region = false
gc_preemptible = true
...
}
profiles {
'test' {
includeConfig 'conf/test.config'
}
'google' {
includeConfig 'conf/google.config'
}
...
}
And in a file called 'conf/google.config':
google {
project = "cool-project"
region = params.gc_region
lifeSciences {
bootDiskSize = "200 GB"
debug = true
preemptible = params.gc_preemptible
}
}
Then you should be able to override these in the usual way:
nextflow run main.nf -profile google --gc_region "europe-west4" --gc_preemptible false
Note that you can also specify multiple configuration profiles by separating the profile names with a comma:
nextflow run main.nf -profile google,test ...

Related

Nextflow DSL2: how to combine outputs (channels) from multiple processes into input of another process by part of filename?

I'm trying to combine outputs from two separate processes A and B, where each of them outputs multiple files, into input of process C. All file names have in common a chromosome number(for example "chr1"). The process A outputs files: /path/chr1_qc.vcf.gz, /path/chr2_qc.vcf.gz and etc (genotype files).
Process B outputs files: /path/chr1.a.bcf, /path/chr1.b.bcf, /path/chr1.c.bcf.../path/chr2.a.bcf, /path/chr2.b.bcf and etc (region files). And the number of both file-sets could vary each time.
Part of the code:
process A {
module "bcftools/1.16"
publishDir "${params.out_dir}", mode: 'copy', overwrite: true
input:
path vcf
path tbi
output:
path ("${(vcf =~ /chr\d{1,2}/)[0]}_qc.vcf.gz")
script:
"""
bcftools view -R ${params.sites_list} -Oz -o ${(vcf =~ /chr\d{1,2}/)[0]}_qc.vcf.gz ${vcf} //generates QC-ed genome files
tabix -f ${(vcf =~ /chr\d{1,2}/)[0]}_qc.vcf.gz //indexing QC-ed genomes
"""
}
process B {
publishDir "${params.out_dir}", mode: 'copy', overwrite: true
input:
path(vcf)
output:
tuple path("${(vcf =~ /chr\d{1,2}/)[0]}.*.bed")
script:
"""
python split_chr.py ${params.chr_lims} ${vcf} //generates region files
"""
}
process C {
publishDir "${params.out_dir}", mode: 'copy', overwrite: true
input:
tuple path(vcf), path(bed)
output:
path "${bed.SimpleName}.vcf.gz"
script:
"""
bcftools view -R ${bed} -Oz -o ${bed.SimpleName}.vcf.gz ${vcf}
"""
}
workflow {
A(someprocess.out)
B(A.out)
C(combined_AB_files)
}
Process B output.view() output:
[/path/chr1.a.bed, /path/chr1.b.bed]
[/path/chr2.a.bed, /path/chr2.b.bed]
How can I get the process C to receive an input as a channel of tuples (A and B outputs combined by chromosome name) like this:
[ /path/chr1_qc.vcf.gz, /path/chr1.a.bcf ]
[ /path/chr1_qc.vcf.gz, /path/chr1.b.bcf ]
...
[ /path/chr2_qc.vcf.gz, /path/chr2.a.bcf ]
...
I think what you want is the second form of the combine operator, which allows you to combine items that share a matching key using the by parameter. If one or more of your channels are missing a shared key in the first element, you can just use the map operator to produce such a key. To get the desired output, use the transpose operator and specify the index (zero based) of the element to be transposed, again using the by parameter. For example:
workflow {
Channel
.fromPath( './data/*.bed' )
.map { tuple( it.simpleName, it ) }
.groupTuple()
.set { bed_files }
Channel
.fromPath( './data/*_qc.vcf.gz' )
.map { tuple( it.simpleName - ~/_qc$/, it ) }
.combine( bed_files, by: 0 )
.transpose( by: 2 )
.map { chrom, vcf, bed -> tuple( vcf, bed ) }
.view()
}
Results:
$ touch ./data/chr{1..3}.{a..c}.bed
$ touch ./data/chr{1..3}_qc.vcf.gz
$ nextflow run main.nf
N E X T F L O W ~ version 22.10.0
Launching `main.nf` [pedantic_woese] DSL2 - revision: 9c5abfca90
[/data/chr1_qc.vcf.gz, /data/chr1.c.bed]
[/data/chr1_qc.vcf.gz, /data/chr1.a.bed]
[/data/chr1_qc.vcf.gz, /data/chr1.b.bed]
[/data/chr2_qc.vcf.gz, /data/chr2.b.bed]
[/data/chr2_qc.vcf.gz, /data/chr2.a.bed]
[/data/chr2_qc.vcf.gz, /data/chr2.c.bed]
[/data/chr3_qc.vcf.gz, /data/chr3.c.bed]
[/data/chr3_qc.vcf.gz, /data/chr3.b.bed]
[/data/chr3_qc.vcf.gz, /data/chr3.a.bed]
Note that when two or more queue channels are declared as process inputs (like in your process A), the process will block until it receives a value from each input channel. As these are run in parallel and asynchronously, there's no guarantee that items will be emitted in the order that they were received. This can result in mix-ups, where for example, you unexpectedly end up with an index file that belongs to another VCF. Most of the time, what you want is one queue channel and one or more value channels. The section in the docs on multiple input channels explains this quite well in my opinion, and is well worth the time reading if you haven't already. Also, joining and combining channels becomes a lot easier when your processes define tuples in their input and output declarations, where the first element is a key, like a sample name/id. I think you want something something like the following:
params.vcf_files = './data/*.vcf.gz{,.tbi}'
params.sites_list = './data/sites.tsv'
params.chr_lims = './data/file.txt'
params.outdir = './results'
process proc_A {
tag "${sample}: ${indexed_vcf.first()}"
publishDir "${params.outdir}/proc_A", mode: 'copy', overwrite: true
module "bcftools/1.16"
input:
tuple val(sample), path(indexed_vcf)
path sites_list
output:
tuple val(sample), path("${sample}_qc.vcf.gz{,.tbi}")
script:
def vcf = indexed_vcf.first()
"""
bcftools view \\
-R "${sites_list}" \\
-Oz \\
-o "${sample}_qc.vcf.gz" \\
"${vcf}"
bcftools index \\
-t \\
"${sample}_qc.vcf.gz"
"""
}
process proc_B {
tag "${sample}: ${indexed_vcf.first()}"
publishDir "${params.outdir}/proc_B", mode: 'copy', overwrite: true
input:
tuple val(sample), path(indexed_vcf)
path chr_lims
output:
tuple val(sample), path("*.bed")
script:
def vcf = indexed_vcf.first()
"""
split_chr.py "${chr_lims}" "${vcf}"
"""
}
process proc_C {
tag "${sample}: ${indexed_vcf.first()}: ${bed.name}"
publishDir "${params.outdir}/proc_C", mode: 'copy', overwrite: true
input:
tuple val(sample), path(indexed_vcf), path(bed)
output:
tuple val(sample), path("${bed.simpleName}.vcf.gz")
script:
def vcf = indexed_vcf.first()
"""
bcftools view \\
-R "${bed}" \\
-Oz \\
-o "${bed.simpleName}.vcf.gz" \\
"${vcf}"
"""
}
workflow {
vcf_files = Channel.fromFilePairs( params.vcf_files )
sites_list = file( params.sites_list )
chr_lims = file( params.chr_lims )
proc_A( vcf_files, sites_list )
proc_B( proc_A.out, chr_lims )
proc_A.out \
| combine( proc_B.out, by: 0 ) \
| map { sample, indexed_vcf, bed_files ->
bed_list = bed_files instanceof Path ? [bed_files] : bed_files
tuple( sample, indexed_vcf, bed_list )
} \
| transpose( by: 2 ) \
| proc_C \
| view()
}
The above should produce results like:
$ nextflow run main.nf
N E X T F L O W ~ version 22.10.0
Launching `main.nf` [mighty_elion] DSL2 - revision: 5ea25ae72c
executor > local (15)
[b4/08df9d] process > proc_A (foo: foo.vcf.gz) [100%] 3 of 3 ✔
[93/55e467] process > proc_B (foo: foo_qc.vcf.gz) [100%] 3 of 3 ✔
[8b/cd7193] process > proc_C (foo: foo_qc.vcf.gz: b.bed) [100%] 9 of 9 ✔
[bar, ./work/90/53b9c6468ca54bb0f4eeb99ca82eda/a.vcf.gz]
[bar, ./work/24/cca839d5f63ee6988ead96dc9fbe1d/b.vcf.gz]
[bar, ./work/6f/61e1587134e68d2e358998f61f6459/c.vcf.gz]
[baz, ./work/f8/1484e94b9187ba6aae81d68f0a18cf/b.vcf.gz]
[baz, ./work/9c/20578262f5a2c13c6c3b566dc7b7d8/c.vcf.gz]
[baz, ./work/f5/3405b54f81f6f500a3ee4a78f5e6df/a.vcf.gz]
[foo, ./work/39/945fb0d3f375260e75afbc9caebc5d/a.vcf.gz]
[foo, ./work/de/cecd94ff39f204e799cb8e4c4ad46f/c.vcf.gz]
[foo, ./work/8b/cd7193107f6be5472d2e29982e3319/b.vcf.gz]
Also note that third party scripts, like your Python script, can be moved to a folder called bin in the root of your project repository (i.e. the same directory as your main.nf). And if you make your script executable, you will be able to invoke "as-is", i.e. without the need for an absolute path to it.
This can be done with channel operators. Check the code below, with some comments:
workflow {
// Let's start by building channels similar to the ones you described
Channel
.of(file('/path/chr1_qc.vcf.gz'), file('/path/chr2_qc.vcf.gz'))
.set { pAoutput}
Channel
.of(file('/path/chr1.a.bcf'), file('/path/chr1.b.bcf'), file('/path/chr1.c.bcf'),
file('/path/chr2.a.bcf'), file('/path/chr2.b.bcf'), file('/path/chr2.c.bcf'))
.set { pBoutput }
// Now, let's create keys to relate the elements in the two channels
pAoutput
.map { filepath -> [filepath.name.tokenize('_')[0], filepath ] }
.set { pAoutput_tuple }
// The channel now looks like this:
// [chr1, /path/chr1_qc.vcf.gz]
// [chr2, /path/chr2_qc.vcf.gz]
pBoutput
.map { filepath -> [filepath.name.tokenize('.')[0], filepath ] }
.set { pBoutput_tuple }
// And:
// [chr1, /path/chr1.a.bcf]
// [chr1, /path/chr1.b.bcf]
// [chr1, /path/chr1.c.bcf]
// [chr2, /path/chr2.a.bcf]
// [chr2, /path/chr2.b.bcf]
// [chr2, /path/chr2.c.bcf]
// Combine the two channels and group by key
pAoutput_tuple
.mix(pBoutput_tuple)
.groupTuple()
.flatMap { chrom, path_list ->
path_list.split {
it.name.endsWith('.vcf.gz')
}.combinations()
}
.view()
}
You can check the output below:
N E X T F L O W ~ version 22.10.4
Launching `ex.nf` [maniac_pike] DSL2 - revision: f87873ef13
[/path/chr1_qc.vcf.gz, /path/chr1.a.bcf]
[/path/chr1_qc.vcf.gz, /path/chr1.b.bcf]
[/path/chr1_qc.vcf.gz, /path/chr1.c.bcf]
[/path/chr2_qc.vcf.gz, /path/chr2.a.bcf]
[/path/chr2_qc.vcf.gz, /path/chr2.b.bcf]
[/path/chr2_qc.vcf.gz, /path/chr2.c.bcf]

Nextflow: Not all items in channel used by process

I've been struggling to identify why a nextflow (v20.10.00) process is not using all the items in a channel. I want the process to run for each sample bam file (10 in total) and for each chromosome (3 in total).
Here is the creation of the channels and the process:
ref_genome = file( params.RefGen, checkIfExists: true )
ref_dir = ref_genome.getParent()
ref_name = ref_genome.getBaseName()
ref_dict = file( "${ref_dir}/${ref_name}.dict", checkIfExists: true )
ref_index = file( "${ref_dir}/${ref_name}.*.fai", checkIfExists: true )
// Handles reading in data if the previous step is skipped
if( params.Skip_BP ){
Channel
.fromFilePairs("${params.ProcBamDir}/*{bam,bai}") { file -> file.name.replaceAll(/.bam|.bai$/,'') }
.ifEmpty { error "No bams found in ${params.ProcBamDir}" }
.map { ID, files -> tuple(ID, files[0], files[1]) }
.set { processed_bams }
}
// Setting up the chromosome channel
if( params.Chroms == "" ){
// Defaulting to using all chromosomes
chromosomes_ch = Channel
.from("AgamP4_2L", "AgamP4_2R", "AgamP4_3L", "AgamP4_3R", "AgamP4_X", "AgamP4_Y_unplaced", "AgamP4_UNKN")
println "No chromosomes specified, using all major chromosomes: AgamP4_2L, AgamP4_2R, AgamP4_3L, AgamP4_3R, AgamP4_X, AgamP4_Y_unplaced, AgamP4_UNKN"
} else {
// User option to choose which chromosome will be used
// This worked with the following syntax nextflow run testing.nf --profile imperial --Chroms "AgamP4_3R,AgamP4_2L"
chrs = params.Chroms.split(",")
chromosomes_ch = Channel
.from( chrs )
println "User defined chromosomes set: ${params.Chroms}"
}
process DNA_HCG {
errorStrategy { sleep(Math.pow(2, task.attempt) * 600 as long); return 'retry' }
maxRetries 3
maxForks params.HCG_Forks
tag { SampleID+"-"+chrom }
executor = 'pbspro'
clusterOptions = "-lselect=1:ncpus=${params.HCG_threads}:mem=${params.HCG_memory}gb:mpiprocs=1:ompthreads=${params.HCG_threads} -lwalltime=${params.HCG_walltime}:00:00"
publishDir(
path: "${params.HCDir}",
mode: 'copy',
)
input:
each chrom from chromosomes_ch
set SampleID, path(bam), path(bai) from processed_bams
path ref_genome
path ref_dict
path ref_index
output:
tuple chrom, path("${SampleID}-${chrom}.vcf") into HCG_ch
path("${SampleID}-${chrom}.vcf.idx") into idx_ch
beforeScript 'module load anaconda3/personal; source activate NF_GATK'
script:
"""
if [ ! -d tmp ]; then mkdir tmp; fi
taskset -c 0-${params.HCG_threads} gatk --java-options \"-Xmx${params.HCG_memory}G -XX:+UseParallelGC -XX:ParallelGCThreads=${params.HCG_threads}\" HaplotypeCaller \\
--tmp-dir tmp/ \\
--pair-hmm-implementation AVX_LOGLESS_CACHING_OMP \\
--native-pair-hmm-threads ${params.HCG_threads} \\
-ERC GVCF \\
-L ${chrom} \\
-R ${ref_genome} \\
-I ${bam} \\
-O ${SampleID}-${chrom}.vcf ${params.GVCF_args}
"""
}
But for reasons I cannot figure out, nextflow only creates 3 jobs: [d8/45499b] process > DNA_HCG (0_wt5_BP-CM029350.1) [ 0%] 0 of 3
I thought maybe it was because it only took the first sample and then one process for each chromosome. Though I doubted this since the code works for a different reference genome correctly. Regardless, I adjusted the input channels:
processed_bams
.combine(chromosomes_ch)
.set { HCG_in }
and
input:
set SampleID, path(bam), path(bai), chrom from HCG_in
But this resulted in only a single job being created: [6e/78b070] process > DNA_HCG (0_wt10_BP-CM029350.1) [ 0%] 0 of 1
Confusingly, when i use HCG_in.view() there are 30 items. And to further confuse me the correct number of jobs comes from the following code:
chrs = params.Chroms.split(",")
chromosomes_ch = Channel
.from(chrs)
Channel
.fromFilePairs("${params.ProcBamDir}/*{bam,bai}") { file -> file.name.replaceAll(/.bam|.bai$/,'') }
.ifEmpty { error "No bams found in ${params.ProcBamDir}" }
.map { ID, files -> tuple(ID, files[0], files[1]) }
.set { processed_bams }
process HCG {
executor 'local'
input:
each chrom from chromosomes_ch
set SampleID, path(bam), path(bai) from processed_bams
//set SampleID, path(bam), path(bai), chrom from HCG_in
script:
"""
echo "${SampleID} - ${chrom}"
"""
}
Output: [75/c1c25a] process > HCG (27) [100%] 30 of 30 ✔
I'm hoping I've just missed something obvious, but I cannot see it at the moment. Thanks in advance for the help.
Issues like this almost always involve the use of multiple input channels:
When two or more channels are declared as process inputs, the process
stops until there’s a complete input configuration ie. it receives an
input value from all the channels declared as input.
Your initial assessment was correct. However, the reason only three processes were run (i.e. one sample for each of the three chromosomes), is because this line (probably) returned a list (i.e. a java LinkedList) containing a single element, and lists behave like queue channels:
ref_index = file( "${ref_dir}/${ref_name}.*.fai", checkIfExists: true )
You might have expected this to return a UnixPath. Ultimately, the solution is to ensure ref_index is value channel.

Why do I get a `java.nio.file.ProviderMismatchException` when I access `isEmpty()` on a staged file

I am getting a java.nio.file.ProviderMismatchException when I run the following script:
process a {
output:
file _biosample_id optional true into biosample_id
script:
"""
touch _biosample_id
"""
}
process b {
input:
file _biosample_id from biosample_id.ifEmpty{file("_biosample_id")}
script:
def biosample_id_option = _biosample_id.isEmpty() ? '' : "--biosample_id \$(cat _biosample_id)"
"""
echo \$(cat ${_biosample_id})
"""
}
i'm using a slightly modified version of Optional Input pattern.
Any ideas on why I'm getting the java.nio.file.ProviderMismatchException?
In your script block, _biosample_id is actually an instance of the nextflow.processor.TaskPath class. So to check if the file (or directory) is empty you can just call it's .empty() method. For example:
script:
def biosample_id_option = _biosample_id.empty() ? '' : "--biosample_id \$(< _biosample_id)"
I like your solution - I think it's neat. And I think it should be robust (but I haven't tested it). The optional input pattern that is recommended will fail when attempting to stage missing input files to a remote filesystem/object store. There is a solution however, which is to keep an empty file in your $baseDir and point to it in your scripts. For example:
params.inputs = 'prots/*{1,2,3}.fa'
params.filter = "${baseDir}/assets/null/NO_FILE"
prots_ch = Channel.fromPath(params.inputs)
opt_file = file(params.filter)
process foo {
input:
file seq from prots_ch
file opt from opt_file
script:
def filter = opt.name != 'NO_FILE' ? "--filter $opt" : ''
"""
your_commad --input $seq $filter
"""
}

channel checks as empty even if it has content

I am trying to have a process that is launched only if a combination of conditions is met, but when checking if a channel has a path to a file, it always returns it as empty. Probably I am doing something wrong, in that case please correct my code. I tried to follow some of the suggestions in this issue but no success.
Consider the following minimal example:
process one {
output:
file("test.txt") into _chProcessTwo
script:
"""
echo "Hello world" > "test.txt"
"""
}
// making a copy so I check first if something in the channel or not
// avoids raising exception of MultipleInputChannel
_chProcessTwo.into{
_chProcessTwoView;
_chProcessTwoCheck;
_chProcessTwoUse
}
//print contents of channel
println "Channel contents: " + _chProcessTwoView.toList().view()
process two {
input:
file(myInput) from _chProcessTwoUse
when:
(!_chProcessTwoCheck.toList().isEmpty())
script:
def test = _chProcessTwoUse.toList().isEmpty() ? "I'm empty" : "I'm NOT empty"
println "The outcome is: " + test
}
I want to have process two run if and only if there is a file in the _chProcessTwo channel.
If I run the above code I obtain:
marius#dev:~/pipeline$ ./bin/nextflow run test.nf
N E X T F L O W ~ version 19.09.0-edge
Launching `test.nf` [infallible_gutenberg] - revision: 9f57464dc1
[c8/bf38f5] process > one [100%] 1 of 1 ✔
[- ] process > two -
[/home/marius/pipeline/work/c8/bf38f595d759686a497bb4a49e9778/test.txt]
where the last line are actually the contents of _chProcessTwoView
If I remove the when directive from the second process I get:
marius#mg-dev:~/pipeline$ ./bin/nextflow run test.nf
N E X T F L O W ~ version 19.09.0-edge
Launching `test.nf` [modest_descartes] - revision: 5b2bbfea6a
[57/1b7b97] process > one [100%] 1 of 1 ✔
[a9/e4b82d] process > two [100%] 1 of 1 ✔
[/home/marius/pipeline/work/57/1b7b979933ca9e936a3c0bb640c37e/test.txt]
with the contents of the second worker .command.log file being: The outcome is: I'm empty
I tried also without toList()
What am I doing wrong? Thank you in advance
Update: a workaround would be to check _chProcessTwoUse.view() != "" but that is pretty dirty
Update 2 as required by #Steve, I've updated the code to reflect a bit more the actual conditions i have in my own pipeline:
def runProcessOne = true
process one {
when:
runProcessOne
output:
file("inputProcessTwo.txt") into _chProcessTwo optional true
file("inputProcessThree.txt") into _chProcessThree optional true
script:
// this would replace the probability that output is not created
def outputSomething = false
"""
if ${outputSomething}; then
echo "Hello world" > "inputProcessTwo.txt"
echo "Goodbye world" > "inputProcessThree.txt"
else
echo "Sorry. Process one did not write to file."
fi
"""
}
// making a copy so I check first if something in the channel or not
// avoids raising exception of MultipleInputChannel
_chProcessTwo.into{
_chProcessTwoView;
_chProcessTwoCheck;
_chProcessTwoUse
}
//print contents of channel
println "Channel contents: " + _chProcessTwoView.view()
println _chProcessTwoView.view() ? "Me empty" : "NOT empty"
process two {
input:
file(myInput) from _chProcessTwoUse
when:
(runProcessOne)
script:
"""
echo "The outcome is: ${myInput}"
"""
}
process three {
input:
file(defaultInput) from _chUpstreamProcesses
file(inputFromProcessTwo) from _chProcessThree
script:
def extra_parameters = _chProcessThree.isEmpty() ? "" : "--extra-input " + inputFromProcessTwo
"""
echo "Hooray! We got: ${extra_parameters}"
"""
}
As #Steve mentioned, I should not even check if a channel is empty, NextFlow should know better to not initiate the process. But I think in this construct I will have to.
Marius
I think part of the problem here is that process 'one' creates only optional outputs. This makes dealing with the optional inputs in process 'three' a bit tricky. I would try to reconcile this if possible. If this can't be reconciled, then you'll need to deal with the optional inputs in process 'three'. To do this, you'll basically need to create a dummy file, pass it into the channel using the ifEmpty operator, then use the name of the dummy file to check whether or not to prepend the argument's prefix. It's a bit of a hack, but it works pretty well.
The first step is to actually create the dummy file. I like shareable pipelines, so I would just create this in your baseDir, perhaps under a folder called 'assets':
mkdir assets
touch assets/NO_FILE
Then pass in your dummy file if your '_chProcessThree' channel is empty:
params.dummy_file = "${baseDir}/assets/NO_FILE"
dummy_file = file(params.dummy_file)
process three {
input:
file(defaultInput) from _chUpstreamProcesses
file(optfile) from _chProcessThree.ifEmpty(dummy_file)
script:
def extra_parameters = optfile.name != 'NO_FILE' ? "--extra-input ${optfile}" : ''
"""
echo "Hooray! We got: ${extra_parameters}"
"""
}
Also, these lines are problematic:
//print contents of channel
println "Channel contents: " + _chProcessTwoView.view()
println _chProcessTwoView.view() ? "Me empty" : "NOT empty"
Calling view() will emit all values from the channel to stdout. You can ignore whatever value it returns. Unless you enable DSL2, the channel will then be empty. I think what you're looking for here is a closure:
_chProcessTwoView.view { "Found: $it" }
Be sure to append -ansi-log false to your nextflow run command so the output doesn't get clobbered. HTH.

Gradle: How to Display Test Results in the Console in Real Time?

I would like to see test results ( system.out/err, log messages from components being tested ) as they run in the same console I run:
gradle test
And not wait until tests are done to look at the test reports ( that are only generated when tests are completed, so I can't "tail -f" anything while tests are running )
Here is my fancy version:
import org.gradle.api.tasks.testing.logging.TestExceptionFormat
import org.gradle.api.tasks.testing.logging.TestLogEvent
tasks.withType(Test) {
testLogging {
// set options for log level LIFECYCLE
events TestLogEvent.FAILED,
TestLogEvent.PASSED,
TestLogEvent.SKIPPED,
TestLogEvent.STANDARD_OUT
exceptionFormat TestExceptionFormat.FULL
showExceptions true
showCauses true
showStackTraces true
// set options for log level DEBUG and INFO
debug {
events TestLogEvent.STARTED,
TestLogEvent.FAILED,
TestLogEvent.PASSED,
TestLogEvent.SKIPPED,
TestLogEvent.STANDARD_ERROR,
TestLogEvent.STANDARD_OUT
exceptionFormat TestExceptionFormat.FULL
}
info.events = debug.events
info.exceptionFormat = debug.exceptionFormat
afterSuite { desc, result ->
if (!desc.parent) { // will match the outermost suite
def output = "Results: ${result.resultType} (${result.testCount} tests, ${result.successfulTestCount} passed, ${result.failedTestCount} failed, ${result.skippedTestCount} skipped)"
def startItem = '| ', endItem = ' |'
def repeatLength = startItem.length() + output.length() + endItem.length()
println('\n' + ('-' * repeatLength) + '\n' + startItem + output + endItem + '\n' + ('-' * repeatLength))
}
}
}
}
You could run Gradle with INFO logging level on the command line. It'll show you the result of each test while they are running. Downside is that you will get far more output for other tasks also.
gradle test -i
Disclaimer: I am the developer of the Gradle Test Logger Plugin.
You can simply use the Gradle Test Logger Plugin to print beautiful logs on the console. The plugin uses sensible defaults to satisfy most users with little or no configuration but also offers a number of themes and configuration options to suit everyone.
Examples
Standard theme
Mocha theme
Usage
plugins {
id 'com.adarshr.test-logger' version '<version>'
}
Make sure you always get the latest version from Gradle Central.
Configuration
You don't need any configuration at all. However, the plugin offers a few options. This can be done as follows (default values shown):
testlogger {
// pick a theme - mocha, standard, plain, mocha-parallel, standard-parallel or plain-parallel
theme 'standard'
// set to false to disable detailed failure logs
showExceptions true
// set to false to hide stack traces
showStackTraces true
// set to true to remove any filtering applied to stack traces
showFullStackTraces false
// set to false to hide exception causes
showCauses true
// set threshold in milliseconds to highlight slow tests
slowThreshold 2000
// displays a breakdown of passes, failures and skips along with total duration
showSummary true
// set to true to see simple class names
showSimpleNames false
// set to false to hide passed tests
showPassed true
// set to false to hide skipped tests
showSkipped true
// set to false to hide failed tests
showFailed true
// enable to see standard out and error streams inline with the test results
showStandardStreams false
// set to false to hide passed standard out and error streams
showPassedStandardStreams true
// set to false to hide skipped standard out and error streams
showSkippedStandardStreams true
// set to false to hide failed standard out and error streams
showFailedStandardStreams true
}
I hope you will enjoy using it.
You can add a Groovy closure inside your build.gradle file that does the logging for you:
test {
afterTest { desc, result ->
logger.quiet "Executing test ${desc.name} [${desc.className}] with result: ${result.resultType}"
}
}
On your console it then reads like this:
:compileJava UP-TO-DATE
:compileGroovy
:processResources
:classes
:jar
:assemble
:compileTestJava
:compileTestGroovy
:processTestResources
:testClasses
:test
Executing test maturesShouldBeCharged11DollarsForDefaultMovie [movietickets.MovieTicketsTests] with result: SUCCESS
Executing test studentsShouldBeCharged8DollarsForDefaultMovie [movietickets.MovieTicketsTests] with result: SUCCESS
Executing test seniorsShouldBeCharged6DollarsForDefaultMovie [movietickets.MovieTicketsTests] with result: SUCCESS
Executing test childrenShouldBeCharged5DollarsAnd50CentForDefaultMovie [movietickets.MovieTicketsTests] with result: SUCCESS
:check
:build
Since version 1.1 Gradle supports much more options to log test output. With those options at hand you can achieve a similar output with the following configuration:
test {
testLogging {
events "passed", "skipped", "failed"
}
}
As stefanglase answered:
adding the following code to your build.gradle (since version 1.1) works fine for output on passed, skipped and failed tests.
test {
testLogging {
events "passed", "skipped", "failed", "standardOut", "standardError"
}
}
What I want to say additionally (I found out this is a problem for starters) is that the gradle test command executes the test only one time per change.
So if you are running it the second time there will be no output on test results. You can also see this in the building output: gradle then says UP-TO-DATE on tests. So its not executed a n-th time.
Smart gradle!
If you want to force the test cases to run, use gradle cleanTest test.
This is slightly off topic but I hope it will help some newbies.
edit
As sparc_spread stated in the comments:
If you want to force gradle to always run fresh tests (which might not always be a good idea) you can add outputs.upToDateWhen {false} to testLogging { [...] }. Continue reading here.
Peace.
Add this to build.gradle to stop gradle from swallowing stdout and stderr.
test {
testLogging.showStandardStreams = true
}
It's documented here.
'test' task does not work for Android plugin, for Android plugin use the following:
// Test Logging
tasks.withType(Test) {
testLogging {
events "started", "passed", "skipped", "failed"
}
}
See the following: https://stackoverflow.com/a/31665341/3521637
As a follow up to Shubham's great answer I like to suggest using enum values instead of strings. Please take a look at the documentation of the TestLogging class.
import org.gradle.api.tasks.testing.logging.TestExceptionFormat
import org.gradle.api.tasks.testing.logging.TestLogEvent
tasks.withType(Test) {
testLogging {
events TestLogEvent.FAILED,
TestLogEvent.PASSED,
TestLogEvent.SKIPPED,
TestLogEvent.STANDARD_ERROR,
TestLogEvent.STANDARD_OUT
exceptionFormat TestExceptionFormat.FULL
showCauses true
showExceptions true
showStackTraces true
}
}
My favourite minimalistic version based on Shubham Chaudhary answer.
Put this in build.gradle file:
test {
afterSuite { desc, result ->
if (!desc.parent)
println("${result.resultType} " +
"(${result.testCount} tests, " +
"${result.successfulTestCount} successes, " +
"${result.failedTestCount} failures, " +
"${result.skippedTestCount} skipped)")
}
}
In Gradle using Android plugin:
gradle.projectsEvaluated {
tasks.withType(Test) { task ->
task.afterTest { desc, result ->
println "Executing test ${desc.name} [${desc.className}] with result: ${result.resultType}"
}
}
}
Then the output will be:
Executing test testConversionMinutes [org.example.app.test.DurationTest] with result: SUCCESS
If you have a build.gradle.kts written in Kotlin DSL you can print test results with (I was developing a kotlin multi-platform project, with no "java" plugin applied):
tasks.withType<AbstractTestTask> {
afterSuite(KotlinClosure2({ desc: TestDescriptor, result: TestResult ->
if (desc.parent == null) { // will match the outermost suite
println("Results: ${result.resultType} (${result.testCount} tests, ${result.successfulTestCount} successes, ${result.failedTestCount} failures, ${result.skippedTestCount} skipped)")
}
}))
}
Just add the following closure to your build.gradle. the output will be printed after the execution of every test.
test{
useJUnitPlatform()
afterTest { desc, result ->
def output = "Class name: ${desc.className}, Test name: ${desc.name}, (Test status: ${result.resultType})"
println( '\n' + output)
}
}
Merge of Shubham's great answer and JJD use enum instead of string
tasks.withType(Test) {
testLogging {
// set options for log level LIFECYCLE
events TestLogEvent.PASSED,
TestLogEvent.SKIPPED, TestLogEvent.FAILED, TestLogEvent.STANDARD_OUT
showExceptions true
exceptionFormat TestExceptionFormat.FULL
showCauses true
showStackTraces true
// set options for log level DEBUG and INFO
debug {
events TestLogEvent.STARTED, TestLogEvent.PASSED, TestLogEvent.SKIPPED, TestLogEvent.FAILED, TestLogEvent.STANDARD_OUT, TestLogEvent.STANDARD_ERROR
exceptionFormat TestExceptionFormat.FULL
}
info.events = debug.events
info.exceptionFormat = debug.exceptionFormat
afterSuite { desc, result ->
if (!desc.parent) { // will match the outermost suite
def output = "Results: ${result.resultType} (${result.testCount} tests, ${result.successfulTestCount} successes, ${result.failedTestCount} failures, ${result.skippedTestCount} skipped)"
def startItem = '| ', endItem = ' |'
def repeatLength = startItem.length() + output.length() + endItem.length()
println('\n' + ('-' * repeatLength) + '\n' + startItem + output + endItem + '\n' + ('-' * repeatLength))
}
}
}
}
Following on from Benjamin Muschko's answer (19 March 2011), you can use the -i flag along with grep, to filter out 1000s of unwanted lines. Examples:
Strong filter - Only display each unit test name and test result and the overall build status. Setup errors or exceptions are not displayed.
./gradlew test -i | grep -E " > |BUILD"
Soft filter - Display each unit test name and test result, as well as setup errors/exceptions. But it will also include some irrelevant info:
./gradlew test -i | grep -E -v "^Executing |^Creating |^Parsing |^Using |^Merging |^Download |^title=Compiling|^AAPT|^future=|^task=|:app:|V/InstrumentationResultParser:"
Soft filter, Alternative syntax: (search tokens are split into individual strings)
./gradlew test -i | grep -v -e "^Executing " -e "^Creating " -e "^Parsing " -e "^Using " -e "^Merging " -e "^Download " -e "^title=Compiling" -e "^AAPT" -e "^future=" -e "^task=" -e ":app:" -e "V/InstrumentationResultParser:"
Explanation of how it works:
The first command is ./gradlew test -i and "-i" means "Info/Verbose" mode, which prints the result of each test in real-time, but also displays large amounts of unwanted debug lines.
So the output of the first command, ./gradlew test -i, is piped to a second command grep, which will filter out many unwanted lines, based on a regular expression. "-E" enables the regular expression mode for a single string; "-e" enables regular expressions for multiple strings; and "|" in the regex string means "or".
In the strong filter, a unit test name and test result is allowed to display using " > ", and the overall status is allowed with "BUILD".
In the soft filter, the "-v" flag means "not containing" and "^" means "start of line". So it strips out all lines that start with "Executing " or start with "Creating ", etc.
Example for Android instrumentation unit tests, with gradle 5.1:
./gradlew connectedDebugAndroidTest --continue -i | grep -v -e \
"^Transforming " -e "^Skipping " -e "^Cache " -e "^Performance " -e "^Creating " -e \
"^Parsing " -e "^file " -e "ddms: " -e ":app:" -e "V/InstrumentationResultParser:"
Example for Jacoco unit test coverage, with gradle 4.10:
./gradlew createDebugCoverageReport --continue -i | grep -E -v "^Executing |^Creating |^Parsing |^Using |^Merging |^Download |^title=Compiling|^AAPT|^future=|^task=|:app:|V/InstrumentationResultParser:"
For Android, this works nicely:
android {
...
testOptions {
unitTests.all {
testLogging {
outputs.upToDateWhen { false }
events "passed", "failed", "skipped", "standardError"
showCauses true
showExceptions true
}
}
}
}
See Running Android unit / instrumentation tests from the console
For those using Kotlin DSL, you can do:
tasks {
named<Test>("test") {
testLogging.showStandardStreams = true
}
}
A more comprehensive response for those using th Kotlin DSL:
subprojects {
// all the other stuff
// ...
tasks.named<Test>("test") {
useJUnitPlatform()
setupTestLogging()
}
}
fun Test.setupTestLogging() {
testLogging {
events(
org.gradle.api.tasks.testing.logging.TestLogEvent.FAILED,
org.gradle.api.tasks.testing.logging.TestLogEvent.PASSED,
org.gradle.api.tasks.testing.logging.TestLogEvent.SKIPPED,
org.gradle.api.tasks.testing.logging.TestLogEvent.STANDARD_OUT,
)
exceptionFormat = org.gradle.api.tasks.testing.logging.TestExceptionFormat.FULL
showExceptions = true
showCauses = true
showStackTraces = true
addTestListener(object : TestListener {
override fun beforeSuite(suite: TestDescriptor) {}
override fun beforeTest(testDescriptor: TestDescriptor) {}
override fun afterTest(testDescriptor: TestDescriptor, result: TestResult) {}
override fun afterSuite(suite: TestDescriptor, result: TestResult) {
if (suite.parent != null) { // will match the outermost suite
val output = "Results: ${result.resultType} (${result.testCount} tests, ${result.successfulTestCount} passed, ${result.failedTestCount} failed, ${result.skippedTestCount} skipped)"
val startItem = "| "
val endItem = " |"
val repeatLength = startItem.length + output.length + endItem.length
val messages = """
${(1..repeatLength).joinToString("") { "-" }}
$startItem$output$endItem
${(1..repeatLength).joinToString("") { "-" }}
""".trimIndent()
println(messages)
}
}
})
}
}
This should produce an output close to #odemolliens answers.
If you are using jupiter and none of the answers work, consider verifying it is setup correctly:
test {
useJUnitPlatform()
outputs.upToDateWhen { false }
}
dependencies {
testImplementation 'org.junit.jupiter:junit-jupiter-api:5.7.0'
testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.7.0'
}
And then try the accepted answers
I've written a test logger for Kotlin DSL. You can put this block on your project scope build.gradle.kts file.
subprojects {
tasks.withType(Test::class.java) {
testLogging {
showCauses = false
showExceptions = false
showStackTraces = false
showStandardStreams = false
val ansiReset = "\u001B[0m"
val ansiGreen = "\u001B[32m"
val ansiRed = "\u001B[31m"
val ansiYellow = "\u001B[33m"
fun getColoredResultType(resultType: ResultType): String {
return when (resultType) {
ResultType.SUCCESS -> "$ansiGreen $resultType $ansiReset"
ResultType.FAILURE -> "$ansiRed $resultType $ansiReset"
ResultType.SKIPPED -> "$ansiYellow $resultType $ansiReset"
}
}
afterTest(
KotlinClosure2({ desc: TestDescriptor, result: TestResult ->
println("${desc.className} | ${desc.displayName} = ${getColoredResultType(result.resultType)}")
})
)
afterSuite(
KotlinClosure2({ desc: TestDescriptor, result: TestResult ->
if (desc.parent == null) {
println("Result: ${result.resultType} (${result.testCount} tests, ${result.successfulTestCount} passed, ${result.failedTestCount} failed, ${result.skippedTestCount} skipped)")
}
})
)
}
}
}