How to call a process in workflow.onError - nextflow

I have this small pipeline:
process test {
"""
echo 'hello'
exit 1
"""
}
workflow.onError {
process finish_error{
script:
"""
echo 'blablabla'
"""
}
}
I want to trigger a python script in case the pipeline has an error using the finish error process, but this entire process does not seem to be triggered even when using a simple echo blabla example.
nextflow run test.nf
N E X T F L O W ~ version 20.10.0
Launching `test.nf` [cheesy_banach] - revision: 9020d641ca
executor > local (1)
[56/994298] process > test [100%] 1 of 1, failed: 1 ✘
[- ] process > finish_error [ 0%] 0 of 1
Error executing process > 'test'
Caused by:
Process `test` terminated with an error exit status (1)
Command executed:
echo 'hello'
exit 1
Command exit status:
1
Command output:
hello
Command wrapper:
hello
Work dir:
/home/joost/nextflow/work/56/9942985fc9948fd9bf7797d39c1785
Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
How can I trigger this finish_error process, and how can I view its output?

The onError handler is invoked when a process causes pipeline execution to terminate prematurely. Since a Nextflow pipeline is really just a series of processes joined together, launching another pipeline process from within an event handler doesn't make much sense to me. If your python script should be run using the local executor, you can just execute it in the usual way. This example assumes your script is executable and has an appropriate shebang:
process test {
"""
echo 'hello'
exit 1
"""
}
workflow.onError {
def proc = "${baseDir}/test.py".execute()
proc.waitFor()
println proc.text
}
Run using:
nextflow run -ansi-log false test.nf

Related

How do I prevent script from crashing as a result of failed proc?

I've got this:
try { run 'tar', '-zxvf', $path.Str, "$dir/META6.json", :err }
Despite being in a try{} block, this line is still causing my script to crash:
The spawned command 'tar' exited unsuccessfully (exit code: 1, signal: 0)
in block at ./all-versions.raku line 27
in block at ./all-versions.raku line 16
in block <unit> at ./all-versions.raku line 13
Why isn't the try{} block allowing the script to continue and how can I get it to continue?
That's because the run didn't fail (yet). run returns a Proc object. And that by itself doesn't throw (yet).
try just returns that Proc object. As soon as the returned value is used however (for instance, by having it sunk), then it will throw.
Compare (with immediate sinking):
$ raku -e 'run "foo"'
The spawned command 'foo' exited unsuccessfully (exit code: 1, signal: 0)
with:
$ raku -e 'my $a = run "foo"; say "ran, going to sink"; $a.sink'
ran, going to sink
The spawned command 'foo' exited unsuccessfully (exit code: 1, signal: 0)
Now, what causes the usage of the Proc object in your code, is unclear. You'd have to show more code.
A way to check for success, is to check the exit-code:
$ raku -e 'my $a = run "foo"; say "failed" if $a.exitcode > 0'
failed
$ raku -e 'my $a = run "echo"; say "failed" if $a.exitcode > 0'
Or alternately, use Jonathan's solution:
$ raku -e 'try sink run "foo"'

`errorStrategy` setting to stop current process but continue pipeline

I have a lot of samples that go through a process which sometimes fail (deterministically). In such a case, I would want the failing process to stop, but all other samples to still get submitted and processed independently.
If I understand correctly, setting errorStrategy 'ignore' will continue the script within the failing process, which is not what I want. And errorStrategy 'finish' would stop submitting new samples, even though there is no reason for the other samples to fail too. And while errorStrategy 'retry' could technically work (by repeating the failing processes while the good ones get through), that doesn't seem like a good solution.
Am I missing something?
If a process can fail deterministically, it might be better to handle this situation somehow. Setting the errorStrategy directive to 'ignore' will mean any processes execution errors are ignored and allow your workflow continue. For example, you might get a process execution error if a process exits with a non-zero exit status or if one or more expected output files are missing. The pipeline will continue, however downstream processes will not be attempted.
Contents of test.nf:
nextflow.enable.dsl=2
process foo {
tag { sample }
input:
val sample
output:
path "${sample}.txt"
"""
if [ "${sample}" == "s1" ] ; then
(exit 1)
fi
if [ "${sample}" == "s2" ] ; then
echo "Hello" > "${sample}.txt"
fi
"""
}
process bar {
tag { txt }
input:
path txt
output:
path "${txt}.gz"
"""
gzip -c "${txt}" > "${txt}.gz"
"""
}
workflow {
Channel.of('s1', 's2', 's3') | foo | bar
}
Contents of nextflow.config:
process {
// this is the default task.shell:
shell = [ '/bin/bash', '-ue' ]
errorStrategy = 'ignore'
}
Run with:
nextflow run -ansi-log false test.nf
Results:
N E X T F L O W ~ version 20.10.0
Launching `test.nf` [drunk_bartik] - revision: e2103ea23b
[9b/56ce2d] Submitted process > foo (s2)
[43/0d5c9d] Submitted process > foo (s1)
[51/7b6752] Submitted process > foo (s3)
[43/0d5c9d] NOTE: Process `foo (s1)` terminated with an error exit status (1) -- Error is ignored
[51/7b6752] NOTE: Missing output file(s) `s3.txt` expected by process `foo (s3)` -- Error is ignored
[51/267685] Submitted process > bar (s2.txt)

Gitlab-CI stop stage if multiline script returns an exit code not equal to zero

Is a multiline script block in a gitlab-ci pipeline immediately aborted if a call within these instructions returns an exit code that is not equal to zero?
[gitlab-ci.yml]
stage: test
script:
- export ENVIRONMENTS=$(cat $STAGE-environments.txt)
- |
for ENVIRONMENT in ${ENVIRONMENTS}; do
echo "create tests for stage '$STAGE' and environment '$ENVIRONMENT'"
create_test $STAGE $ENVIRONMENT
fill_test $STAGE $ENVIRONMENT
done
The two calls "create_test" and "fill_test" are two bash script functions that return an exit code not equal to zero in the event of an error. If this happens, I want the gitlab-ci stage "test" to stop immediately and mark the job as failed.
Do I have to adapt the multiline script for this?

How to get exit status or completion message from an asynchronous gcloud sql export job?

I have a daily export task for pulling data from gcloud sql to a bucket in cloud storage, but the job winds up timing out and gcloud sends back an error saying this. Nonetheless, the SQL instance continues running the export, and the files make it to their destination, no problem.
In order to get around the timeout error, which is fouling up our logs, I have tried adding the --async flag, which gets around the error as it should, but there is no exit or completion message.
gcloud --project=$PROJECT sql export csv cloud-sql --database=$DB $BUCKET/$(date +%Y%m%d)_$NAME.csv --async --query="cat $SQLPATH/$table.sql" >> $LOG 2>&1
Is there some bash code or modification I can make to receive a status update or exit response to that I can accurately log that the job has been completed?
You can start this bash code at the same time you issue the command in another process. It has a function that gets the status of the operation (whose id is saved in the log), then the first status read and the next ones are compared until they are different, and the status is printed periodically:
OPERATION=$(cat $LOG|tr / \ | awk '{print $NF}'| tail -n 1)
get_status(){
CURRENT_STATUS=$(gcloud sql operations describe $OPERATION | grep status: | awk '{print $NF}')
}
FIRST_STATUS=$CURRENT_STATUS
echo FIRST STATUS: $FIRST_STATUS
while [ $FIRST_STATUS == $CURRENT_STATUS ]
do
get_status
echo CURRENT STATUS: $CURRENT_STATUS
sleep 5
done
echo CURRENT STATUS: $CURRENT_STATUS
echo DONE!

Handling SIGPIPE error in snakemake

The following snakemake script:
rule all:
input:
'test.done'
rule pipe:
output:
'test.done'
shell:
"""
seq 1 10000 | head > test.done
"""
fails with the following error:
snakemake -s test.snake
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 all
1 pipe
2
rule pipe:
output: test.done
jobid: 1
Error in job pipe while creating output file test.done.
RuleException:
CalledProcessError in line 9 of /Users/db291g/Tritume/test.snake:
Command '
seq 1 10000 | head > test.done
' returned non-zero exit status 141.
File "/Users/db291g/Tritume/test.snake", line 9, in __rule_pipe
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/concurrent/futures/thread.py", line 55, in run
Removing output files of failed job pipe since they might be corrupted:
test.done
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message
The explanation returned non-zero exit status 141 seems to say that snakemake has caught the SIGPIPE fail sent by head. I guess strictly speaking snakemake is doing the right thing in catching the fail, but I wonder if it would be possible to ignore some types of errors like this one. I have a snakemake script using the head command and I'm trying to find a workaround this error.
Yes, Snakemake sets pipefail by default, because in most cases this is what people implicitly expect. You can always deactivate it for specific commands by prepending set +o pipefail; to the shell command.
A somehow clunky solution is to append || true to the script. This will make the command always exit cleanly, which is not acceptable. To check whether the script actually succeded you can query the array variable ${PIPESTATUS[#]} to ensure it contains the expected exit codes:
This script is ok:
seq 1 10000 | head | grep 1 > test.done || true
echo ${PIPESTATUS[#]}
141 0 0
This is not ok:
seq 1 10000 | head | FOOBAR > test.done || true
echo ${PIPESTATUS[#]}
0