How to add sbatch options such as --wait in a snakemake file

How to add sbatch options such as --wait in a snakemake file - snakemake

I am unsure where I add the --wait sbatch option when using snakemake. I tried to add it to the snakemake command itself but I get the following error:
Error submitting jobscript (exit code 1):
Submitted batch job 5389577
Any help would be appriciated.
My snakemake command is as follows:
snakemake --latency-wait 60 --rerun-incomplete --keep-going --jobs 99 --cluster-status 'python /home/lamma/faststorage/scripts/slurm-status.py' --cluster 'sbatch -t {cluster.time} --mem={cluster.mem} --cpus-per-task={cluster.c} --error={cluster.error} --job-name={cluster.name} --output={cluster.output} --wait' --cluster-config bacterial-hybrid-assembly-config.json --configfile yaml-config-files/test_experiment3.yaml --snakefile bacterial-hybrid-assembly.smk

Adding --parsable flag to sbatch command will allow --cluster-status script to properly parse job IDs. Relevant sources: 1, 2.
'sbatch -t {cluster.time} --mem={cluster.mem} --cpus-per-task={cluster.c} --error={cluster.error} --job-name={cluster.name} --output={cluster.output} --wait --parsable'

Related

what is the snakemake variable name for slurm job id?

When I run jobs on snakemake using --profile ./slurm, I see in standard output:
Submitted job 406 with external jobid '1956125'
in slurm/config.yaml I have:
cores: "all"
cluster: "sbatch --partition=mypartition -A myaccount -t {resources.time_min} --mem={resources.mem_mb} -c {resources.cpus} -o slurm/logs/{jobid}.out -e slurm/logs/{jobid}.err --mail-type=FAIL --mail-user=mymail.edu --parsable"
default-resources: [cpus=1, mem_mb=2000, time_min=10080, parition=mypartition]
use-conda: true
This writes log files like 406.err and what I want is 1956125.err
How do I do this?

406 is the internal jobid from snakemake. You want the external jobid from slurm.
IIRC that should be possible by using %j instead of {jobid}:
cluster: "sbatch --partition=mypartition -A myaccount -t {resources.time_min} --mem={resources.mem_mb} -c {resources.cpus} -o slurm/logs/%j.out -e slurm/logs/%j.err --mail-type=FAIL --mail-user=mymail.edu --parsable"
Let us know if it works.

passing parameters to cwl from snakemake

I'm trying to execute some cwl pipelines using snakemake. I need to pass parameters to cwltool, which snakemake is using to execute the pipeline.
cwltool has a number of options. Currently, when snakemake calls it, the only option that I can figure out how to pass is --singularity, and that is easy, because if you call snakemake with the --use-singularity flag, it automatically inserts it into the call to cwltool.
snakemake --jobs 999 --printshellcmds --rerun-incomplete --use-singularity
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 999
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 align
1
[Tue Aug 11 06:55:18 2020]
rule align:
input: /project/uoa00571/projects/rapidAutopsy/data/raw/dna_batch2/BA1_1.fastq.gz, /nesi/project/uoa00571/projects/rapidAutopsy/data/raw/dna_batch2/BA1_2.fastq.gz
output: /nobackup/uoa00571/data/intermediate/rapidAutopsy/BA1.unaligned.bam
jobid: 0
threads: 8
cwltool --singularity file:/project/uoa00571/projects/rapidAutopsy/src/gridss/gridssAlignment.cwl /tmp/tmpfgsz5uvr
Unfortunately, I can't figure out how to add additional arguments to the cwltool call. When snakemake calls the cwl tool, it doesn't appear to pass on the working directory, so if I do :
snakemake --jobs 999 --printshellcmds --rerun-incomplete --use-singularity --directory "/nesi/nobackup/uoa00571/data/intermediate/rapidAutopsy/"
instead of landing intermediate files in the specified directory, singularity appears to be binding a directory in the /tmp directory for the intermediate files, which on the system I am working on, is not big enough, resulting in a disk quota exceeded error.
singularity \
--quiet \
exec \
--contain \
--pid \
--ipc \
--home \
/tmp/xq5co13r:/sSyVnR \
--bind \
/tmp/vbjabhqx:/tmp:rw \
Or at least, that's what I think is happening. If I run the cwl pipeline using cwltool, I can add in the option --cacheDir and the pipeline will run to completion, so I'd like to be able to pass that from snakemake to the cwltool call, if that's possible?

How can I run multiple runs of pipeline with different config files - issue with lock on .snakemake directory

I am running a snakemake pipeline from the same working directory but with different config files and the input / output are in different directories too. The issue seems to be that although both runs are using data in different folders snakemake creates the lock on the pipeline folder due to the .snakemake folder and the lock folder within. Is there a way to force separate .snakemake folders? code example below:
Both runs are ran from within /home/pipelines/qc_pipeline :
run 1:
/home/apps/miniconda3/bin/snakemake -p -k -j 999 --latency-wait 10 --restart-times 3 --use-singularity --singularity-args "-B /pipelines_test/QC_pipeline/PE_trimming/,/clusterTMP/testingQC/,/home/www/codebase/references" --configfile /clusterTMP/testingQC/config.yaml --cluster-config QC_slurm_roadsheet.json --cluster "sbatch --job-name {cluster.name} --mem-per-cpu {cluster.mem-per-cpu} -t {cluster.time} --output {cluster.output}"
run 2:
/home/apps/miniconda3/bin/snakemake -p -k -j 999 --latency-wait 10 --restart-times 3 --use-singularity --singularity-args "-B /pipelines_test/QC_pipeline/SE_trimming/,/clusterTMP/testingQC2/,/home/www/codebase/references" --configfile /clusterTMP/testingQC2/config.yaml --cluster-config QC_slurm_roadsheet.json --cluster "sbatch --job-name {cluster.name} --mem-per-cpu {cluster.mem-per-cpu} -t {cluster.time} --output {cluster.output}"
error:
Directory cannot be locked. Please make sure that no other Snakemake process is trying to create the same files in the following directory:
/home/pipelines/qc_pipeline
If you are sure that no other instances of snakemake are running on this directory, the remaining lock was likely caused by a kill signal or a power loss. It can be removed with the --unlock argument.

Maarten-vd-Sande correctly points to the --nolock option (+1), but in my opinion it's a very bad idea to use --nolock routinely.
As the error says, two snakemake processes are trying to create the same file. Unless the error is a bug in snakemake, I wouldn't blindly proceed and overwrite files.
I think it would be safer to assign to each snakemake execution its own execution directory and working directory, like:
topdir=`pwd`
mkdir -p run1
cd run1
snakemake --configfile /path/to/config1.yaml ...
cd $topdir
mkdir -p run2
cd run2
snakemake --configfile /path/to/config2.yaml ...
cd $topdir
mkdir -p run3
etc...
EDIT
Actually, it should be less clunky and probably better to use the the --directory/-d option:
snakemake -d run1 --configfile /path/to/config1.yaml ...
snakemake -d run2 --configfile /path/to/config2.yaml ...
...

As long as the different pipelines do not generate the same output files you can do it with the --nolock option:
snakemake --nolock [rest of the command]
Take a look here for a short doc about nolock.

I am trying to rebuild my old working project and this error is showing what to do?

Error:In FontFamilyFont, unable to find attribute android:fontVariationSettings
Error:java.util.concurrent.ExecutionException: com.android.ide.common.process.ProcessException: Error while executing process /sdk/build-tools/27.0.2/aapt with arguments {package -f --no-crunch -I /sdk/platforms/android-26/android.jar -M /project/kitchen33app/milla/build/intermediates/manifest/androidTest/debug/AndroidManifest.xml -S /project/kitchen33app/milla/build/intermediates/res/merged/androidTest/debug -m -J /project/kitchen33app/milla/build/generated/source/r/androidTest/debug -F /project/kitchen33app/milla/build/intermediates/res/androidTest/debug/resources-debugAndroidTest.ap_ -0 apk --output-text-symbols /project/kitchen33app/milla/build/intermediates/symbols/androidTest/debug --no-version-vectors}
Error:com.android.ide.common.process.ProcessException: Error while executing process /sdk/build-tools/27.0.2/aapt with arguments {package -f --no-crunch -I /sdk/platforms/android-26/android.jar -M /project/kitchen33app/milla/build/intermediates/manifest/androidTest/debug/AndroidManifest.xml -S /project/kitchen33app/milla/build/intermediates/res/merged/androidTest/debug -m -J /project/kitchen33app/milla/build/generated/source/r/androidTest/debug -F /project/kitchen33app/milla/build/intermediates/res/androidTest/debug/resources-debugAndroidTest.ap_ -0 apk --output-text-symbols /project/kitchen33app/milla/build/intermediates/symbols/androidTest/debug --no-version-vectors}
Error:org.gradle.process.internal.ExecException: Process 'command '/sdk/build-tools/27.0.2/aapt'' finished with non-zero exit value 1
Error:Execution failed for task ':milla:processDebugAndroidTestResources'.
Failed to execute aapt

Just replace referencing of color code with the actual color in vector drawable files and it should work.

In your app build.gradle add the following line:
defaultConfig{
vectorDrawables.useSupportLibrary = true
}

Is it possible to print commands instead of rules in snakemake dry run?

Dry runs are a super important functionality of workflow languages. What I am looking at is mostly what would be executed if I run the command and this is exactly what one see when running make -n.
However analogical functionality snakemake -n prints something like
Building DAG of jobs...
rule produce_output:
output: my_output
jobid: 0
wildcards: var=something
Job counts:
count jobs
1 produce_output
1
The log contains kind of everything else than commands that get executed. Is there a way how to get command from snakemake?

snakemake -p --quiet -n
-p for print shell commands
-n for dry run
--quiet for removing the rest
EDIT 2019-Jan
This solution seems broken for lasts versions of snakemake

snakemake -p -n
Avoid the --quiet reported in the #eric-c answer, at least in some situations the combination on -p -n -q does not print the command executed without -n.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to add sbatch options such as --wait in a snakemake file - snakemake

Adding --parsable flag to sbatch command will allow --cluster-status script to properly parse job IDs. Relevant sources: 1, 2. 'sbatch -t {cluster.time} --mem={cluster.mem} --cpus-per-task={cluster.c} --error={cluster.error} --job-name={cluster.name} --output={cluster.output} --wait --parsable'

Related

what is the snakemake variable name for slurm job id?

passing parameters to cwl from snakemake

How can I run multiple runs of pipeline with different config files - issue with lock on .snakemake directory

I am trying to rebuild my old working project and this error is showing what to do?

Is it possible to print commands instead of rules in snakemake dry run?

Categories

Resources