shell() function in run does not use singularity - snakemake

EDIT
I have now posted this question as an issue on the Snakemake bitbucket given this seems to be an unknown behavior.
I am using snakemake with the --use-singularity option.
When I use a classic rule of the form:
singularity: mycontainer
rule myrule:
input:
output:
shell:
"somecommand"
with the somecommand only present in the singularity container, everything goes fine.
However, when I need to use some python code in the run part of the rule, the command is not found.
rule myrule:
input:
output:
run:
some python code here
shell("somecommand")
The only workaround I found is to use
shell("singularity exec mycontainer somecommand")
but this is not optimal.
I am either missing something, such as an option, or this is a missing feature in snakemake.
What I would like to obtain is to use the shell() function with the --use-singularity option.

Snakemake doesn't allow using --use-conda with run block and this is why:
The run block of a rule (see Rules) has access to anything defined in the Snakefile, outside of the rule. Hence, it has to share the conda environment with the main Snakemake process. To avoid confusion we therefore disallow the conda directive together with the run block. It is recommended to use the script directive instead (see External scripts).
I bet --use-singularity is not allowed with run block for the same reason.

Related

snakemake rule won't run the complete command

I am working on this snakemake pipeline where the last rule looks like this:
rule RunCodeml:
input:
'{ProjectFolder}/{Fastas}/codeml.ctl'
output:
'{ProjectFolder}/{Fastas}/codeml.out'
shell:
'codeml {input}'
This rule does not run and the error seems to be that the program codeml can't find the .ctl file because it looks for an incomplete path: '/work_beegfs/sunam133/Anas_plasmids/Phylo_chromosome/Acinet_only/Klebs_Esc_SCUG/cluster_536/co'
although the command seems correct:
shell:
codeml /work_beegfs/sunam133/Anas_plasmids/Phylo_chromosome/Acinet_only/Klebs_Esc_SCUG/cluster_536/codeml.ctl
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)'
And here the output from running with -p option:
error when opening file /work_beegfs/sunam133/Anas_plasmids/Phylo_chromosome/Acinet_only/Klebs_Esc_SCUG/cluster_1083/co
tell me the full path-name of the file? Can't find the file. I give up.
I find this behavior very strange and I can't figure out what is going on. Any help would be appreciated.
Thanks!
D.
Ok, so the problem was not snakemake but the program I am calling (codeml), which is restricted in the length of the string given as path to the control file.

Execute a shell command outside of a sandbox while in a sandbox

I'm using singularity to run python in an environnement deprived of python. I'm also running a mysql instance as explained by the IOWA state university (running an instance of mysql, and closing it when done).
For clarity, I'm using a bash script to open mysql, then do what i have to do (a python script) and close mysql, and it works fine. But Python's only way to stop if an error occured is sys.exit([value]) and this not only stops the python script, but also the bash script that ran it. This makes it impossible for me to manage the errors and close the instance of mysql if the python script exits.
My question is : Is there a way for me to execute a 'singularity instance stop mysql' while being in the python sandbox. Something to tell singularity "hey, this command here must be used on the host !" ?
I keep searching but can't find anything.
I only tried to execute it with subprocess like any other command, but it returned an error message because I don't have this instance inside the python sandbox. I don't even have singularity in this sandbox.
For any clarifications, just ask me, I'm trying to be clear but I'm pretty sure it's not very clear.
Thanks a lot !
Generally speaking, it would be a big security issue if a process could be initiated from inside a container (docker or singularity) but run in the host OS's namespace.
If the bash script is exiting on the python failure, it sounds like you're using set -e or #!/bin/bash -e. This causes the script to abort if any command returns non-zero. It's commonly recommended for safer processing, but can cause problems like this at times. To bypass that for the python step you can modify your script:
# start mysql, do some stuff
set +x # disable abort on non-zero return
python my_script.py
set -x # re-enable abort on non-zero
# shut down mysql, do other stuff

Snakemake - Tibanna config support

I trying to run snakemake --tibanna to deploy Snakemake on AWS using the "Unicorn" Step Functions Tibanna creates.
I can't seem to find a way to change the different arguments Tibanna accepts like which subnet, AZ or Security Group will be used for the actual EC2 instance deployed.
Argument example (when running Tibanna without Snakemake):
https://github.com/4dn-dcic/tibanna/blob/master/test_json/unicorn/shelltest4.json#L32
Thanks!
Did you noticed this option?
snakemake --help
--tibanna-config TIBANNA_CONFIG [TIBANNA_CONFIG ...]
Additional tibanan config e.g. --tibanna-config
spot_instance=true subnet= security
group=
I think it was added recently.
-jk

Multiple Job (j3)

I am trying to run a GNU make file with multiple jobs.
When I try executing ' make.exe -r -j3', the receive the following to errors:
make.exe: Do not specify -j or --jobs if sh.exe is not available.
make.exe: Resetting make for single job mode.
Do I have to add ' $(SH) -c' somewhere in the makefile? If so, where?
The error message suggests that make cannot find sh.exe. The file names indicate you are probably on CygWin. I would investigate setting the PATH to include the location of sh.exe, or defining the value of SHELL to the name (or, even, full path) of your shell.
Are you running this on Windows (more specifically, in the "windows" shell?). If you are, you might want to read this:
http://www.gnu.org/software/make/manual/make.html#Parallel
more specifically:
On MS-DOS, the ā€˜-jā€™ option has no effect, since that system doesn't support multi-processing.
Once again, assuming you're running on windows, you should get MinGW or CygWin

Apache2 PassEnv on Ubuntu

I want to pass a system-wide variable to Apache so I can pass it to executed scripts using PassEnv. Basically a script executed Apache executes a shell script, that shell script wont run without the variable being set.
But Ubuntu devs did this in the startup script:
ENV="env -i LANG=C PATH=/usr/local/bin:/usr/bin:/bin"
Resulting in variables from /etc/environment to be discarded. Can I fix this without modifying the startup script?
Turns out you can pass along vars in /etc/apache2/envvars. Still sucks though.
Nope. The value stays empty.