snakemake - configure rules to run with local container (Singularity, Docker) - snakemake

For snakemakev5.27+
Is there a way to run snakemake with the container directive that points to a local image? E.g. if I store the Docker containers on Dockerhub, and I also have a copy locally, when running snakemake, I don't want the rule to pull a singularity image copy from DockerHub if there already exists the exact copy locally. Makes for faster runs.

Sure, just pass a relative or absolute file path to the directive.

Even though the snakemake manual doesn't explicitly state it, it is possible to use a local singularity image using the containerized directive.
So instead of the example in the link above:
containerized: "docker://username/myworkflow:1.0.0"
You can point to the singularity sif file path (which contains the image)
containerized: "/path/to/myimage.sif"
Make sure you use --use-singularity when running snakemake.
How to build the singularity (sif) image:
You can build the sif image in various ways as described here, bug as for your question, you can build it from a local docker image.
I.e. you can list your local images by docker images and pick one to build the local sif file like so:
SINGULARITY_NOHTTPS=1 singularity build /path/to/myimage.sif docker-daemon://mydockerimage:latest
Note, it doesn't seem to work straight from local docker container, i.e. I would have expected this to work:
containerized: "docker-daemon://scpipe_docker:latest"
... but it didn't as of snakemake version 6.10.0

Related

Singularity definition file with paths relative to it

Question
When building Singularity images using definition files, is there a way to specify the path to a file on the host system relative to the definition file (i.e. independent of where the build command is called)?
Example to Illustrate the Problem
I have the following files in the same directory (e.g. a git repository):
foobar.def
some_file.txt
foobar.def looks as follows:
Bootstrap: library
From: ubuntu:20.04
Stage: build
%files
# Add some_file.txt at the root of the image
some_file.txt /some_file.txt
This works fine when I build with the following command in the directory which contains the files:
singularity build --fakeroot foobar.sif foobar.def
However, it fails if I call the build command from anywhere else (e.g. from a dedicated "build" directory) because it searches some_file.txt relative to the current working directory of the build command, not relative to the definition file.
Is there a way to implement the definition file such that the build works independently of where the command is called? I know that I could use absolute paths but this is not a viable solution in my case.
To make it even more complicated: My actual definition file is bootstrapping from another local image, which is located in the build directory. So ideally I would need a solution where some files are found relative the working directory while others are found relative to the location of the definition file.
Short answer: Not really
Longer answer: Not really, but there's a reason why and it shouldn't really matter for most use cases. While Docker went the route of letting you specify what your directory context is, Singularity decided to base all of its commands off the current directory where it is being executed. This also follows with $PWD being auto-mounted into the container, so it makes sense for it to be consistent.
That said, is there a reason you can't run singularity build --fakeroot $build_dir/foobar.sif foobar.def from the repo directory? There isn't any other output written besides the final image and it makes more sense for the directory with the data being used to be the context to work from.

Having host filesystem visible to singularity container

I use singularity images that do not require any binding of the desired host path, i.e.
singularity exec image.simg IMAGE_COMMAND -i $PWD/input_file -o ./output_dir
simply works like any other command on the "input_file" in my host system, also using relative paths as in "-o".
I'm not comfortable enough with Singularity and its jargon to understand how this is made.
Is a configuration done in singularity.conf?
How is this feature called? (is it "MOUNT HOSTFS"?)
By default, both your home and current directories are mounted/bound into the image for you. You can modify this in singularity.conf. Details on the settings are available in the admin documentation.
The MOUNT HOSTFS in the config is a toggle to automatically mount all host filesystems into the image. MOUNT HOME is the corresponding setting for auto-mounting the user's HOME directory.
You can see which files/directories are currently being mounted by using the --verbose option with your singularity command.

How to export a container in Singularity

I would like to move an already built container from one machine to another. What is the proper way to migrate the container from one environment to another?
I can find here the image.export command, but this is for an older version of the software. I am using version 3.5.2.
The container I wish to export is a --sandbox container. Is something like that possible?
Singularity allows you to easily convert between a sandbox and a production build.
For example:
singularity build lolcow.sif docker://godlovedc/lolcow # pulls and builds a container
singularity build --sandbox lolcow_sandbox/ lolcow.sif # converts from container to a writable sandbox
singularity build lolcow2 lolcow_sandbox/ # converts from sandbox to container
Once you have a production SIF or SIMG, you can easily transfer the file and convert as necessary.
singularity build generates a file that you can copy between computers just like any other file. The only things it needs is the singularity binary installed on the new host server.
The difference when using --sandbox is that you get a modifiable directory instead of single file. It can still be run elsewhere, but you may want to tar it up first so you're only moving a single file. Then you can untar it and run as normal on the new host.

What space is required to build a Singularity container?

I've got a def file to build a container (within a Vagrant VM). If I build as a sandbox:
sudo singularity build --sandbox mytest/ mytest.def
then the build completes. However, if I build straight to a container:
sudo singularity build mytest.sif mytest.def
then I get an error:
FATAL: While performing build: While creating SIF: while creating container: writing data object for SIF file: copying data object file to SIF file: write mytest.sif: no space left on device
If I try and convert the sandbox to a container:
sudo singularity build mytest.sif mytest/
then I get the same error.
The docs don't give an indication of the amount of space needed for a build vs sandbox. I could increase the size of the Vagrant VM, but it would be good to have an idea how much I should increase it by to ensure that the build is successful
The size is dependent on the image. If you're building from a docker image, you can look at that to get a general idea based on its size. It's important to know where to put the extra drive space, however.
Singularity uses a tmp dir (default: /tmp) and a cache dir (default: $HOME/.singularity/cache) in addition to the directory you're building in. Note that cache dir uses /root/.singularity/cache not your user home on sudo singularity build because of sudo. VMs often have small /, /root, and/or /tmp partitions by default. This has been a gotcha for me in the past and may also be affecting you.
You can use the --tmpdir flag on build to change that to somewhere that has more space if desired (see documentation here).
To change the default cache dir you have to set the environment variable SINGULARITY_CACHEDIR, with details on specifics in the documentation here. You can also set the SINGULARITY_TMPDIR in the same manner instead of using the --tmpdir flag. It is sometimes nice to keep all the environment modifications in one place.

Singularity: What is the difference between an image, a container, and an instance?

I am starting to learn Singularity for reproducible analysis of scientific pipelines. A colleague explained that an image was used to instantiate a container. However, in reading through the documentation and tutorials, the term instance is also used and the usage of image and container seems somewhat interchangeable. So, I am not sure I precisely understand the difference between an image, container, and instance. I do get that a recipe is a text file for building one of these (I think an image?).
For example, on this page it explains:
Now we can build the definition file into an image! Simply run build
and the image will be ready to go:
$ sudo singularity build url-to-pdf-api.img Singularity
Okay, so this uses the recipe Singularity to build an image, with the intuitive extension of .img. However, the help description of the build command states:
$ singularity help build
USAGE: singularity [...] build [build
options...]
The build command
compiles a container per a recipe (definition file) or based on a URI,
location, or archive.
So this seems to indicate we are building a container?
Then, there are image and instance sub-commands.
Are all these terms used interchangeably? It seems sometimes they are and sometimes there is a difference between them.
A container is the general concept of creating a sandboxed run environment and can be used as a general term to refer to either Docker or Singularity images. However it is sometimes used to also refer to the specific files being generated. This is probably not ideal, as it can clearly cause confusion to new users.
image is generally used to to refer to the actual files created by singularity build ...
instance refers to a specific way of running singularity images. Normally, if you singularity run some_image.sif or singularity some_image.sif some_command you can't easily access its environment while it's running. However, if you instead run singularity instance start some_image.sif some_instance1 it creates a persistent service that you can access like a docker container. The singularity service/instance documentation has some good examples of how instances are used differently than the basic exec and run commands.