Singularity and interior dynamic libraries - singularity-container

I am currently working on getting a bigger (C++) project inside a Singularity container. So far, everything works well, until I try to execute the container image, in which it won't find a dynamic library file that I previously build inside the container:
./MyProject.img
/<some path>/MyExecutable: error while loading shared libraries: libmongocxx.so._noabi: cannot open shared object file: No such file or directory
My first thought was that maybe the process of building this dependency inside the container did somehow not succeed, therefore I added ls /usr/local/lib/ at the end of the %post section of my recipe to check on that, but everything there is fine:
+ ls /usr/local/lib/
[...]
libmongocxx.so
libmongocxx.so.3.6.0
libmongocxx.so._noabi
[...]
So my next thought was that maybe the basic library folder is for some reason not a part of the environment variables of my container, so I extended the %post section with
export PATH=$PATH:/usr/local/lib/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/
still to no avail.
Is there some property of Singularity containers I am missing here? Do I need to somehow extract the dynamic library file to outside of the container? Or did I made some stupid mistake I just can't see here?
(I tagged the question only with singularity-container for now as I don't think this is anything specific to C++ here, but if somebody thinks otherwise feel free to add. My container uses Bootstrap: docker From: ubuntu:18.04, should that be relevant.)
Edit: I also explicitely gave the dynamic libraries execution rights, just in case, and printed their rights:
lrwxrwxrwx 1 root root 20 Sep 10 10:51 libmongocxx.so._noabi -> libmongocxx.so.3.6.0
Didn't work either.

My first guess is that your local environment is overwriting the variables in the image. You can use singularity run --cleanenv MyProject.img to prevent your current environment from persisting into the container. If there are variables you do want to pass in there, you can export SINGULARITYENV_SOMEVAR=foo to have SOMEVAR=foo set in the container environment.
If that doesn't do it, modify the %runscript to have a env | sort in there so you can see exactly what's set when it's attempting to run your code.

Related

Why does `singularity run/exec` automatically bind specific some directories? What is the use case?

I'm familiar with containers, but new to Singularity and I found myself fighting a broken Python installation in a Singularity container tonight. It turns out that this was because $HOME was being mounted into my container without my knowledge.
I guess that I've developed a liking for the idiom "Explicit is better than implicit" from Python. To me, automatically mounting specific directories is unexpected behavior.
Three questions:
Why does Singularity default to mounting $HOME, /tmp, /proc, etc?
So that I can become more comfortable with Singularity, what are some use cases for this behavior?
I see the --no-home flag, but is there a flag to disable all of the default mounts without needing to change the default Singularity configuration?
It's a mixture of design, convenience and technical necessity.
The biggest reason is that, unless you use certain params that say otherwise, Singularity images are read-only filesystems. You need somewhere to write output and any temporary files that get created along the way. Maybe you know to mount in your output dir, but there are all sorts of files that get created / modified / deleted in the background that we don't ever think about. Implicit automounts give reasonable defaults that work in most situations.
Simplistic example: you're doing a big sort and filter operation on some data, but you're print the results to console so you don't bother to mount in anything but the raw data. But even after some manipulation and filtering, the size of the data exceeds available memory so sort falls back to using small files in /tmp before being deleted when the process finishes. And then it crashes because you can't write to /tmp.
You can require a user to manually specify a what to mount to /tmp on run, or you can use a sane default like /tmp and also allow that to be overridden by the user (SINGULARITY_TMPDIR, -B $PWD/fake_tmp:/tmp, --contain/--containall). These are all also configurable, so the admins can set sane defaults specific the running environment.
There are also technical reasons for some of the mounts. e.g., /etc/passwd and /etc/group are needed to match permissions on the host OS. The docs on bind paths and mounts are actually pretty good and have more specifics on the whats and whys, and even the answer to your third question: --no-mount. The --contain/--containall flags will probably also be of interest. If you really want to deep dive, there are also the admin docs and the source code on github.
A simple but real singularity use case, with explanation:
singularity exec \
--cleanenv \
-H $PWD:/home \
-B /some/local/data:/data \
multiqc.sif \
multiqc -i $SAMPLE_ID /data
--cleanenv / -e: You've already experienced the fun of unexpected mounts, there's also unexpected environment variables! --cleanenv/-e tells Singularity to not persist the host execution environment in the container. You can still use, e.g., SINGULARITYENV_SOMEVAR=23 to have SOMEVAR=23 inside the container though, as that is explicitly set.
-H $PWD:/home: This mounts the current directory into the container to /home and sets HOME=/home. While using --contain/--containall and explicit mounts is probably a better solution, I am lazy and this ensures several things:
the current directory is mounted into the container. The implicit mounting of the working is allowed to fail, and will do so quietly, if the base directory does not exist in the image. e.g., if you're running from /cluster/my-lab/some-project and there is no /cluster inside your image, it will not be mounted in. This is not an issue if using explicit binds directly (-B /cluster/my-lab/some-project) or if an explicit bind has a shared path (-B /cluster/data/experiment-123) with current directory.
the command is executed from the context of the current directory. If $PWD fails to be mounted as described above, singularity uses $HOME as the working directory instead. If both $PWD and $HOME failed to mount, / is used. This can cause problems if you're using relative paths and you aren't where you expected to be. Since it is specific to the path on the host, it can be really annoying when trying to duplicate a problem locally.
the base path is inside the container is always the same regardless of host OS file structure. Consistency is good.
The rest is just the command that's being run, which in this case summarizes the logs from other programs that work with genetic data.

Singularity Recipe: How to access executable within container?

I am a beginner with Singularity.
What I want to achieve in the long run: I have a programming project with a long lists of dependencies, and I want to be able to give the program to other people in my company without there being bugs caused by missing dependencies, or wrong versions of dependencies.
The idea was now to use Singularity in order to easily provide a working environment.
In order to test this, I wrote a Hello World application which I now want to run in a container. I have a folder HelloWorld/ which contains the source code for a C++ Qt project. Then I wrote the following recipe file:
project.recipe
Bootstrap: docker
From: ubuntu:18.04
%setup
cp -R <some_folder>/HelloWorld ${SINGULARITY_ROOTFS}/HelloWorld
%post
apt update
apt-get install -y qt5-default
apt install -y g++
apt-get install -y build-essential
cd HelloWorld
qmake
make
echo "after build:"
ls
%runscript
echo "before execution:"
ls HelloWorld/
./HelloWorld/HelloWorld
where the echos and directory listings are for my current debugging process.
I can sucessfully build an image file using sudo singularity build --writable project.img project.recipe. (My debugging output shows me that the executable was build successfully.)
The problem is now that if I try to run it using ./project.img, or singularity run project.img, it won't find the executable.
Using my debugging output, I found out that the lines in %runscript use the folders outside of the container.
Tutorials like https://sylabs.io/guides/3.1/user-guide/build_a_container.html made it seem to me as if my recipe was the way to go, but apparently it isn't?
My questions:
Is there some way for me to access my executable? Am I calling it wrong?
Is the way I do it the way it is supposed to be done? Or would one normally do something like getting the executable outside of the container and then use the container to call that outside file? Or is there a different best practice?
If the executable is to be copied outside of the container after compilation, how do I do that? How do I access outside folders when I'm within %post?
Is this the best work process for what I want to achieve? Later on, my idea is that the big project is copied likewise in the container, dependencies are either installed or copied, then the project is compiled and finally its source being deleted. I also considered using a repository, but I can't have the project being in an open repository, and I don't want to store any passwords.
Firstly, use %files, don't use %setup. %setup is run as root and can directly modify the host server. You can very easily and accidentally break things without realizing it. You can get the same effect this way:
%files
some_folder/HelloWorld /HelloWorld
You are calling it wrong. In your %setup (and hopefully now in your %files) steps, you are copying the data to /HelloWorld. In your %runscript your are calling ./HelloWorld/HelloWorld which is the equivalent of $PWD/HelloWorld/HelloWorld. Since singularity automatically mounts in $PWD (as well as $HOME and some other directories), you are not calling what you're trying to call.
You don't copy the executable outside of the container, you just need to make sure what you're executing is where you think it is.
There is no access to the host filesystem in %post, you should have everything you need copied in via %files first.
That's a reasonable workflow. Having a local private repo for the code is probably a good idea for tracking your changes, but that's your call.

Singularity definition file with paths relative to it

Question
When building Singularity images using definition files, is there a way to specify the path to a file on the host system relative to the definition file (i.e. independent of where the build command is called)?
Example to Illustrate the Problem
I have the following files in the same directory (e.g. a git repository):
foobar.def
some_file.txt
foobar.def looks as follows:
Bootstrap: library
From: ubuntu:20.04
Stage: build
%files
# Add some_file.txt at the root of the image
some_file.txt /some_file.txt
This works fine when I build with the following command in the directory which contains the files:
singularity build --fakeroot foobar.sif foobar.def
However, it fails if I call the build command from anywhere else (e.g. from a dedicated "build" directory) because it searches some_file.txt relative to the current working directory of the build command, not relative to the definition file.
Is there a way to implement the definition file such that the build works independently of where the command is called? I know that I could use absolute paths but this is not a viable solution in my case.
To make it even more complicated: My actual definition file is bootstrapping from another local image, which is located in the build directory. So ideally I would need a solution where some files are found relative the working directory while others are found relative to the location of the definition file.
Short answer: Not really
Longer answer: Not really, but there's a reason why and it shouldn't really matter for most use cases. While Docker went the route of letting you specify what your directory context is, Singularity decided to base all of its commands off the current directory where it is being executed. This also follows with $PWD being auto-mounted into the container, so it makes sense for it to be consistent.
That said, is there a reason you can't run singularity build --fakeroot $build_dir/foobar.sif foobar.def from the repo directory? There isn't any other output written besides the final image and it makes more sense for the directory with the data being used to be the context to work from.

What space is required to build a Singularity container?

I've got a def file to build a container (within a Vagrant VM). If I build as a sandbox:
sudo singularity build --sandbox mytest/ mytest.def
then the build completes. However, if I build straight to a container:
sudo singularity build mytest.sif mytest.def
then I get an error:
FATAL: While performing build: While creating SIF: while creating container: writing data object for SIF file: copying data object file to SIF file: write mytest.sif: no space left on device
If I try and convert the sandbox to a container:
sudo singularity build mytest.sif mytest/
then I get the same error.
The docs don't give an indication of the amount of space needed for a build vs sandbox. I could increase the size of the Vagrant VM, but it would be good to have an idea how much I should increase it by to ensure that the build is successful
The size is dependent on the image. If you're building from a docker image, you can look at that to get a general idea based on its size. It's important to know where to put the extra drive space, however.
Singularity uses a tmp dir (default: /tmp) and a cache dir (default: $HOME/.singularity/cache) in addition to the directory you're building in. Note that cache dir uses /root/.singularity/cache not your user home on sudo singularity build because of sudo. VMs often have small /, /root, and/or /tmp partitions by default. This has been a gotcha for me in the past and may also be affecting you.
You can use the --tmpdir flag on build to change that to somewhere that has more space if desired (see documentation here).
To change the default cache dir you have to set the environment variable SINGULARITY_CACHEDIR, with details on specifics in the documentation here. You can also set the SINGULARITY_TMPDIR in the same manner instead of using the --tmpdir flag. It is sometimes nice to keep all the environment modifications in one place.

Singularity: What is the difference between an image, a container, and an instance?

I am starting to learn Singularity for reproducible analysis of scientific pipelines. A colleague explained that an image was used to instantiate a container. However, in reading through the documentation and tutorials, the term instance is also used and the usage of image and container seems somewhat interchangeable. So, I am not sure I precisely understand the difference between an image, container, and instance. I do get that a recipe is a text file for building one of these (I think an image?).
For example, on this page it explains:
Now we can build the definition file into an image! Simply run build
and the image will be ready to go:
$ sudo singularity build url-to-pdf-api.img Singularity
Okay, so this uses the recipe Singularity to build an image, with the intuitive extension of .img. However, the help description of the build command states:
$ singularity help build
USAGE: singularity [...] build [build
options...]
The build command
compiles a container per a recipe (definition file) or based on a URI,
location, or archive.
So this seems to indicate we are building a container?
Then, there are image and instance sub-commands.
Are all these terms used interchangeably? It seems sometimes they are and sometimes there is a difference between them.
A container is the general concept of creating a sandboxed run environment and can be used as a general term to refer to either Docker or Singularity images. However it is sometimes used to also refer to the specific files being generated. This is probably not ideal, as it can clearly cause confusion to new users.
image is generally used to to refer to the actual files created by singularity build ...
instance refers to a specific way of running singularity images. Normally, if you singularity run some_image.sif or singularity some_image.sif some_command you can't easily access its environment while it's running. However, if you instead run singularity instance start some_image.sif some_instance1 it creates a persistent service that you can access like a docker container. The singularity service/instance documentation has some good examples of how instances are used differently than the basic exec and run commands.