I am looking for a way to check if a container is already cached from a hub url. For example I would want a command to do something like:
singularity iscached docker://username/container:tag
True
Any ideas?
Thanks!
You can use the command
singularity cache list --verbose | grep 'IMAGE_NAME.sif'
if that return code is 0, then the image exists. you can get the return code with $?. you will have to change names like docker://user/repo:tag into the sif filenames that singularity creates.
for example:
singularity cache list --verbose | grep 'alpine_latest.sif'
echo $? # prints 1
singularity pull docker://alpine:latest
singularity cache list --verbose | grep 'alpine_latest.sif'
echo $? # prints 0
Related
I have a nextflow script that sources scripts located in the ./bin directory that is located where I invoke the nextflow script. When I run the workflow without a container, Nextflow can find these scripts and execute them. However, when I run Nextflow with a container the scripts cannot be found, despite that I attempted to add those scripts to the container file.
I guess that I am either not adding the executables to the container properly or that I am not referencing them properly in the Nextflow script or config file. Any help is appreciated.
Here is an example process:
process clean {
// Remove common contaminants from fastq(s) using tapioca script
// see: https://github.com/ncgr/tapioca
input:
path fastq_file from fastq_raw
val x from machine_name
output:
path 'out.fastq' into clean_out
stdout ch1
script:
"""
echo "Number of reads in $fastq_file"
grep -c "^#" $fastq_file
tap_contam_analysis --db ${dbdir}/phix174 --pct 80 ${fastq_file} > hits.txt
echo "PhiX filtering completed for ${fastq_file}"
"""
}
Note that the "tap_contam_analysis" script is a perl script located in ./bin from where I invoke the Nextflow script.
Here are the relevant parts of my Docker file. Note that I attempted to modify the $PATH in hopes that would fix the issue...no luck:
FROM ubuntu:18.04
ARG DEBIAN_FRONTEND=noninteractive
WORKDIR /usr/src
#Copy all the stuff for this Nextflow workflow (python and perl scripts)
COPY . .
ENV PATH=${PATH}:/usr/src/bin
Finally, here is my nextflow.config file:
process {
container = 'nf_se_demux_to_bam_bai_denovo'
}
The executables just need to be added to somewhere in your container's $PATH. Folks often like to use /usr/local/bin for this, but you can check other locations with:
docker run --rm ubuntu:18.04 bash -c 'echo $PATH'
Your Dockerfile, therefore, might look like this:
FROM ubuntu:18.04
ARG DEBIAN_FRONTEND=noninteractive
WORKDIR /usr/src
COPY ./bin /usr/local/bin
I declared the variable logfile in the environment part and was trying to assign its value after executing ls -t -c1 *.log | head -1 command in the remote system.
I know i am doing it the wrong way.Any ideas how to assign the variable value after executing command in the remote system?
You can take return value in a variable in below way :
# Declare variable init
def logResult
logResult = sshCommand remote: remote, command: "ls -t -c1 *.log | head -1"
You can use it in script as:
script {
test = sh(script "echo ${logResult}")
}
I was trying to build Apache Impala from source(newest version on github).
I followed following instructions to build Impala:
(1) clone Impala
> git clone https://git-wip-us.apache.org/repos/asf/incubator-impala.git
> cd Impala
(2) configure environmental variables
> export JAVA_HOME=/usr/lib/jvm/java-7-oracle-amd64
> export IMPALA_HOME=<path to Impala>
> export BOOST_LIBRARYDIR=/usr/lib/x86_64-linux-gnu
> export LC_ALL="en_US.UTF-8"
(3)build
${IMPALA_HOME}/buildall.sh -noclean -skiptests -build_shared_libs -format
(4) errors are shown below:
Heap is needed to find the cause. Looks like the compiler does not support the GLIBCXX_3.4.21. But the GCC is automatically downloaded by the building script.
Appreciate your help!!!
Starting from this commit https://github.com/apache/impala/commit/d5cefe07c931a0d3bf02bca97bbba05400d91a48 , Impala has been shipped with a development bootstrap script.
I tried the master branch in a fresh ubuntu 16.04 docker image and it works fine. Here is what I did.
checkout the latest impala code base and do
docker run --rm -it --privileged -v /home/amos/git/impala/:/root/Impala ubuntu:16.04
inside docker, do
apt-get update
apt-get install sudo
cd /root/Impala
comment this out in bin/bootstrap_system.sh if you don't need test data
# if ! [[ -d ~/Impala-lzo ]]
# then
# git clone https://github.com/cloudera/impala-lzo.git ~/Impala-lzo
# fi
# if ! [[ -d ~/hadoop-lzo ]]
# then
# git clone https://github.com/cloudera/hadoop-lzo.git ~/hadoop-lzo
# fi
# cd ~/hadoop-lzo/
# time -p ant package
also add this line before ssh localhost whoami
echo "source ${IMPALA_HOME}/bin/impala-config-local.sh" >> ~/.bashrc
change the build command to whatever you like in bin/bootstrap_development.sh
${IMPALA_HOME}/buildall.sh -noclean -skiptests -build_shared_libs -format
then run bin/bootstrap_development.sh
You'll be prompted for some input. Just fill in default value and it'll work.
I am making a test file. I need to have a docker image and run it like this:
docker run www.google.com
Everytime that url changes, I need to pass it into a file inside the docker. Is that possible?
Sure. You need a custom docker image but this is definitely possible.
Let's say you want to execute the command "ping -c 3" and pass it the parameter you send in the command line.
You can build a custom image with the following Dockerfile:
FROM alpine:latest
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT /entrypoint.sh
The entrypoint.sh file contains the following:
#!/bin/sh
ping -c 3 "$WEBSITE"
Then, you have to build you image by running:
docker build -t pinger .
Now, you can run your image with this command:
docker run --rm -e WEBSITE=www.google.com pinger
By changing the value of the WEBSITE env variable in the latest step you can get what you requested.
I just solved it by adding this:
--env="url=test"
to the docker run, but I guess your way of doing it, is better.
Thank you
My problem is that I have a cluster-server with Torque PBS and want to use it to run a sequence-comparison with the program rapsearch.
The normal RapSearch command is:
./rapsearch -q protein.fasta -d database -o output -e 0.001 -v 10 -x t -z 32
Now I want to run it with 2 nodes on the cluster-server.
I've tried with: echo "./rapsearch -q protein.fasta -d database -o output -e 0.001 -v 10 -x t -z 32" | qsub -l nodes=2 but nothing happened.
Do you have any suggestions? Where I'm wrong? Help please.
Standard output (and error output) files are placed in your home directory by default; take a look. You are looking for a file named STDIN.e[numbers], it will contain the error message.
However, I see that you're using ./rapsearch but are not really being explicit about what directory you're in. Your problem is therefore probably a matter of changing directory into the directory that you submitted from. When your terminal is in the directory of the rapsearch executable, try echo "cd \$PBS_O_WORKDIR && ./rapsearch [arguments]" | qsub [arguments] to submit your job to the cluster.
Other tips:
You could add rapsearch to your path if you use it often. Then you can use it like a regular command anywhere. It's a matter of adding the line export PATH=/full/path/to/rapsearch/bin:$PATH to your .bashrc file.
Create a submission script for use with qsub. Here is a good example.