Cannot find ./bin scripts when running Nextflow with a container - nextflow

I have a nextflow script that sources scripts located in the ./bin directory that is located where I invoke the nextflow script. When I run the workflow without a container, Nextflow can find these scripts and execute them. However, when I run Nextflow with a container the scripts cannot be found, despite that I attempted to add those scripts to the container file.
I guess that I am either not adding the executables to the container properly or that I am not referencing them properly in the Nextflow script or config file. Any help is appreciated.
Here is an example process:
process clean {
// Remove common contaminants from fastq(s) using tapioca script
// see: https://github.com/ncgr/tapioca
input:
path fastq_file from fastq_raw
val x from machine_name
output:
path 'out.fastq' into clean_out
stdout ch1
script:
"""
echo "Number of reads in $fastq_file"
grep -c "^#" $fastq_file
tap_contam_analysis --db ${dbdir}/phix174 --pct 80 ${fastq_file} > hits.txt
echo "PhiX filtering completed for ${fastq_file}"
"""
}
Note that the "tap_contam_analysis" script is a perl script located in ./bin from where I invoke the Nextflow script.
Here are the relevant parts of my Docker file. Note that I attempted to modify the $PATH in hopes that would fix the issue...no luck:
FROM ubuntu:18.04
ARG DEBIAN_FRONTEND=noninteractive
WORKDIR /usr/src
#Copy all the stuff for this Nextflow workflow (python and perl scripts)
COPY . .
ENV PATH=${PATH}:/usr/src/bin
Finally, here is my nextflow.config file:
process {
container = 'nf_se_demux_to_bam_bai_denovo'
}

The executables just need to be added to somewhere in your container's $PATH. Folks often like to use /usr/local/bin for this, but you can check other locations with:
docker run --rm ubuntu:18.04 bash -c 'echo $PATH'
Your Dockerfile, therefore, might look like this:
FROM ubuntu:18.04
ARG DEBIAN_FRONTEND=noninteractive
WORKDIR /usr/src
COPY ./bin /usr/local/bin

Related

Running PMD in GitLab CI Script doesn't work unless echo command is added after the script runs

This is an interesting issue. I have a GitLab project, and I've created a .gitlab-ci.yml to run a PMD that will scan my code after every commit. The ci.yml file looks like this:
image: "node:latest"
stages:
- preliminary-testing
apex-code-scan:
stage: preliminary-testing
allow_failure: false
script:
- install_java
- install_pmd
artifacts:
paths:
- pmd-reports/
####################################################
# Helper Methods
####################################################
.sfdx_helpers: &sfdx_helpers |
function install_java() {
local JAVA_VERSION=11
local JAVA_INSTALLATION=openjdk-$JAVA_VERSION-jdk
echo "Installing ${JAVA_INSTALLATION}"
apt update && apt -y install $JAVA_INSTALLATION
}
function install_pmd() {
local PMD_VERSION=6.52.0
local RULESET_PATH=ruleset.xml
local OUTPUT_DIRECTORY=pmd-reports
local SOURCE_DIRECTORY=force-app
local URL=https://github.com/pmd/pmd/releases/download/pmd_releases%2F$PMD_VERSION/pmd-bin-$PMD_VERSION.zip
# Here I would download and unzip the PMD source code. But for now I have the PMD source already in my project for testing purposes
# apt update && apt -y install unzip
# wget $URL
# unzip -o pmd-bin-$PMD_VERSION.zip
# rm pmd-bin-$PMD_VERSION.zip
echo "Installed PMD!"
mkdir -p $OUTPUT_DIRECTORY
echo "Going to run PMD!"
ls
echo "Start"
pmd-bin-$PMD_VERSION/bin/run.sh pmd -d $SOURCE_DIRECTORY -R $RULESET_PATH -f xslt -P xsltFilename=pmd_report.xsl -r $OUTPUT_DIRECTORY/pmd-apex.html
echo "Done"
rm -r pmd-bin-$PMD_VERSION
echo "Remove pmd"
}
before_script:
- *sfdx_helpers
When I try to run this pipeline, it will fail after starting the PMD:
However, if I make a small change to the PMD's .sh file and add an echo command at the very end. Then the pipeline succeeds:
PMD /bin/run.sh before (doesn't work):
...
java ${HEAPSIZE} ${PMD_JAVA_OPTS} $(jre_specific_vm_options) -cp "${classpath}" "${CLASSNAME}" "$#"
PMD /bin/run.sh after (does work):
...
java ${HEAPSIZE} ${PMD_JAVA_OPTS} $(jre_specific_vm_options) -cp "${classpath}" "${CLASSNAME}" "$#"
echo "Done1" // This is the last line in the file
I don't have the slightest idea why this is the case. Does anyone know why adding this echo command at the end of the .sh file would cause the pipeline to succeed? I could keep it as is with the echo command, but I would like to understand why it is behaving this way. I don't want to be that guy that just leaves a comment saying Hey don't touch this line of code, I don't know why, but without it the whole thing fails. Thank you!
PMD exits with a specific exit code depending whether it found some violations or not, see https://pmd.github.io/latest/pmd_userdocs_cli_reference.html#exit-status
I guess, your PMD run finds some violations, and PMD exits with exit code 4 - which is not a success exit code.
In general, this is used to make the CI build fail, in case any PMD violations are present - forcing to fix the violations before you get a green build.
If that is not what you want, e.g. you only want to report the violations but not fail the build, then you need to add the following command line option:
--fail-on-violation false
Then PMD will exit with exit code 0, even when there are violations.
So it appears that the java command that the PMD runs for some reason returns a non-zero exit code (even though the script is successful). Because I was adding an echo command at the end of that bash script, the last line in the script returned a success exit code, which is why the GitLab CI pipeline succeeded when the echo command was there.
In order to work around the non-zero exit code being returned by the java PMD command, I have changed this line in my .gitlab-ci.yml file to catch the non-zero exit code and proceed.
function install_pmd() {
// ... For brevity I'm just including the line that was changed in this method
pmd-bin-$PMD_VERSION/bin/run.sh pmd -d $SOURCE_DIRECTORY -R $RULESET_PATH -f xslt -P xsltFilename=pmd_report.xsl -r $OUTPUT_DIRECTORY/pmd-apex.html || echo "PMD Returned Exit Code"
// ...
}

Variable Substitution with Bamboo in Dockerfile

My Dockerfile looks like the following:
from httpd:${bamboo.test.tag}
COPY index.html /usr/local/apache2/htdocs/
In Bamboo I have a task with the following script:
docker build --no-cache -t myproj/my .
When running the job, I get the following error:
build 26-Sep-2022 10:42:26 Step 1/2 : from httpd:${bamboo.test.tag}
error 26-Sep-2022 10:42:26 failed to process "httpd:${bamboo.test.tag}": missing ':' in substitution
How can I substitute the tag?
This is actually a problem with how you are using the dockerfile.
Docker will not expand environment variables inside your Dockerfile. You need to pass the environment value as a build argument in the docker build command then use the ARG keyword inside the Dockerfile.
Your Dockerfile would look like this:
ARG IMAGE_TAG
from httpd:${IMAGE_TAG}
COPY index.html /usr/local/apache2/htdocs/
And you would need to change you docker build command to:
docker build --no-cache --build-arg IMAGE_TAG=${bamboo.test.tag} -t myproj/my .
Check a more detailed explanation here

gitlab-runner doesn't run ENTRYPOINT scripts in Dockerfile

I use gitlab-ci in my project. I have created an image and push it to gitlab container registry.
To create an image and register it to gitlab container registry, I have created a Dockerfile.
Dockerfile:
...
ENTRYPOINT [ "scripts/entry-gitlab-ci.sh" ]
CMD "app"
...
entry-gitlab-ci.sh:
#!/bin/bash
set -e
if [[ $# == 'app' ]]; then
echo "Initialize image"
rake db:drop
rake db:create
rake db:migrate
fi
exec "$#"
the image will be created successfully, but when the gitlab-runner pulls and execs the created image, doesn't run the **entry-gitlab-ci** script.
What is the problem?
Image entrypoints definitely run in GitLab CI with the docker executor, both for services and for jobs, so long as this has not been overwritten by the job configuration.
There's two key problems if you're trying to use this image in your job image:.
GitLab overrides the command for the image. So your if condition won't ever catch here.
Your entrypoint should be prepared to run a shell script. So, you should use something like exec /bin/bash not exec "$#" for a job image.
Per the documentation:
The runner expects that the image has no entrypoint or that the entrypoint is prepared to start a shell command.
So your entrypoint might look something like this:
#!/usr/bin/env bash
# gitlab-entrypoint-script
echo "doing something before running commands"
if [[ -n "$CI" ]]; then
echo "this block will only execute in a CI environment"
echo "now running script commands"
# this is how GitLab expects your entrypoint to end, if provided
# will execute scripts from stdin
exec /bin/bash
else
echo "Not in CI. Running the image normally"
exec "$#"
fi
This assumes you are using a docker executor and the runner is using a version of docker >= 17.06
You can also explicitly set the entrypoint for job images and service images in the job config image:. This may be useful, for example, if your image normally has an entrypoint and you don't want to build your image with consideration for GitLab-CI or if you wanted to use a public image that has a non-compatible entrypoint.
From my experience and struggles, I couldn't get Gitlab to use the EXEC automatically. Same with trying to get a login shell working easily to pick up environment variables. Instead, you have to run it manually from the CI.
# .gitlab-ci.yml
build:
image: your-image-name
stage: build
script:
- /bin/bash ./scripts/entry-gitlab-ci.sh

Create default files for conan without install

I'm creating a docker image as a build environment where I can mount a project and build it. For build I use cmake and conan. The dockerfile of this image:
FROM alpine:3.9
RUN ["apk", "add", "--no-cache", "gcc", "g++", "make", "cmake", "python3", "python3-dev", "linux-headers", "musl-dev"]
RUN ["pip3", "install", "--upgrade", "pip"]
RUN ["pip3", "install", "conan"]
WORKDIR /project
Files like
~/.conan/profiles/default
are created after I call
conan install ..
so that these files are created in the container and not in the image. The default behavior of conan is to set
compiler.libcxx=libstdc++
I'd like to run something like
RUN ["sed", "-i", "s/compiler.libcxx=libstdc++/compiler.libcxx=libstdc++11/", "~/.conan/profiles/default"]
to change the libcxx value but this file does not exist at this point. The only way I found to create the default profile by conan would be to install something.
Currently I'm running this container with
docker run --rm -v $(dirname $(realpath $0))/project:/project build-environment /bin/sh -c "\
rm -rf build && \
mkdir build && \
cd build && \
conan install -s compiler.libcxx=libstdc++11 .. --build missing && \
cmake .. && \
cmake --build . ; \
chown -R $(id -u):$(id -u) /project/build \
"
but I need to remove -s compiler.libcxx=libstdc++11 as it should be dependent on the image and not fixed by the build script.
Is there a way to initialize conan inside the image and edit the configuration without installing something? Currently I'm planning to write the whole configuration by myself but that seems a little too much as I want to use the default configuration and change only one line.
You can also create an image from a running container. Try installing conan in running container and then create an image of it. As it is being installed in running container it will have all dependencies only for it. To create that image you can follow this link
https://docs.docker.com/engine/reference/commandline/commit/

How to test docker image with external script

Assume that I have a code repo. That code repo has the following files in it:
Dockerfile
test.sh
lets say I now build a docker image from the Dockerfile
docker build -t my-image .
Now I want to execute the test.sh in the context of my-image in a container, yet I don't have added test.sh to the docker image during build.
How do I run the docker image and execute the test.sh in it? Do I have to mount the repo as volume first or is there a quicker way?
Couple of options:
Copy it in (docker cp test.sh <container_id>:<path_file_should_go_inside>) - but then you gotta run the file as a separate step
Mount it in (docker run -v $(pwd)/test.sh:<path_file_should_go_inside> my-image <path_file_should_go_inside>)
If test.sh is not part of image then you will have to mount the local repository as a volume.
docker run -v /path/to/local/repo:/tmp my-image /tmp/test.sh