Bcftools 1.16 able to add F_MISSING tag? - snakemake

List item
I tried adding the F_MISSING tag using bcftools 1.16. When I run this command:
bcftools +fill-tags input.vcf.gz -- -t 'F_MISSING' | bcftools view -i 'INFO/F_MISSING<0.25' -Oz -o output.vcf.gz
I get the following error:
Error parsing "--tags F_MISSING": the tag "F_MISSING" is not supported
This command runs fine using bcftools 1.15. However, version 1.15 gives complications with other packages I use in my snakefile. Do you maybe know alternatives for how to add F_MISSING using bcftls 1.16?
I installed bcftools1.16 in a newly created conda env using conda install -c bioconda bcftools as indicated on https://anaconda.org/bioconda/bcftools
When I type bcftools +fill-tags --version:
bcftools 1.9 using htslib 1.9
plugin at 1.9 using htslib 1.9
##SOLUTION##
Indeed the issue was that I was not installing the most recent version of Conda.
I solved it by changing the .condarc file to solelely include the following lines:
channels:
- conda-forge
- bioconda
- defaults
The order is crucial as well.

I'm only giving a partial answer here:
However, version 1.15 gives complications with other packages I use in my snakefile.
You could work around this by making snakemake use a dedicated conda environment for the rule(s) needing bcftools 1.15. E.g.:
rule fill_tags:
input:
...
output:
...
conda:
"envs/bcftools-1.15.yaml"
shell:
r"""
bcftools +fill-tags {input.vcf} -- -t 'F_MISSING' \
| bcftools view -i 'INFO/F_MISSING<0.25' -Oz -o {output.vcf}
"""
Where envs/bcftools-1.15.yaml contains something like:
dependencies:
- bcftools=1.15
then run snakemake with flag --use-conda

Related

GraphDB Docker Container Fails to Run: adoptopenjdk/openjdk12:alpine

When using the standard DockerFile available here, GraphDB fails to start with the following output:
Could not find any executable java binary. Please install java in your PATH or set JAVA_HOME
Looking into it, the DockerFile uses adoptopenjdk/openjdk11:alpine which was recently updated to Alpine 3.14.
If I switch to an older Docker image (or use adoptopenjdk/openjdk12:alpine) then GraphDB starts without a problem.
How can I fix this while still using the latest version of adoptopenjdk/openjdk11:alpine?
Below is the DockerFile:
FROM adoptopenjdk/openjdk11:alpine
# Build time arguments
ARG version=9.1.1
ARG edition=ee
ENV GRAPHDB_PARENT_DIR=/opt/graphdb
ENV GRAPHDB_HOME=${GRAPHDB_PARENT_DIR}/home
ENV GRAPHDB_INSTALL_DIR=${GRAPHDB_PARENT_DIR}/dist
WORKDIR /tmp
RUN apk add --no-cache bash curl util-linux procps net-tools busybox-extras wget less && \
curl -fsSL "http://maven.ontotext.com/content/groups/all-onto/com/ontotext/graphdb/graphdb-${edition}/${version}/graphdb-${edition}-${version}-dist.zip" > \
graphdb-${edition}-${version}.zip && \
bash -c 'md5sum -c - <<<"$(curl -fsSL http://maven.ontotext.com/content/groups/all-onto/com/ontotext/graphdb/graphdb-${edition}/${version}/graphdb-${edition}-${version}-dist.zip.md5) graphdb-${edition}-${version}.zip"' && \
mkdir -p ${GRAPHDB_PARENT_DIR} && \
cd ${GRAPHDB_PARENT_DIR} && \
unzip /tmp/graphdb-${edition}-${version}.zip && \
rm /tmp/graphdb-${edition}-${version}.zip && \
mv graphdb-${edition}-${version} dist && \
mkdir -p ${GRAPHDB_HOME}
ENV PATH=${GRAPHDB_INSTALL_DIR}/bin:$PATH
CMD ["-Dgraphdb.home=/opt/graphdb/home"]
ENTRYPOINT ["/opt/graphdb/dist/bin/graphdb"]
EXPOSE 7200
The issue comes from an update in the base image. From a few weeks adopt switched to alpine 3.14 which has some issues with older container runtime (runc). The issue can be seen in the release notes: https://wiki.alpinelinux.org/wiki/Release_Notes_for_Alpine_3.14.0
Updating your Docker will fix the issue. However, if you don't wish to update your Docker, there's a workaround.
Some additional info:
The cause of the issue is that for some reason containers running in older docker versions and alpine 3.14 seem to have issues with the test flag "-x" so an if [ -x /opt/java/openjdk/bin/java ] returns false, although java is there and is executable.
You can workaround this for now by
Pull the GraphDB distribution
Unzip it
Open "setvars.in.sh" in the bin folder
Find and remove the if block around line 32
if [ ! -x "$JAVA" ]; then
echo "Could not find any executable java binary. Please install java in your PATH or set JAVA_HOME"
exit 1
fi
Zip it again and provide it in the Dockerfile without pulling it from maven.ontotext.com
Passing it to the Dockerfile is done with 'ADD'
You can check the GraphDB free version's Dockerfile for a reference on how to pass the zip file to the Dockerfile https://github.com/Ontotext-AD/graphdb-docker/blob/master/free-edition/Dockerfile

Problems at running ImageDataBunch in Deepnote

I'm having trouble running this line of code in Deepnote, does anyone know why?
data = ImageDataBunch.from_folder(path, train="train", valid ="test",ds_tfms=get_transforms(), size=(256,256), bs=32, num_workers=4).normalize()
The error says:
NameError: name 'ImageDataBunch' is not defined
And previously, I have imported the Fastai library. So I don't get it!
The FastAI setup in Deepnote is not that straightforward. It's best to use a custom environment where you set stuff up in a Dockerfile and everything works afterwards in the notebook. I am not sure if the ImageDataBunch or whatever you're trying to do works the same way in FastAI v1 and v2, but here are the details for v1.
This is a Dockerfile which sets up the FastAI environment via conda:
# This is Dockerfile
FROM deepnote/python:3.9
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh
RUN bash ~/miniconda.sh -b -p $HOME/miniconda
ENV PATH $HOME/miniconda/bin:$PATH
ENV PYTONPATH $HOME/miniconda
RUN $HOME/miniconda/bin/conda install python=3.9 ipykernel -y
RUN $HOME/miniconda/bin/conda install -c fastai -c pytorch fastai -y
RUN $HOME/miniconda/bin/python -m ipykernel install --user --name=conda
ENV DEFAULT_KERNEL_NAME "conda"
After that, you can test the fastai imports in the notebook:
import fastai
from fastai.vision import *
print(fastai.__version__)
ImageDataBunch
And if you download and unpack this sample MNIST dataset, you should be able to load the data like you suggested:
data = ImageDataBunch.from_folder(path, train="train", valid ="test",ds_tfms=get_transforms(), size=(256,256), bs=32, num_workers=4).normalize()
Feel free to check out or clone my Deepnote project to continue working on this.

Invalid argument --model_config_file_poll_wait_seconds

I'm trying to start tensorflow-serving with the following two options like on the documentation
docker run -t --rm -p 8501:8501 \
-v "$(pwd)/models/:/models/" tensorflow/serving \
--model_config_file=/models/models.config \
--model_config_file_poll_wait_seconds=60
The container does not start because it does not recognize the argument --model_config_file_poll_wait_seconds.
unknown argument: --model_config_file_poll_wait_seconds=60
usage: tensorflow_model_server
I'm on the latest docker image, 1.14.0 and the line is taken straight from the documentation
https://www.tensorflow.org/tfx/serving/serving_config
Does this argument even work?
Many thanks.
It seems https://www.tensorflow.org/tfx/serving/serving_config is talking about code that has not been released as a new version yet, which is odd. I will ask about that.
That package is generated from this source:
https://github.com/tensorflow/serving/blob/master/tensorflow_serving/g3doc/serving_config.md, it mentions the --model_config_file_poll_wait_seconds flag.
However, the same document for 1.14.0 has no mention of the flag:
https://github.com/tensorflow/serving/blob/1.14.0/tensorflow_serving/g3doc/serving_config.md
Try using the nightly tensorflow serving image and see if it works.
docker run -t --rm -p 8501:8501 \
-v "$(pwd)/models/:/models/" tensorflow/serving:nightly \
--model_config_file=/models/models.config \
--model_config_file_poll_wait_seconds=60
Just tried. Tensorflow Serving 2.1.0 supports it while 1.14.0 doesn't.

Singularity container from conda environment

I want to build a container from my conda environment following this post. However, I get the following error: '/bin/sh: 1: cannot create ~/.bashrc: Directory nonexistent'. I am using a vagrant VM to build my image and would be grateful for any help.
Editing the .bashrc, aside from failing, will not be helpful as the shell loaded by singularity is explicitly --norc. You want to use the $SINGULARITY_ENVIRONMENT variable in %post to have the values available.
Something along these lines:
%post
# You may need to install some pre-reqs your host system has installed outside of conda, e.g.
# apt update && apt install -y build-essential make zlib
ENV_NAME=$(head -1 environment.yml | cut -d' ' -f2)
echo ". /opt/conda/etc/profile.d/conda.sh" >> $SINGULARITY_ENVIRONMENT
echo "conda activate $ENV_NAME" >> $SINGULARITY_ENVIRONMENT
. /opt/conda/etc/profile.d/conda.sh
conda env create -f environment.yml -p /opt/conda/envs/$ENV_NAME
I listed a few libraries that you probably have installed in your current machine that might not be installed in the slim docker image. You can install them via apt or conda, depending on your preference. If it does happen though, it'll be specific to your environment.yml and host OS, so you'll have to iterate through until the build succeeds.

Scripts installed by the deb package have wrong prefix

Building our own deb packages we've run into the issue of having to patch manually some scripts so they get the proper prefix.
In particular,
We're building mono
We're using official tarballs.
The scripts that end up with wrong prefix are: mcs, xbuild, nunit-console4, etc
An example of a wrong script:
#!/bin/sh
exec /root/7digital-mono/mono/bin/mono \
--debug $MONO_OPTIONS \
/root/7digital-mono/mono/lib/mono/2.0/nunit-console.exe "$#"
What should be the correct end result:
#!/bin/sh
exec /usr/bin/mono \
--debug $MONO_OPTIONS \
/usr/lib/mono/2.0/nunit-console.exe "$#"
The workaround we're using in our build-package script before calling dpkg-buildpackage:
sed -i s,`pwd`/mono,/usr,g $TARGET_DIR/bin/mcs
sed -i s,`pwd`/mono,/usr,g $TARGET_DIR/bin/xbuild
sed -i s,`pwd`/mono,/usr,g $TARGET_DIR/bin/nunit-console
sed -i s,`pwd`/mono,/usr,g $TARGET_DIR/bin/nunit-console2
sed -i s,`pwd`/mono,/usr,g $TARGET_DIR/bin/nunit-console4
Now, what is the CORRECT way to fix this? Full debian package creation scripts here.
Disclaimer: I know there are preview packages of Mono 3 here! But those don't work for Squeeze.
the proper way is to not call ./configure --prefix=$TARGET_DIR
this tells all the binaries/scripts/... that the installated files will end up in ${TARGET_DIR}, whereas they really should endup in /usr.
you can use the DESTDIR variable (as in make install DESTDIR=${TARGET_DIR}) to change (prefix) the installation target at install time (files will end-up in ${TARGET_DIR}/${prefix} but will only have ${prefix} "built-in")