Docker images with architecture optimisation? - optimization

Some libraries such as BLAS/LAPACK or certain optimisation libraries get optimised for the local machine architecture upon compilation time. Lets take OpenBlas as an example. There exist two ways to create a Docker container with OpenBlas:
Use a Dockerfile in which you specify a git clone of the OpenBlas library together with all necessary compilation flags and build commands.
Pull and run someone else's image of Ubuntu + OpenBlas from the Docker Hub.
Option (1) guarantees that OpenBlas is build and optimised for your machine. What about option (2)? As a Docker novice, I see images as something fixed and static, so running this image would not be optimised to my machine (which might be AMD-based instead of the maintainer's Intel CPU). What left me confused is that the image ipython/scipyserver does clone the latest OpenBlas master from Github during build.
I seem to misunderstand the concept of Docker images and/or automated builds, and I would very much appreciate a clarification.

No, I think you're pretty much right.
The image you link to is an automated build, so OpenBlas will be getting compiled using the kernel and architecture of the Docker build server. I did notice the build script sets the following flags when building OpenBlas:
DYNAMIC_ARCH=1 NO_AFFINITY=1 NUM_THREADS=32
Which presumably makes the install portable at a performance cost. An alternative would be to have a separate image for various build configurations.

Related

Problem building GNURadio in custom environment

I am trying to build the latest GNURadio package on my development system. Unfortunately this system configuration is tightly controlled and I can't just install new packages of software on it as it is used to develop a product and all development systems are kept in lockstep. We are currently on an older version of RedHat.
While I cannot modify the system includes I can download and use newer versions of packages locally (in non-system directories) as long as that doesn't affect the product build/debug environment. Normally this isn't a problem.
However, when building GNURadio I found that our development platforms use an older version of the Boost libraries than is required to build GNURadio. So, I got the latest version of Boost and extracted it into my local (home) directory. I found several directions for, I thought, instructing CMake to use additional include directories. Unfortunately, this hasn't seemed to work with the Boost libraries. CMake keeps complaining that it finds the older version of Boost and not the newer one I have extracted locally.
I have tried using
-DCMAKE_CXX_STANDARD_INCLUDE_DIRECTORIES=<dir>
and
-DCMAKE_CXX_STANDARD_INCLUDE_DIRECTORIES_BEFORE=<dir>
and this had no effect. I then tried adding the following to the top-level CMakeLists.txt file:
SET(CMAKE_INCLUDE_DIRECTORIES_BEFORE ON)
SET(CMAKE_CXX_STANDARD_INCLUDE_DIRECTORIES <dir>)
or, even
include_directories(BEFORE <dir>)
Again, no joy.
I did a bit of digging and found that there is a GrBoost.cmake module and it had an additional configuration for the boost directory so I added this:
list(PREPEND BOOST_LIBRARYDIR "<dir>")
to the top of the file. Again, no luck.
I've never used CMake before (and I'm not really keen on learning yet another build system if I don't have to - our company just switched to bazel and I am coming up to speed on that) so I am flying blind here.
What do I have to do to get CMake to look in my local directory to find the Boost stuff I downloaded?
Ok. As it often happens, just after asking the question I was able to find an answer.
It turns out that there is a command-line option to CMake (CMAKE_PREFIX_PATH=<dir>) where you can specify additional base paths to search for CMake config files. I just added this to the command-line and it was found just fine.
I wasn't even aware that Boost came with such config files. Live and learn.
#vre's comment would have probably worked just as well (maybe better, in fact).

Parallel STL algorithms requiring tbb or not?

I remember that g++ 9.3.0 required to link against libtbb to use parallel STL algorithms, otherwise it gave compilation errors. I have a docker container using an ubuntu image where I have installed g++-10 only and I can use the algorithms without writing -ltbb. How can I explain this? I searched everywhere and there is no libtbb anywhere, so I am assuming g++-10 no longer needs it? Where can I read some documentation about this fact that it does not need it or is required up to version x.x.x? Thanks for clarifying.

Best way to write setup script for multi-language project package that includes anaconda, atom, node.js etc.?

I am designing an environment for productive research, i.e. writing, data-analysis, publication, etc.
In order to share the final results with others, I need to find a way to package this and to set up the local installation.
The project depends on Anaconda, so conda as a package manager is available.
It also includes
Pandoc and some pandoc packages, some will have to be fetched from Github directly because some versions are not available via conda-forge (doable in conda)
Atom and Atom packages; they should be installed and configured by my script (this works on the CLI via the apm package manager)
Node.js and Mermaid and a few other JS packages, which require npm calls
Some file-system-level operations, like deleting parts from packages where I only need a portion from, creating symlinks and aliases etc.
Maybe some Python code for modifying yaml/json/ini files or reading therefrom.
The main project will reside in a Github repository. It will be fine for users to clone it from there and start a build script locally.
My idea is to write a Bash shell script that
creates a conda environment based on requirements.yaml for everything that can be done this way
installs other parts using CLI commands (wget/curl etc.)
does all necessary modifications using CLI commands, maybe using a few short Python scripts (e.g. for changing or reading JSON or yaml files).
My local usage will be on OSX Big Sur, Linux should be supported, Windows compatibility would be nice-to-have.
Before I start:
Is this approach viable? I think it will be pretty transparent, but of course also a bit proprietary.
Docker is likely overkill for my purpose, and I also read that the execution will be slow on OSX.
The same environment will likely be installed multiple times on the same users' machine, so it is important that I can control e.g. the usage of existing packages and files via aliases or symlinks. It is not important that the multiple installations are decoupled for the non-python/non-conda parts (e.g. atom, node.js, mermaid could be the same binaries for all installations; just the set of Python packages might vary by installation).
Thanks for your expertise!

MXE compile on linux for linux

So I saw this project http://mxe.cc/ and tried it, it seems like it is very easy to compile stuff for windows with this. I tried to hack it a little bit to compile binaries for linux instead, because, if it compiles for other system so easily how can it be hard to compile for host? 90% of the stuff seems to just build out of the box, but there are some errors and therefore I cannot build. I want to ask, how correctly should I configure mxe to build for the linux host? I know this is not supported but I don't think it should be that hard because we build from source anyway. And there are next to no modifications for downloaded sources too (in a windows build that is).
For people who might ask why I don't want to use shared stuff, I want to basically have two options:
dpkg package for user with dependencies specified (the linux way)
single standalone static executable
Any suggestions? Or maybe there's whole another guide on linux on how to build things from scratch (without a lot of manual work like mxe does)?

Which documentation should I use to develop & test makefile based applications with Yocto?

I have installed Yocto, and successfully ran host-prepare.sh script. I want to know that, to develop makefile based applications, what to do next, should install toolchain or ADT or kernel & filesystem images? Is there any documentation which has step by step process?
In many cases, one do not need to install Yocto at all, what is needed is just appropriate toolchain along with filesystem image to develop and test simple program with qemu (runqemu - which is emulator - a part of toolchain). These toolchains and filesystem images are available to download and ready to use. Refer 'Using Pre-Built Binaries and QEMU' section in this documentation.