Automation of compiling dependency trees for different architectures

Automation of compiling dependency trees for different architectures - automation

Please, don't consider my question as a rant.
Every package in a typical Linux distro has a dependency tree.
Let's suppose we want a specific package in a specific architecture
(amd64, i386, mips, sparc, powerpc). I may need it for training assembly
languages of these architectures with high-level libraries. If this package depends on some
libraries, we must have these libraries also in this architecture. And my idea is to make
a tool that tries to compile all the dependency tree of libraries
(without scripts and programs like python), beginning with compiling the cross-compiler and libc.
This tool must be aware of different build systems like Autotools, Cmake, Meson. And the system
is intended to work on Debian. Source packages will be downloaded via "apt source" and "apt download"
will be used to assess whether the package is a library. One level dependencies of a package will be listed
using "apt-cache show". Dependencies will be installed in non standard locations (in a common parent directory of all)
and environmental variables (like C_INCLUDE_PATH, LIBRARY_PATH) will be used to denote where they are. I am going to implement this in Perl.
I am aware that many package compilations yield errors when they are run first times when we don't know yet
what options to use and what must be fixed in our system. We don't know what package we compile, so we cannot adjust the compilation.
We only may guess what things can turn up. There exist projects with tree compilation (like proton) but it is known what we are compiling
(faudio and gstreamer). Therefore I think that the task may be difficult and successful compilations may be a little fraction of all.
If it were easy, there would be something like that. The bigger the dependency tree, the bigger chance of failure.
I wonder whether to propose this topic for my graduation work.
My questions are: is something like that already invented or thought of? Is it a difficult task?

Related

Is there an easy way to find what the minimum required version for CMake should be?

Is there a convenient way to find what the minimum compatible CMake version should be (besides testing it with every single version)? I'm looking for a tool that will parse my CMakeLists.txt, find all the features I'm using, look up the CMake version they were added, and spit out the maximum. A quick look though cmake --help didn't show an option to do this. Is there an external tool that will do this for me?

As of now, there is no such tool. This is because there are a number of problems with creating such a tool.
Most importantly is that CMake has never made any promises of forwards-compatibility. Listfiles authored with a newer version of CMake have never been guaranteed to work with older versions, regardless what cmake_minimum_required setting appears. This is due to several factors: new features added, improved logic in Find modules and compiler detection, and so on. Basically anything that doesn't break old code, but makes newer code more intelligent and robust, even without source changes.
Thus, a tool that only checked for new features (like generator expressions) would miss out on changes to other parts of the overall system.
This means any such tool would have to model CMake so closely that it would be easier to simply automate running an old CMake version and testing the build. If you feel you need to do this, you should automate it yourself.
Taking a step back, CMake is amazingly, ludicrously easy to upgrade and you can save yourself and others a lot of backwards compatibility headaches like this by simply sidestepping the issue. Use a recent version, declare it as a minimum and encourage your users to upgrade. On Linux, Kitware provides statically linked executables for x86 and arm that require nothing besides libc. I have never heard of these executables not working. I use them on old Raspberry Pis. I have yet to see any remotely valid reason to support versions of CMake older than a year or so.

Is CMake an equivalent of npm?

I am totally new to CMake and compiled languages for that matter. I have seen this question and answer. But I still don't fully understand what CMake is.
I am coming from a nodeJs/Javascipt environment, therefore if I could know a CMake equivalent in the nodeJs/Javascipt environment it would really help me understand what it is.So... Is CMake an equivalent of npm?

No, citing from Wikipedia:
CMake is a cross-platform free and open-source software tool for managing the build process of software using a compiler-independent method. It supports directory hierarchies and applications that depend on multiple libraries. It is used in conjunction with native build environments such as Make, Qt Creator, Ninja, Apple's Xcode, and Microsoft Visual Studio. It has minimal dependencies, requiring only a C++ compiler on its own build system.
JavaScript is an interpreted language, that means NodeJS/Browsers read and understand the code and execute it directly. For example C is built via a compiler (that reads and understands the code before execution) to Machine code (that does not need to be understand because it's the native language from your processor) and can be executed faster. CMake simplifies calling the Compiler, linking libraries (something like setting up require) and more for all files. Altough sometimes using babel, webpack and others via npm run is called 'building'.

Why does Bazel's rules_closure downloads platform specific binaries instead of sources?

I noticed on the rules_closure repository (used by tensorflow when building it with //tensorflow/tools/pip_package:build_pip_package) that there are rules to build some dependencies like nodejs and protoc through the filegroup_external interface.
Why is the reason for not building it from scratch like other dependencies?
I ask because this approach compromises portability, as it needs to list the binaries for each platform that tries to build tensorflow (and it is even worse when there is no binary-ready for your platform).

This build configuration works deterministically, out of the box, with no systems dependencies, on recent Linux/Mac/Windows systems with Intel CPUs, and incurs no additional build latency. Our goal has been to optimize for the best build experience, for what's in our support matrix. I agree with you that an escape hatch should exist for other systems. Feel free to open an issue with the rules_closure project and CC: #jart so we can discuss more how to solve that.

Autotools vs CMake for both Windows and Linux compilation

I have been looking for pros & cons of Autotools and CMake. But I would like to know opinions from people having used one (or both) of these tools for projects.
I used Autotools very basically a year ago and I know that one of the good points is that it relies on shell scripting, thus it does not need to be installed to be run and uses portable shell scripting. But it looks like it is too unix oriented, and it would not be possible to run the configure file on Windows.
I have now to choose a build system tool for an open source project that will have to be compiled for at least Linux & Windows. It is written in C++, and uses a Qt GUI front-end, the rest of it is "generic".
Thanks for you help.

Updated 16th of January 2019: Refined advice as tools evolve.
I have used autotools before for a considerable amount of time.
Currently I make intensive use of meson and cmake only when I need it.
Some personal advice:
for big teams, stick to CMake if you want to make use of the generators for XCode. If you do not need it, I would use Meson directly. Meson, as of version 0.49, also supports finding CMake configuration files (though I did not test yet how well this works). Also, Visual Studio seems to be sufficiently well-supported at this point in time, though, again, I did not try myself. The advantage of CMake is that it has Visual Studio integration.
Drop autotools. Meson covers well everything already. Their cross-compilation model is amazingly understandable. In CMake, last time I checked, everything was quite more difficult.
I have also tried scons, waf, and tup.
The most full-featured, cross-platform system, is CMake, but the DSL from meson will be easier to use for people used to python and others. Meson is starting to support VS also (a VS2015 generator) and some projects already have experimental support for it, for example gstreamer. Gstreamer is compiled in windows as well with meson. Right now there is VS2015 generator and VS2017 but I did not try myself the generators lately. As of meson 0.37.1 needed some work, but they are improving them and current version is already 0.40.
Meson
Pros:
The DSL does not get in the way at all. In fact, it is very nice and familiar, based in python.
Well-thought cross compilation support.
The objects are all strongly typed: you cannot make string substitution mistakes easily, since objects are entities such as 'depencency', 'include directory', etc.
It is very obviuos how to add a module for one of your tools.
Cross-compilation seems more straightforward to use.
Really well-thought. The designer and main writer of Meson knows what
he talks about very well when designing a build system.
Very, very fast, especially in incremental builds.
The documentation is 10 times better that what you can find in cmake. Go visit http://mesonbuild.com and you will find tutorial, howtos and a good reference. It is not perfect but it is really discoverable.
Cons:
Not as mature as CMake, though, I consider it already fully usable for C++.
Not so many modules available, though, gnome, qt and the common ones are already there.
Project generators: seems VS generator is not working that well as of now. CMake project generators are far more mature.
Has a python3 + ninja dependency.
Cmake
Pros:
Generates projects for many different IDEs. This is a very nice feature for teams.
Plays well with windows tools, unlike autotools.
Mature, almost de-facto standard.
Microsoft is working on CMake integration for Visual Studio.
Cons:
It does not follow any well known standard or guidelines.
No uninstall target.
The DSL is weird, when you start to do comparisons and such, and the strings vs list thing or escape chars, you will make many mistakes, I am pretty sure.
Cross compilation sucks.
Autotools
Pros:
Most powerful system for cross-compilation, IMHO.
The generated scripts don't need anything else than make, a shell and, if you need it to build, a compiler.
The command-line is really nice and consistent.
A standard in unix world, lots of docs.
Really powerful command-line: changing directories of installation, uninstall,
renaming binaries...
If you target unix, packaging sources with this tool is really convenient.
Cons:
It won't play well with microsoft tools. A real showstopper.
The learning curve is... well... But actually I can say that CMake was not that easy either.
The use of recursive make is pervasive in legacy projects. Automake supports non-recursive builds, but it's not a very widely used approach.
About the learning curve, there are two very good sources to learn from:
The website here
The book here
The first source will get you up and running faster. The book is a more in-depth discussion.
From Scons, waf and tup, Scons and tup are more like make. Waf is more like CMake and the autotools. I tried waf instead of cmake at first. I think it is overengineered in the sense that it has a full OOP API. The scripts didn't look short at all and it was really confusing for me the working directory stuff and related things. At the end, I found that autotools and CMake are a better choice. My favourite from these 3 build systems is tup.
Tup
Pros
Really correct.
Insanely fast. You should try it to believe it.
The scripting language relies on a very easy idea that can be understood in 10 minutes.
Cons
It does not have a full-featured config framework.
I couldn't find the way to make targets such as doc, since
they generate files I don't know of and they must be listed in the output before being generated, or at least, that's my conclusion for now. This was a really annoying limitation, if it is, since I am not sure.
All in all, the only things I am considering right now for new projects is are Cmake and Meson. When I have a chance I will try tup also, but it lacks the config framework, which means that it makes things more complex when you need all of that stuff. On the other hand, it is really fast.

I would not recommend autotools for Windows. Use CMake.
Why? Windows doesn't have a native sh.exe, and the emulation is slow. It's also very easy to get configury stuff wrong. I'm not saying it's impossible in CMake, but CMake surely abstracts more away, so you worry about less. CMake documentation can be a bit hard to read, but once it's set up, you should be fine for all toolchains ever supported by CMake. CMake also integrates testing, packaging etc...
Autotools is slow on Windows, does not work easily with MSVC, and has weird quirks with Windows (and other OSes) that are hard to debug, and hard to fix. libtool also sucks on Windows, where it often refuses to build a shared library even, if you think it should and could. Toolchain relocation issues are also prevalent with libtool, which may look at the wrong files in a user's toolchain. CMake is a lot easier in this regard. It assumes normal things about the target platform and creates generic and good build instructions.
Also, CMake has coloured output :) and nice progress percentages.
PS: I just have some experience with CMake and autotools on Windows as a user. CMake tends to work, autotools tends to bite your ear off when you're not looking, and smile at you when it fails due to some strange error...

Designing a GPL library with weak dependencies on proprietary libs, best approaches?

I'm planning to write a C library which will act as an umbrella "wrapper" around several other libs. Some of the libraries will be GPL and some will be proprietary. Moreover, some of the libraries may not be available at compile time, so I plan to have autotools detect them during configure. I'm also wondering if I should build in support for these weak dependencies and then also detect them at run-time -- particularly for the proprietary libs. Here's why:
Without going into specifics, the library is intended to provide an API for talking to various devices, some of which don't have open source drivers. Currently it's difficult to program for these devices because there is no standard, easily available API to use. Each vendor provides its own. There are a few other APIs available that attempt to wrap them, but they are by and large
C++-only.
Designed for a Windows environment, with *nix as an afterthought.
Fail to build unless you have dependencies in the right places, i.e., complete lack of a proper configure/build system.
Most importantly, designed in such a way that they often link directly to proprietary libs, making me almost 100% sure it would be impossible to get these APIs into Debian.
Therefore my end-goal is to build a very simple and straight-forward C API that has a chance in hell of making it into distros so that people can actually write programs for these devices with a simple apt-get.
My question is, how should I best design the library to be GPL-compatible and Debian-friendly, but still be able to call out to proprietary libs when necessary?
Ideally I'd like the user to be able to apt-get a program using this library, and then as long as the vendor's user-level driver is installed to the expected place, everything should work out of the box.
My concern is two-fold:
having dependencies on optional, proprietary libs means the binary distro of the library can't be compiled to dynamically link to these libs, since they may or may not be available.
the user should not have to install dependencies for devices he does not have, open or proprietary.
How do other packages handle this problem of linking to proprietary libs and having run-time weak dependencies? Is dlopen the right way to go for everything? Should I dlopen only the proprietary stuff? What are reasons why or cases when Debian might reject such a package?
Lastly, I realize this probably isn't the right forum for this question about Debian policy, so can anyone point to me a better place to ask this question?
Thanks.

I have no relationship to Debian and cannot speak about their policies. However, for your framework, this seems a reasonable approach:
Define a simple header file that expresses the functionality you need from these plugins
Create a useful GPL/LGPL/BSD plugin that uses that interface
Have your main program load that using libdl, as you mentioned (if your main program is GPL, you need to have a licence exception to allow linking proprietary plugins)
Submit those for inclusion in Debian, and don't mention about the proprietary stuff
The main point is that your plugin system should be useful for free software, and not just be a Trojan horse to allow proprietary code to be loaded.

Using dlopen does not change the fact that you are writing a program to deliberately link to proprietary libraries and GPL libraries at the same time, it just shifts the linking from compile time to run time. While common consensus among the masses is that the GPL does not cover linking dynamically at runtime in this way, it is not safe legal advice to rely on such common understanding. The way I would solve the problem is to write a program with a single generic API for plugins (which can use dlopen, but the key is that you have not specifically written this program to link to proprietary libraries). The program must be under a free license that is compatible with all the plugins you eventually want it to be used with (ie LGPL, or GPL with exception for that API). Then write separate plugins for the GPL libraries and the proprietary libraries, and distribute them separately. If only one plugin can be loaded at a time, then there is no legal problem. If it is necessary to allow more than one plugin at once, then you need to be careful to separate your distribution. As the GPL is a distribution license, what the end users do is not a concern.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas