Why does Bazel's rules_closure downloads platform specific binaries instead of sources? - tensorflow

I noticed on the rules_closure repository (used by tensorflow when building it with //tensorflow/tools/pip_package:build_pip_package) that there are rules to build some dependencies like nodejs and protoc through the filegroup_external interface.
Why is the reason for not building it from scratch like other dependencies?
I ask because this approach compromises portability, as it needs to list the binaries for each platform that tries to build tensorflow (and it is even worse when there is no binary-ready for your platform).

This build configuration works deterministically, out of the box, with no systems dependencies, on recent Linux/Mac/Windows systems with Intel CPUs, and incurs no additional build latency. Our goal has been to optimize for the best build experience, for what's in our support matrix. I agree with you that an escape hatch should exist for other systems. Feel free to open an issue with the rules_closure project and CC: #jart so we can discuss more how to solve that.

Related

Automation of compiling dependency trees for different architectures

Please, don't consider my question as a rant.
Every package in a typical Linux distro has a dependency tree.
Let's suppose we want a specific package in a specific architecture
(amd64, i386, mips, sparc, powerpc). I may need it for training assembly
languages of these architectures with high-level libraries. If this package depends on some
libraries, we must have these libraries also in this architecture. And my idea is to make
a tool that tries to compile all the dependency tree of libraries
(without scripts and programs like python), beginning with compiling the cross-compiler and libc.
This tool must be aware of different build systems like Autotools, Cmake, Meson. And the system
is intended to work on Debian. Source packages will be downloaded via "apt source" and "apt download"
will be used to assess whether the package is a library. One level dependencies of a package will be listed
using "apt-cache show". Dependencies will be installed in non standard locations (in a common parent directory of all)
and environmental variables (like C_INCLUDE_PATH, LIBRARY_PATH) will be used to denote where they are. I am going to implement this in Perl.
I am aware that many package compilations yield errors when they are run first times when we don't know yet
what options to use and what must be fixed in our system. We don't know what package we compile, so we cannot adjust the compilation.
We only may guess what things can turn up. There exist projects with tree compilation (like proton) but it is known what we are compiling
(faudio and gstreamer). Therefore I think that the task may be difficult and successful compilations may be a little fraction of all.
If it were easy, there would be something like that. The bigger the dependency tree, the bigger chance of failure.
I wonder whether to propose this topic for my graduation work.
My questions are: is something like that already invented or thought of? Is it a difficult task?

Embedded Linux and Cross Compiling

I'm just now starting to learn embedded linux system development and I'm wondering how to go about a few things. Mainly, I have questions about cross compiling. I know what cross compiling is but I'm wondering how to actually go about the whole process when it comes to writing the makefile and deploying the application to the board (mainly the makefile part though).
I've researched a good amount online and found a ton of different things have to be set whether it's in regards to the toolchain, the processor, etc. Are there any good resources to learn this topic and master it or could anyone explain the best way to go about it?
EDIT:
I'm not wondering about how to cross compile in general. I'm wondering about cross compiling already existing applications (e.g. openCV, samba, etc) for a target system from the host system (especially when there is no support regarding the process with the application, which is common).
Basically you just need a special embedded Linux distribution, that will take care of cross-compilation process. Take a look at for example Buildroot. In folder package you'll find package recipe examples.
For your own software build process you can take a look at CMake. libuci recipe shows, how to use CMake based projects in Buildroot.
This answer is based on my own experience, so you're to justify, if it suits your needs.
I learned everything about embedding Linux with these guys: http://free-electrons.com/
They not only offer free docs but also courses for successfully running your box with custom Linux distros. In my case, I achieved embedding uClinux in an board with MMU-less 32 bit CPU with 32 MB RAM. The Linux image just occupied 1MB.

Language for cross-platform install script

In writing an install script, I quickly found that I'd have cross-platform issues, and bash scripts are hard to maintain. I decided to look for a cleaner solution that's more cross-platform.
The goal is to have an intelligent script sniff out components of the user's system and have as little user interaction as possible. That being stated, I thought about these languages:
Python- cross-platform, and many other programs rely on it, so it may already be present
Javascript- nodejs is required by part of my application, but it's a little clunky for exec calls
Are there any languages that would be a better fit for this application?
Requirements:
Available on all platforms
May be distributed as part of my application if small enough
Little to no version variation, so Ruby is out
*nix only for now, but eventually will be run on Windows
Maintainable
Clear syntax (Perl is out)
Modular (if I sniff the OS, I can include separate OS-specific code)
Capable of downloading files (unmet dependencies)
Capable of relatively complex scripting tasks
Testing for used HTTP ports
Reading and parsing files for configuration data
Checking for permissions and changing directories of insufficient privileges
Open source
Python can do all of those things:
Available on all platforms (Mac, Linux, Windows, and more)
May be distributed as part of my application if small enough (You can make binaries with cx_freeze, if needed)
Little to no version variation, so Ruby is out (Python is pretty static when it comes to version changes)
*nix only for now, but eventually will be run on Windows (It comes pre-installed on Mac, and ships with just about any Linux distro. Binaries don't need the interpreter to run)
Maintainable
Clear syntax (Perl is out) (Python is very easy to read, but that's up to you to decide)
Modular (if I sniff the OS, I can include separate OS-specific code) (Modules are just files in Python)
Capable of downloading files (unmet dependencies) (Urllib2 takes care of that, and it's pre-installed)
Open source (Yep)
Ant will do what you need. It is OS independent and will allow compiles and installs.

Autotools vs CMake for both Windows and Linux compilation

I have been looking for pros & cons of Autotools and CMake. But I would like to know opinions from people having used one (or both) of these tools for projects.
I used Autotools very basically a year ago and I know that one of the good points is that it relies on shell scripting, thus it does not need to be installed to be run and uses portable shell scripting. But it looks like it is too unix oriented, and it would not be possible to run the configure file on Windows.
I have now to choose a build system tool for an open source project that will have to be compiled for at least Linux & Windows. It is written in C++, and uses a Qt GUI front-end, the rest of it is "generic".
Thanks for you help.
Updated 16th of January 2019: Refined advice as tools evolve.
I have used autotools before for a considerable amount of time.
Currently I make intensive use of meson and cmake only when I need it.
Some personal advice:
for big teams, stick to CMake if you want to make use of the generators for XCode. If you do not need it, I would use Meson directly. Meson, as of version 0.49, also supports finding CMake configuration files (though I did not test yet how well this works). Also, Visual Studio seems to be sufficiently well-supported at this point in time, though, again, I did not try myself. The advantage of CMake is that it has Visual Studio integration.
Drop autotools. Meson covers well everything already. Their cross-compilation model is amazingly understandable. In CMake, last time I checked, everything was quite more difficult.
I have also tried scons, waf, and tup.
The most full-featured, cross-platform system, is CMake, but the DSL from meson will be easier to use for people used to python and others. Meson is starting to support VS also (a VS2015 generator) and some projects already have experimental support for it, for example gstreamer. Gstreamer is compiled in windows as well with meson. Right now there is VS2015 generator and VS2017 but I did not try myself the generators lately. As of meson 0.37.1 needed some work, but they are improving them and current version is already 0.40.
Meson
Pros:
The DSL does not get in the way at all. In fact, it is very nice and familiar, based in python.
Well-thought cross compilation support.
The objects are all strongly typed: you cannot make string substitution mistakes easily, since objects are entities such as 'depencency', 'include directory', etc.
It is very obviuos how to add a module for one of your tools.
Cross-compilation seems more straightforward to use.
Really well-thought. The designer and main writer of Meson knows what
he talks about very well when designing a build system.
Very, very fast, especially in incremental builds.
The documentation is 10 times better that what you can find in cmake. Go visit http://mesonbuild.com and you will find tutorial, howtos and a good reference. It is not perfect but it is really discoverable.
Cons:
Not as mature as CMake, though, I consider it already fully usable for C++.
Not so many modules available, though, gnome, qt and the common ones are already there.
Project generators: seems VS generator is not working that well as of now. CMake project generators are far more mature.
Has a python3 + ninja dependency.
Cmake
Pros:
Generates projects for many different IDEs. This is a very nice feature for teams.
Plays well with windows tools, unlike autotools.
Mature, almost de-facto standard.
Microsoft is working on CMake integration for Visual Studio.
Cons:
It does not follow any well known standard or guidelines.
No uninstall target.
The DSL is weird, when you start to do comparisons and such, and the strings vs list thing or escape chars, you will make many mistakes, I am pretty sure.
Cross compilation sucks.
Autotools
Pros:
Most powerful system for cross-compilation, IMHO.
The generated scripts don't need anything else than make, a shell and, if you need it to build, a compiler.
The command-line is really nice and consistent.
A standard in unix world, lots of docs.
Really powerful command-line: changing directories of installation, uninstall,
renaming binaries...
If you target unix, packaging sources with this tool is really convenient.
Cons:
It won't play well with microsoft tools. A real showstopper.
The learning curve is... well... But actually I can say that CMake was not that easy either.
The use of recursive make is pervasive in legacy projects. Automake supports non-recursive builds, but it's not a very widely used approach.
About the learning curve, there are two very good sources to learn from:
The website here
The book here
The first source will get you up and running faster. The book is a more in-depth discussion.
From Scons, waf and tup, Scons and tup are more like make. Waf is more like CMake and the autotools. I tried waf instead of cmake at first. I think it is overengineered in the sense that it has a full OOP API. The scripts didn't look short at all and it was really confusing for me the working directory stuff and related things. At the end, I found that autotools and CMake are a better choice. My favourite from these 3 build systems is tup.
Tup
Pros
Really correct.
Insanely fast. You should try it to believe it.
The scripting language relies on a very easy idea that can be understood in 10 minutes.
Cons
It does not have a full-featured config framework.
I couldn't find the way to make targets such as doc, since
they generate files I don't know of and they must be listed in the output before being generated, or at least, that's my conclusion for now. This was a really annoying limitation, if it is, since I am not sure.
All in all, the only things I am considering right now for new projects is are Cmake and Meson. When I have a chance I will try tup also, but it lacks the config framework, which means that it makes things more complex when you need all of that stuff. On the other hand, it is really fast.
I would not recommend autotools for Windows. Use CMake.
Why? Windows doesn't have a native sh.exe, and the emulation is slow. It's also very easy to get configury stuff wrong. I'm not saying it's impossible in CMake, but CMake surely abstracts more away, so you worry about less. CMake documentation can be a bit hard to read, but once it's set up, you should be fine for all toolchains ever supported by CMake. CMake also integrates testing, packaging etc...
Autotools is slow on Windows, does not work easily with MSVC, and has weird quirks with Windows (and other OSes) that are hard to debug, and hard to fix. libtool also sucks on Windows, where it often refuses to build a shared library even, if you think it should and could. Toolchain relocation issues are also prevalent with libtool, which may look at the wrong files in a user's toolchain. CMake is a lot easier in this regard. It assumes normal things about the target platform and creates generic and good build instructions.
Also, CMake has coloured output :) and nice progress percentages.
PS: I just have some experience with CMake and autotools on Windows as a user. CMake tends to work, autotools tends to bite your ear off when you're not looking, and smile at you when it fails due to some strange error...

Designing a GPL library with weak dependencies on proprietary libs, best approaches?

I'm planning to write a C library which will act as an umbrella "wrapper" around several other libs. Some of the libraries will be GPL and some will be proprietary. Moreover, some of the libraries may not be available at compile time, so I plan to have autotools detect them during configure. I'm also wondering if I should build in support for these weak dependencies and then also detect them at run-time -- particularly for the proprietary libs. Here's why:
Without going into specifics, the library is intended to provide an API for talking to various devices, some of which don't have open source drivers. Currently it's difficult to program for these devices because there is no standard, easily available API to use. Each vendor provides its own. There are a few other APIs available that attempt to wrap them, but they are by and large
C++-only.
Designed for a Windows environment, with *nix as an afterthought.
Fail to build unless you have dependencies in the right places, i.e., complete lack of a proper configure/build system.
Most importantly, designed in such a way that they often link directly to proprietary libs, making me almost 100% sure it would be impossible to get these APIs into Debian.
Therefore my end-goal is to build a very simple and straight-forward C API that has a chance in hell of making it into distros so that people can actually write programs for these devices with a simple apt-get.
My question is, how should I best design the library to be GPL-compatible and Debian-friendly, but still be able to call out to proprietary libs when necessary?
Ideally I'd like the user to be able to apt-get a program using this library, and then as long as the vendor's user-level driver is installed to the expected place, everything should work out of the box.
My concern is two-fold:
having dependencies on optional, proprietary libs means the binary distro of the library can't be compiled to dynamically link to these libs, since they may or may not be available.
the user should not have to install dependencies for devices he does not have, open or proprietary.
How do other packages handle this problem of linking to proprietary libs and having run-time weak dependencies? Is dlopen the right way to go for everything? Should I dlopen only the proprietary stuff? What are reasons why or cases when Debian might reject such a package?
Lastly, I realize this probably isn't the right forum for this question about Debian policy, so can anyone point to me a better place to ask this question?
Thanks.
I have no relationship to Debian and cannot speak about their policies. However, for your framework, this seems a reasonable approach:
Define a simple header file that expresses the functionality you need from these plugins
Create a useful GPL/LGPL/BSD plugin that uses that interface
Have your main program load that using libdl, as you mentioned (if your main program is GPL, you need to have a licence exception to allow linking proprietary plugins)
Submit those for inclusion in Debian, and don't mention about the proprietary stuff
The main point is that your plugin system should be useful for free software, and not just be a Trojan horse to allow proprietary code to be loaded.
Using dlopen does not change the fact that you are writing a program to deliberately link to proprietary libraries and GPL libraries at the same time, it just shifts the linking from compile time to run time. While common consensus among the masses is that the GPL does not cover linking dynamically at runtime in this way, it is not safe legal advice to rely on such common understanding. The way I would solve the problem is to write a program with a single generic API for plugins (which can use dlopen, but the key is that you have not specifically written this program to link to proprietary libraries). The program must be under a free license that is compatible with all the plugins you eventually want it to be used with (ie LGPL, or GPL with exception for that API). Then write separate plugins for the GPL libraries and the proprietary libraries, and distribute them separately. If only one plugin can be loaded at a time, then there is no legal problem. If it is necessary to allow more than one plugin at once, then you need to be careful to separate your distribution. As the GPL is a distribution license, what the end users do is not a concern.