Why are the source file names not human readable? - raku

I installed Perl6 with rakudobrew and wanded to browse the installed files to see a list of hex-filenames in ~/.rakudobrew/moar-2018.08/install/share/perl6/site/sources as well as ~/.rakudobrew/moar-2018.08/install/share/perl6/sources/.
E.g.
> ls ~/.rakudobrew/moar-2018.08/install/share/perl6/sources/
09A0291155A88760B69483D7F27D1FBD8A131A35 AAC61C0EC6F88780427830443A057030CAA33846
24DD121B5B4774C04A7084827BFAD92199756E03 C57EBB9F7A3922A4DA48EE8FCF34A4DC55942942
2ACCA56EF5582D3ED623105F00BD76D7449263F7 C712FE6969F786C9380D643DF17E85D06868219E
51E302443A2C8FF185ABC10CA1E5520EFEE885A1 FBA542C3C62C08EB82C1F4D25BE7B4696F41B923
522BE83A1D821D8844E8579B32BA04966BAB7B87 FE7156F9200E802D3DB8FA628CF91AD6B020539B
5DD1D8B49C838828E13504545C427D3D157E56EC
The files contain the source of packages but this does not feel very accessible. What is the rational for that?

In Perl 6, the mechanism for loading modules and caching their compilations is pluggable. Rakudo Perl 6 comes with two main mechanisms for this.
One is a file-system based repository, and it's used with things like -Ilib. This resolves modules simply using paths on disk. Whenever a module loaded, it first has to check that the modules sources have not changed in order to re-compile them if so. This is ideal for development, however such checks take time. Furthermore, this doesn't allow for having multiple versions of the same module available and picking the one matching the specification in the use statement. Again, ideal for development, when you just want it to use your latest changes, but less so for installation of modules from the ecosystem.
The other is an installation repository. Here, specific versions of modules are installed and precompiled. It is expected that all interactions with such a repository will be done through the API or tools using the API (for example, zef locate Some::Module). It's assumed that once a specific version of a module has been installed, then it is immutable. Thus, no checks need to be done against source, and it can go straight to loaded the compiled version of the module.
Thus, the installation repository is not intended for direct human consumption. The SHA-1s are primarily an implementation convenience; an alternative scheme could have been used in return for a bit more effort (and may well be used in the future). However, the SHA-1s do also create the appearance of something that wasn't intended for direct manipulation - which is indeed the case: editing a source file in there will have no effect in the immediate, and probably confusing effects next time the compiler is upgraded to a new version.

Related

When I should use find_package

I am learning CMake, and I feel hard to understand when I should use find_package.
For separate compilation, we need to let the compiler knows where to find the header file, and this could be done by target_include_directories. For linking, we need to let the linker knows where the implementation is, and this could be done by target_link_libraries. It seems like that is all we need to do to compile a project. Could anyone explain why and when we should use find_package?
If a package you intend allows for the use of find_package, you should use it. If a package comes with a working configuration script, it'll encourage you to use the library the way it's intended to be used likely come with a simple way to add include directories and dependencies required.
When is it possible to use find_package?
There needs to be either a configuration script (<PackageName>Config.cmake or packagename-config.cmake) that gets installed with the package or find script (Find<PackageName>.cmake). The latter one in some cases even comes with the cmake installation instead of the package installed, see CMake find modules.
Should you create missing scripts yourself?
There are several benefits in creating a package configuration script yourself, even if a package doesn't come with a existing configuration or find script:
The scripts separate the information about libraries from the logic used to create your own target. The use of the 2 commands find_package and target_link_libraries is concise and any logic you may need to collect and apply information like dependencies, include directories, minimal versions of the C++ standard to use, ect. would probably take up much more space in your CMakeLists.txt files thus making it harder to understand.
If makes library used easy to replace. Basically all it takes to go with a different version of the same package would be to modify CMAKE_PREFIX_PATH, CMAKE_MODULE_PATH or package-specific <PackageName>_ROOT variables. If you ever want to try out different versions of the same library, this is incredibly useful.
The logic is reuseable. If you need to use the same functionality in a different project, it takes little effort to reuse the same logic. Even if a library is only used within a single project, but in multiple places, the use of find_package can help keeping the logic for "importing" a lib close to its use (see also the first bullet point).
There can be multiple versions of the same library with automatic selection of applicable ones. Note that this requires the use of a version file, but this file allows you to specify, if a version of the package is suitable for the current project. This allows for the checking the target architecture, ect. This is helpful when cross compiling or when providing both 32 and 64 bit versions of a library on Windows: If a version file indicates a mismatch the search for a suitable version simply continues with different paths instead of failing fatally when considering the first mismatch.
You will probably find CMake's guide on using dependencies helpful. It describes find_package and alternatives, and when each one is relevant / useful. Here's an excerpt from the section on find_package (italics added):
A package needed by the project may already be built and available at some location on the user's system. That package might have also been built by CMake, or it could have used a different build system entirely. It might even just be a collection of files that didn't need to be built at all. CMake provides the find_package() command for these scenarios. It searches well-known locations, along with additional hints and paths provided by the project or user. It also supports package components and packages being optional. Result variables are provided to allow the project to customize its own behavior according to whether the package or specific components were found.
find_package requires that the package provide CMake support in the form of specific files that describe the package's contents to CMake. Some library authors provide this support (the most desirable scenario for you, the package consumer), some don't but are prominent enough that CMake itself comes with such files for those packages, or in the worst case, there is no CMake support at all, in which case you can either do something to get the either of the previous good outcomes, or perform some kludges to get the job done (ie. define the targets yourself in your project's CMake config).

Force CMake to install targets to architecture-specific directories?

I'm currently having this issue with the Google Protobuf Library, but it is a recurring problem and will likely occur with many if not all 3rd-party packages that I want to build and install from source.
I'm developing for Windows, and we need to be able to generate both 32-bit and 64-bit versions of our DLLs. It was relatively straightforward to get CMake to install our own modules to architecture-specific subdirectories, e.g. D:\libraries\bin\i686 and d:\libraries\lib\i686 (and sim. for bin). But I'm having trouble achieving the same thing with 3rd-party libraries such as Protobuf.
I could, of course, use distinct CMAKE_INSTALL_PREFIX and CMAKE_PREFIX_PATH combinations (e.g. D:\libraries-i686 and D:\libraries-x86_64, and will probably end up doing just that, but it bothers me that there doesn't seem to be a better alternative. The docs for find_package() clearly show that the search procedure does attempt architecture-specific search paths, so why do the CMake files of popular libraries not generally seem to support installing to architecture-specific subdirectories?
Or could it be that it is just a matter of setting the right CMAKE_XXX variable?
Thanks to #arrowd for pointing me in the right direction, I now have my answer, though it is not exactly what I had hoped for.
CMAKE_LIBRARY_OUTPUT_DIRECTORY and CMAKE_RUNTIME_OUTPUT_DIRECTORY, however, specify the build output directories, not the install directories. As it turns out though, there are variables for the install directories too, called CMAKE_INSTALL_BINDIR and CMAKE_INSTALL_LIBDIR - they are actually plainly visible (along with plenty more) in the cmake-gui interface when "Advanced" is checked.
I tried setting those two manually (to bin\i686 and lib\i686), and it works: the Protobuf INSTALL target copies the files where I wanted to have them, i.e. where the CMake script of my consumer project will find them in an architecture-safe manner.
I'm not sure how I feel about this - I would have preferred something like a CMAKE_INSTALL_ARCHITECTURE or CMAKE_ARCHITECTURE_SUBDIR variable that CMake would automatically append to relevant install paths. The solution above requires overriding defaults that I would prefer to leave untouched.
Under the circumstances, my fallback approach might still be the better option. That approach however requires that the choice of architecture be made very early on, typically when running the script that initializes the CMake-specific environment variables that will be passed to cmake when configuring build directories. And it's worse when using cmake-gui, which requires the user to set all directories manually.
In the end, I'm still undecided.

How do you make it so that cpack doesn't add required libraries to an RPM?

I'm trying to convert our build system at work over to cmake and have run into an interesting problem with the RPMs that it generates (via cpack): It automatically adds all of the dependencies that it thinks your RPM has to its list of required libraries.
In general, that's great, but in my case, it's catastrophic. Unfortunately, the development packages that we build end up getting installed with one our home-grown tool that uses rpm to install them in a separate RPM database from the system one. It's stupid, but I can't change it. What this means is that all of the system libraries that any normal library will rely on (like libc or libpthread) aren't in the RPM database that is being used with our development packages. So, if an RPM for one of our development packages lists system libraries as being required, then we can't install it, as rpm will think that they're not installed (since they're listed in the normal database rather than the one that it's being told to use when installing our packages). Our current build stuff handles this just fine, because it doesn't list any system libraries as dependencies in the RPMs, but cpack automatically populates the RPM's list of required libraries and puts the system libraries in there. I need a way to stop it from doing so.
I tried setting CPACK_RPM_PACKAGE_REQUIRES to "", but that has no effect. The RPM cpack generates still ends up with the system libraries listed as being required. All I can think of doing at this point is to copy the RPM cpack generator and hack it up to do what I want and use that instead of the standard one, but I'd prefer to avoid that. Does anyone have any idea how I could get cpack to stop populating the RPM with required libraries?
See bottom of
http://www.rpm.org/max-rpm/s1-rpm-depend-auto-depend.html
The autoreqprov Tag — Disable Automatic Dependency Processing
There may be times when RPM's automatic dependency processing is not desired. In these cases, the autoreqprov tag may be used to disable it. This tag takes a yes/no or 0/1 value. For example, to disable automatic dependency processing, the following line may be used:
AutoReqProv: no
EDIT:
In order to set this in cmake, you need to do set(CPACK_RPM_PACKAGE_AUTOREQPROV " no"). The extra space seems to be required in front of (or behind) the no in order for it to work. It seems that the RPM module for cpack has a bug which makes it so that it won't let you set some its variables to anything shorter than 3 characters long.
To add to Mark Lakata's answer above, there's a snapshot of the "Maximum RPM" doc
http://www.rpm.org/max-rpm-snapshot/s1-rpm-depend-auto-depend.html
that also adds:
The autoreq and autoprov tags can be used to disable automatic processing of requirements or "provides" only, respectively.
And at least with my version of CPackRPM, there seems to be similar variables you can set e.g.
set(CPACK_RPM_PACKAGE_AUTOREQ " no")
to only disable the automatic dependency processing of 'Requires'.

CMake: build library used by multiple projects

I have a directory containing several tools which I use for independent projects, e.g.:
CommonTools
+ Tool A
+ Tool B
+ Tool C
Tool B depends on Tool A, but Tool A can be used independently from Tool B. I think I have two options:
I can install the tools under a system directory (e.g. for Windows, C:\Program Files). This is not necessarily a good thing given that some of my programs are meant to be used in the same directory as the one they are shipped in because I don't have sufficient rights to write to a system directory). Besides, I still need to locate the header files to compile projects that use those tools.
I could use find_library to locate them. Then I run into the following problem: find_library(A) won't work until I've actually built A, so I can't cmake CommonTools (because Tool B requires Tool A). I could call cmake from make, but that looks rather convoluted...
I can put relative paths to Tool A in Tool B & only use find_library for other projects. Unfortunately, this relative path changes depending on whether I'm building CommonTools or Tool B.
What are your thoughts on this? Thanks!
As I wanted to be able to perform one-step builds, this is what I ended up doing.
I distinguish the submodules of the module I'm currently building from external dependencies & third-party tools. Each (sub)module is only responsible for building itself. This means that all external dependencies & third-party tools must be already installed or available in binary + header form from a server. As a corollary, it means that a missing dependency is a binary which should be available from a given server but isn't.
Submodules are added using add_subdirectory, which means that if any of them is not available, the configuration step will fail with an explicit message.
External dependencies & third-party tools are located using find_package. The HINT location is an option which must be provided by the user performing the build (this gives an indication of the module's dependencies to the user. If any of them is not found, a binary is downloaded from a given location using ExternalProject_Add. The <module>_FOUND, <module>_LIBRARIES & <module>_INCLUDE_DIRS variables must be set manually in the CMakeLists.txt file, but given a proper directory layout on the server side (e.g. <module>-<version>-<platform>/include & <module>-<version>-<platform>/binaries), it can be done in a consistent way (e.g. using a macro). There again, if no binaries are found on the server, the configuration step will fail with an explicit message.
All of this means that the continuous integration server will correctly detect any missing dependencies (i.e. components which should be on the server but aren't or submodules which are not under version control) at configuration time rather than at build time, while still allowing one-step builds.
I hope this can be of some use to others.
PS: as a side-node to Google Test users: "gtest must be recompiled for each module because every user needs to compile his tests using the same compiler flags used to compile the installed Google Test libraries; otherwise he may run into undefined behaviors. If you compile Google Test and your test code using different compiler flags, they may see different definitions of the same class/function/variable)". This means you actually need (in my case) to run an ExternalProject_Add command in every module because each module contains its own tests.

How to use Ivy/Ant to build using intermediate artifacts

I am trying to revise my build process to use ant with apache ivy for my personal projects. These consist of a few shared modules, and a few application modules that depend on the shared modules. For the sake of this post, let's simplify and say I have a shared module (common), and an application module (application) which depends on common. Each module has it's own effective svn repository:
svn_repo_1/common/trunk
/branches
/tags
svn_repo_2/application/trunk
/branches
/tags
I check out the relevant revision into a common workspace, in a flat structure:
workspace/common
workspace/application
In general, application will depend on a published version of common, so there will be no need to build common when building application.
However, when I need to add new functionality to common that is required by application, I would then like application to depend on the latest common build from my workspace (without needing to publish common to my repository).
I assumed this is what latest.integration meant (i.e. changing application's ivy.xml to specify latest.integration for the common revision). My intention was to use the ivy buildlist task to find the local modules that needed to be built before application could be built. This does not work however, because the buildlist task seems to include the common/build.xml entry regardless of whether application's ivy.xml file specifies latest.integration or some other published revision.
I would appreciate any suggestions. I am struggling with ivy's documentation and samples, so any real-world examples would also be helpful. Note: I am not interested in a Maven solution here.
Wow, this is truly deja vu! Go back to some of my first questions on this site from 3 - 4 months ago and they're almost all Ivy-related! I empathize with you 100% that Ivy is a difficult beast to learn and tame, but after using it professionally for a few months now, I'll never develop without it again. So my first piece of advice: keep going. Sooner or later, what little (practical) documentation you find on Apache Ivy will alll start to make sense and fall into play.
I can understand there may be extenuating reasons for why you don't want to publish your common to your repo. However, if you are a newcome to transitive dependency management, the first piece of practical advice I can give you is that you should always publish your JARs/WARs/whatever to your repo; not an intermediary "integration" local to your workspace.
The reason for this is simple: Ivy only has the ability to crawl the repositories you define in your settings file (basically). If you deliberately keep a JAR like common outside of one of these defined repositories, then: (a) Ivy has no way to resolve transitive dependencies (its primary job), and (b) "downstream" (dependent) JARs fail to be dynamically updated every time you tweak common. Thus, using Ivy only to not publish JARs is a bit counter-productive; I'm surprised Ivy even includes it as a feature.
I guess I would need to understand your motivation for not publishing common. If you're simply having problems getting the ivy:publish task to work, no worries I can provide plenty of examples to help get you started. But if there are some other reasons, then I ask you to consider this solution: set up multiple repositories.
Perhaps you have one "primary" repository where mostly everything gets published; and then you have a "secondary" or "intermediary" repository where you publish common to whenever it makes sense (for you) to do that. You can then configure your Ant build with two different publish tasks, such as publish-main and publish-integration.
That way you get the best of both worlds: you get your intermediary staging area, and you get to keep everything inside of Ivy's powerful control.