Strange CMake shared lib linking issue on Linux - objective-c

I'm seeing a weird linking issue on Linux with a cross-platform library project that uses CMake to create both an OS X Framework and a Linux shared library from the same source tree. The cross-platform aspect of that project has worked well in the past (up to about two years ago), but since then, we have exclusively done development work on OS X. The reason for the temporary Linux abandonment was a developer shortage: all those who remained used OS X - there was no technical reason for not building the source on Linux for some years.
And with one potentially relevant exception (more on this later), there have been no fundamental changes to our source in the meantime. But of course Linux has advanced: so of course there were some minor snags at first when we went back there. Things like the new version of the compiler complaining about things they had not complained about in the past (questionable casts, void pointer voodoo, and such). These issues were resolved in short order.
The entire source tree now compiles again on Mint 17.1 with some definitely harmless remaining warnings. But linking fails with a rather bizarre message:
Linking CXX shared library lib<ourLibName>.so
CMakeFiles/<file1>.c.o:1:1: error: stray '\177' in program
CMakeFiles/<file1>.c.o:1:1: error: stray '\2' in program
CMakeFiles/<file2>.c.o:1:1: error: stray '\213' in program
(and so on, thousands of times, with seemingly random values in the quotes
for all the object files in the library)
To me, this looks like the linker is accidentally trying to compile the object files one more time, instead of linking them. Switching between gcc and clang made no difference.
As I already said, there was one potentially relevant structural change to the project since it last compiled under Linux: it used to be a combination of only C and Objective-C sources. It now contains C, Objective-C as well as Objective-C++ source. On OS X, this change has not caused any issues whatsoever, and it is very hard for me to imagine that this addition of some .mm files is causing what we are seeing here. But still - weirder things have happened.
Also, there is a popular issue with several articles on stackoverflow about erroneously including unicode characters in C/C++ programs. This is not the problem here - no such messages appear during the actual compilation. The circus only starts once linking should happen.
The source tree is far too large to post, and the CMake files are also fairly involved, nested, and large (i.e. impossible to include here). To add insult to injury, they have worked fine in the past, on Ubuntu 10.10. Which I don't have around anymore, to test if the current tree still works there (that would have been far too easy, I guess). The relevant commands in the CMakeList that generate the library under Linux are
set_target_properties(
<ourLibName>
PROPERTIES
VERSION 2.0
SOVERSION 2
)
target_link_libraries(
<ourLibName>
${our_other_link_libraries}
)
install (
TARGETS
<ourLibName>
DESTINATION
lib
)
which still looks o.k. to me at first glance. How do I proceed here? I'm out of ideas on what to try next.
P.S. versions of the software involved: Cmake 2.8.11, gcc 4.8.2, clang 3.4-1ubuntu3.

Turns out the root of the problem was simple: one of the project devs who is no longer with us had apparently made an attempt to build the Objective-C++ version of the source on Linux, a year or two ago. From his abortive attempt, there was a sensible-looking leftover in the Linux-only part of the CMake compiler flags:
else ( APPLE )
set( CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -x objective-c++" )
endif ( APPLE )
The kicker is of course the -x objective-c++ flag. That does not cause any harm during compilation, except that you get tons of unnecessary warnings. But as these flags are also passed to the linker, it forces the poor thing into treating all the object files as ObjC++ input. That flag should never have been there: CMake is smart enough to handle a mixture of C, ObjC and ObjC++ right out of the box. Once the flag is removed, everything works as expected.

Related

Find out why cmake adds specific link flags

I have big project with cmake. It mostly works.
But recently some combination of compilation server vs test server broke. Investigation found that final compile/link command calls gcc (...) -licudata -licui18n -licuuc (...), this introduces dependency on shared library which is not present on test server.
How do I find out what in my project (my library, imported library, found library, whatever) adds those 3 flags to compile command?
I don't add them explicitly, so something is done automagically and I want to find it. compile_commands.json doesn't have them because linking flags don't belong in it. CMakeCache.txt has those flags in some obscure variable PC_LIBXML_STATIC_LIBRARIES:INTERNAL but removing them there doesn't affect compile/link command.
Note that this question is not about dealing with libicu specifically but about a method for investigation in general (though comments about eventual known problems with libicu would be appreciated too).
I found out that dependency graphs created by cmake can have more details that was configured for our project. Here are all options: https://cmake.org/cmake/help/latest/module/CMakeGraphVizOptions.html I expect GRAPHVIZ_EXTERNAL_LIBS, GRAPHVIZ_SHARED_LIBS are most important to set to true.
We enabled everything that was possible to enable, filtered out nothing and resulting graph was massive (to big for xdot - luckily .dot files are human readable), but showed that Boost::regex uses those 3 libraries.

CMake precompiled headers issue with mixed C/C++ project

Environment
cmake version 3.21.1 running on macOS 10.15.7, clang version string Apple clang version 12.0.0 (clang-1200.0.32.29).
Introduction
I have a project for a library written in C, however its unit tests are written in C++ using Google Test. The library implements different algorithms, with one target for each different algorithm. For each target, say A, there is a corresponding A_tests target. Say there are 5 targets, A through E.
Due to ever-increasing build times, I'm trying to add Google Test's "gtest/gtest.h" header as a precompiled header, evidently only for C++. To avoid repeatedly recompiling the same header, I added the following entry to one of my targets, say A_tests:
target_precompiled_headers(A_tests PRIVATE [["gtest/gtest.h"]])
Note that A_tests is composed entirely of C++ files.
For each of the other targets (X = B, C, D, E), I added the following:
target_precompiled_headers(X_tests REUSE_FROM A_tests)
The issue
Now this works fine for, say, X = B and C, which are also pure C++ targets. However, D_tests has a C file in it in addition to the various C++ files. When configuring the project with CMake, I get the following error:
CMake Error in CMakeLists.txt:
Unable to resolve full path of PCH-header
'/Users/.../my-lib/build/CMakeFiles/A_tests.dir/cmake_pch.h'
assigned to target D_tests, although its path is supposed to be
known!
Indeed, at my-lib/build/CMakeFiles/A_tests.dir, there is a cmake_pch.hxx file but not a cmake_pch.h file.
Root cause
Eventually, after an investigation that involved running CMake under a debugger, I found out it had to with the presence of a C file in D_tests, along with the lack of C files in A_tests. (Note: the PCH must be compiled inside A_tests, since A is the only mandatory target in the library -- B through E may all be disabled through CMake options.)
Attempts to fix
My first attempt was to add a dummy C file to A_tests to ensure that a C PCH is created as well. Although this ensures the error goes away, this is the content of the cmake_pch.h (note this is the C version of the file, as opposed to the separate C++ version which is cmake_pch.hxx):
/* generated by CMake */
#pragma clang system_header
#include "gtest/gtest.h"
I can't imagine any good things will come out of force-including a C++ header in C files (even if that's not an error, it will at the very least slow down the compilation by including the PCH in files where it makes no sense to do so).
After some more experimentation, I got an acceptable result by changing the target_precompiled_headers() entry in A_tests to the following, using generator expressions:
target_precompile_headers(A_tests PRIVATE
"$<$<COMPILE_LANGUAGE:CXX>:\"gtest/gtest.h\">"
"$<$<COMPILE_LANGUAGE:C>:<stddef.h$<ANGLE-R>>")
In principle this solution is acceptable -- having a C PCH with stddef.h is not really a problem, since it's a small and harmless header, and moreover there are very few C files in the X_tests targets and anyway C compilation is blazingly-fast.
However, I'm still bothered by the fact that I must add some C header to prevent an error. I even tried changing the relevant part of the statement above to "$<$<COMPILE_LANGUAGE:C>:>", but then I get a different error: target_precompile_headers called with invalid arguments.
The question
Can I modify my script to communicate to CMake that, for target D_tests, only a C++ PCH should be used, even though there are C files in that target?
Failing that, is it possible to create an empty C PCH, say by a suitable modification of the generator expression above?

What does the -specs argument do in arm-none-eabi-gcc?

I was having trouble with the linker for the embedded arm gcc compiler, and I found a tutorial somewhere online saying that I could fix my linker errors in arm-none-eabi-gcc by including the argument -specs=nosys.specs, which worked for me, and it was able to compile my code.
My chip is an ATSAM7SE256 microcontroller, which to my understanding is an arm7tdmi processor using the armv4t and thumb instruction sets, and I've been compiling my code using:
arm-none-eabi-gcc -march=armv4t -mtune=arm7tdmi -specs=nosys.specs -o <exe_name>.elf <input_files>
And the code compiles with no issue, but I have no idea if it's doing what I think it's doing.
What is the significance of a spec file? What other values can you set with -specs=, and in what situations would you want to? Is nosys.specs the value I want for a completely embedded arm microcontroller?
It is documented at: https://gcc.gnu.org/onlinedocs/gcc-11.1.0/gcc/Overall-Options.html#Overall-Options
It is a file containing switches to override standard defaults for various build components such as the compiler, assembler and linker. For example it can be used to replace the default C library.
I have never seen it used; typically bare-metal embedded system builds explicitly specify --nostdlib then explicitly link the required library. It could be used for environment specific build environments to link other default code such as an RTOS I guess. Personally I'd rather make all that explicit on the command line that hiding it in a file somewhere.
Essentially it applies the switches specified in the file as if they were defaults, so can be used to define defaults for specific build and execution environments.
The format of the specs file is documented at https://gcc.gnu.org/onlinedocs/gcc-11.1.0/gcc/Spec-Files.html#Spec-Files
Without seeing both the linker errors and the content of the nosys.specs file in this case it is difficult to say how or why it solved your linker problem. The alternative solution of course would be to apply whatever switches are in the specs file directly.

Portable whole-archive linking in CMake

If you want to link a static library into an shared library or executable while keeping all the symbols visible (e.g. so you can dlopen it later to find them), a non-portable way to do this on Linux/BSD is to use the flag -Wl,--whole-archive. On macOS, the equivalent flag is -Wl,-force_load,<library>; on Windows it's apparently /WHOLEARCHIVE.
Is there a portable way to do this in CMake?
I know I can add linker flags with target_link_libraries. I can detect the OS. However, since the macOS version of this includes the library name in the same string as the flag (no spaces), I think this messes with CMake's usual handling of link targets and so on. The more compatible I try to make this, the more I have to bend over backwards to make it happen.
And this is without even getting into more unusual compilers like Intel, PGI, Cray, IBM, etc. Those may not be compilers that people commonly deal with, but in some domains it's basically unavoidable to need to deal with these.
Are there any better options?
flink.cmake will help you.
target_force_link_libraries(<target>
<PRIVATE|PUBLIC|INTERFACE> <item>...
[<PRIVATE|PUBLIC|INTERFACE> <item>...]...
)

How to reuse Fortran modules without copying source or creating libraries

I'm having trouble understanding if/how to share code among several Fortran projects without building libraries or duplicating source code.
I am using Eclipse/Photran with the Intel compiler (ifort) on a linux system, but I believe I'm having a bigger conceptual problem with modules than with the specific tools.
Here's a simple example: In ~/workspace/cow I have a source directory (src) containing cow.f90 (the PROGRAM) and two modules m_graze and m_moo in m_graze.f90 and m_moo.f90, respectively. This project builds and links properly to create the executable 'cow'. The executable and modules (m_graze.mod and m_moo.mod) are stored in ~/workspace/cow/Debug and object files are stored under ~/workspace/cow/Debug/src
Later, I create ~/workplace/sheep and have src/sheep.f90 as the program and src/m_baa.f90 as the module m_baa. I want to 'use m_graze, only: ruminate' in sheep.f90 to get access to the ruminate() subroutine. I could just copy m_graze.f90 but that could lead to code getting out of sync and doesn't take into account any dependencies m_graze might have. For these reasons, I'd rather leave m_graze in the cow project and compile and link sheep.f90 against it.
If I try to compile the sheep project, I'll get an error like:
error #7002: Error in opening the compiled module file. Check INCLUDE paths. [M_GRAZE]
Under Properties:Project References for sheep, I can select the cow project. Under Properties:Fortran Build:Settings:Intel Compiler:Preprocessor I can add ~/workspace/cow/Debug (location of the module files) to the list of include directories so the compiler now finds the cow modules and compiles sheep.f90. However the linker dies with something like:
Building target: sheep
Invoking: Intel(R) Fortran Linker
ifort -L/home/me/workspace/cow/Debug -o "sheep" ./src/sheep.o
./src/sheep.o: In function `sheep':
/home/me/workspace/sheep/src/sheep.f90:11: undefined reference to `m_graze_mp_ruminate_'
This would normally be solved by adding libraries and library paths to the linker settings except there are no appropriate libraries to link to (this is Fortran, not C.)
The cow project was perfectly capable of compiling and linking together cow.f90, m_graze.f90 and m_moo.f90 into an executable. Yet while the sheep project can compile sheep.f90 and m_baa.f90 and can find the module m_graze.mod, it can't seem to find the symbols for m_graze even though all the requisite information is present on the system for it to do so.
It would seem to be an easy matter of configuration to get the linker portion of ifort to find the missing pieces and put them together but I have no idea what magic words need to be entered where in the Photran UI to make this happen.
I confess an utter lack of interest and competence in C and the C build process and I'd rather avoid the diversion of creating libraries (.a or .so) unless that's the only way to make this work.
Ultimately, I'm looking for a pure Fortran solution to this problem so I can keep a single copy of the source code and don't have to manually maintain a pile of custom Makefiles.
So can this be done?
Apologies if this has already been documented somewhere; Google is only showing me simple build examples, how to create modules, and how to link with existing libraries. There don't seem to be (m)any examples of code reuse with modules that don't involve duplicating source code.
Edit
As respondents have pointed out, the .mod files are necessary but not sufficient; either object code (in the form of m_graze.o) or static or shared libraries must be specified during the linking phase. The .mod files describe the interface to the object code/library but both are necessary to build the final executable.
For an oversimplified toy problem such as this, that's sufficient to answer the question as posed.
In a larger project with more complex dependencies (in my case, 80+KLOC of F90 linking to the MKL version of LAPACK95), the IDE or toolchain may lack sufficient automatic or user-interface facilities to make sharing a single canonical set of source files a viable strategy. The choice seems to be between risking duplicate source files getting out of sync, giving up many of the benefits of an IDE (i.e. avoiding manual creation of make/CMake/SCons files), or, in all likelihood, both. While a revision control system and good code organization can help, it's clear that sharing a single canonical set of source files among projects is far from easy given the current state of Eclipse.
Some background which I suspect you already know: Typically (including ifort) compiling the source code for a Fortran module results in two outputs - a "mod" file that contains a description of the Fortran entities that the module defines that the compiler needs to find whenever it sees a USE statement for the module, and object code for the linker that implements the procedures and variable storage, etc., that the module defines.
Your first error (the one you solved) is because the compiler couldn't find the mod file.
The second error is because the linker hasn't been told about the object code that implements the stuff that was in the source file with the module. I'm not an Eclipse user by any means, but a brute force way of specifying that is just to add the object file (xxxxx/Debug/m_graze.o) as an additional linker option (Fortran Build > Settings, under Intel Fortran Linker > Command Line). (Other tool chains have explicit "additional object file" properties for their link stage - there may well be a better way of doing this for the Intel chain.)
For more involved examples you would typically create a library out of the shared code. That's not really C specific, the only Fortran aspect is that the libraries archive of object code needs to be provided alongside the mod files that the Fortran compiler generates.
Yes the object code must be provided. E.g., when you install libnetcdf-dev in Debian (apt-get install libnetcdf-dev), there is a /usr/include/netcdf.mod file that is included.
You can now use all netcdf routines in your Fortran code. E.g.,
program main
use netcdf
...
end
but you'll have link to the netcdf shared (or static) library, i.e.,
gfortran -I/usr/include/ main.f90 -lnetcdff
However, as user MSB mentioned the mod file can only be used by gfortran that comes with the distribution (apt-get install gfortran). If you want to use any other compiler (even a different version that you may have installed yourself) then you'll have to build netcdf yourself using that particular compiler.
So creating a library is not a bad solution.