minimal Fortran file that compiles with any compiler - cmake

CMake developers recommend adding a dummy Fortran file to tell CMake that static libraries need to be linked with Fortran libraries (for example when linking C program with LAPACK).
My first thought was to use empty dummy.f. But ifort 9.0 won't compile it.
What is the minimal portable dummy Fortran file?
Is old Intel compiler the only one that has problem with empty file?

Same error with GFortran and Absoft Fortran. Actually, you need a "program" block to build an executable.
This dummy.f would work:
program dummy
end
According to standard, "A Fortran program must contain one main program and may contain any number of the other kinds of program units". (see "Fortran 95 handbook" by Adams et al., section 2.1.1 p.19)
Or in standard Fortran 95 same section 2.2.1, p.12:
A program consists of exactly one main program unit and any number
(including zero) of other kinds of program units. The set of program
units may include any combination of the different kinds of program
units in any order as long as there is only one main program unit.

Related

CMake precompiled headers issue with mixed C/C++ project

Environment
cmake version 3.21.1 running on macOS 10.15.7, clang version string Apple clang version 12.0.0 (clang-1200.0.32.29).
Introduction
I have a project for a library written in C, however its unit tests are written in C++ using Google Test. The library implements different algorithms, with one target for each different algorithm. For each target, say A, there is a corresponding A_tests target. Say there are 5 targets, A through E.
Due to ever-increasing build times, I'm trying to add Google Test's "gtest/gtest.h" header as a precompiled header, evidently only for C++. To avoid repeatedly recompiling the same header, I added the following entry to one of my targets, say A_tests:
target_precompiled_headers(A_tests PRIVATE [["gtest/gtest.h"]])
Note that A_tests is composed entirely of C++ files.
For each of the other targets (X = B, C, D, E), I added the following:
target_precompiled_headers(X_tests REUSE_FROM A_tests)
The issue
Now this works fine for, say, X = B and C, which are also pure C++ targets. However, D_tests has a C file in it in addition to the various C++ files. When configuring the project with CMake, I get the following error:
CMake Error in CMakeLists.txt:
Unable to resolve full path of PCH-header
'/Users/.../my-lib/build/CMakeFiles/A_tests.dir/cmake_pch.h'
assigned to target D_tests, although its path is supposed to be
known!
Indeed, at my-lib/build/CMakeFiles/A_tests.dir, there is a cmake_pch.hxx file but not a cmake_pch.h file.
Root cause
Eventually, after an investigation that involved running CMake under a debugger, I found out it had to with the presence of a C file in D_tests, along with the lack of C files in A_tests. (Note: the PCH must be compiled inside A_tests, since A is the only mandatory target in the library -- B through E may all be disabled through CMake options.)
Attempts to fix
My first attempt was to add a dummy C file to A_tests to ensure that a C PCH is created as well. Although this ensures the error goes away, this is the content of the cmake_pch.h (note this is the C version of the file, as opposed to the separate C++ version which is cmake_pch.hxx):
/* generated by CMake */
#pragma clang system_header
#include "gtest/gtest.h"
I can't imagine any good things will come out of force-including a C++ header in C files (even if that's not an error, it will at the very least slow down the compilation by including the PCH in files where it makes no sense to do so).
After some more experimentation, I got an acceptable result by changing the target_precompiled_headers() entry in A_tests to the following, using generator expressions:
target_precompile_headers(A_tests PRIVATE
"$<$<COMPILE_LANGUAGE:CXX>:\"gtest/gtest.h\">"
"$<$<COMPILE_LANGUAGE:C>:<stddef.h$<ANGLE-R>>")
In principle this solution is acceptable -- having a C PCH with stddef.h is not really a problem, since it's a small and harmless header, and moreover there are very few C files in the X_tests targets and anyway C compilation is blazingly-fast.
However, I'm still bothered by the fact that I must add some C header to prevent an error. I even tried changing the relevant part of the statement above to "$<$<COMPILE_LANGUAGE:C>:>", but then I get a different error: target_precompile_headers called with invalid arguments.
The question
Can I modify my script to communicate to CMake that, for target D_tests, only a C++ PCH should be used, even though there are C files in that target?
Failing that, is it possible to create an empty C PCH, say by a suitable modification of the generator expression above?

What is the difference between using DFWIN or IFWIN module for the Intel's oneAPI Fortran compiler?

I compiled an old software, where several "system" modules are used all starting with D (DLIB, DWIN, DWINTY, ...). I noticed that on the oneline oneApi manual similar modules starting with the "I" instead of the "D" are illustrated. At first sight it doesn't seem to be much different (the software compiles with DWIN or IWIN). Is a D module the debug version of the respective I module?
A module in the DF* and MSF* series of names is the same as the corresponding module in the IF* series. These names series provide for migrations between the compilers from the different corporate entities.
The source code for those compiler provided modules is provided with your compiler installation (e.g. search the installation tree for IFWINTY.f90). If you inspect the source for the IFWINTY module (just as an example) you will find that it has source for lots of type and constant declarations. DFWINTY and MSFWINTY, on the other hand, just consist of USE IFWINTY (and lots of licence boilerplate).

What are the possible values for the LANGUAGE variable in CMAKE

I haven't been able to find a list of possible values for the LANGUAGE variable on the CMAKE.org site or anywhere else. Would someone please enumerate the values CMAKE recognises? I specifically need to specify Objective C++.
Just take a look at all the CMakeDetermine<Language>Compiler.cmake scripts CMake ships with.
This would result - in alphabetic order - in the following you could put in the enable_language() call:
ASM
ASM-ATT
ASM-MASM
ASM-NASM
C
CSharp
CUDA
CXX
Fortran
Java
OBJC (Objective C)
OBJCXX (Objective C++)
RC (Windows Resource Compiler)
Swift
Evaluated with CMake Version 3.16
References
enable_language()
Generic rule from makefile to CMake
Update for CMake 3.16 and later: CMake added native support for Objective-C in version 3.16. The corresponding language strings are OBJC and OBJCXX. Thanks to squareskittles for pointing this out.
Original answer: The support for languages varies across platforms.
Currently CMake supports C, CXX and Fortran out of the box on most platforms. There is also support for certain Assemblers on some platforms. For a complete list, check out the contents of the Modules/Platform folder.
The idea is that the language given to the LANGUAGE field of the project command or the enable_language command is just a string, which will then be used by CMake together with the language dependent variables to setup the build system. The Platform scripts shipping with CMake do this configuration for C and C++. In theory, one can add their own language simply by setting the correct variables (although this is quite involved and I do not know of anyone ever successfully doing this).
As for adding support for Objective-C: Since most toolchains use the same compiler for C and Objective-C, you do not need to configure a new language. Simply compile your code as if it was plain C and add the appropriate compiler flags for Objective-C support.
Unfortunately, this is not very comfortable to use and can easily break in corner cases. But until CMake adds explicit support for Objective-C as a first class language, I'm afraid this is as good as it gets.

How to reuse Fortran modules without copying source or creating libraries

I'm having trouble understanding if/how to share code among several Fortran projects without building libraries or duplicating source code.
I am using Eclipse/Photran with the Intel compiler (ifort) on a linux system, but I believe I'm having a bigger conceptual problem with modules than with the specific tools.
Here's a simple example: In ~/workspace/cow I have a source directory (src) containing cow.f90 (the PROGRAM) and two modules m_graze and m_moo in m_graze.f90 and m_moo.f90, respectively. This project builds and links properly to create the executable 'cow'. The executable and modules (m_graze.mod and m_moo.mod) are stored in ~/workspace/cow/Debug and object files are stored under ~/workspace/cow/Debug/src
Later, I create ~/workplace/sheep and have src/sheep.f90 as the program and src/m_baa.f90 as the module m_baa. I want to 'use m_graze, only: ruminate' in sheep.f90 to get access to the ruminate() subroutine. I could just copy m_graze.f90 but that could lead to code getting out of sync and doesn't take into account any dependencies m_graze might have. For these reasons, I'd rather leave m_graze in the cow project and compile and link sheep.f90 against it.
If I try to compile the sheep project, I'll get an error like:
error #7002: Error in opening the compiled module file. Check INCLUDE paths. [M_GRAZE]
Under Properties:Project References for sheep, I can select the cow project. Under Properties:Fortran Build:Settings:Intel Compiler:Preprocessor I can add ~/workspace/cow/Debug (location of the module files) to the list of include directories so the compiler now finds the cow modules and compiles sheep.f90. However the linker dies with something like:
Building target: sheep
Invoking: Intel(R) Fortran Linker
ifort -L/home/me/workspace/cow/Debug -o "sheep" ./src/sheep.o
./src/sheep.o: In function `sheep':
/home/me/workspace/sheep/src/sheep.f90:11: undefined reference to `m_graze_mp_ruminate_'
This would normally be solved by adding libraries and library paths to the linker settings except there are no appropriate libraries to link to (this is Fortran, not C.)
The cow project was perfectly capable of compiling and linking together cow.f90, m_graze.f90 and m_moo.f90 into an executable. Yet while the sheep project can compile sheep.f90 and m_baa.f90 and can find the module m_graze.mod, it can't seem to find the symbols for m_graze even though all the requisite information is present on the system for it to do so.
It would seem to be an easy matter of configuration to get the linker portion of ifort to find the missing pieces and put them together but I have no idea what magic words need to be entered where in the Photran UI to make this happen.
I confess an utter lack of interest and competence in C and the C build process and I'd rather avoid the diversion of creating libraries (.a or .so) unless that's the only way to make this work.
Ultimately, I'm looking for a pure Fortran solution to this problem so I can keep a single copy of the source code and don't have to manually maintain a pile of custom Makefiles.
So can this be done?
Apologies if this has already been documented somewhere; Google is only showing me simple build examples, how to create modules, and how to link with existing libraries. There don't seem to be (m)any examples of code reuse with modules that don't involve duplicating source code.
Edit
As respondents have pointed out, the .mod files are necessary but not sufficient; either object code (in the form of m_graze.o) or static or shared libraries must be specified during the linking phase. The .mod files describe the interface to the object code/library but both are necessary to build the final executable.
For an oversimplified toy problem such as this, that's sufficient to answer the question as posed.
In a larger project with more complex dependencies (in my case, 80+KLOC of F90 linking to the MKL version of LAPACK95), the IDE or toolchain may lack sufficient automatic or user-interface facilities to make sharing a single canonical set of source files a viable strategy. The choice seems to be between risking duplicate source files getting out of sync, giving up many of the benefits of an IDE (i.e. avoiding manual creation of make/CMake/SCons files), or, in all likelihood, both. While a revision control system and good code organization can help, it's clear that sharing a single canonical set of source files among projects is far from easy given the current state of Eclipse.
Some background which I suspect you already know: Typically (including ifort) compiling the source code for a Fortran module results in two outputs - a "mod" file that contains a description of the Fortran entities that the module defines that the compiler needs to find whenever it sees a USE statement for the module, and object code for the linker that implements the procedures and variable storage, etc., that the module defines.
Your first error (the one you solved) is because the compiler couldn't find the mod file.
The second error is because the linker hasn't been told about the object code that implements the stuff that was in the source file with the module. I'm not an Eclipse user by any means, but a brute force way of specifying that is just to add the object file (xxxxx/Debug/m_graze.o) as an additional linker option (Fortran Build > Settings, under Intel Fortran Linker > Command Line). (Other tool chains have explicit "additional object file" properties for their link stage - there may well be a better way of doing this for the Intel chain.)
For more involved examples you would typically create a library out of the shared code. That's not really C specific, the only Fortran aspect is that the libraries archive of object code needs to be provided alongside the mod files that the Fortran compiler generates.
Yes the object code must be provided. E.g., when you install libnetcdf-dev in Debian (apt-get install libnetcdf-dev), there is a /usr/include/netcdf.mod file that is included.
You can now use all netcdf routines in your Fortran code. E.g.,
program main
use netcdf
...
end
but you'll have link to the netcdf shared (or static) library, i.e.,
gfortran -I/usr/include/ main.f90 -lnetcdff
However, as user MSB mentioned the mod file can only be used by gfortran that comes with the distribution (apt-get install gfortran). If you want to use any other compiler (even a different version that you may have installed yourself) then you'll have to build netcdf yourself using that particular compiler.
So creating a library is not a bad solution.

What is the difference in byte code like Java bytecode and files and machine code executables like ELF?

What are the differences between the byte code binary executables such as Java class files, Parrot bytecode files or CLR files and machine code executables such as ELF, Mach-O and PE.
what are the distinctive differences between the two?
such as the .text area in the ELF structure is equal to what part of the class file?
or they all have headers but the ELF and PE headers contain Architecture but the Class file does not
Java Class File
Elf file
PE File
Byte code is, as imulsion noted, an intermediate step, right before compilation into machine code. Because the last step is left to load time (and often runtime, as is the case with Just-In-Time (JIT) compilation, byte code is architecture independent: The runtime (CLR for .net or JVM for Java) is responsible for mapping the byte code opcodes to their underlying machine code representation.
By comparison, native code (Windows: PE, PE32+, OS X/iOS: Mach-O, Linux/Android/etc: ELF) is compiled code, suited for a particular architecture (Android/iOS: ARM, most else: Intel 32-bit (i386) or 64-bit). These are all very similar, but still require sections (or, in Mach-O parlance "Load Commands") to set up the memory structure of the executable as it becomes a process (Old DOS supported the ".com" format which was a raw memory image). In all the above, you can say , roughly, the following:
Sections with a "." are created by the compiler, and are "default" or expected to have default behavior
The executable has the main code section, usually called "text" or ".text". This is native code, which can run on the specific architecture
Strings are stored in a separate section. These are used for hard-coded output (what you print out) as well as symbol names.
Symbols - which are what the linker uses to put together the executable with its libraries (Windows: DLLs, Linux/Android: Shared Objects, OS X/iOS: .dylibs or frameworks) are stored in a separate section. Usually there is also a "PLT" (Procedure Linkage Table) which enables the compiler to simply put in stubs to the functions you call (printf, open, etc), that the linker can connect when the executable loads.
Import table (in Windows parlance.. In ELF this is a DYNAMIC section, in OS X this is a LC_LOAD_LIBRARY command) is used to declare additional libraries. If those aren't found when the executable is loaded, the load fails, and you can't run it.
Export table (for libraries/dylibs/etc) are the symbols which the library (or in Windows, even an .exe) can export so as to have others link with.
Constants are usually in what you see as the ".rodata".
Hope this helps. Really, your question was vague..
TG
Byte code is a 'halfway' step. So the Java compiler (javac) will turn the source code into byte code. Machine code is the next step, where the computer takes the byte code, turns it into machine code (which can be read by the computer) and then executes your program by reading the machine code. Computers cannot read source code directly, likewise compilers cannot translate immediately into machine code. You need a halfway step to make programs work.
Note that ELF binaries don't necessarily need to be machine/arch specific per se.
The interesting piece is the "interpreter" header field: it holds a path name to a loader program that's executed instead of the actual binary. This one then is responsible for loading the actual program, loading and linking libraries, etc. This is the way how eg. ld.so comes in.
Theoretically one could create an ELF binary that holds java bytecode (or a complete jar). This just needs some appropriate "interpreter" program which starts up a JVM and loads the code from the binary into it.
Not sure whether this actually has been done before, but certainly possible.
The same can be done w/ quite any non-native code.
It also could serve for direct multiarch support via some VM like qemu:
Let the target platform (libc+linker scripts) put the arch name into the interpreter program name (eg. /lib/ld.so.x86_64, /lib/ld.so.armhf, ...).
Then, on a particular arch (eg. x86_64), the one with native arch name will point to the original ld.so, while the others point to some special one that calls up something like qemu-system-XXX.