In CMake how do I deal with generated source files which number and names are not known before? - cmake

Imagine a code generator which reads an input file (say a UML class diagram) and produces an arbitrary number of source files which I want to be handled in my project. (to draw a simple picture let's assume the code generator just produces .cpp files).
The problem is now the number of files generated depends on the input file and thus is not known when writing the CMakeLists.txt file or even in CMakes configure step. E.g.:
>>> code-gen uml.xml
generate class1.cpp..
generate class2.cpp..
generate class3.cpp..
What's the recommended way to handle generated files in such a case? You could use FILE(GLOB.. ) to collect the file names after running code-gen the first time but this is discouraged because CMake would not know any files on the first run and later it would not recognize when the number of files changes.
I could think of some approaches but I don't know if CMake covers them, e.g.:
(somehow) define a dependency from an input file (uml.xml in my example) to a variable (list with generated file names)
in case the code generator can be convinced to tell which files it generates the output of code-gen could be used to create a list of input file names. (would lead to similar problems but at least I would not have to use GLOB which might collect old files)
just define a custom target which runs the code generator and handles the output files without CMake (don't like this option)
Update: This question targets a similar problem but just asks how to glob generated files which does not address how to re-configure when the input file changes.

Together with Tsyvarev's answer and some more googling I came up with the following CMakeList.txt which does what I want:
project(generated)
cmake_minimum_required(VERSION 3.6)
set(IN_FILE "${CMAKE_SOURCE_DIR}/input.txt")
set_property(DIRECTORY APPEND PROPERTY CMAKE_CONFIGURE_DEPENDS "${IN_FILE}")
execute_process(
COMMAND python3 "${CMAKE_SOURCE_DIR}/code-gen" "${IN_FILE}"
WORKING_DIRECTORY ${PROJECT_BINARY_DIR}
INPUT_FILE "${IN_FILE}"
OUTPUT_VARIABLE GENERATED_FILES
OUTPUT_STRIP_TRAILING_WHITESPACE
)
add_executable(generated main.cpp ${GENERATED_FILES})
It turns an input file (input.txt) into output files using code-gen and compiles them.
execute_process is being executed in the configure step and the set_property() command makes sure CMake is being re-run when the input file changes.
Note: in this example the code-generator must print a CMake-friendly list on stdout which is nice if you can modify the code generator. FILE(GLOB..) would do the trick too but this would for sure lead to problems (e.g. old generated files being compiled, too, colleagues complaining about your code etc.)
PS: I don't like to answer my own questions - If you come up with a nicer or cleaner solution in the next couple of days I'll take yours!

Related

Bazel Checkers Support

What options do Bazel provide for creating new or extending existing targets that call C/C++-code checkers such as
qac
cppcheck
iwyu
?
Do I need to use a genrule or is there some other target rule for that?
Is https://bazel.build/versions/master/docs/be/extra-actions.html my only viable choice here?
In security critical software industries, such as aviation and automotive, it's very common to use the results of these calls to collect so called "metric reports".
In these cases, calls to such linters must have outputs that are further processed by the build actions of these metric report collectors. In such cases, I cannot find a useful way of reusing Bazel's "extra-actions". Ideas any one?
I've written something which uses extra actions to generate a compile_commands.json file used by clang-tidy and other tools, and I'd like to do the same kind of thing for iwyu when I get around to it. I haven't used those other tools, but I assume they fit the same pattern too.
The basic idea is to run an extra action which generates some output for each file (aka C/C++ compilation command), and then find all the output files afterwards (outside of Bazel) and aggregate them. A reasonably complete example is here for reference. Basically, the action listener (written in Python) decodes the extra action proto and extracts the source files, compiler options, etc:
action = extra_actions_base_pb2.ExtraActionInfo()
with open(argv[1], 'rb') as f:
action.MergeFromString(f.read())
cpp_compile_info = action.Extensions[extra_actions_base_pb2.CppCompileInfo.cpp_compile_info]
compiler = cpp_compile_info.tool
options = ' '.join(cpp_compile_info.compiler_option)
source = cpp_compile_info.source_file
output = cpp_compile_info.output_file
print('%s %s -c %s -o %s' % (compiler, options, source, output))
If you give the extra action an output template, then it can write that output to a file. If you give the output files distinctive names, you can find them all in the output tree and merge them together however you want.
A more sophisticated option is to use bazel query --output=proto and write code to calculate the extra action output filenames of the targets you're interested in from there. That requires writing more code, but you don't have problems with old output files in the output tree that are accidentally included when aggregating.
FWIW, Aspects are another possibility. However, I think extra actions work acceptably for this.

How to use the program's exit status at compile time?

This question is subsequent to my previous one: How to integrate such kind of source generator into CMake build chain?
Currently, the C source file is generated from XS in this way:
set_source_files_properties(${CMAKE_CURRENT_BINARY_DIR}/${file_src_by_xs} PROPERTIES GENERATED 1)
add_custom_target(${file_src_by_xs}
COMMAND ${XSUBPP_EXECUTABLE} ${XSUBPP_EXTRA_OPTIONS} ${lang_args} ${typemap_args} ${file_xs} >${CMAKE_CURRENT_BINARY_DIR}/${file_src_by_xs}
WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
DEPENDS ${file_xs} ${files_xsh} ${_XSUBPP_TYPEMAP_FILES}
COMMENT "generating source from XS file ${file_xs}"
)
The GENERATED property let cmake don't check the existence of this source file at configure time, and add_custom_target let the xsubpp always re-run at each compile. The reason for always rerun is because xsubpp will generate an incomplete source file even if it fails, so there are possibility that the whole compiling continues with an incomplete source file.
I found it is time consuming to always re-run source generator and recompile it. So I want to have it re-run only when dependent XS files are modified. However, if I do so, the incomplete generated source file must be deleted.
So my question is: is there any way to remove the generated file, only when the program exit abnormally at compile time?
Or more generic: is there any way to run a command depending on another command's exit status at compile time?
You can always write a wrapper script in your favorite language, e.g. Perl or Ruby, that runs xsubpp and deletes the output file if the command failed. That way you can be sure that if it exists, it is correct.
In addition, I would suggest that you use the OUTPUT keyword of add_custom_command to tell CMake that the file is a result of executing the command. (And, if you do that, you don't have to set the GENERATED property manually.)
Inspired by #Lindydancer's answer, I achieved the purpose by multiple COMMANDs in one target, and it don't need to write an external wrapper script.
set(source_file_ok ${source_file}.ok)
add_custom_command(
OUTPUT ${source_file} ${source_file_ok}
DEPENDS ${xs_file} ${xsh_files}
COMMAND rm -f ${source_file_ok}
COMMAND xsubpp ...... >${source_file}
COMMAND touch ${source_file_ok}
)
add_library(${xs_lib} ${source_file})
add_dependencies(${xs_lib} ${source_file} ${source_file_ok})
The custom target has 3 commands. The OK file only exists when xsubpp is success, and this file is added as a dependency of the library. When xsubpp is not success, the dependency on the OK file will force the custom command to be run again.
The only flaw is cross-platform: not all OS have touch and rm, so the name of these two commands should be decided according to OS type.

Proper way to call a found executable in a custom command?

I have a program on my computer, let's say C:/Tools/generate_v23_debug.exe
I have a FindGenerate.cmake file which allows CMake to find that exact path to the executable.
So in my CMake code, I do:
find_program(Generate)
if (NOT Generate_FOUND)
message(FATAL_ERROR "Generator not found!")
So CMake has found the executable. Now I want to call this program in a custom command statement. Should I use COMMAND Generator or COMMAND ${GENERATOR_EXECUTABLE}? Will both of these do the same thing? Is one preferred over the other? Is name_EXECUTABLE a variable that CMake will define (it's not in the FindGenerate.cmake file), or is it something specific to someone else's example code I'm looking at? Will COMMAND Generator be expanded to the correct path?
add_custom_command(
OUTPUT blahblah.txt
COMMAND Generator inputfile1.log
DEPENDS Generator
)
find_program stores its result into the variable given as a first argument. You can verify this by inserting some debug output:
find_program(GENERATOR Generate)
message(${GENERATOR})
Note that find_program does not set any additional variables beyond that. In particular, you mentioned Generate_FOUND and GENERATOR_EXECUTABLE in your question and neither of those gets introduced implicitly by the find_program call.
The second mistake in your program is the use of the DEPENDS option on the add_custom_command. DEPENDS is used to model inter-target dependencies at build time and not to manipulate control flow in the CMakeLists. For example, additional custom command can DEPEND on the output of your command (blahblah.txt), but a custom command cannot DEPEND on the result of a previous find operation.
A working example might look something like this:
find_program(GENERATOR Generate)
if(NOT GENERATOR)
message(FATAL_ERROR "Generator not found!")
endif()
add_custom_command(
OUTPUT blahblah.txt
COMMAND ${GENERATOR} inputfile1.log
)
P.S.: You asked why the code examples were not properly formatted in your question. You indented everything correctly, but you need an additional newline between normal text and code paragraphs. I edited your question accordingly.

CMake: add dependency to add_custom_command dynamically

I have a CMake project with many subprojects.
Each of them can use a function I provide to generate a small text file with some certain information (by calling add_custom_command).
At the final step I'd like to combine all those files into one big text file.
I've created a custom command which searches for created files (all in one place) and merges them.
The problem is that I'd like to make this final step dependend on all of small steps being made in subprojects while I don't actually know how many files will be provided.
My final command looks like:
add_custom_command(OUTPUT combination.txt
COMMAND create combination.txt from all files from /path/)
and my create-small-text-file-for-each-subproject command looks like:
add_custom_command(OUTPUT /path/${sub_project_name}.txt
COMMAND create /path/${sub_project_name}.txt)
And when I create those small files I'd like to do something like to make "combination.txt" depend on /path/${sub_project_name}.txt
So I wish I could:
add_dependency(combination.txt /path/${sub_project_name}.txt)
However this only works for targets.
I've also tried to use set_source_files_properties with OBJECT_DEPENDS, but it seems to doesn't work (maybe its intend to be used with add_target's cpp files ?)
The last way to get it work I see is to use a cache variable which would accumulate all those small files paths and then use it like this:
add_custom_command(OUTPUT combination.txt
COMMAND create combination.txt from all files from /path/
DEPENDS ${all_small_files_list})
but this is the last thing I want to do.
Instead of using add_custom_command you could use add_custom_target with a correct dependency-definition (so that is it not built every time).
add_custom_target(project
COMMAND touch project.txt)
add_custom_target(project2
COMMAND touch project2.txt)
add_custom_target(combination
COMMAND cat project.txt project2.txt > combination.txt)
add_dependencies(combination project2)
add_dependencies(combination project)
add_executable(t t.c)
add_dependencies(t combination.txt)
Again: make sure you're using the DEPENDS argument of add_custom_target to create a real dependency chain so that a project-target and thus the combination-target gets out of date.
UPDATE: I was too premature. In fact cmake (at least up to 2.8.9) works as follows for dependencies: with a call to add_dependencies you cannot add a dependency which is the OUTPUT of a custom command IOW a (generated) file. With add_dependencies you can only add target as created by add_custom_target. However in a add_custom_target you can depend on an output of add_custom_command by using the DEPENDS-directive. That said this makes it work:
add_custom_command(OUTPUT project.txt
COMMAND uptime >> project.txt MAIN_DEPENDENCY t2.c)
add_custom_target(project DEPENDS project.txt)
add_custom_target(combination
COMMAND cat project.txt project2.txt > combination.txt)
add_dependencies(combination project)
This will make the combination target always be regenerated as it has no MAIN_DEPENDENCY or DEPENDS, but the usage of add_dependencies is allowed.

SCONS: making a special script builder depend on output of another builder

I hope the title clarifies what I want to ask because it is a bit tricky.
I have a SCONS SConscript for every subdir as follows (doing it in linux, if it matters):
src_dir
compiler
SConscript
yacc srcs
scripts
legacy_script
data
SConscript
data files for the yacc
I use a variant_dir without copy, for example:
SConscript('src_dir/compiler/SConscript', variant_dir = 'obj_dir', duplicate = 0)
The resulting obj_dir after building the yacc is:
obj_dir
compiler
compiler_compiler.exe
Now here is the deal.
I have another SConscript in the data dir that needs to do 2 things:
1. compile the data with the yacc compiled compiler
2. Take the output of the compiler and run it with the legacy_script I can't change
(the legacy_script, takes the output of the compiled data and build some h files for another software to depend on)
number 1 is acheived easily:
linux_env.Command('[output1, output2]', 'data/data_files','compiler_compiler.exe data_files output1 output2')
my problem is number 2: How do I make the script runner depend on outputs of another target
And just to clarify it, I need to make SCONS run (and only if compiler_output changes):
src_dir/script/legacy_script obj_dir/data/compiler_output obj_dir/some_dir/script_output
(the script is usage is: legacy_script input_file output_file)
I hope I made myself clear, feel free to ask some more questions...
I've had a similar problem recently when I needed to compile Cheetah Templates first, which were then used from another Builder to generate HTML files from different sources.
If you define the build output of the first builder as source for the second builder, SCons will run them in the correct order and only if intermediate files have changed.
Wolfgang