Documenting CMake scripts

Documenting CMake scripts - cmake

I find myself in a situation where I would like to accurately document a host of custom CMake macros and functions and was wondering how to do it.
The first thing that comes to mind is simply using the built-in syntax and only document scripts, like so:
# -----------------------------
# [FUNCTION_NAME | MACRO_NAME]
# -----------------------------
# ... description ...
# -----------------------------
This is fine. However, I'd like to employ common doc generators, for instance doxygen, to also generate external documentation that can be read by anyone without looking at the implementation (which is a common scenario).
One way would be to write a simple parser that generates a corresponding C/C++ header with the appropriate signatures and documentation directly from the CMake script, which could the be processed by doxygen or comparable tools. One could also maintain such a header by hand - which is obviously tedious and error prone.
Is there any other way to employ a documentation generator with CMake scripts?

Here is the closest I could get. The following was tested with CMake 2.8.10. Currently, CMake 3.0 is under development which will get a new documentation system based on Sphinx and reStructuredText. I guess that this will bring new ways to document your modules.
CMake 2.8 can extract documentation from your modules, but only documentation at the beginning of the file is considered. All documentation is added as CMake comments, beginning with a single #. Double ## will be ignored (so you can add comments to your documentation). The end of documentation is marked by the first non-comment line (e.g. an empty line)
The first line gives a brief description of the module. It must start with - and end with a period . or a blank line.
# - My first documented CMake module.
# description
or
# - My first documented CMake module
#
# description
In HTML, lines starting with at two or more spaces (after the #) are formatted with monospace font.
Example:
# - My custom macros to do foo
#
# This module provides the macro foo().
# These macros serve to demonstrate the documentation capabilietes of CMake.
#
# FOO( [FILENAME <file>]
# [APPEND]
# [VAR <variable_name>]
# )
#
# The FOO() macro can be used to do foo or bar. If FILENAME is given,
# it even writes baz.
MACRO( FOO )
...
ENDMACRO()
To generate documentation for your custom modules only, call
cmake -DCMAKE_MODULE_PATH:STRING=. --help-custom-modules test.html
Setting CMAKE_MODULE_PATH allows you to define additional directories to search for modules. Otherwise, your modules need to be in the default CMake location. --help-custom-modules limits the documentation generation to custom, non-CMake-standar modules. If you give a filename, the documentation is written to the file, to stdout otherwise. If the filename has a recognized extension, the documentation is formatted accordingly.
The following formats are possible:
.html for HTML documentation
.1 to .9 for man page
.docbook for Docbook
anything else: plain text

Related

Standalone CMake script to cut off file contents by delimiters

I have a project where one repeatable task to do involves manipulating files' contents.
Until now I used a Python script for it, but recently I discovered I can use standalone CMake scripts ("standalone" here means they can be invoked outside of configure/build/test/etc. workflow). As my project already uses CMake for project management I concluded I can save others' problem of installing a Python interpreter (welcome Windows users!) and use CMake project-wide.
Part of my script needs to read a file and cut off everything that appears before "[START-HERE]" and after "[END-HERE]" lines. I am stuck with that part and don't know how to implement it. How can it be done?

You could combine file(READ) with if(MATCHES) to accompilish this. The former is used to read the file, the latter allows you to check for the occurance of a regular expression and to extract a capturing group:
foo.cmake
#[===[
Params:
INPUT_FILE : the path to the file to read
#]===]
file(READ "${INPUT_FILE}" FILE_CONTENTS)
if (FILE_CONTENTS MATCHES "(^|[\r\n])\\[START-HERE\\][\r\n]+(.*)[\r\n]+\\[END-HERE\\]")
# todo: use extracted match stored in CMAKE_MATCH_2 for your own logic
message("Content: '${CMAKE_MATCH_2}'")
else()
message(FATAL_ERROR "[START-HERE]...[END-HERE] doesn't occur in the input file '${INPUT_FILE}'")
endif()
foo.txt
Definetly not
[START-HERE]
working
[END-HERE]
Try again!
Output:
> cmake -D INPUT_FILE=foo.txt -P foo.cmake
Content: 'working'

For the part where you are stuck, here's one approach using the string, file, and math commands:
file(READ data.txt file_str)
string(FIND "${file_str}" "[START-HERE]" start_offset)
# message("${start_offset}")
math(EXPR start_offset "${start_offset}+12")
# message("${start_offset}")
string(FIND "${file_str}" "[END-HERE]" end_offset)
math(EXPR substr_len "${end_offset}-${start_offset}")
# message("${substr_len}")
string(SUBSTRING "${file_str}" "${start_offset}" "${substr_len}" trimmed_str)
# message("${trimmed_str}")
You could also probably do it by using the file(STRINGS) command, which reads lines of a file into an array, and then use the list(FIND) command. The approach shown above has the advantage of working if your delimiters are not on their own lines.
As #fabian shows in their answer post, you can also do this using a regular expression with if(MATCHES) like this:
file(READ "${INPUT_FILE}" FILE_CONTENTS)
if (FILE_CONTENTS MATCHES "(^|[\r\n])\\[START-HERE\\][\r\n]+(.*)[\r\n]+\\[END-HERE\\]")
# todo: use extracted match stored in CMAKE_MATCH_2 for your own logic
message("Content: '${CMAKE_MATCH_2}'")
else()
message(FATAL_ERROR "[START-HERE]...[END-HERE] doesn't occur in the input file '${INPUT_FILE}'")
endif()

How to extract parts of complex configuration with CONFIG generator expression in CMake

In our project, we have a large number of configurations stemming from a large variety of target hardware types multiplied by few modes.
To avoid unneeded details let's just assume that the configurations have form <hw>_<mode> were
<hw> is one of: A, B or C,
<mode> is one of: 1, 2 or 3.
Furthermore, to remain close to actual case let's assume that A_3 and C_1 are unsupported exceptions. (However, I don't think it matters here.)
Which leaves us with 3 x 3 - 2 = 7 supported configurations.
Now, we would like to make settings (amongst others also the path to compiler and sysroot) depend on the configuration. Also, some sources should be included only in some configurations. And we would prefer to do it based on parts of the configuration.
For example, we would like to use /this/g++ for all A_* configurations and /that/g++ for all other. Or we would like to add mode2.cpp file for all *_2 configurations but not others.
It is a simple task if we use CMAKE_BUILD_TYPE. We can split it with regex (string(REGEX MATCH) and have variables with each part. Then simple if does the job.
However, such approach is not friendly with multi-config generators (it seems currently those are only Visual Studio and Xcode). To play nicely with multi-config generators, AFAIK, we would have to use generator expressions.
The problem is, however, that I see no way to extract parts for the configuration (CONFIG) in the generator expressions.
For example, I can do this:
add_executable(my_prog
source_1.cpp
# ...
source_n.cpp
$<$<CONFIG:A_2>:mode2.cpp>
$<$<CONFIG:B_2>:mode2.cpp>
$<$<CONFIG:C_2>:mode2.cpp>
)
but this doesn't look like a maintainable approach considering that sooner or later we will be adding new hardware types (or removing obsolete ones).
Is there any way to do some form of matching in generator expression?
The only workaround I found out so far is to use an approach like this:
set(CONFIG_IS_MODE_2 $<OR:$<CONFIG:A_2>,$<CONFIG:B_2>,$<CONFIG:C_2>>)
add_executable(my_target
source_1.cpp
# ...
source_n.cpp
$<${CONFIG_IS_MODE_2}:mode2.cpp>
)
which at least allows centralizing those expressions and when new hardware type is added there is a single place to update. However, still, there are many variables to update.
Is there any better solution?

With target_sources() command and a function() you could still use a regex to match your configurations.
This would look something like in this example code:
cmake_minimum_required(VERSION 3.0)
project(TestConfigRegEx)
function(my_add_sources_by_config_regex _target _regex)
foreach(_config IN LISTS CMAKE_CONFIGURATION_TYPES CMAKE_BUILD_TYPE)
if (_config MATCHES "${_regex}")
target_sources(${_target} PRIVATE $<$<CONFIG:${_config}>:${ARGN}>)
endif()
endforeach()
endfunction()
file(WRITE main.cpp "int main() { return 0; }")
file(WRITE modeRelease.cpp "")
add_executable(my_target main.cpp)
my_add_sources_by_config_regex(my_target Release modeRelease.cpp)
But that gives me an error from CMake version 3.11.1 Visual Studio 15 2017 generator side:
Target "my_target" has source files which vary by configuration. This is
not supported by the "Visual Studio 15 2017" generator.
Config "Debug":
.../main.cpp
Config "Release":
.../main.cpp
.../modeRelease.cpp
Strange enough it still generates the solution.
Alternatives
The classic one would be adding a define containing the configuration and handle the differences in the C/C++ code with #if checks
You differentiate not per configuration but with additional targets (like my_target and my_target_2)

In CMake how do I deal with generated source files which number and names are not known before?

Imagine a code generator which reads an input file (say a UML class diagram) and produces an arbitrary number of source files which I want to be handled in my project. (to draw a simple picture let's assume the code generator just produces .cpp files).
The problem is now the number of files generated depends on the input file and thus is not known when writing the CMakeLists.txt file or even in CMakes configure step. E.g.:
>>> code-gen uml.xml
generate class1.cpp..
generate class2.cpp..
generate class3.cpp..
What's the recommended way to handle generated files in such a case? You could use FILE(GLOB.. ) to collect the file names after running code-gen the first time but this is discouraged because CMake would not know any files on the first run and later it would not recognize when the number of files changes.
I could think of some approaches but I don't know if CMake covers them, e.g.:
(somehow) define a dependency from an input file (uml.xml in my example) to a variable (list with generated file names)
in case the code generator can be convinced to tell which files it generates the output of code-gen could be used to create a list of input file names. (would lead to similar problems but at least I would not have to use GLOB which might collect old files)
just define a custom target which runs the code generator and handles the output files without CMake (don't like this option)
Update: This question targets a similar problem but just asks how to glob generated files which does not address how to re-configure when the input file changes.

Together with Tsyvarev's answer and some more googling I came up with the following CMakeList.txt which does what I want:
project(generated)
cmake_minimum_required(VERSION 3.6)
set(IN_FILE "${CMAKE_SOURCE_DIR}/input.txt")
set_property(DIRECTORY APPEND PROPERTY CMAKE_CONFIGURE_DEPENDS "${IN_FILE}")
execute_process(
COMMAND python3 "${CMAKE_SOURCE_DIR}/code-gen" "${IN_FILE}"
WORKING_DIRECTORY ${PROJECT_BINARY_DIR}
INPUT_FILE "${IN_FILE}"
OUTPUT_VARIABLE GENERATED_FILES
OUTPUT_STRIP_TRAILING_WHITESPACE
)
add_executable(generated main.cpp ${GENERATED_FILES})
It turns an input file (input.txt) into output files using code-gen and compiles them.
execute_process is being executed in the configure step and the set_property() command makes sure CMake is being re-run when the input file changes.
Note: in this example the code-generator must print a CMake-friendly list on stdout which is nice if you can modify the code generator. FILE(GLOB..) would do the trick too but this would for sure lead to problems (e.g. old generated files being compiled, too, colleagues complaining about your code etc.)
PS: I don't like to answer my own questions - If you come up with a nicer or cleaner solution in the next couple of days I'll take yours!

CMake variable expansion using "#" vs. "${}"

Consider the following:
SET(TEST_DIR, "test")
INSTALL(PROGRAMS scripts/foo.py DESTINATION ${TEST_DIR})
INSTALL(PROGRAMS scripts/foo.py DESTINATION #TEST_DIR#)
The first INSTALL command does not work. The second does. Why is that? What is the difference between those two? I have not found any reference to ## expansion except in the context of creation of configuration files. Everything else only uses ${} expansion.
UPDATE: OK, obvious bug in the above. My SET() command has an extraneous comma. Removing it, such that it looks like:
SET(TEST_DIR "test")
results in both ## and ${} expansions working. Still wondering (a) what is the meaning of ## as opposed to ${}, and why only the former worked with my incorrect SET() statement.

According to the documentation for the configure_file() command when configuring a file both the ${VAR} form and #VAR# form will be replaced VAR's value. Based on your experience above and some testing I did both forms are replaced when CMake evaluates your CMakeLists.txt, too. Since this is not documented I would recommend against using the #VAR# from in your CMakeLists.txt
Note that when using configure_file() you can restrict replacement to only the #VAR# form by using the #ONLY argument.

As far as I know, the #VAR# syntax is only used when replacing variables with the configure_file command.
Note that the configure_file command allows for an extra option #ONLY. Using it you can specify that only the #VAR#'s are replaced, but that the ${VAR}'s are kept.
As an example, this can be useful when generating e.g. a cmake-file which is later to be used with CMake again. E.g. when building your project, the #VAR# will be replaced when using configure_file. After you distributed your project and someone else uses the generated UseProject.cmake file, the ${VAR}$ entries will be replaced.

How to document Visual Basic with Doxygen

I am trying to use some Doxygen filter for Visual Basic in Windows.
I started with Vsevolod Kukol filter, based on gawk.
There are not so many directions.
So I started using his own commented VB code VB6Module.bas and, by means of his vbfilter.awk, I issued:
gawk -f vbfilter.awk VB6Module.bas
This outputs a C-like code on stdin. Therefore I redirected it to a file with:
gawk -f vbfilter.awk VB6Module.bas>awkout.txt
I created this Doxygen test.cfg file:
PROJECT_NAME = "Test"
OUTPUT_DIRECTORY = test
GENERATE_LATEX = NO
GENERATE_MAN = NO
GENERATE_RTF = NO
CASE_SENSE_NAMES = NO
INPUT = awkout.txt
QUIET = NO
JAVADOC_AUTOBRIEF = NO
SEARCHENGINE = NO
To produce the documentation I issued:
doxygen test.cfg
Doxygen complains as the "name 'VB6Module.bas' supplied as the second argument in the \file statement is not an input file." I removed the comment #file VB6Module.bas from awkout.txt. The warning stopped, but in both cases the documentation produced was just a single page with the project name.
I tried also the alternative filter by Basti Grembowietz in Python vbfilter.py. Again without documentation, again producing errors and without any useful output.

After trials and errors I solved the problem.
I was unable to convert a .bas file in a format such that I can pass it to Doxygen as input.
Anyway, following #doxygen user suggestions, I was able to create a Doxygen config file such that it can interpret the .bas file comments properly.
Given the file VB6Module.bas (by the Doxygen-VB-Filter author, Vsevolod Kukol), commented with Doxygen style adapted for Visual Basic, I wrote the Doxygen config file, test.cfg, as follows:
PROJECT_NAME = "Test"
OUTPUT_DIRECTORY = test
GENERATE_LATEX = NO
GENERATE_MAN = NO
GENERATE_RTF = NO
CASE_SENSE_NAMES = NO
INPUT = readme.md VB6Module.bas
QUIET = YES
JAVADOC_AUTOBRIEF = NO
SEARCHENGINE = NO
FILTER_PATTERNS = "*.bas=vbfilter.bat"
where:
readme.md is any Markdown file that can used as the main documentation page.
vbfilter.bat contains:
#echo off
gawk.exe -f vbfilter.awk "%1%"
vbfilter.awk by the filter author is assumed to be in the same folder as the input files to be documented and obviously gawk should be in the path.
Running:
doxygen test.cfg
everything is smooth, apart two apparently innocuous warnings:
gawk: vbfilter.awk:528: warning: escape sequence `\[' treated as plain `['
gawk: vbfilter.awk:528: warning: escape sequence `\]' treated as plain `]'
Now test\html\index.html contains the proper documentation as extracted by the ".bas" and the Markdown files.

Alright I did some work:
You can download this .zip file. It contains:
MakeDoxy.bas The macro that makes it all happen
makedoxy.cmd A shell script that will be executed by MakeDoxy
configuration Folder that contains doxygen and gawk binaries which are needed to create the doxygen documentation as well as some additional filtering files which were already used by the OP.
source Folder that contains example source code for doxygen
How To Use:
Note: I tested it with Excel 2010
Extract VBADoxy.zip somehwere (referenced as <root> from now on)
Import MakeDoxy.bas into your VBA project. You can also import the files from source or use your own doxygen-documented VBA code files but you'll need at least one documented file in the same VBA project.
Add "Microsoft Visual Basic for Applications Extensibility 5.3" or higher to your VBA Project References (did not test it with lower versions). It's needed for the export-part (VBProject, VBComponent).
Run macro MakeDoxy
What is going to happen:
You will be asked for the <root> folder.
You will be asked if you want to delete <root>\source afterwards It is okay to delete those files. They will not be removed from your VBA Project.
MakeDoxy will export all .bas, cls and .frm files to location:<root>\source\<modulename>\<modulename>(.bas|.cls|.frm)
cmd.exewill be commanded to run makedoxy.cmd and delete <root>\source if you've chosen that way which alltogether will result in your desired documentation.
A logfile MakeDoxy.bas.logwill be re-created each time MakeDoxy is executed.
You can play with configuration\vbdoxy.cfg a little if you want to change doxygens behavior.
There is still some room for improvements but I guess this is something one can work with.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas