Cmake match beginning of string - cmake

I am getting some compile definitions from an external library. Unfortunately, they provide a list that sometimes starts with a leading semi-colon. For example:
;-Dfoo;Dbar
I think this is crashing the build command later in the process. I thought that I could simply remove potential leading semi-colons with this regex:
string(REGEX REPLACE "^;" "" stripped_defs ${defs})
but the problem is that Cmake seems to be ignoring the carrot ^ which signifies the start of the string, with the consequence being that all semi-colons are deleted. That is, I am getting the output
-Dfoo-Dbar
when I want
-Dfoo;-Dbar

As Sergei points out, the problem is that my defs variable was being interpreted as a list, not a string. So the regex was acting on each element of the list individually. All I need to do to force the string interpretation is to add quotes. Specifically, instead of
string(REGEX REPLACE "^;" "" stripped_defs ${defs})
I should have had
string(REGEX REPLACE "^;" "" stripped_defs "${defs}")

Rather than using a regular expression in this case, using list operations to delete empty elements would be my preferred approach in this case:
set(stripped_defs ${defs})
list(REMOVE_ITEM stripped_defs "")
This may involve one more command, but it's easier to understand what the snippet does.

Related

CMake - How does the if() command treat a symbol? As string or as variable?

I am not sure the CMake if() command will treat a symbol in the condition clause as a variable or a string literal. So I did some experiments.
Script1.cmake
cmake_minimum_required(VERSION 3.15)
set(XXX "YYY") #<========== HERE!!
if(XXX STREQUAL "XXX")
message("condition 1 is true") # If reach here, XXX is treated as string
elseif(XXX STREQUAL "YYY")
message("condition 2 is true") # If reach here, XXX is treated as variable
endif()
The output is:
condition 2 is true
So I come to below conclusion 1.
For a symbol in the condition clause:
If the symbol is defined as a variable before, CMake will treat it as variable and use its value for evaluation.
If the symbol is not defined as a variable before, CMake will treat it literally as a string.
Then I did another experiment.
set(ON "OFF")
if(ON)
message("condition 3 is true") # If reach here, ON is treated as a constant.
else()
message("condition 4 is true") # If reach here. ON is treated as a variable.
endif()
The output is:
condition 3 is true
So, though ON is explicitly defined as a variable, the if command still treat it as a constant of TRUE value. This directly contradicts to my previous conclusion 1.
So how can I know for sure the CMake if() command will treat a symbol as string or variable??
ADD 1 - 11:04 AM 7/11/2019
It seems the if(constant) form precedes other forms of if() statement. (src)
if(<constant>)
True if the constant is 1, ON, YES, TRUE, Y, or a non-zero number.
False if the constant is 0, OFF, NO, FALSE, N, IGNORE, NOTFOUND, the
empty string, or ends in the suffix -NOTFOUND. Named boolean constants
are case-insensitive. If the argument is not one of these specific
constants, it is treated as a variable or string and the following
signature is used.
So for now, I have to refer to the above rule first before applying my conclusion 1.
(This may be an answer, but I am not sure enough yet.)
Welcome to the wilderness of CMake symbol interpretation.
If the symbol exists as a variable, then the expression is evaluated with the value of the variable. Otherwise, the name of the variable (or literal, as you said) is evaluated instead.
The behavior becomes a little more consistent if you add the ${ and } sequences. Then the value of the variable is used in the evaluation every single time. If the variable doesn't exist or has not been assigned a value, then CMake uses several placeholder values that evaluate to "false". These are the values you mentioned in the latter part to your post.
I believe this is done this way for backwards compatibility, which CMake is really good about. For most of the quirky things CMake does, it's usually in the name of backwards compatibility.
As for the inconsistent behavior you mentioned in the "ON" variable, this is probably due to the precedence in which CMake processes the command arguments. I would have to figure that the constants are parsed before the symbol lookup occurs.
So when it comes to knowing/predicting how an if statement will evaluate, my best answer is experience. The CMake source tree and logic is one magnificent, nasty beast.
There's been discussions on adding an alternative language (one with perhaps a functional paradigm), but it's a quite large undertaking.

Add space separated string to cmake `include_directories`

I have a space separated string that represents include directories I'd like to add, let's call it ${MYSTRING}, and let's say it contains the stringmy/dir1 my/dir2 my/dir3.
Using:
include_directories(${MYSTRING})
Results in an incorrect makefile, as the CXX_FLAGS that is added is:
-Imy/dir1 my/dir2 my/dir3
Rather than:
-Imy/dir1 -Imy/dir2 -Imy/dir3
Is there anyway I can work around this? the string is generated via an external command, and I'd rather not have to depend on external tools such as sed.
Use separate_arguments which takes a space-separated string of values and turns it into a list:
set(MY_LIST ${MYSTRING})
separate_arguments(MY_LIST)
include_directories(${MY_LIST})

What's the difference between parenthesis $() and curly bracket ${} syntax in Makefile?

Is there any differences in invoking variables with syntax ${var} and $(var)? For instance, in the way the variable will be expanded or anything?
There's no difference – they mean exactly the same (in GNU Make and in POSIX make).
I think that $(round brackets) look tidier, but that's just personal preference.
(Other answers point to the relevant sections of the GNU Make documentation, and note that you shouldn't mix the syntaxes within a single expression)
The Basics of Variable References section from the GNU make documentation state no differences:
To substitute a variable's value, write a dollar sign followed by the
name of the variable in parentheses or braces: either $(foo) or
${foo} is a valid reference to the variable foo.
As already correctly pointed out, there is no difference but be be wary not to mix the two kind of delimiters as it can lead to cryptic errors like in the GNU make example by unomadh.
From the GNU make manual on the Function Call Syntax (emphasis mine):
[…] If the arguments themselves contain other function calls or variable references, it is wisest to use the same kind of delimiters for all the references; write $(subst a,b,$(x)), not $(subst a,b,${x}). This is because it is clearer, and because only one type of delimiter is matched to find the end of the reference.
The ${} style lets you test the make rules in the shell, if you have the corresponding environment variables set, since that is compatible with bash.
Actually, it seems to be fairly different:
, = ,
list = a,b,c
$(info $(subst $(,),-,$(list))_EOL)
$(info $(subst ${,},-,$(list))_EOL)
outputs
a-b-c_EOL
md/init-profile.md:4: *** unterminated variable reference. Stop.
But so far I only found this difference when the variable name into ${...} contains itself a comma. I first thought ${...} was expanding the comma not as part as the value, but it turns out i'm not able to hack it this way. I still don't understand this... If anyone had an explanation, I'd be happy to know !
It makes a difference if the expression contains unbalanced brackets:
${info ${subst ),(,:-)}}
$(info $(subst ),(,:-)))
->
:-(
*** insufficient number of arguments (1) to function 'subst'. Stop.
For variable references, this makes a difference for functions, or for variable names that contain brackets (bad idea)

How do I exclude a single file from a cmake `file(GLOB ... )` pattern?

My CMakeLists.txt contains this line:
file(GLOB lib_srcs Half/half.cpp Iex/*.cpp IlmThread/*.cpp Imath/*.cpp IlmImf/*.cpp)
and the IlmImf folder contains b44ExpLogTable.cpp, which I need to exclude from the build.
How to achieve that?
You can use the list function to manipulate the list, for example:
list(REMOVE_ITEM <list> <value> [<value> ...])
In your case, maybe something like this will work:
list(REMOVE_ITEM lib_srcs "IlmImf/b44ExpLogTable.cpp")
FILTER is another option which could be more convenient in some cases:
list(FILTER <list> <INCLUDE|EXCLUDE> REGEX <regular_expression>)
This line excludes every item ending with the required filename:
list(FILTER lib_srcs EXCLUDE REGEX ".*b44ExpLogTable\\.cpp$")
Here is Regex Specification for cmake:
The following characters have special meaning in regular expressions:
^ Matches at the beginning of input
$ Matches at the end of input
. Matches any single character
[ ] Matches any character(s) inside the brackets
[^ ] Matches any character(s) not inside the brackets
- Inside brackets, specifies an inclusive range between
characters on either side e.g. [a-f] is [abcdef]
To match a literal - using brackets, make it the first
or the last character e.g. [+*/-] matches basic
mathematical operators.
* Matches preceding pattern zero or more times
+ Matches preceding pattern one or more times
? Matches preceding pattern zero or once only
| Matches a pattern on either side of the |
() Saves a matched subexpression, which can be referenced
in the REGEX REPLACE operation. Additionally it is saved
by all regular expression-related commands, including
e.g. if( MATCHES ), in the variables CMAKE_MATCH_(0..9).
try this : CMakeLists.txt
install(DIRECTORY ${CMAKE_SOURCE_DIR}/
DESTINATION ${CMAKE_INSTALL_PREFIX}
COMPONENT copy-files
PATTERN ".git*" EXCLUDE
PATTERN "*.in" EXCLUDE
PATTERN "*/build" EXCLUDE)
add_custom_target(copy-files
COMMAND ${CMAKE_COMMAND} -D COMPONENT=copy-files
-P cmake_install.cmake)
$cmake <src_path> -DCMAKE_INSTALL_PREFIX=<install_path>
$cmake --build . --target copy-files
I have an alternative solution worth noticing: mark source as header file.
This way it will not be part of the build process, but will be visible in IDE (verified on Visual Studio and Xcode):
set_source_files_properties(b44ExpLogTable.cpp,
PROPERTIES HEADER_FILE_ONLY TRUE)
I use this when some source file is platform specific. It is great since if some symbol has to be modified in many places and working on one platform then other platform specific source will can be visible and can be updated too.
For that I've created a helper function which works great in my current project.
I didn't use this method with file GLOB yet.

Using CMake's include_directories command with white spaces

I am using CMake to build my project and I have the following line:
include_directories(${LLVM_INCLUDE_DIRS})
which, after evaluating LLVM_INCLUDE_DIRS, evaluates to:
include_directories(C:\Program Files\LLVM\include)
The problem is that this is being considered two include directories, "C:\Program" and "Files\LLVM\include".
Any idea how can I solve this problem? I tried using quotation marks, but it didn't work.
EDIT: It turned out that the problem is in the file llvm-3.0\share\llvm\cmake\LLVMConfig.cmake. I enclosed the following paths with quotation marks and the problem was solved:
set(LLVM_INSTALL_PREFIX C:/Program Files/LLVM)
set(LLVM_INCLUDE_DIRS ${LLVM_INSTALL_PREFIX}/include)
set(LLVM_LIBRARY_DIRS ${LLVM_INSTALL_PREFIX}/lib)
In CMake,
whitespace is a list separator (like ;),
evaluating variable names basically replaces the variable name with its content and
\ is an escape character (to get the symbol, it needs to be escaped as well)
So, in your example, include_directories(C:\\Pogram Files\\LLVM\\include) is the same as
include_directories( C:\\Program;Files\\LLVM\\include)
that is, a list with two items. To avoid this, either
escape the whitespace as well:
include_directories( C:\\Program\ Files\\LLVM\\include) or
surround the path with quotation marks:
include_directories( "C:\\Program Files\\LLVM\\include")
Obviously, the second option is the better choice as it is
simpler and easier to read and
can be used with variable evaluation like in your example (since the result of the evaluation is then surrounded by quotation marks and thus, treated a single item)
include_directories("${LLVM_INCLUDE_DIRS}")
This works as well, if LLVM_INCLUDE_DIRS is a list of multiple directories because the items in this list will then be explicitly separated by ; so that there is no need for unquoted whitespace as implicit list item separator.
Side note:
When using hard-coded path-names (for whatever reason) in my CMake files, I usually uses forward slashes as directory separators as this works on Windows as well and avoids the need to escape all backslashes.
This is more likely to be an error at the point where LLVM_INCLUDE_DIRS is set rather than a problem with include_directories.
To check this, try calling include_directories("C:\\Program Files\\LLVM\\include") - it should work correctly.
The problem seems to be that LLVM_INCLUDE_DIRS was constructed without using quotation marks. Try for example running this:
set(LLVM_INCLUDE_DIRS C:\\Program Files\\LLVM\\include)
message("${LLVM_INCLUDE_DIRS}")
set(LLVM_INCLUDE_DIRS "C:\\Program Files\\LLVM\\include")
message("${LLVM_INCLUDE_DIRS}")
The output is:
C:\Program;Files\LLVM\include
C:\Program Files\LLVM\include
Note the semi-colon in the first output line. This is a list with 2 items.
So the way to fix this is to modify the way in which LLVM_INCLUDE_DIRS is created.