CMake syntax: how to negate if(<constant>) and if(<variable|string>) - cmake

CMake's if command [1] supports several signatures, starting with
if(<constant>)
if(<variable|string>)
if(NOT <expression>)
How to negate the first two?
If the CMake documentation is correct (which in my experience is far from certain), then my question boils down to:
How to convert a constant, a variable, or a string X into an expression, with the additional requirement that X is to be evaluated as a boolean?
[1] https://cmake.org/cmake/help/latest/command/if.html

Actually, <expression> is just a placeholder for any parameter, which can be passed to if. Even the list of possible if constructions is titled as "Possible expressions are".
if(NOT <constant>) # Revert 'if(<constant>)'
if(NOT <variable|string>) # Revert 'if(NOT <variable|string>)'

Related

In what order does CMake evaluate OR and AND in compound 'if' condition?

The CMake documentation states:
The following syntax applies to the condition argument of the if,
elseif and while() clauses.
Compound conditions are evaluated in the following order of
precedence: Innermost parentheses are evaluated first. Next come unary
tests such as EXISTS, COMMAND, and DEFINED. Then binary tests such as
EQUAL, LESS, LESS_EQUAL, GREATER, GREATER_EQUAL, STREQUAL, STRLESS,
STRLESS_EQUAL, STRGREATER, STRGREATER_EQUAL, VERSION_EQUAL,
VERSION_LESS, VERSION_LESS_EQUAL, VERSION_GREATER,
VERSION_GREATER_EQUAL, and MATCHES. Then the boolean operators in the
order NOT, AND, and finally OR.
But the following prints 'FALSE':
cmake_minimum_required(VERSION 3.22)
project(Test)
if(YES OR NO AND NO)
message("TRUE")
else()
message("FALSE")
endif()
I'd expect the expression to evaluate as YES OR (NO AND NO). What's going on?
This is unfortunately not a bug in the implementation, but in the documentation. CMake is (mis)designed to evaluate AND and OR at the same precedence, and from left to right.
See the MR that will update the documentation here: https://gitlab.kitware.com/cmake/cmake/-/merge_requests/6970

CMake - How does the if() command treat a symbol? As string or as variable?

I am not sure the CMake if() command will treat a symbol in the condition clause as a variable or a string literal. So I did some experiments.
Script1.cmake
cmake_minimum_required(VERSION 3.15)
set(XXX "YYY") #<========== HERE!!
if(XXX STREQUAL "XXX")
message("condition 1 is true") # If reach here, XXX is treated as string
elseif(XXX STREQUAL "YYY")
message("condition 2 is true") # If reach here, XXX is treated as variable
endif()
The output is:
condition 2 is true
So I come to below conclusion 1.
For a symbol in the condition clause:
If the symbol is defined as a variable before, CMake will treat it as variable and use its value for evaluation.
If the symbol is not defined as a variable before, CMake will treat it literally as a string.
Then I did another experiment.
set(ON "OFF")
if(ON)
message("condition 3 is true") # If reach here, ON is treated as a constant.
else()
message("condition 4 is true") # If reach here. ON is treated as a variable.
endif()
The output is:
condition 3 is true
So, though ON is explicitly defined as a variable, the if command still treat it as a constant of TRUE value. This directly contradicts to my previous conclusion 1.
So how can I know for sure the CMake if() command will treat a symbol as string or variable??
ADD 1 - 11:04 AM 7/11/2019
It seems the if(constant) form precedes other forms of if() statement. (src)
if(<constant>)
True if the constant is 1, ON, YES, TRUE, Y, or a non-zero number.
False if the constant is 0, OFF, NO, FALSE, N, IGNORE, NOTFOUND, the
empty string, or ends in the suffix -NOTFOUND. Named boolean constants
are case-insensitive. If the argument is not one of these specific
constants, it is treated as a variable or string and the following
signature is used.
So for now, I have to refer to the above rule first before applying my conclusion 1.
(This may be an answer, but I am not sure enough yet.)
Welcome to the wilderness of CMake symbol interpretation.
If the symbol exists as a variable, then the expression is evaluated with the value of the variable. Otherwise, the name of the variable (or literal, as you said) is evaluated instead.
The behavior becomes a little more consistent if you add the ${ and } sequences. Then the value of the variable is used in the evaluation every single time. If the variable doesn't exist or has not been assigned a value, then CMake uses several placeholder values that evaluate to "false". These are the values you mentioned in the latter part to your post.
I believe this is done this way for backwards compatibility, which CMake is really good about. For most of the quirky things CMake does, it's usually in the name of backwards compatibility.
As for the inconsistent behavior you mentioned in the "ON" variable, this is probably due to the precedence in which CMake processes the command arguments. I would have to figure that the constants are parsed before the symbol lookup occurs.
So when it comes to knowing/predicting how an if statement will evaluate, my best answer is experience. The CMake source tree and logic is one magnificent, nasty beast.
There's been discussions on adding an alternative language (one with perhaps a functional paradigm), but it's a quite large undertaking.

Numeric only variable name in CMake

What was the reason to allow numeric only variable names in CMake?
It makes the next code frustrative (if's condition becomes true):
set(1 3)
set(2 3)
if (1 EQUAL 2)
MESSAGE( "hi there" )
endif()
And even more likely usage (if's condition becomes true also):
set(1 2)
... # later on, or even in the other file:
set(var1 1)
if (${var1} EQUAL 2)
MESSAGE( "hi there" )
endif()
PS I understand why variable references without ${} used inside IF/WHILE. But the possibility of numeric only variable names makes using IFs more error-prone...
Answer from Brad King at CMake issue tracker:
For reference, variable names are arbitrary strings, e.g.
set(var "almost anything here")
set("${var}" value)
message(STATUS "${${var}}")
Allowing numeric-only names is a side effect of that.
Certainly they can be used in confusing ways. Disallowing them, even
if only for if() evaluation, would require a policy.

CMake function to convert string to C string literal

Is there a built-in function to convert a string to a C string literal. For example:
set(foo [[Hello\ World"!\]])
convert_to_cstring_literal(bar "${foo}")
message("${foo}") # Should print (including quotes): "Hello\\ World\"!\\"
I mean I can do this with considerable effort with regexes, but if there's a built-in function it would be a lot nicer.
So, I actually gave up on this an used a different trick: C++ raw string literals. It's not 100% guaranteed of course, so don't use it on untrusted input (not sure why you would have any in CMake though). But it should be fine for most purposes.
set(foo "R\"#?#:#?#(${foo})#?#:#?#\"")
Turning my comment into an answer
Slightly modifying the CMake's function _cpack_escape_for_cmake from CPack.cmake I was able to successfully test the following:
cmake_minimum_required(VERSION 2.8)
project(CStringLiteral)
function(convert_to_cstring_literal var value)
string(REGEX REPLACE "([\\\$\"])" "\\\\\\1" escaped "${value}")
set("${var}" "\"${escaped}\"" PARENT_SCOPE)
endfunction()
set(foo [[Hello\ World"!\]])
convert_to_cstring_literal(bar "${foo}")
message("${bar}") # prints "Hello\\ World\"!\\"

When should I wrap variables with ${...} in CMake?

I wonder why often variables in CMake are wrapped with a dollar sign and curly brackets. For example, I saw this call in a CMake tutorial.
include_directories(${PROJECT_BINARY_DIR})
But from what I tried, this does the same thing.
include_directories(PROJECT_BINARY_DIR)
When is the wrapping with ${...} needed and what does it mean? Why are variables often wrapped with this even if it makes no difference?
Quoting the CMake documentation:
A variable reference has the form ${variable_name} and is evaluated
inside a Quoted Argument or an Unquoted Argument. A variable reference
is replaced by the value of the variable, or by the empty string if
the variable is not set.
In other words, writing PROJECT_BINARY_DIR refers, literally, to the string "PROJECT_BINARY_DIR". Encapsulating it in ${...} gives you the contents of the variable with the name PROJECT_BINARY_DIR.
Consider:
set(FOO "Hello there!")
message(FOO) # prints FOO
message(${FOO}) # prints Hello there!
As you have probably guessed already, include_directories(PROJECT_BINARY_DIR) simply attempts to add a subdirectory of the name PROJECT_BINARY_DIR to the include directories. On most build systems, if no such directory exists, it will simply ignore the command, which might have tricked you into the impression that it works as expected.
A popular source of confusion comes from the fact that if() does not require explicit dereferencing of variables:
set(FOO TRUE)
if(FOO)
message("Foo was set!")
endif()
Again the documentation explains this behavior:
if(<constant>)
True if the constant is 1, ON, YES, TRUE, Y, or a non-zero number. False if the constant is 0, OFF, NO, FALSE, N, IGNORE, NOTFOUND, the
empty string, or ends in the suffix -NOTFOUND. Named boolean constants
are case-insensitive. If the argument is not one of these constants,
it is treated as a variable.
if(<variable>)
True if the variable is defined to a value that is not a false constant. False otherwise. (Note macro arguments are not variables.)
In particular, one can come up with weird examples like:
unset(BLA)
set(FOO "BLA")
if(FOO)
message("if(<variable>): True")
else()
message("if(<variable>): False")
endif()
if(${FOO})
message("if(<constant>): True")
else()
message("if(<constant>): False")
endif()
Which will take the TRUE branch in the variable case, and the FALSE branch in the constant case. This is due to the fact that in the constant case, CMake will go look for a variable BLA to perform the check on (which is not defined, hence we end up in the FALSE branch).
it's per-case. it's poorly defined. You just have to look it up.
there are other places where you don't have to use {}'s to use the contents of the variable, besides IF. Yikes.