How to use empty entries in CMake lists as arguments - cmake

When using a CMake list to specify multiple arguments to a function, empty arguments are not passed as arguments to the list. In some cases an empty string is needed as an argument. Is there a way to achieve this?
If I run this
set(CMAKE_EXECUTE_PROCESS_COMMAND_ECHO STDOUT)
set(LIST_VAR
"ONE"
"TWO"
""
"FOUR"
)
execute_process(
COMMAND
${CMAKE_COMMAND} -E echo
${LIST_VAR}
)
execute_process(
COMMAND
${CMAKE_COMMAND} -E echo
"ONE"
"TWO"
""
"FOUR"
)
with
cmake -P test.cmake
I get:
'cmake' '-E' 'echo' 'ONE' 'TWO' 'FOUR'
ONE TWO FOUR
'cmake' '-E' 'echo' 'ONE' 'TWO' '' 'FOUR'
ONE TWO FOUR
In the first variant the third, empty argument is swallowed, which is really annoying if there are cases, where an empty argument may be possible/needed and arguments are prepared as CMake lists.
In my application I need to call a script and not cmake -E echo, which expects empty arguments in certain situations.
Also, the empty entries are - of course - not put in literally as in this simplified example. Instead of "" I have something like "${MIGHT_BE_EMPTY}".
Is there a way to safely transport empty strings as list entries to function arguments?
If not, is there a good work-around for this problem?
E.g. transform unset variables to something like a space (" ") which might be equivalent to an empty argument for the called script?

The problem is that there are no lists in CMake, only strings. And there is a convention that some CMake commands understand: if there are unquoted ; (semicolon) in the variable then it is a list, where each element of the list separated from another by ;. And there is a rule which allows to create lists out of multiple strings separated by a whitespace.
So this command:
set(LIST_VAR
"ONE"
"TWO"
""
"FOUR"
)
Creates a variable with the following content: ONE;TWO;;FOUR which is a list in the CMake book. Now there is another rule (reverse) which CMake uses when it expands unquoted variables:
Each non-empty element is given to the command invocation as an
argument.
So you can't use empty elements and have CMake lists propagate them across the script. You can use something non-empty like an empty element, though. For example, you could use the space character to mean the element is empty if it wouldn't mess with your other data:
set(EMPTY " ")
set(LIST_VAR
"ONE"
"TWO"
"${EMPTY}"
"FOUR"
)
And if you can't modify the list with explicitly providing an ${EMPTY} element you can add it to the existing list like this: string(REPLACE ";;" ";${EMPTY};" LIST_VAR "${LIST_VAR}").
Finally, if you need to modify not a list but some particular, potentially empty variable then you can use the following:
if(MIGHT_BE_EMPTY STREQUAL "")
set(MIGHT_BE_EMPTY "${EMPTY}")
endif()
I use the EMPTY variable because it is convenient, you can drop it and replace with whatever symbol you likeā€”it won't make a difference.

Related

Cmake match beginning of string

I am getting some compile definitions from an external library. Unfortunately, they provide a list that sometimes starts with a leading semi-colon. For example:
;-Dfoo;Dbar
I think this is crashing the build command later in the process. I thought that I could simply remove potential leading semi-colons with this regex:
string(REGEX REPLACE "^;" "" stripped_defs ${defs})
but the problem is that Cmake seems to be ignoring the carrot ^ which signifies the start of the string, with the consequence being that all semi-colons are deleted. That is, I am getting the output
-Dfoo-Dbar
when I want
-Dfoo;-Dbar
As Sergei points out, the problem is that my defs variable was being interpreted as a list, not a string. So the regex was acting on each element of the list individually. All I need to do to force the string interpretation is to add quotes. Specifically, instead of
string(REGEX REPLACE "^;" "" stripped_defs ${defs})
I should have had
string(REGEX REPLACE "^;" "" stripped_defs "${defs}")
Rather than using a regular expression in this case, using list operations to delete empty elements would be my preferred approach in this case:
set(stripped_defs ${defs})
list(REMOVE_ITEM stripped_defs "")
This may involve one more command, but it's easier to understand what the snippet does.

How do I convert a CMake semicolon-separated list to newline-separated?

E.g:
set (txt "Hello" "There" "World")
# TODO
message (txt) # Prints "Hello\nThere\nWorld" (i.e. each list item on a new line
What do I put in place of TODO?
CMake's lists are semicolon-delimited. So "Hello" "There" "World" is internally represented as Hello;There;World. So a simple solution is to replace semicolons with newlines:
string (REPLACE ";" "\n" txt "${txt}")
This works in this example, however lets try a more complicated example:
set (txt "" [[\;One]] "Two" [[Thre\;eee]] [[Four\\;rrr]])
The [[ ]] is a raw string so the \'s are passed through into CMake's internal representation of the list unchanged. The internal representation is: ;\;One;Two;Thre\;eee;Four\\;rrr. We'd expect it to print:
<blank line>
;One
Two
Thre;eee
Four\\;rrr
I'm not actually 100% sure about the Four\\;rrr one but I think it is right. Anyway with our naive implementation we actually get this:
<blank line>
\
One
Two
Thre\
eee
Four\\
rrr
It's because it doesn't know to not convert actual semicolons that are escaped. The solution is to use a regex:
string (REGEX REPLACE "[^\\\\];" "\\1\n" txt "${txt}")
I.e. only replace ; if it is preceded by a non-\ character (and put that character in the replacement). The almost works, but it doesn't handle the first empty element because the semicolon isn't preceded by anything. The final answer is to allow the start of string too:
string (REGEX REPLACE "(^|[^\\\\]);" "\\1\n" txt "${txt}")
Oh and the \\\\ is because one level of escaping is removed by CMake processing the string literal, and another by the regex engine. You could also do this:
string (REGEX REPLACE [[(^|[^\\]);]] "\\1\n" txt "${txt}")
But I don't think that is clearer.
Maybe there is a simpler method than this but I couldn't find it. Anyway, that, Ladies and Gentlemen, is why you should never use strings as your only type, or do in-band string delimiting. Still, could have been worse - at least they didn't use spaces as a separator like Bash!
I just wanted to add some alternatives I'm seeing just using the fact that message() does place a newline at the end by itself:
Just using for_each() to iterate over the list:
set (txt "Hello" "There" "World")
foreach(line IN LISTS txt)
message("${line}")
endforeach()
An function() based alternative I came up with looks more complicated:
function(message_cr line)
message("${line}")
if (ARGN)
message_cr(${ARGN})
endif()
endfunction()
set(txt "Hello" "There" "World")
message_cr(${txt})
The more generalized version of those approaches would look like:
for_each() with strings
set(txt "Hello" "There" "World")
foreach(line IN LISTS txt)
string(APPEND multiline "${line}\n")
endforeach()
message("${multiline}")
function() with strings
function(stringify_cr var line)
if (ARGN)
stringify_cr(${var} ${ARGN})
endif()
set(${var} "${line}\n${${var}}" PARENT_SCOPE)
endfunction()
set(txt "Hello" "There" "World")
stringify_cr(multiline ${txt})
message(${multiline})
If you don't like the additional newline at the end add string(STRIP "${multiline}" multiline).

How do I exclude a single file from a cmake `file(GLOB ... )` pattern?

My CMakeLists.txt contains this line:
file(GLOB lib_srcs Half/half.cpp Iex/*.cpp IlmThread/*.cpp Imath/*.cpp IlmImf/*.cpp)
and the IlmImf folder contains b44ExpLogTable.cpp, which I need to exclude from the build.
How to achieve that?
You can use the list function to manipulate the list, for example:
list(REMOVE_ITEM <list> <value> [<value> ...])
In your case, maybe something like this will work:
list(REMOVE_ITEM lib_srcs "IlmImf/b44ExpLogTable.cpp")
FILTER is another option which could be more convenient in some cases:
list(FILTER <list> <INCLUDE|EXCLUDE> REGEX <regular_expression>)
This line excludes every item ending with the required filename:
list(FILTER lib_srcs EXCLUDE REGEX ".*b44ExpLogTable\\.cpp$")
Here is Regex Specification for cmake:
The following characters have special meaning in regular expressions:
^ Matches at the beginning of input
$ Matches at the end of input
. Matches any single character
[ ] Matches any character(s) inside the brackets
[^ ] Matches any character(s) not inside the brackets
- Inside brackets, specifies an inclusive range between
characters on either side e.g. [a-f] is [abcdef]
To match a literal - using brackets, make it the first
or the last character e.g. [+*/-] matches basic
mathematical operators.
* Matches preceding pattern zero or more times
+ Matches preceding pattern one or more times
? Matches preceding pattern zero or once only
| Matches a pattern on either side of the |
() Saves a matched subexpression, which can be referenced
in the REGEX REPLACE operation. Additionally it is saved
by all regular expression-related commands, including
e.g. if( MATCHES ), in the variables CMAKE_MATCH_(0..9).
try this : CMakeLists.txt
install(DIRECTORY ${CMAKE_SOURCE_DIR}/
DESTINATION ${CMAKE_INSTALL_PREFIX}
COMPONENT copy-files
PATTERN ".git*" EXCLUDE
PATTERN "*.in" EXCLUDE
PATTERN "*/build" EXCLUDE)
add_custom_target(copy-files
COMMAND ${CMAKE_COMMAND} -D COMPONENT=copy-files
-P cmake_install.cmake)
$cmake <src_path> -DCMAKE_INSTALL_PREFIX=<install_path>
$cmake --build . --target copy-files
I have an alternative solution worth noticing: mark source as header file.
This way it will not be part of the build process, but will be visible in IDE (verified on Visual Studio and Xcode):
set_source_files_properties(b44ExpLogTable.cpp,
PROPERTIES HEADER_FILE_ONLY TRUE)
I use this when some source file is platform specific. It is great since if some symbol has to be modified in many places and working on one platform then other platform specific source will can be visible and can be updated too.
For that I've created a helper function which works great in my current project.
I didn't use this method with file GLOB yet.

CMake: difference between ${} and "${}"

What is the difference, in cmake, between something like:
set(any_new_var ${old_var})
and
set(any_new_var "${old_var}")
Any important difference? When have I to use one or the other form?
For example, I try with the next mini test
# test.cmake
# Variable 'a' isn't defined.
set(hola "${a}")
# message(${hola})
message("${hola}")
The output of this mini-test (cmake -P test.cmake) is a empty line (because 'a' isn't defined). If I uncomment the first message, cmake throws an message error:
CMake Error at prueba.cmake:6 (message):
message called with incorrect number of arguments
Why in the second case it doesn't throw and error but an empty line?
In CMake strings can be interpreted as lists. The rule is simple: to form the list split the string at semicolons. For example, the string value one;two;three can be thought of as a list of three elements: one, two, and three.
To invoke a command you write the command name and some words between parentheses. However, these words do not correspond to the arguments the command receive in a one-to-one fashion. Each word become zero or more arguments, and all the arguments get concatenated together.
Unless a word is quoted, it is treated as a list and is expanded to multiple arguments. A quoted word always becomes a single argument.
For example, assume that X is bound to one;two;three, Y is bound to the empty string, and Z is bound to foo. The following command invocation has three words, but the command receives four arguments:
some_command(${X} ${Y} ${Z})
# The command receives four arguments:
# 1. one
# 2. two
# 3. three
# 4. foo
If we would have quoted the words, the command would have received three arguments:
some_command("${X}" "${Y}" "${Z}")
# The command receives three arguments:
# 1. one;two;three
# 2. (the empty list)
# 3. foo
To return to your original question: the message command can receive a varying number of arguments. It takes all its arguments, concatenates them together into one string, and then prints that string. For some unknown reason it does not accept zero arguments, though.
The behavior message has with multiple arguments is not very useful, so you tend to use a single quoted argument with it:
set(SOURCES foo.c hoo.h)
message(${SOURCES}) # prints foo.cfoo.h
message("${SOURCES}") # prints foo.c;foo.h
Also, when set receives multiple arguments it builds a string of the arguments separated by semicolons. The variable is then set to that string.

Escaping Double Quotes in Batch Script

How would I go about replacing all of the double quotes in my batch file's parameters with escaped double quotes? This is my current batch file, which expands all of its command line parameters inside the string:
#echo off
call bash --verbose -c "g++-linux-4.1 %*"
It then uses that string to make a call to Cygwin's bash, executing a Linux cross-compiler. Unfortunately, I'm getting parameters like these passed in to my batch file:
"launch-linux-g++.bat" -ftemplate-depth-128 -O3 -finline-functions
-Wno-inline -Wall -DNDEBUG -c
-o "C:\Users\Me\Documents\Testing\SparseLib\bin\Win32\LinuxRelease\hello.o"
"c:\Users\Me\Documents\Testing\SparseLib\SparseLib\hello.cpp"
Where the first quote around the first path passed in is prematurely ending the string being passed to GCC, and passing the rest of the parameters directly to bash (which fails spectacularly.)
I imagine if I can concatenate the parameters into a single string then escape the quotes it should work fine, but I'm having difficulty determining how to do this. Does anyone know?
The escape character in batch scripts is ^. But for double-quoted strings, double up the quotes:
"string with an embedded "" character"
eplawless's own answer simply and effectively solves his specific problem: it replaces all " instances in the entire argument list with \", which is how Bash requires double-quotes inside a double-quoted string to be represented.
To generally answer the question of how to escape double-quotes inside a double-quoted string using cmd.exe, the Windows command-line interpreter (whether on the command line - often still mistakenly called the "DOS prompt" - or in a batch file):See bottom for a look at PowerShell.
tl;dr:
The answer depends on which program you're calling:
You must use "" when passing an argument to a(nother) batch file and you may use "" with applications created with Microsoft's C/C++/.NET compilers (which also accept \"), which on Windows includes Python, Node.js, and PowerShell (Core) 7+'s CLI (pwsh) but not Windows PowerShell's (powershell.exe):
Example: foo.bat "We had 3"" of rain."
The following applies to targeting batch files only:
"" is the only way to get the command interpreter (cmd.exe) to treat the whole double-quoted string as a single argument (though that won't matter if you simply pass all arguments through to another program, with %*)
Sadly, however, not only are the enclosing double-quotes retained (as usual), but so are the doubled escaped ones, so obtaining the intended string is a two-step process; e.g., assuming that the double-quoted string is passed as the 1st argument, %1:
set "str=%~1" removes the enclosing double-quotes; set "str=%str:""="%" then converts the doubled double-quotes to single ones.
Be sure to use the enclosing double-quotes around the assignment parts to prevent unwanted interpretation of the values.
\" is required - as the only option - by many other programs, (e.g., Ruby, Perl, PHP, as well as programs that use the CommandLineToArgv Windows API function to parse their command-line arguments), but it use from cmd.exe is not robust and safe:
\" is what many executables and interpreters either require - including Windows PowerShell - when passed strings from the outside, on the command line - or, in the case of Microsoft's compilers, support as an alternative to "" - ultimately, though, it's up to the target program to parse the argument list.
Example: foo.exe "We had 3\" of rain."
However, use of \" can break calls and at least hypothetically result in unwanted, arbitrary execution of commands and/or input/output redirections:
The following characters present this risk: & | < >
For instance, the following results in unintended execution of the ver command; see further below for an explanation and the next bullet point for a workaround:
foo.exe "3\" of snow" "& ver."
For calling the Windows PowerShell CLI, powershell.exe, \"" and "^"" are robust, but limited alternatives (see section "Calling PowerShell's CLI ..." below).
If you must use \" from cmd.exe, there are only 3 safe approaches from cmd.exe, which are, however quite cumbersome: Tip of the hat to T S for his help.
Using (possibly selective) delayed variable expansion in your batch file, you can store literal \" in a variable and reference that variable inside a "..." string using !var! syntax - see T S's helpful answer.
The above approach, despite being cumbersome, has the advantage that you can apply it methodically and that it works robustly, with any input.
Only with LITERAL strings - ones NOT involving VARIABLES - do you get a similarly methodical approach: categorically ^-escape all cmd.exe metacharacters: " & | < > and - if you also want to suppress variable expansion - %:
foo.exe ^"3\^" of snow^" ^"^& ver.^"
Otherwise, you must formulate your string based on recognizing which portions of the string cmd.exe considers unquoted due to misinterpreting \" as closing delimiters:
in literal portions containing shell metacharacters: ^-escape them; using the example above, it is & that must be ^-escaped:
foo.exe "3\" of snow" "^& ver."
in portions with %...%-style variable references: ensure that cmd.exe considers them part of a "..." string and that that the variable values do not themselves have embedded, unbalanced quotes - which is not even always possible.
Background
Note: This is based on my own experiments. Do let me know if I'm wrong.
POSIX-like shells such as Bash on Unix-like systems tokenize the argument list (string) before passing arguments individually to the target program: among other expansions, they split the argument list into individual words (word splitting) and remove quoting characters from the resulting words (quote removal). The target program is handed an array of individual, verbatim arguments, i.e. with syntactic quotes removed.
By contrast, the Windows command interpreter apparently does not tokenize the argument list and simply passes the single string comprising all arguments - including quoting chars. - to the target program.
However, some preprocessing takes place before the single string is passed to the target program: ^ escape chars. outside of double-quoted strings are removed (they escape the following char.), and variable references (e.g., %USERNAME%) are interpolated first.
Thus, unlike in Unix, it is the target program's responsibility to parse to parse the arguments string and break it down into individual arguments with quotes removed.
Thus, different programs can require differing escaping methods and there's no single escaping mechanism that is guaranteed to work with all programs - https://stackoverflow.com/a/4094897/45375 contains excellent background on the anarchy that is Windows command-line parsing.
In practice, \" is very common, but NOT SAFE from cmd.exe, as mentioned above:
Since cmd.exe itself doesn't recognize \" as an escaped double-quote, it can misconstrue later tokens on the command line as unquoted and potentially interpret them as commands and/or input/output redirections.
In a nutshell: the problem surfaces, if any of the following characters follow an opening or unbalanced \": & | < >; for example:
foo.exe "3\" of snow" "& ver."
cmd.exe sees the following tokens, resulting from misinterpreting \" as a regular double-quote:
"3\"
of
snow" "
rest: & ver.
Since cmd.exe thinks that & ver. is unquoted, it interprets it as & (the command-sequencing operator), followed by the name of a command to execute (ver. - the . is ignored; ver reports cmd.exe's version information).
The overall effect is:
First, foo.exe is invoked with the first 3 tokens only.
Then, command ver is executed.
Even in cases where the accidental command does no harm, your overall command won't work as designed, given that not all arguments are passed to it.
Many compilers / interpreters recognize ONLY \" - e.g., the GNU C/C++ compiler, Perl, Ruby, PHP, as well as programs that use the CommandLineToArgv Windows API function to parse their command-line arguments - and for them there is no simple solution to this problem.
Essentially, you'd have to know in advance which portions of your command line are misinterpreted as unquoted, and selectively ^-escape all instances of & | < > in those portions.
By contrast, use of "" is SAFE, but is regrettably only supported by Microsoft-compiler-based executables and batch files (in the case of batch files, with the quirks discussed above), which notable excludes PowerShell - see next section.
Calling PowerShell's CLI from cmd.exe or POSIX-like shells:
Note: See the bottom section for how quoting is handled inside PowerShell.
When invoked from the outside - e.g., from cmd.exe, whether from the command line or a batch file:
PowerShell [Core] v6+ now properly recognizes "" (in addition to \"), which is both safe to use and whitespace-preserving.
pwsh -c " ""a & c"".length " doesn't break and correctly yields 6
Windows PowerShell (the legacy edition whose latest and final version is 5.1) recognizes only \" or """, the latter being the most robust choice from cmd.exe, in the form "^""" (even though internally PowerShell uses ` as the escape character in double-quoted strings and also accepts "" - see bottom section), as discussed next:
Calling Windows PowerShell from cmd.exe / a batch file:
"" breaks, because it is fundamentally unsupported:
powershell -c " ""ab c"".length " -> error "The string is missing the terminator"
\" and """ work in principle, but aren't safe:
powershell -c " \"ab c\".length " works as intended: it outputs 5 (note the 2 spaces)
But it isn't safe, because cmd.exe metacharacters break the command, unless escaped:
powershell -c " \"a& c\".length " breaks, due to the &, which would have to be escaped as ^&
\"" is safe, but normalizes interior whitespace, which can be undesired:
powershell -c " \""a& c\"".length " outputs 4(!), because the 2 spaces are normalized to 1.
"^"" is the best choice for Windows PowerShell, specifically Credit goes to Venryx for discovering this approach. and "" for PowerShell (Core) 7+:
Windows PowerShell: powershell -c " "^""a& c"^"".length " works: doesn't break - despite & - and outputs 5, i.e., correctly preserved whitespace.
PowerShell Core: pwsh -c """a& c"".length "
See this answer for more information.
On Unix-like platforms (Linux, macOS), when calling PowerShell [Core]'s CLI, pwsh, from a POSIX-like shell such as bash:
You must use \", which, however is both safe and whitespace-preserving:
$ pwsh -c " \"a& c\".length " # OK: 5
# Alternative, with '...' quoting: no escaping of " needed.
$ pwsh -c ' "a& c".length ' # OK: 5
Related information
^ can only be used as the escape character in unquoted strings - inside double-quoted strings, ^ is not special and treated as a literal.
CAVEAT: Use of ^ in parameters passed to the call statement is broken (this applies to both uses of call: invoking another batch file or binary, and calling a subroutine in the same batch file):
^ instances in double-quoted values are inexplicably doubled, altering the value being passed: e.g., if variable %v% contains literal value a^b, call :foo "%v%" assigns "a^^b"(!) to %1 (the first parameter) in subroutine :foo.
Unquoted use of ^ with call is broken altogether in that ^ can no longer be used to escape special characters: e.g., call foo.cmd a^&b quietly breaks (instead of passing literal a&b too foo.cmd, as would be the case without call) - foo.cmd is never even invoked(!), at least on Windows 7.
Escaping a literal % is a special case, unfortunately, which requires distinct syntax depending on whether a string is specified on the command line vs. inside a batch file; see https://stackoverflow.com/a/31420292/45375
The short of it: Inside a batch file, use %%. On the command line, % cannot be escaped, but if you place a ^ at the start, end, or inside a variable name in an unquoted string (e.g., echo %^foo%), you can prevent variable expansion (interpolation); % instances on the command line that are not part of a variable reference are treated as literals (e.g, 100%).
Generally, to safely work with variable values that may contain spaces and special characters:
Assignment: Enclose both the variable name and the value in a single pair of double-quotes; e.g., set "v=a & b" assigns literal value a & b to variable %v% (by contrast, set v="a & b" would make the double-quotes part of the value). Escape literal % instances as %% (works only in batch files - see above).
Reference: Double-quote variable references to make sure their value is not interpolated; e.g., echo "%v%" does not subject the value of %v% to interpolation and prints "a & b" (but note that the double-quotes are invariably printed too). By contrast, echo %v% passes literal a to echo, interprets & as the command-sequencing operator, and therefore tries to execute a command named b.
Also note the above caveat re use of ^ with the call statement.
External programs typically take care of removing enclosing double-quotes around parameters, but, as noted, in batch files you have to do it yourself (e.g., %~1 to remove enclosing double-quotes from the 1st parameter) and, sadly, there is no direct way that I know of to get echo to print a variable value faithfully without the enclosing double-quotes.
Neil offers a for-based workaround that works as long as the value has no embedded double quotes; e.g.:
set "var=^&')|;,%!" for /f "delims=" %%v in ("%var%") do echo %%~v
cmd.exe does not recognize single-quotes as string delimiters ('...') - they are treated as literals and cannot generally be used to delimit strings with embedded whitespace; also, it follows that the tokens abutting the single-quotes and any tokens in between are treated as unquoted by cmd.exe and interpreted accordingly.
However, given that target programs ultimately perform their own argument parsing, some programs such as Ruby do recognize single-quoted strings even on Windows; by contrast, C/C++ executables and Perl do not recognize them.
Even if supported by the target program, however, it is not advisable to use single-quoted strings, given that their contents are not protected from potentially unwanted interpretation by cmd.exe.
Quoting from within PowerShell:
Windows PowerShell is a much more advanced shell than cmd.exe, and it has been a part of Windows for many years now (and PowerShell Core brought the PowerShell experience to macOS and Linux as well).
PowerShell works consistently internally with respect to quoting:
inside double-quoted strings, use `" or "" to escape double-quotes
inside single-quoted strings, use '' to escape single-quotes
This works on the PowerShell command line and when passing parameters to PowerShell scripts or functions from within PowerShell.
(As discussed above, passing an escaped double-quote to PowerShell from the outside requires \" or, more robustly, \"" - nothing else works).
Sadly, when invoking external programs from PowerShell, you're faced with the need to both accommodate PowerShell's own quoting rules and to escape for the target program:
This problematic behavior is also discussed and summarized in this answer; the experimental PSNativeCommandArgumentPassing feature introduced in PowerShell Core 7.2.0-preview.5 - assuming it becomes an official feature - will fix this at least for those external programs that accept \".
Double-quotes inside double-quoted strings:
Consider string "3`" of rain", which PowerShell-internally translates to literal 3" of rain.
If you want to pass this string to an external program, you have to apply the target program's escaping in addition to PowerShell's; say you want to pass the string to a C program, which expects embedded double-quotes to be escaped as \":
foo.exe "3\`" of rain"
Note how both `" - to make PowerShell happy - and the \ - to make the target program happy - must be present.
The same logic applies to invoking a batch file, where "" must be used:
foo.bat "3`"`" of rain"
By contrast, embedding single-quotes in a double-quoted string requires no escaping at all.
Single-quotes inside single-quoted strings do not require extra escaping; consider '2'' of snow', which is PowerShell' representation of 2' of snow.
foo.exe '2'' of snow'
foo.bat '2'' of snow'
PowerShell translates single-quoted strings to double-quoted ones before passing them to the target program.
However, double-quotes inside single-quoted strings, which do not need escaping for PowerShell, do still need to be escaped for the target program:
foo.exe '3\" of rain'
foo.bat '3"" of rain'
PowerShell v3 introduced the magic --% option, called the stop-parsing symbol, which alleviates some of the pain, by passing anything after it uninterpreted to the target program, save for cmd.exe-style environment-variable references (e.g., %USERNAME%), which are expanded; e.g.:
foo.exe --% "3\" of rain" -u %USERNAME%
Note how escaping the embedded " as \" for the target program only (and not also for PowerShell as \`") is sufficient.
However, this approach:
does not allow for escaping % characters in order to avoid environment-variable expansions.
precludes direct use of PowerShell variables and expressions; instead, the command line must be built in a string variable in a first step, and then invoked with Invoke-Expression in a second.
An alternative workaround* that addresses this problem is to call via cmd /c with a single argument containing the entire command line:
cmd /c "foo.exe `"3\`" of rain`" -u $env:USERNAME"
Thus, despite its many advancements, PowerShell has not made escaping easier when calling external programs - on the contrary. It has, however, introduced support for single-quoted strings.
If you don't mind installing a third-party module (authored by me), the Native module (Install-Module Native) offers backward- and forward-compatible helper function ie, which obviates the need for the extra escaping and contains important accommodations for high-profile CLIs on Windows:
# Simply prepend 'ie' to your external-program calls.
ie foo.exe '3" of rain' -u $env:USERNAME
Google eventually came up with the answer. The syntax for string replacement in batch is this:
set v_myvar=replace me
set v_myvar=%v_myvar:ace=icate%
Which produces "replicate me". My script now looks like this:
#echo off
set v_params=%*
set v_params=%v_params:"=\"%
call bash -c "g++-linux-4.1 %v_params%"
Which replaces all instances of " with \", properly escaped for bash.
As an addition to mklement0's excellent answer:
Almost all executables accept \" as an escaped ". Safe usage in cmd however is almost only possible using DELAYEDEXPANSION.
To explicitely send a literal " to some process, assign \" to an environment variable, and then use that variable, whenever you need to pass a quote. Example:
SETLOCAL ENABLEDELAYEDEXPANSION
set q=\"
child "malicious argument!q!&whoami"
Note SETLOCAL ENABLEDELAYEDEXPANSION seems to work only within batch files. To get DELAYEDEXPANSION in an interactive session, start cmd /V:ON.
If your batchfile does't work with DELAYEDEXPANSION, you can enable it temporarily:
::region without DELAYEDEXPANSION
SETLOCAL ENABLEDELAYEDEXPANSION
::region with DELAYEDEXPANSION
set q=\"
echoarg.exe "ab !q! & echo danger"
ENDLOCAL
::region without DELAYEDEXPANSION
If you want to pass dynamic content from a variable that contains quotes that are escaped as "" you can replace "" with \" on expansion:
SETLOCAL ENABLEDELAYEDEXPANSION
foo.exe "danger & bar=region with !dynamic_content:""=\"! & danger"
ENDLOCAL
This replacement is not safe with %...% style expansion!
In case of OP bash -c "g++-linux-4.1 !v_params:"=\"!" is the safe version.
If for some reason even temporarily enabling DELAYEDEXPANSION is not an option, read on:
Using \" from within cmd is a little bit safer if one always needs to escape special characters, instead of just sometimes. (It's less likely to forget a caret, if it's consistent...)
To achieve this, one precedes any quote with a caret (^"), quotes that should reach the child process as literals must additionally be escaped with a backlash (\^"). ALL shell meta characters must be escaped with ^ as well, e.g. & => ^&; | => ^|; > => ^>; etc.
Example:
child ^"malicious argument\^"^&whoami^"
Source: Everyone quotes command line arguments the wrong way, see "A better method of quoting"
To pass dynamic content, one needs to ensure the following:
The part of the command that contains the variable must be considered "quoted" by cmd.exe (This is impossible if the variable can contain quotes - don't write %var:""=\"%). To achieve this, the last " before the variable and the first " after the variable are not ^-escaped. cmd-metacharacters between those two " must not be escaped. Example:
foo.exe ^"danger ^& bar=\"region with %dynamic_content% & danger\"^"
This isn't safe, if %dynamic_content% can contain unmatched quotes.
If the string is already within quotes then use another quote to nullify its action.
echo "Insert tablename(col1) Values('""val1""')"
At Windows 10 21H1.
If from a batch (.bat) file I want to run the Everything application, I use """ inside double quotes argument:
"C:\Program Files\Everything\Everything.exe" -search "<"""D:\My spaced folder""" | """Z:\My_non_spaced_folder"""> <*.jpg | *.jpeg | *.avi | *.mp4>"
Hope it helps.