I'm using cmake to compare two files like this:
cmake -E compare_files file1 file2
The trouble is that file1 and file2 have different line endings. I'm using cmake because I already use that for my build; the above command is in my testing.
I don't need anything special at this point. Just a way to tell the user that the files are different. (Hopefully there are no differences.) If there are differences, I'll just report it at this point and manually inspect more closely.
If there is a convenient way of reporting (e.g., print to screen or write to a file) then I'm open to suggestions on how to make that happen. But I'm really just interested in knowing if there are differences and different line endings are unimportant.
Is there a flag or an option that I'm missing that will ignore the difference in line endings?
Unfortunately it seems to be an unsupported feature (issue is "tracked" here, doesn't look like it will ever be handled).
But there's a workaround: you can use configure_file to create a copy of the files with a uniform row endings before starting the comparison. For example:
configure_file(<input> <output> NEWLINE_STYLE CRLF)
Note that the option COPYONLY is not compatible with NEWLINE_STYLE, so you'll have to take care configure_file doesn't make any unintended variable substitution.
With CMake 3.14 you can now do:
cmake -E compare_files --ignore-eol file1 file2
Related
I have a project where one repeatable task to do involves manipulating files' contents.
Until now I used a Python script for it, but recently I discovered I can use standalone CMake scripts ("standalone" here means they can be invoked outside of configure/build/test/etc. workflow). As my project already uses CMake for project management I concluded I can save others' problem of installing a Python interpreter (welcome Windows users!) and use CMake project-wide.
Part of my script needs to read a file and cut off everything that appears before "[START-HERE]" and after "[END-HERE]" lines. I am stuck with that part and don't know how to implement it. How can it be done?
You could combine file(READ) with if(MATCHES) to accompilish this. The former is used to read the file, the latter allows you to check for the occurance of a regular expression and to extract a capturing group:
foo.cmake
#[===[
Params:
INPUT_FILE : the path to the file to read
#]===]
file(READ "${INPUT_FILE}" FILE_CONTENTS)
if (FILE_CONTENTS MATCHES "(^|[\r\n])\\[START-HERE\\][\r\n]+(.*)[\r\n]+\\[END-HERE\\]")
# todo: use extracted match stored in CMAKE_MATCH_2 for your own logic
message("Content: '${CMAKE_MATCH_2}'")
else()
message(FATAL_ERROR "[START-HERE]...[END-HERE] doesn't occur in the input file '${INPUT_FILE}'")
endif()
foo.txt
Definetly not
[START-HERE]
working
[END-HERE]
Try again!
Output:
> cmake -D INPUT_FILE=foo.txt -P foo.cmake
Content: 'working'
For the part where you are stuck, here's one approach using the string, file, and math commands:
file(READ data.txt file_str)
string(FIND "${file_str}" "[START-HERE]" start_offset)
# message("${start_offset}")
math(EXPR start_offset "${start_offset}+12")
# message("${start_offset}")
string(FIND "${file_str}" "[END-HERE]" end_offset)
math(EXPR substr_len "${end_offset}-${start_offset}")
# message("${substr_len}")
string(SUBSTRING "${file_str}" "${start_offset}" "${substr_len}" trimmed_str)
# message("${trimmed_str}")
You could also probably do it by using the file(STRINGS) command, which reads lines of a file into an array, and then use the list(FIND) command. The approach shown above has the advantage of working if your delimiters are not on their own lines.
As #fabian shows in their answer post, you can also do this using a regular expression with if(MATCHES) like this:
file(READ "${INPUT_FILE}" FILE_CONTENTS)
if (FILE_CONTENTS MATCHES "(^|[\r\n])\\[START-HERE\\][\r\n]+(.*)[\r\n]+\\[END-HERE\\]")
# todo: use extracted match stored in CMAKE_MATCH_2 for your own logic
message("Content: '${CMAKE_MATCH_2}'")
else()
message(FATAL_ERROR "[START-HERE]...[END-HERE] doesn't occur in the input file '${INPUT_FILE}'")
endif()
I have a directory of almost a thousand html files. Each file needs to be split up into multiple text files, based on a recurring pattern (a heading). I am on a windows machine, using GnuWin32 tools.
I've found a way to do this, for a single file:
csplit 1.html -b "%04d.txt" /"Words in heading"/ {*}
But I don't know how to repeat this operation over the entire set of HTML files. This:
csplit *.html -b "%04d.txt" /"Words in heading"/ {*}
doesn't work, and neither does this:
for %i in (*.html) do csplit *.html -b "%04d.txt" /"Words in heading"/ {*}
Both result in an invalid pattern error. Help would be much appreciated!
The options/arguments order is important with csplit. And it won’t accept multiple files. It’s help gets you there:
% csplit --help
Usage: csplit [OPTION]... FILE PATTERN...
I’m surprised your first example works for the single file. It really should be changed to:
% csplit -b "%04d.txt" 1.html "/Words in heading/" "{*}"
^^^^^^^^^^^^^ ^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^
OPTS/ARGS FILE PATTERNS
Notice also that I changed your your quoting to be around the arguments. You probably also need to have quoted your last "{*}".
I’m not sure what shell you’re using, but if that for-loop syntax is appropriate, then the fixed command should work in the loop.
I am writing a package using the GNU build system. The documentation hence is in the texinfo format. As a result, executing make converts the texinfo file into the info format, and executing make pdf automatically produces a pdf file.
In the texinfo file, I have something like this:
#verbatim
awk '{...}' data.txt
#end verbatim
However, in the pdf, the "basic" single quotes (U+0027) in the awk command above are transformed into "curvy" single quotes (U+2019) so that, if one does a copy-paste of the command from the pdf into a terminal, bash complains ("syntax error"). This forces the user to edit the command he just copy-pasted. Same problem occurs if I replace #verbatim by #example. I searched the texinfo manual but couldn't find a way to specify apostrophes. I am using texinfo version 5.2.
Karl Berry (via the bug-texinfo mailing list) told me to add 2 lines to my texi file (more info):
#codequoteundirected on
#codequotebacktick on
as well as add the latest version of texinfo.tex to my package.
I want (GNU) make to rebuild when variables change. How can I achieve this?
For example,
$ make project
[...]
$ make project
make: `project' is up to date.
...like it should, but then I'd prefer
$ make project IMPORTANTVARIABLE=foobar
make: `project' is up to date.
to rebuild some or all of project.
Make wasn't designed to refer to variable content but Reinier's approach shows us the workaround. Unfortunately, using variable value as a file name is both insecure and error-prone. Hopefully, Unix tools can help us to properly encode the value. So
IMPORTANTVARIABLE = a trouble
# GUARD is a function which calculates md5 sum for its
# argument variable name. Note, that both cut and md5sum are
# members of coreutils package so they should be available on
# nearly all systems.
GUARD = $(1)_GUARD_$(shell echo $($(1)) | md5sum | cut -d ' ' -f 1)
foo: bar $(call GUARD,IMPORTANTVARIABLE)
#echo "Rebuilding foo with $(IMPORTANTVARIABLE)"
#touch $#
$(call GUARD,IMPORTANTVARIABLE):
rm -rf IMPORTANTVARIABLE*
touch $#
Here you virtually depend your target on a special file named $(NAME)_GUARD_$(VALUEMD5) which is safe to refer to and has (almost) 1-to-1 correspondence with variable's value. Note that call and shell are GNU Make extensions.
You could use empty files to record the last value of your variable by using something like this:
someTarget: IMPORTANTVARIABLE.$(IMPORTANTVARIABLE)
#echo Remaking $# because IMPORTANTVARIABLE has changed
touch $#
IMPORTANTVARIABLE.$(IMPORTANTVARIABLE):
#rm -f IMPORTANTVARIABLE.*
touch $#
After your make run, there will be an empty file in your directory whose name starts with IMPORTANTVARIABLE. and has the value of your variable appended. This basically contains the information about what the last value of the variable IMPORTANTVARIABLE was.
You can add more variables with this approach and make it more sophisticated using pattern rules -- but this example gives you the gist of it.
You probably want to use ifdef or ifeq depending on what the final goal is. See the manual here for examples.
I might be late with an answer, but here is another way of doing such a dependency with Make conditional syntax (works on GNU Make 4.1, GNU bash, Bash on Ubuntu on Windows version 4.3.48(1)-release (x86_64-pc-linux-gnu)):
1 ifneq ($(shell cat config.sig 2>/dev/null),prefix $(CONFIG))
2 .PHONY: config.sig
3 config.sig:
4 #(echo 'prefix $(CONFIG)' >config.sig &)
5 endif
In the above sample we track the $(CONFIG) variable, writing it's value down to a signature file, by means of the self-titled target which is generated under condition when the signature file's record value is different with that of $(CONFIG) variable. Please, note the prefix on lines 1 and 4: it is needed to distinct the case, when signature file doesn't exist yet.
Of course, consumer targets specify config.sig as a prerequisite.
I need to write a custom command that runs whenever file A is newer than file B.
How do I do this in CMake?
Sounds like you want something similar to this:
add_custom_command(OUTPUT B
COMMAND ${CMAKE_COMMAND} -Dinput=A -P script_that_generates_B.cmake
DEPENDS A
)
Where "B" is the full path to the output file, "A" is the full path to some input file, and the command is something that runs at build time to produce B whenever A changes.
In order for the rule producing B to be executed at build time, something else must depend on B also. It should appear either as a DEPENDS of an add_custom_target that is in "all" or as a source file to an add_library or add_executable command to trigger the command to run.
EDIT:
You can also use the
if(file1 IS_NEWER_THAN file2)
construct at CMake configure time, if necessary. The documentation of the IF command is rather lengthy, but searching on this page for IS_NEWER_THAN yields this nugget:
"True if file1 is newer than file2 or if one of the two files doesn't exist. Behavior is well-defined only for full paths."