script to run a certain program with input from a given directory - testing

So I need to run a bunch of (maven) tests with testfiles being supplied as an argument to a maven task.
Something like this:
mvn clean test -Dtest=<filename>
And the test files are usually organized into different directories. So I'm trying to write a script which would execute the above 'command' and automatically feed the name of all files in a given dir to the -Dtest.
So I started out with a shellscript called 'run_test':
#!/bin/sh
if test $# -lt 2; then
echo "$0: insufficient arguments on the command line." >&1
echo "usage: $0 run_test dirctory" >&1
exit 1
fi
for file in allFiles <<<<<<< what should I put here? Can I somehow iterate thru the list of all files' name in the given directory put the file name here?
do mvn clean test -Dtest= $file
exit $?
The part where I got stuck is how to get a list of filenames.
Thanks,

Assuming $1 contains the directory name (validation of the user input is a separate issue), then
for file in $1/*
do
[[ -f $file ]] && mvn clean test -Dtest=$file
done
will run the comand on all files. If you want to recurse into subdirectories then you need to use the find command
for file in $(find $1 -type f)
do
etc...
done

#! /bin/sh
# Set IFS to newline to minimise problems with whitespace in file/directory
# names. If we also need to deal with newlines, we will need to use
# find -print0 | xargs -0 instead of a for loop.
IFS="
"
if ! [[ -d "${1}" ]]; then
echo "Please supply a directory name" > &2
exit 1
else
# We use find rather than glob expansion in case there are nested directories.
# We sort the filenames so that we execute the tests in a predictable order.
for pathname in $(find "${1}" -type f | LC_ALL=C sort) do
mvn clean test -Dtest="${pathname}" || break
done
fi
# exit $? would be superfluous (it is the default)

Related

How to ls the first file (path included) of every subfolder recursively?

Suppose these are the files:
folder1/11.txt
folder1/12.txt
folder1/levela/11a1.txt
folder1/levela/11a2.txt
folder1/levela/levelb/11b1.txt
folder1/levela/levelb/11b2.txt
folder2/21.txt
folder2/22.txt
folder2/levela/21a1.txt
folder2/levela/21a2.txt
folder2/levela/levelb/21b1.txt
folder2/levela/levelb/21b2.txt
folder3/a/b/c/d/e/deepfile1.txt
folder3/a/b/c/d/e/deepfile2.txt
Is there a way (for example using ls, find or grep or any gnuwin32 commands) to show the 1st file from every subfolder please?
Desired output:
folder1/11.txt
folder1/levela/11a1.txt
folder1/levela/levelb/11b1.txt
folder2/21.txt
folder2/levela/21a1.txt
folder2/levela/levelb/21b1.txt
folder3/a/b/c/d/e/deepfile1.txt
Thank you.
Suggesting this solution:
find -type f -printf "%p %h\n"|sort --key 2.1,1.1|uniq --skip-fields=1|awk '{print $1}'
Explanation:
find -type -printf "%p %n\n"
This find command search for all regular files under current directory.
And print for each file. Files' relative path, (space), and files' relative folder.
Suggesting to run this command on your directory.
sort --key 2.1,1.1
Sort the files list lexicography, from 2nd field than 1st field
Result in all files are sorted per their specific directory
Suggesting to try this:
find -type f -printf "%p %h\n"|sort --key 2.1,1.1
uniq --skip-fields=1
From the sorted files list.
Remove those lines having duplicate directory (field #2)
awk '{print $1}'
Print only first field, the relative files path.
A bash script:
script.sh
#!/bin/bash
declare -A filesArr # declare assiciate array for files in directories
for currFile in $(find "$1" -type f); do # main loop scan all files undre $1
currDir=$(dirname "$currFile") # get the curret file's directory
if [[ -z ${filesArr["$currDir"]} ]]; then # if current directory is not stored in filesArr
filesArr[$currDir]="$currFile" # store the directory with curren file
fi
if [[ ${filesArr["$currDir"]} > "$currFile" ]]; then # if current file < stored file in array
filesArr[$currDir]="$currFile" # set the stored file to be current file
fi
done
for currFile in ${filesArr[#]}; do # loop over array to output each directory
echo "$currFile"
done
Running script.sh on /tmp folder
chmod a+x script.sh
./script.sh /tmp
BTW: answer below with sort and uniq is much faster.

how to remove a pattern from many files

this is my file.
...
</script>
<!--START: Google Analytics --->
<script type="text/javascript"
src="../src/goog/ga_body.js"></script>
<!--END: Google Analytics --->
</body>
</html>
...
how do I delete every thing <!--START: Google Analytics ---> and <!--END: Google Analytics ---> inclusively? So effectively this:
<!--START: Google Analytics --->
<script type="text/javascript"
src="../src/goog/ga_body.js"></script>
<!--END: Google Analytics --->
will be gone. and this will be left i.e. that is nothing, the 4 lines will be replaced with nothing.
</script>
<nothing here 4 lines deleted>
</body>
</html>
I am looking at doing it in bash so maybe sed and awk might be my best bet, although python might be better.
EDIT1
This is something I have written before, but it is probably very poor coding, I will work off this find2PatternsAndDeleteTextInBetween.sh:
#HEre I want to find 2 patterns and delete whats in between
#this example works
#this is the 2 patterns I want to fine Start and End
#have to use some escape characters here for this to show properly
# have to use \n for it to appear in this format
#<!-- Start of StatCounter Code for DoYourOwnSite -->
# text would go here
#<!-- End of StatCounter Code for DoYourOwnSite -->>
#b="<!-- Start of StatCounter Code for DoYourOwnSite -->"
#b2="<!-- End of StatCounter Code for DoYourOwnSite -->"
#p1="PATTERN-1"
#p2="PATTERN-2"
p1="<!-- Start of StatCounter Code for DoYourOwnSite -->"
p2="<!-- End of StatCounter Code for DoYourOwnSite -->"
fname="*.html"
num_of_files_pattern1=ls #grep $p1 fname
echo "fname(s) to apply the sed to:"
echo $fname
echo "num_of_files_pattern1 is:"
echo $num_of_files_pattern1
echo "Pattern1 is equal to:"
echo $p1
echo "Pattern2 is equal to:"
echo $p2
#this is current dir where the script is
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
echo "DIR is equal to:"
echo $DIR
#cd to the dir where I want to copy the files to:
cd "$DIR"
# this will find the pattern <\head> in all the .html files and place "This should appear before the closing head tag" this before it
# it will also make a backup with .bak extension
#sed -i.bak '/<\\head>/i\This should appear before the closing head tag' *.html
echo "sed on the file"
# this does the head part
#sed '/PATTERN-1/,/PATTERN-2/d' *.txt # this works
#sed "/$p1/,/$p2/d" *.txt # this works
#sed "/$p1/,/$p2/d" $fname # this works
sed -i.bak "/$p1/,/$p2/d" $fname # this works
EDIT2
This is what i ended up with, but there is a more robust answer below:
# ------------------------------------------------------------------
# [author] find2PatternsAndDeleteTextInBetween.sh
# Description
# Here I want to find 2 patterns and delete what's in between
# this example works
#
# EXAMPLE:
# this is the 2 patterns I want to find Start and End
# <!-- Start of StatCounter Code for DoYourOwnSite -->
# text would go here
# <!-- End of StatCounter Code for DoYourOwnSite -->>
#
# ------------------------------------------------------------------
p1="<!--START: Google Analytics --->"
p2="<!--END: Google Analytics --->"
fname=".html"
echo "fname(s) to apply the sed to:"
echo *"$fname"
echo -e "\n"
echo "Pattern1 is equal to:"
echo -e "$p1\n"
echo "Pattern2 is equal to:"
echo -e "$p2\n"
echo -e "PWD is: $PWD\n"
echo "sed on the file"
#sed '/PATTERN-1/,/PATTERN-2/d' *.txt # this works
#sed "/$p1/,/$p2/d" *.txt # this works
#sed "/$p1/,/$p2/d" $fname # this works
sed -i.bak "/$p1/,/$p2/d" *"$fname" # this works
sed is for this task
$ sed -i'.bak' '/<!--START/,/<!--END/d' file
if you have other lines with similar tags add more of the pattern.
For multiple files, for example file1,..,file4
$ for f in file{1..4}; do sed -i'.bak' '/<!--START/,/<!--END/d' "$f"; done
Something to consider:
$ awk '/<!--(START|END): Google Analytics --->/{f=!f;next} !f' file
...
</script>
</body>
</html>
...
Judging by the script in your question it sounds like you already know how to use sed to remove the range of interest from a single file (sed -i.bak "/$p1/,/$p2/d" $fname), but are looking for a robust way to process multiple files in a script (assumes bash):
#!/usr/bin/env bash
# cd to the dir. in which this script is located.
# CAVEAT: Assumes that the script wasn't invoked through a *symlink*
# located in a different dir.
cd -- "$(dirname -- "$BASH_SOURCE")" || exit
fpattern='*.html' # specify source-file globbing pattern
shopt -s failglob # make sure that globbing expands to nothing if nothing matches
fnames=( $fpattern ) # expand to matching files and store in array
num_of_files_matching_pattern=${#fnames[#]} # count matching files
(( num_of_files_matching_pattern > 0 )) || exit # abort, if no files match
printf '%s\n%s\n' "Running from:" "$PWD"
printf '%s\n%s\n' "Pattern matching the files to process:" "$fpattern"
printf '%s\n%s\n' "# of matching files:" "$num_of_files_matching_pattern"
# Determine the range-endpoint-identifier-line regular expressions.
# CAVEAT: Make sure you escape any regular-expression metacharacters you want
# to be treated as *literals*.
p1='^<!--START: Google Analytics --->$'
p2='^<!--END: Google Analytics --->$'
# Remove the range identified by its endpoints from all matching input files
# and save the original files with extension '.bak'
sed -i'.bak' "/$p1/,/$p2/d" "${fnames[#]}" || exit
As an aside: I suggest not using suffix .sh in your script filename:
The shebang line inside the file is sufficient to tell the system what shell/interpreter to pass the script to.
Not specifying as suffix leaves you free to change the implementation later (e.g., to Python), without breaking existing programs that rely on your scripts.
In the case at hand, assuming that use of bash is actually acceptable, .sh would be misleading, because its suggests a sh-features-only script.
Determining the running script's true directory, even when the script is invoked via a symlink located in a different directory:
If you can assume a Linux platform (or at least GNU readlink), use:
dirname -- "$(readlink -e -- "$BASH_SOURCE")"
Otherwise, a more elaborate solution with a helper function is required - see this answer of mine.

SSH - Loop through lines from txt file and delete files

I have a .txt file and on each line is a different file location e.g.
file1.zip
file2.zip
file3.zip
How can I open that file, loop through each line and rm -f filename on each one?
Also, will deleting it throw an error if the file doesn't exist (has already been deleted) and if so how can I avoid this?
EDIT: The file names may have spaces in them, so this needs to be catered for as well.
You can use a for loop with cat to iterate through the lines:
IFS=$'\n'; \
for file in `cat list.txt`; do \
if [ -f $file ]; then \
rm -f "$file"; \
fi; \
done
The if [ -f $file ] will check if the file exists and is a regular file (not a directory). If the check fails, it will skip it.
The IFS=$'\n' at the top will set the delimiter to be newlines-only; This will allow you to process files with whitespace.
xargs -n1 echo < test.txt
Replace 'echo' with rm -f or any other command. You can also use cat test.txt |
'man xargs' for more info.

How to capture CMake command line arguments?

I want to record the arguments passed to cmake in my generated scripts. E.g., "my-config.in" will be processed by cmake, it has definition like this:
config="#CMAKE_ARGS#"
After cmake, my-config will contain a line something like this:
config="-DLINUX -DUSE_FOO=y -DCMAKE_INSTALL_PREFIX=/usr"
I tried CMAKE_ARGS, CMAKE_OPTIONS, but failed. No documents mention this. :-(
I don't know of any variable which provides this information, but you can generate it yourself (with a few provisos).
Any -D arguments passed to CMake are added to the cache file CMakeCache.txt in the build directory and are reapplied during subsequent invocations without having to be specified on the command line again.
So in your example, if you first execute CMake as
cmake ../.. -DCMAKE_INSTALL_PREFIX:PATH=/usr
then you will find that subsequently running simply
cmake .
will still have CMAKE_INSTALL_PREFIX set to /usr
If what you're looking for from CMAKE_ARGS is the full list of variables defined on the command line from every invocation of CMake then the following should do the trick:
get_cmake_property(CACHE_VARS CACHE_VARIABLES)
foreach(CACHE_VAR ${CACHE_VARS})
get_property(CACHE_VAR_HELPSTRING CACHE ${CACHE_VAR} PROPERTY HELPSTRING)
if(CACHE_VAR_HELPSTRING STREQUAL "No help, variable specified on the command line.")
get_property(CACHE_VAR_TYPE CACHE ${CACHE_VAR} PROPERTY TYPE)
if(CACHE_VAR_TYPE STREQUAL "UNINITIALIZED")
set(CACHE_VAR_TYPE)
else()
set(CACHE_VAR_TYPE :${CACHE_VAR_TYPE})
endif()
set(CMAKE_ARGS "${CMAKE_ARGS} -D${CACHE_VAR}${CACHE_VAR_TYPE}=\"${${CACHE_VAR}}\"")
endif()
endforeach()
message("CMAKE_ARGS: ${CMAKE_ARGS}")
This is a bit fragile as it depends on the fact that each variable which has been set via the command line has the phrase "No help, variable specified on the command line." specified as its HELPSTRING property. If CMake changes this default HELPSTRING, you'd have to update the if statement accordingly.
If this isn't what you want CMAKE_ARGS to show, but instead only the arguments from the current execution, then I don't think there's a way to do that short of hacking CMake's source code! However, I expect this isn't what you want since all the previous command line arguments are effectively re-applied every time.
One way to store CMake command line arguments, is to have a wrapper script called ~/bin/cmake (***1) , which does 2 things:
create ./cmake_call.sh that stores the command line arguments
call the real cmake executable with the command line arguments
~/bin/cmake # code is shown below
#!/usr/bin/env bash
#
# Place this file into this location: ~/bin/cmake
# (with executable rights)
#
# This is a wrapper for cmake!
# * It calls cmake -- see last line of the script
# It also:
# * Creates a file cmake_call.sh in the current directory (build-directory)
# which stores the cmake-call with all it's cmake-flags etc.
# (It also stores successive calls to cmake, so that you have a trace of all your cmake calls)
#
# You can simply reinvoke the last cmake commandline with: ./cmake_call.sh !!!!!!!!!!
#
# cmake_call.sh is not created
# when cmake is called without any flags,
# or when it is called with flags such as --help, -E, -P, etc. (refer to NON_STORE_ARGUMENTS -- you might need to modify it to suit your needs)
SCRIPT_PATH=$(readlink -f "$BASH_SOURCE")
SCRIPT_DIR=$(dirname "$SCRIPT_PATH")
#http://stackoverflow.com/a/13864829
if [ -z ${SUDO_USER+x} ]; then
# var SUDO_USER is unset
user=$USER
else
user=$SUDO_USER
fi
#http://stackoverflow.com/a/34621068
path_append () { path_remove $1 $2; export $1="${!1}:$2"; }
path_prepend() { path_remove $1 $2; export $1="$2:${!1}"; }
path_remove () { export $1="`echo -n ${!1} | awk -v RS=: -v ORS=: '$1 != "'$2'"' | sed 's/:$//'`"; }
path_remove PATH ~/bin # when calling cmake (at the bottom of this script), do not invoke this script again!
# when called with no arguments, don't create cmake_call.sh
if [[ -z "$#" ]]; then
cmake "$#"
exit
fi
# variable NON_STORE_ARGUMENTS stores flags which, if any are present, cause cmake_call.sh to NOT be created
read -r -d '' NON_STORE_ARGUMENTS <<'EOF'
-E
--build
#-N
-P
--graphviz
--system-information
--debug-trycompile
#--debug-output
--help
-help
-usage
-h
-H
--version
-version
/V
--help-full
--help-manual
--help-manual-list
--help-command
--help-command-list
--help-commands
--help-module
--help-module-list
--help-modules
--help-policy
--help-policy-list
--help-policies
--help-property
--help-property-list
--help-properties
--help-variable
--help-variable-list
--help-variables
EOF
NON_STORE_ARGUMENTS=$(echo "$NON_STORE_ARGUMENTS" | head -c -1 `# remove last newline` | sed "s/^/^/g" `#begin every line with ^` | tr '\n' '|')
#echo "$NON_STORE_ARGUMENTS" ## for debug purposes
## store all the args
ARGS_STR=
for arg in "$#"; do
if cat <<< "$arg" | grep -E -- "$NON_STORE_ARGUMENTS" &> /dev/null; then # don't use echo "$arg" ....
# since echo "-E" does not do what you want here,
# but cat <<< "-E" does what you want (print minus E)
# do not create cmake_call.sh
cmake "$#"
exit
fi
# concatenate to ARGS_STR
ARGS_STR="${ARGS_STR}$(echo -n " \"$arg\"" | sed "s,\($(pwd)\)\(\([/ \t,:;'\"].*\)\?\)$,\$(pwd)\2,g")"
# replace $(pwd) followed by
# / or
# whitespace or
# , or
# : or
# ; or
# ' or
# "
# or nothing
# with \$(pwd)
done
if [[ ! -e $(pwd)/cmake_call.sh ]]; then
echo "#!/usr/bin/env bash" > $(pwd)/cmake_call.sh
# escaping:
# note in the HEREDOC below, \\ means \ in the output!!
# \$ means $ in the output!!
# \` means ` in the output!!
cat <<EOF >> $(pwd)/cmake_call.sh
#http://stackoverflow.com/a/34621068
path_remove () { export \$1="\`echo -n \${!1} | awk -v RS=: -v ORS=: '\$1 != "'\$2'"' | sed 's/:\$//'\`"; }
path_remove PATH ~/bin # when calling cmake (at the bottom of this script), do not invoke ~/bin/cmake but real cmake!
EOF
else
# remove bottom 2 lines from cmake_call.sh
sed -i '$ d' $(pwd)/cmake_call.sh
sed -i '$ d' $(pwd)/cmake_call.sh
fi
echo "ARGS='${ARGS_STR}'" >> $(pwd)/cmake_call.sh
echo "echo cmake \"\$ARGS\"" >> $(pwd)/cmake_call.sh
echo "eval cmake \"\$ARGS\"" >> $(pwd)/cmake_call.sh
#echo "eval which cmake" >> $(pwd)/cmake_call.sh
chmod +x $(pwd)/cmake_call.sh
chown $user: $(pwd)/cmake_call.sh
cmake "$#"
Usage:
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=$(pwd)/install ..
This will create cmake_call.sh with the following content:
#!/usr/bin/env bash
#http://stackoverflow.com/a/34621068
path_remove () { export $1="`echo -n ${!1} | awk -v RS=: -v ORS=: '$1 != "'$2'"' | sed 's/:$//'`"; }
path_remove PATH ~/bin # when calling cmake (at the bottom of this script), do not invoke ~/bin/cmake but real cmake!
ARGS=' "-DCMAKE_BUILD_TYPE=Debug" "-DCMAKE_INSTALL_PREFIX=$(pwd)/install" ".."'
echo cmake "$ARGS"
eval cmake "$ARGS"
The 3rd last line stores the cmake arguments.
You can now reinvoke the exact command-line that you used by simply calling:
./cmake_call.sh
Footnotes:
(***1) ~/bin/cmake is usually in the PATH because of ~/.profile. When creating ~/bin/cmake the very 1st time, it might be necessary to log out and back in, so that .profile sees ~/bin.
A very Linux specific way of achieving the same objective:
if(${CMAKE_SYSTEM_NAME} STREQUAL Linux)
file(STRINGS /proc/self/status _cmake_process_status)
# Grab the PID of the parent process
string(REGEX MATCH "PPid:[ \t]*([0-9]*)" _ ${_cmake_process_status})
# Grab the absolute path of the parent process
file(READ_SYMLINK /proc/${CMAKE_MATCH_1}/exe _cmake_parent_process_path)
# Compute CMake arguments only if CMake was not invoked by the native build
# system, to avoid dropping user specified options on re-triggers.
if(NOT ${_cmake_parent_process_path} STREQUAL ${CMAKE_MAKE_PROGRAM})
execute_process(COMMAND bash -c "tr '\\0' ' ' < /proc/$PPID/cmdline"
OUTPUT_VARIABLE _cmake_args)
string(STRIP "${_cmake_args}" _cmake_args)
set(CMAKE_ARGS "${_cmake_args}"
CACHE STRING "CMake command line args (set by end user)" FORCE)
endif()
message(STATUS "User Specified CMake Arguments: ${CMAKE_ARGS}")
endif()

Can someone help explain this code? It is a shell script for creating a checksum list

#!/bin/bash
# create a list of checksums
cat /dev/null > MD5SUM
for i in */*/*.sql ; do test -e $i && md5sum $i >>MD5SUM ; done
Then this command is used to check to see if anything has changed:
md5sum -c MD5SUM
It works fine and everything. I just don't really understand how. Say if I wanted to make a checksum list of all the files in my home directory $HOME how can I do that? What does the */*/*.sql part of the for loop mean? I'm assuming that is to display SQL files only but how can I modify that? Say I wanted all files in the directory? Why is it not just *.sql ? What does the rest of the for loop do in this case?
Lets go by parts:
cat /dev/null > MD5SUM
this will only "erase" the previous MD5SUM file/list that was created before.
for i in */*/*.sql;
this will iterate over files that are 2 directories deep from your current folder. If you have folders
~/a/b
~/c/d
~/e/f
and you run your script in your home folder (~) all "*.sql" inside directories b,d,f will have the checksum calculated and piped to a file MD5SUM in the current direcotry:
do test -e $i && md5sum $i >>MD5SUM ; done
Now Answering your questions:
Say if I wanted to make a checksum list of all the files in my home directory $HOME how can I do that?
I would use the find command with the exec option
find $HOME -maxdepth 1 -name \*.sql -exec md5sum {} \;
What does the //*.sql part of the for loop mean?
I answered it above, anyway only goes 2 directories deep before getting to the files.
I'm assuming that is to display SQL files only but how can I modify that? Say I wanted all files in the directory?
Change
for i in */*/*.sql;
to
for i in */*/*;
or for current directory
find $HOME -maxdepth 1 -name \* -exec md5sum {} \;
Why is it not just *.sql ? What does the rest of the for loop do in this case?
Explained before.
Hope it helps =)