AWK - suppress stdout on system() function - awk

I'm currently writing a shell script that will be given a directory, then output an ls of that directory with the return code from a C program appended to each line. The C program only needs to be called for regular files.
The problem I'm having is that output from the C program is cluttering up the output from awk, and I can't get stdout to redirect to /dev/null inside of awk. I have no use for the output, I just need the return code. Speed is definitely a factor, so if you have a more efficient solution I'd be happy to hear it. Code follows:
directory=$1
ls -i --full-time $directory | awk '
{
rc = 0
if (substr($2,1,1) == "-") {
dbType=system("cprogram '$directory'/"$10)
}
print $0 " " rc
}
'

awk is not shell so you cant just use a shell variable inside an awk script, and in shell always quote your variables. Try this:
directory="$1"
ls -i --full-time "$directory" | awk -v dir="$directory" '
{
rc = 0
if (substr($2,1,1) == "-") {
rc = system("cprogram \"" dir "/" $10 "\" >/dev/null")
}
print $0, rc
}
'
Oh and, of course, don't actually do this. See http://mywiki.wooledge.org/ParsingLs.
I just spent a minute thinking about what your script is actually doing and rather than trying to use awk as a shell and parse the output of ls, it looks like the solution you REALLY want would be more like:
directory="$1"
find "$directory" -type f -maxdepth 1 -print |
while IFS= read -r dirFile
do
op=$(ls -i --full-time "$dirFile")
cprogram "$dirFile" >/dev/null
rc="$?"
printf "%s %s\n" "$op" "$rc"
done
and you could probably save a step by using the -printf arg for find to get whatever info you're currently using ls for.

Related

find pattern in multiple files and perform some action on them

I have 2 files - file1.txt and file2.txt.
I want to set a condition such that, a command is run on both files only if a pattern "xyz" is present in both files. Even if one file fails to have that pattern, the command shouldn't run. Also , I need to have both files being passed to the grep or awk command at the same time as I am using this code inside another workflow language.
I wrote some code with grep, but this code performs the action even if the pattern is present in one of the files, which is not what I want . Please let me know if there is a better way to do this.
if grep "xyz" file1.txt file2.txt; then
my_command file1.txt file2.txt
else
echo " command cannot be run on these files"
fi
Thanks!
This awk should work for you:
awk -v s='xyz' 'FNR == NR {
if ($0 ~ s) {
++p
nextfile
}
next
}
FNR == 1 {
if (!p) exit 1
}
{
if ($0 ~ s) {
++p
exit
}
}
END {
exit p < 2
}' file1 file2
This will exit with 0 if given string is found in both the files otherwise it will exit with 1.
Salvaging code from a deleted answer by Cyrus:
if grep -q "xyz" file1.txt && grep -q "xyz" file2.txt; then
echo "xyz was found in both files"
else
echo "xyz was found in one or no file"
fi
If you need to run a single command, save this as a script, and run that script in your condition.
#!/bin/sh
grep -q "xyz" "$1" && grep -q "xyz" "$2"
If you save this in your PATH and call it grepboth (don't forget to chmod a+x grepboth when you save it) your condition can now be written
grepboth file1.txt file2.txt
Or perhaps grepall to accept a search expression and a list of files;
#!/bin/sh
what=$1
shift
for file; do
grep -q "$what" "$file" || exit
done
This could be used as
grepall "xyz" file1.txt file2.txt

Change a string using sed or awk

I have some files which have wrong time and date, but the filename contains the correct time and date and I try to write a script to fix this with the touch command.
Example of filename:
071212_090537.jpg
I would like this to be converted to the following format:
1712120905.37
Note, the year is listed as 07 in the filename, even if it is 17 so I would like the first 0 to be changed to 1.
How can I do this using awk or sed?
I'm quite new to awk and sed, an programming in general. Have tried to search for a solution and instruction, but haven't manage to figure out how to solve this.
Can anyone help me?
Thanks. :)
Take your example:
awk -F'[_.]' '{$0=$1$2;sub(/^./,"1");sub(/..$/,".&")}1'<<<"071212_090537.jpg"
will output:
1712120905.37
If you want the file to be removed, you can let awk generate the mv origin new command, and pipe the output to |sh, like: (comments inline)
listYourFiles| # list your files as input to awk
awk -F'[_.]' '{o=$0;$0=$1$2;sub(/^./,"1");sub(/..$/,".&");
printf "mv %s %s\n",o,$0 }1' #this will print "mv ori new"
|sh # this will execute the mv command
It's completely unnecessary to call awk or sed for this, you can do it in your shell. e.g. with bash:
$ f='071212_090537.jpg'
$ [[ $f =~ ^.(.*)_(.*)(..)\.[^.]+$ ]]
$ echo "1${BASH_REMATCH[1]}${BASH_REMATCH[2]}.${BASH_REMATCH[3]}"
1712120905.37
This is probably what you're trying to do:
for old in *.jpg; do
[[ $old =~ ^.(.*)_(.*)(..)\.[^.]+$ ]] || { printf 'Warning, unexpected old file name format "%s"\n' "$old" >&2; continue; }
new="1${BASH_REMATCH[1]}${BASH_REMATCH[2]}.${BASH_REMATCH[3]}"
[[ -f "$new" ]] && { printf 'Warning, new file name "%s" generated from "%s" already exists, skipping.\n' "$new" "$old" >&2; continue; }
mv -- "$old" "$new"
done
You need that test for new already existing since an old of 071212_090537.jpg or 171212_090537.jpg (or various other values) would create the same new of 1712120905.37
I think sed really is the easiest solution:
You could do this:
▶ for f in *.jpg ; do
new_f=$(sed -E 's/([0-9]{6})_([0-9]{4})([0-9]{2})\.jpg/\1\2.\3.jpg/' <<< $f)
mv $f $new_f
done
For more info:
You probably need to read an introductory tutorial on regular expressions.
Note that the -E option to sed allows use of extended regular expressions, allowing a more readable and convenient expression here.
Use of <<< is a Bashism known as a "here-string". If you are using a shell that doesn't support that, A <<< $b can be rewritten as echo $b | A.
Testing:
▶ touch 071212_090538.jpg 071212_090539.jpg
▶ ls -1 *.jpg
071212_090538.jpg
071212_090539.jpg
▶ for f in *.jpg ; do
new_f=$(sed -E 's/([0-9]{6})_([0-9]{4})([0-9]{2})\.jpg/\1\2.\3.jpg/' <<< $f)
mv $f $new_f
done
▶ ls -1
0712120905.38.jpg
0712120905.39.jpg

cat, grep & awk - both while read line & while read file in 1 loop?

Hi,
Thanks to alot of searching on stackoverflow (great resource!) last couple of days I succeeded in this, and even succeeded in the following issue, that was the output resulted in doubling of the lines everytime I ran the command. Thanks to an awk command which was able to remove double lines.
I'm pretty far in my search, but am missing 1 option.
Using both MacosX and linux by the way.
What I'm trying to do is parse through my notes (all plain text .md files), searching for words/tags in a text file (called greplist.txt), and parsing matched lines in separate text files with the same name as the searchword/tag (eg #computer.md).
Selection of contents of greplist.txt are:
#home
#computer
#Next
#Waiting
example contents of 2 .md files:
school.md:
* find lost schoolbooks #home
* do homework #computer
fun.md
* play videogame #computer
With this terminal command (that works great, but not perfect yet)
$ cat greplist.txt | while read line; do grep -h "$line" *.md >> $line.md.tmp; mv $line.md.tmp $line.md; awk '!x[$0]++' < $line.md > $line.md.tmp && mv $line.md.tmp $line.md ;done
Results
The result for #computer.md :
* do homework #computer
* play videogame #computer
And #home.md would look like this
* find lost schoolbooks #home
So far so great! Already really really happy with this. Especially since the added moving/renaming of the files, it is also for me possible to add extra tasks/lines to the # tag .md files, and be included in the file without being overwritten the next time I run the command. Awesomecakes!
Now the only thing I miss is that I wish that in the output of the # tag .md files behind the task also the output also list the filename (without extensions) in between brackets behind the search result (so that nvalt can use this as an internal link)
So the desired output of example #computer.md would become:
* do homework #computer [[school]]
* play videogame #computer [[fun]]
I tried playing around with this with the -l and -H in the grep command instead of -h, but the output it just gets messy somehow. (Not even tried adding the bracket yet!)
Another this I tried was this, but it doesn't do anything it seams. It does however illustrate probably what I'm trying to accomplish.
$ cat greplist.txt | while read line; do grep -h "$line" *.md | while read filename; do echo "$filename" >> $line.md.tmp; mv $line.md.tmp $line.md; awk '!x[$0]++' < $line.md > $line.md.tmp && mv $line.md.tmp $line.md ;done
So the million Zimbabwean dollar question is: How to do this. I tried and tried, but this is above my skill level atm. Very eager to find out the solution!
Thanks in advance.
Daniel Dennis de Wit
The outline solution seems like a fairly long-winded way to write the code. This script uses sed to write an awk script and then runs awk so that it reads its program from standard input and applies it to all the '.md' files that don't start with an #.
sed 's!.*!/&/ { name=FILENAME; sub(/\\.md$/, "", name); printf "%s [[%s]]\\n", $0, name > "&.md" }!' greplist.txt |
awk -f - [!#]*.md
The version of awk on Mac OS X will read its program from standard input; so will GNU awk. So, the technique it uses of writing the program on a pipe and reading the program from a pipe works with those versions. If the worst comes to the worst, you'll have to save the output of sed into a temporary file, have awk read the program from the temporary file, and then remove the temporary file. It would be straight-forward to replace the sed with awk, so you'd have one awk process writing an awk program and a second awk process executing the program.
The generated awk code looks like:
/#home/ { name=FILENAME; sub(/\.md$/, "", name); printf "%s [[%s]]\n", $0, name > "#home.md" }
/#computer/ { name=FILENAME; sub(/\.md$/, "", name); printf "%s [[%s]]\n", $0, name > "#computer.md" }
/#Next/ { name=FILENAME; sub(/\.md$/, "", name); printf "%s [[%s]]\n", $0, name > "#Next.md" }
/#Waiting/ { name=FILENAME; sub(/\.md$/, "", name); printf "%s [[%s]]\n", $0, name > "#Waiting.md" }
The use of ! in the sed script is simply the choice of a character that doesn't appear in the generated script. Determining the basename of the file on each line is not 'efficient'; if your files are big enough, you can add a line such as:
{ if (FILENAME != oldname) { name = FILENAME; sub(/\.md$/, "", name); oldname = FILENAME } }
to the start of the awk script (how many ways can you think of to do that?). You can then drop the per-line setting of name.
Do not attempt to run the program on the #topic.md files; it leads to confusion.
Try this one:
grep -f greplist.txt *.md | awk ' match($0, /(.*).md:(.*)(#.*)/, vars) { print vars[2], "[[" vars[1] "]]" >> vars[3]".md.out"} '
What it does:
grep will output matched patterns in greplist.txt in the .md files:
fun.md:* play videogame #computer
school.md:* find lost schoolbooks #home
school.md:* do homework #computer
finally awk will move the file name to the back in the format you want and append each line to the corressponding #.md.out* file:
* play videogame #computer [[fun]]
* find lost schoolbooks #home [[school]]
* do homework #computer [[school]]
I added the .out on the file name so that the next time you execute the command it will not include the #* files.
Note that I'm not sure if the awk script will work on the Mac OS X awk.

search for variable in multiple files within the same script

i have a script which reads every line of a file and outputs based on certain match,
function tohyphen (o) {
split (o,a,"to[-_]")
split (a[2],b,"-")
if (b[1] ~ / /) { k=""; p=""; }
else { k=b[1]; p=b[2] }
if (p ~ / /) { p="" }
return k
}
print k, "is present in" , FILENAME
what i need to do is check if the value of k is present in say about 60 other files and print that filename and also it has to ignore the file which it was original reading, im currently doing this with grep , but the calling of grep so many times causes the cpu to go high, is there a way i can do this within the awk script itself.
You can try something like this with gnu awk.
gawk '/pattern to search/ { print FILENAME; nextfile }' *.files
You can replace your pipeline grep "$k" *.cfg | grep "something1" | grep "something2" | cut -d -f2,3,4 with the following single awk script:
awk -v k="$k" '$0~k&&/something1/&&/something2/{print $2,$3,$4}' *.cfg
You mention printing the filename in your question, in this case:
awk -v k="$k" '$0~k&&/something1/&&/something2/{print FILENAME;nextfile}' *.cfg

choose the newest file and use getline to read it

Having problems with a small awk script, Im trying to choose the newest of some log files and then use getline to read it. The problem is that it dosent work if I dont send it any input first to the script.
This works
echo | myprog.awk
this do not
myprog.awk
myprog.awk
BEGIN{
#find the newest file
command="ls -alrt | tail -1 | cut -c59-100"
command | getline logfile
close(command)
}
{
while((getline<logfile)>0){
#do the magic
print $0
}
}
Your problem is that while your program selects OK the logfile the block {} is to be executed for every line of the input file and you have not input file so it defaults to standard input. I don't know awk very well myself so I don't know how to change the input (if possible) from within an awk script, so I would:
#! /bin/awk -f
BEGIN{
# find the newest file
command = "ls -1rt | tail -1 "
command | getline logfile
close(command)
while((getline<logfile)>0){
getline<logfile
# do the magic
print $0
}
}
or maybe
alias myprog.awk="awk '{print $0}' `ls -1rt | tail -1`"
Again, this maybe a little dirty. We'll wait for a better answer. :-)
Never parse ls. See this for the reason.
Why do you need to use getline? Let awk do the work for you.
#!/bin/bash
# get the newest file
files=(*) newest=${f[0]}
for f in "${files[#]}"; do
if [[ $f -nt $newest ]]; then
newest=$f
fi
done
# process it with awk
awk '{
# do the magic
print $0
}' $newest