Reading from file -- awk - awk

I would like to read a file like this
1.23213213
0.12321321
-1.12321321
0.23232322
into a variable, or array to use it somewhere in the main {} code.
But I would like to use it if this file exists. How can I check if it already exists or not, and if not, then do not use that variable or array?

I don't understand completely what you want to achieve, but perhaps something like this can be useful to you:
It process the file line by line and saves each one in an array, the key is the line number so you keep the order. In the END section check how many lines were processed and get if the file had content.
awk '{ line[ FNR ] = $0 } END { if ( FNR > 0 ) { print "File" } else { print "NO file" } }' infile
EDIT to comment:
But in awk you can process many files from command line.
BEGIN {
...
}
## Processing of first file in command line.
FNR == NR {
a[ FNR ] = $0
next
}
## Processing of second file in command line
FNR < NR {
## Check if array 'a' has the values you want and use them
## 'for(...)variable += a[i]' or whatever.
}
Run script like:
awk -f script.awk first_file.txt second_file.txt
But if first_file.txt doesn't exists, awk will complain with an error.

Related

In AWK, skip the rest of the current action?

Thanks for looking.
I have an AWK script with something like this;
/^test/{
if ($2 == "2") {
# What goes here?
}
# Do some more stuff with lines that match test, but $2 != "2".
}
NR>1 {
print $0
}
I'd like to skip the rest of the action, but process the rest of the patterns/actions on the same line.
I've tried return but this isn't a function.
I've tried next but that skips the rest of the patterns/actions for the current line.
For now I've wrapped the rest of the ^test action in the if statement's else, but I was wondering if there was a better approach.
Not sure this matters but I am using gawk on OSX, installed via brew (for better compatibility with my target OS).
Update (w/solution):
Edits: Expanded code sample based on #karakfa's answer.
BEGIN{
keepLastLine = 1;
}
/^test/ && !keepLastLine{
printLine = 1;
print $0;
next;
}
/^test/ && keepLastLine{
printLine = 0;
next;
}
/^foo/{
# This is where I have the rest of my logic (approx 100 lines),
# including updates to printLine and keepLastLine
}
NR>1 {
if (printLine) {
print $0
}
}
This will work for me, I even like it better that what I was thinking of.
However I do wonder what if my keepLastLine condition was only accessible in a for loop?
I gather from what #karakfa has said, there isn't a control structure for exiting only an action, and continuing with other patterns, so that would have to be implemented with a flag of some sort (not unlike #RavinderSingh13's answer).
If I got it correct could you please try following. I am creating a variable named flag here which will be chedked if condition inside test block for checking if 2nd field is 2 is TRUE then it will be SET. When it is SET so rest of statements in test BLOCK will NOT be executed. Also resetting flag's value before read starts for a line too.
awk '
{
found=""
}
/^test/{
if ($2 == "2") {
# What goes here?
found=1
}
if(!found){
# Do some more stuff with lines that match test, but $2 != "2".
}
}
NR>1 {
print $0
}' Input_file
Testing of code here:
Let's say following is the Input_file:
cat Input_file
file
test 2 file
test
abcd
After running code following we will get following output, where if any line is having test keyword and NOT having $2==2 then also it will execute statements outside of test condition.
awk '
{
found=""
}
/^test/{
if ($2 == "2") {
print "# What goes here?"
found=1
}
if(!found){
print "Do some more stuff with lines that match test, but $2 != 2"
}
}
NR>1 {
print $0
}' Input_file
# What goes here?
test 2 file
Do some more stuff with lines that match test, but $2 != 2
test
abcd
the magic keyword you're looking for is else
/^test/{ if($2==2) { } # do something
else { } # do something else
}
NR>1 # {print $0} is implied.
for some reason if you don't want to use else just move up condition one up (flatten the hierarchy)
/^test/ && $2==2 { } # do something
/^test/ && $2!=2 { } # do something else
# other action{statement}s

I have three text file and I want to merge (print) them in one file. using awk programme

I have three text file and I want to merge (print) them in one file. using awk programme. I used the following code to print or call two different text file, and it is work perfectly. but if I have three or four text file it does not work. any idea, help
BEGIN { #1 text file
} # This line is closing the BEGIN
{
if (FNR != NR)
print $0
}
END {
print ""
} # Closing END
BEGIN { # 2 text file
} # This line is closing the BEGIN
{
if (FNR == NR)
print $0
}
END {
you don't need awk for this, cat is the right tool
$ cat file1 file2 file3 > mergedfile
but, of course awk will do as well
$ awk 1 file1 file2 file3 > mergedfile

Awk: I want to use the input filename to generate an output file with same name different extension

I have a script that looks like this:
#! /bin/awk -f
BEGIN { print "start" }
{ print $0 }
END { print "end" }
Call the script like this: ./myscript.awk test.txt
Pretty simple - takes a file and adds "start" to the start and "end" to the end.
Now I want to take the input filename, lets call it test.txt, and print the output to a file called test.out.
So I tried to print the input filename:
BEGIN { print "fname: '" FILENAME "'" }
But that printed: fname: '' :(
The rest I can figure out I think, I have this following to print to a hard-coded filename:
#! /bin/awk -f
BEGIN { print "start" > "test.out" }
{ print $0 >> "test.out" }
END { print "end" >> "test.out" }
And that works great.
So the questions are:
how do I get the input filename?
Assuming somehow I get the input file name in a variable, e.g. FILENAME which contains "test.txt" how would I make another variable, e.g. OUTFILE, which contains "test.out"?
Note: I will be doing much more awk processing so please don't suggest to use sed or other languages :))
Try something like this:
#! /bin/awk -f
BEGIN {
file = gensub(".txt",".out","g",ARGV[1])
print "start" > file
}
{ print $0 >> file }
END {
print "end" >> file
close(file)
}
I'd suggest to close() the file too in the END{} statement. Good call to Sundeep for pointing out that FILENAME is empty in BEGIN.
$ echo 'foo' > ip.txt
$ awk 'NR==1{op=FILENAME; sub(/\.[^.]+$/, ".log", op); print "start" > op}
{print > op}
END{print "end" > op}' ip.txt
$ cat ip.log
start
foo
end
Save FILENAME to a variable, change the extension using sub and then print as required
From gawk manual
Inside a BEGIN rule, the value of FILENAME is "", because there are no input files being processed yet
If you're using GNU awk (gawk), you can use the patterns BEGINFILE and ENDFILE
awk 'BEGINFILE{
outfile=FILENAME;
sub(".txt",".out",outfile);
print "start" > outfile
}
ENDFILE{
print "stop" >outfile
}' file1.txt file2.txt
You can then use the variable outfile your the main {...} loop.
Doing so will allow you to process more that 1 file in a single awk command.

Insert a line at the end of an ini section only if it doesn't exist

I have an smb.conf ini file which is overwritten whenever edited with a certain GUI tool, wiping out a custom setting. This means I need a cron job to ensure that one particular section in the file contains a certain option=value pair, and insert it at the end of the section if it doesn't exist.
Example
Ensure that hosts deny=192.168.23. exists within the [myshare] section:
[global]
printcap name = cups
winbind enum groups = yes
security = user
[myshare]
path=/mnt/myshare
browseable=yes
enable recycle bin=no
writeable=yes
hosts deny=192.168.23.
[Another Share]
invalid users=nobody,nobody
valid users=nobody,nobody
path=/mnt/share2
browseable=no
Long-winded solution using awk
After a long time struggling with sed, I concluded that it might not be the right tool for the job. So I moved over to awk and came up with this:
#!/bin/sh
file="smb.conf"
tmp="smb.conf.tmp"
section="myshare"
opt="hosts deny=192.168.23."
awk '
BEGIN {
this_section=0;
opt_found=0;
}
# Match the line where our section begins
/^[ \t]*\['"$section"'\][ \t]*$/ {
this_section=1;
print $0;
next;
}
# Match lines containing our option
this_section == 1 && /^[ \t]*'"$opt"'[ \t]*$/ {
opt_found=1;
}
# Match the following section heading
this_section == 1 && /^[ \t]*\[.*$/ {
this_section=0;
if (opt_found != 1) {
print "\t'"$opt"'";
}
}
# Print every line
{ print $0; }
END {
# In case our section is the very last in the file
if (this_section == 1 && opt_found != 1) {
print "\t'"$opt"'";
}
}
' $file > $tmp
# Overwrite $file only if $tmp is different
diff -q $file $tmp > /dev/null 2>&1
if [ $? -ne 0 ]; then
mv $tmp $file
# reload smb.conf here
else
rm $tmp
fi
I can't help feeling that this is a long script to achieve a simple task. Is there a more efficient/elegant way to insert a property in an ini file using basic shell tools like sed and awk?
Consider using Python 3's configparser:
#!/usr/bin/python3
import sys
from configparser import SafeConfigParser
cfg = SafeConfigParser()
cfg.read(sys.argv[1])
cfg['myshare']['hosts deny'] = '192.168.23.';
with open(sys.argv[1], 'w') as f:
cfg.write(f)
To be called as ./filename.py smb.conf (i.e., the first parameter is the file to change).
Note that comments are not preserved by this. However, since a GUI overwrites the config and doesn't preserve custom options, I suspect that comments are already nuked and that this is not a worry in your case.
Untested, should work though
awk -vT="hosts deny=192.168.23" 'x&&$0~T{x=0}x&&/^ *\[[^]]+\]/{print "\t\t"T;x=0}
/^ *\[myshare\]/{x++}1' file
This solution is a bit awkward. It uses the INI section header as the record separator. This means that there is an empty record before the first header, so when we match the header we're interested in, we have to read the next record to handle that INI section. Also, there are some printf commands because the records still contain leading and trailing newlines.
awk -v RS='[[][^]]+[]]' -v str="hosts deny=192.168.23." '
{printf "%s", $0; printf "%s", RT}
RT == "[myshare]" {
getline
printf "%s", $0
if (index($0, str) == 0) print str
printf "%s", RT
}
' smb.conf
RS is the awk variable that contains the regex to split the text into records.
RT is the awk variable that contains the actual text of the current record separator.
With GNU awk for a couple of extensions:
$ cat tst.awk
index($0,str) { found = 1 }
match($0,/^\s*\[([^]]+).*/,a) {
if ( (name == tgt) && !found ) { print indent str }
name = a[1]
found = 0
}
{ print; indent=gensub(/\S.*/,"","") }
.
$ awk -v tgt="myshare" -v str="hosts deny=192.168.23." -f tst.awk file
[global]
printcap name = cups
winbind enum groups = yes
security = user
[myshare]
path=/mnt/myshare
browseable=yes
enable recycle bin=no
writeable=yes
hosts deny=192.168.23.
[Another Share]
invalid users=nobody,nobody
valid users=nobody,nobody
path=/mnt/share2
browseable=no
.
$ awk -v tgt="myshare" -v str="fluffy bunny" -f tst.awk file
[global]
printcap name = cups
winbind enum groups = yes
security = user
[myshare]
path=/mnt/myshare
browseable=yes
enable recycle bin=no
writeable=yes
hosts deny=192.168.23.
fluffy bunny
[Another Share]
invalid users=nobody,nobody
valid users=nobody,nobody
path=/mnt/share2
browseable=no

How to append lines to a new file with AWK

I am trying to append lines to some new files with awk in this way:
#!/usr/bin/awk -f
BEGIN {
FS = "[ \t|]"; }
{
print $5 "\t" $13 "\t" $14 >> "./bed/" $5 ".bed";
}
END {
}
New file is created with filename derived from a field of awk input file (5th field). I am unable to execute this script since it fails with
awk: ./blast2bed.awk:6: (FILENAME=blastout000 FNR=1) fatal: can't redirect to `./bed/AY517392.1.bed' (No such file or directory)
Any hints?
Thanks
The directory bed has to exist so create it first with mkdir bed either before you run your script or in the BEGIN block. You should also add brackets around the output file:
print $5"\t"$13"\t"$14 >> ("./bed/"$5".bed")
Notes: You don't need to end lines with ; if you have a single statement per line and the BEGIN and END blocks are optional.