Using awk to print the word on the second column of the second line of a file only if it ends in `.local`? - awk

I'd like to print the word on the second column of the second line of a file, only if it ends in .local.
How can I achieve this using awk?
Right now I have awk 'FNR==2{print $2}', but this prints the word no matter what.

Adding a check for $2 against RE:
awk 'FNR == 2 && $2 ~ /\.local$/ { print $2 }'

Related

awk to extract days from line

I have the following csv file
238013750030646-2;;"Default";"2020-10-01 00:40:36";;"opening";0;3591911;283940640
238013750030646-2;;"Default";"2020-10-03 00:40:36";;"closing line";0;89320;283940640
238013750030646-2;;"something-else";"2020-10-04 00:40:36";;"started";0;0;283940640
238013750030646-2;;"default else";"2020-10-08 05:42:06";;"opening";0;2410;283940640
Im trying to store each line in a specific file matching the date from each line, with the date being in the 4th column of each line, so first line ("2020-10-01 00:40:36") should be in output-01.csv, second line in output-03.csv etc
This awk command
awk -F";|-" -vOFS='\t' '{print > "output-"$7".csv"}' testing.csv
half works but fails on line 3 because of the - in the 3rd column, and line 4 because of the in the 3rd column - this produces output-10.csv
Is there a way to run the awk command twice ? then i could extract the date using the ; separator and then split using -
Using gawk takes care of unsorted file too :
awk 'match($0,/([0-9]{4})-([0-9]{2})-([0-9]{2})/,arr){
file=sprintf("output-%s.csv",arr[3]);
if(!seen[file]++){
print >file;
next
}
}{
print >>file;
close(file);
}' infile
Explanation:
awk 'match($0,/([0-9]{4})-([0-9]{2})-([0-9]{2})/,arr){ # match for regex
file=sprintf("output-%s.csv",arr[3]); # file variable using array arr value, 3rd index
if(!seen[file]++){ # if not seen file name before in array seen
print >file; # print content to file
next # go to next line
}
}{
print >>file; # append content to file
close(file); # close file
}' infile
Try this:
$ awk -F';' -v OFS='\t' '{split($4,a,/[- ]/); file = "output-"a[3]".csv";
$1=$1; print > file; close(file)}' testing.csv
split($4,a,/[- ]/) this will split 4th field further based on space or - characters, saved in array a
file = "output-"a[3]".csv" output filename
$1=$1 since there's no other command changing contents of input line, this is needed to rebuild input line, otherwise OFS will not be applied
print > file print input line to required file
close(file) calling close, useful if there are too many file names
You can also use file = "output-" substr($4,10,2) ".csv" instead of split if the 4th column is consistent as shown in the sample.
With your shown samples, please try following, written and tested in GNU awk.
awk '
match($0,/[0-9]{4}(-[0-9]{2}){2}/){
outputFile=substr($0,RSTART+8,RLENGTH-8)".csv"
print >> (outputFile)
close(outputFile)
}
' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
match($0,/[0-9]{4}(-[0-9]{2}){2}/){ ##using match function to match yyyy-mm-dd here in line.
outputFile=substr($0,RSTART+8,RLENGTH-8)".csv" ##Getting matched regex sub-string into outputFile here.
print >> (outputFile) ##Printing current line into outputFile here.
close(outputFile) ##Closing output file to avoid too many files opened error.
}
' Input_file ##Mentioning Input_file name here.
To do this efficiently you should sort on the key field first:
awk -F';' '{print $4, NR, $0}' file |
sort -k1,1 -k3,3n |
awk '
{ curr=$1; sub(/([^ ]+ ){2}/,"") }
curr != prev { close(out); out="output-" (++c) ".csv"; prev=curr }
{ print > out }
'
$ head output*.csv
==> output-1.csv <==
238013750030646-2;;"Default";"2020-10-01 00:40:36";;"opening";0;3591911;283940640
==> output-2.csv <==
238013750030646-2;;"Default";"2020-10-03 00:40:36";;"closing line";0;89320;283940640
==> output-3.csv <==
238013750030646-2;;"something-else";"2020-10-04 00:40:36";;"started";0;0;283940640
==> output-4.csv <==
238013750030646-2;;"default else";"2020-10-08 05:42:06";;"opening";0;2410;283940640
The above will work using any awk+sort in any shell on every Unix box. See the many similar examples on this site for an explanation.

why does my awk code that should print word only on specific condition actually prints for all lines?

I am trying to learn awk functionalities and as a simple exercise I try to print values in a file where if the first word is PERMNO then it should print the third word else it should just ignore.
the code I am using is
awk '{if ($1 = "PERMNO"){ print $3}}' ddoutput.txt
Right now this prints third word from every line. But I expect it to print only the third word when the first word of line is PERMNO. What am I missing?
With $1 = "PERMNO" you're assigning PERMNO to first field, which always evaluates to true. You should use the == operator like:
awk '{if($1=="PERMNO"){print $3}}' file
Or more awkish:
awk '$1=="PERMNO"{print $3}' file

awk to parse field if specific value is in another

In the awk below I am trying to parse $2 using the _ only if $3 is a specific valus (ID). I am reading that parsed value into an array and going to use it as a key in a lookup. The awk does execute but the entire line 2 or line with ID in $3 prints not just the desired. The print statement is only to see what results (for testing only) and will not be part of the script. Thank you :).
awk
awk -F'\t' '$3=="ID"
f="$(echo $2|cut -d_ -f1,1)"
{
print $f
}' file
file tab-delimited
R_Index locus type
17 chr20:31022959 NON
18 chr11:118353210-chr9:20354877_KMT2A-MLLT3.K8M9 ID
desired
$f = chr11:118353210-chr9:20354877
Not completely clear, could you please try following.
awk '{split($2,array,"_");if(array[2]=="KMT2A-MLLT3.K8M9"){print array[1]}}' Input_file
Or if you want top change 2nd field's value along with printing all lines then try following once.
awk '{split($2,array,"_");if(array[2]=="KMT2A-MLLT3.K8M9"){$2=array[1]}} 1' Input_file

AWK - get value between two strings over multiple lines

input.txt:
>block1
111111111111111111111
>block2
222222222222222222222
>block3
333333333333333333333
AWK command:
awk '/>block2.*>/' input.txt
Expected output
222222222222222222222
However, AWK is returning nothing. What am I misunderstanding?
Thanks!
If you want to print the line after the line containing >block2, then you could use:
awk '/^>block2$/ { nr=NR+1 } NR == nr { print }'
Track the record number plus 1 when you find the match; when the current record number matches the remembered one, print the current record.
If you want all the lines between the line >block2 and >block3, then you'd use:
awk '/^>block2$/,/^>block3/ {if ($0 !~ /^>block[23]$/) print }'
For all lines between the two markers, if the line doesn't match either marker, print it. The output is the same with the sample data file.
another awk
$ awk 'c&&c--; /^>block2/{c=1}' file
222222222222222222222
c specifies how many lines you want to print after the match. If you want the text between two markers
$ awk '/^>block3/{exit} s; /^>block2/{s=1}' file
222222222222222222222
if there are multiple instances and you want them all, just change exit to s=0
You probably meant:
$ awk '/>/{f=/^>block2$/;next} f' file
222222222222222222222

AWK -Print the next to last field of each line of input file

I have an input file file the content of which constantly is updated
with various number of fields, what I am trying to is to print out to a new file
the next to last field of each line of input file:
awk '{print $(NF-1)}' outputfile
error:
and
awk: (FILENAME=- FNR=2) fatal: attempt to access field -1
Need help. Thanks in advance
On lines with no fields (blank lines, or all whitespace) NF is 0, so that evaluates to $(-1). Also if there's only one field your code will print $0 which may not be what you want.
awk 'NF>=2 {print $(NF-1)}'
Should be awk 'NF > 1 { print $(NF - 1); }'
awk 'NF { print $(NF - 1) }' is not correct. When NF == 1 it'll print $0 which is not next to the last field.
another awk line: (golfing a bit):
awk 'NF>1&&$0=$(NF-1)'