How can a variable in an awk command be used to read a column of a file in an if condition?
e.g. Say to read column 2 of a below sample file, in which fcolumn1 holds value as 2, startdate as 2014-09-22 00:00:00, and enddate as 2014-09-23 00:00:00.
abcd,2016-04-23 02:35:34,sdfsdfsd
sdsd,2016-04-22 02:35:34,sdfsdfsd
Below command works:
awk -v startdate="$startdate" -v enddate="$enddate" -F"," '
{
if ($2>=startdate && $2<enddate)
{
print $2
}
}'
Expectation is to make $2 as dynamic as below:
awk -v startdate="$startdate" -v enddate="$enddate" -v "fcolumn1=${fcolumn1}" -F"," '
{
if (fcolumn1 != "")
{
if (**$fcolumn1**>=startdate && **$fcloumn1**<enddate)
{
print "$fcolum1"
}
}
}'
First, the if block is superfluous since awk programs follow the following (simplified) structure:
CONDITION { ACTIONS } CONDITION {ACTIONS} ...
You can write the condition without the if statement:
awk '$2>=startdate && $2<enddate { print $2 }' file
If you want to make the actual column number configurable via a variable, note that you can address a column using a variable in awk, like this:
awk -v col=2 '{print $col}'
Related
I have a file "test"
Below is the content
235788###235788###20200724_103122###SUCCESS
235791###235791###20200724_105934###SUCCESS
235833###235833###20200724_130652###FAILURE
235842###235842###20200724_132721###FAILURE
235852###235852###20200724_134607###FAILURE
235791###235791###20200724_105934###SUCCESS
if last line of this file begin with 235791 then replace string "SUCCESS" to "FAILURE" on just that line.
Expected Output
235788###235788###20200724_103122###SUCCESS
235791###235791###20200724_105934###SUCCESS
235833###235833###20200724_130652###FAILURE
235842###235842###20200724_132721###FAILURE
235852###235852###20200724_134607###FAILURE
235791###235791###20200724_105934###FAILURE
Below is the sample code
id = 235791
last_build_id = `tail -1 test | awk -F'###' '{print \$1}'`
if (id == last_build_id ){
sed -i '$s/SUCCESS/FAILURE/' test
}
I would like to avoid these many lines and use one line shell command using regex groups or in any other simple way.
sed might be easier here
$ sed -E '$s/(^235791#.*)SUCCESS$/\1FAILURE/' file
you can add -i for in place update.
To pass id as a variable
$ id=235791; sed -E '$s/(^'$id'#.*)SUCCESS$/\1FAILURE/' file
you should double quote "$id" ideally, but if you're sure about the contents you may get away without.
With GNU sed
sed -E '${/^235791\>/ s/SUCCESS$/FAILURE/}' file
Or with the BSD sed on MacOS
sed -E '${/^235791#/ s/SUCCESS$/FAILURE/;}' file
When working with "the last X in the file", it's often easier to reverse the file and work with "the first X":
tac file | awk '
BEGIN {FS = OFS = "###"}
NR == 1 && $1 == 235791 && $NF == "SUCCESS" {$NF = "FAILURE"}
1
' | tac
Could you please try following, written and tested with shown samples in GNU awk. You need not to use many commands for this one, we could do this in a single awk itself.
One liner form of code:
awk -v id="$your_shell_variable" 'BEGIN{ FS=OFS="###" } NR>1{print prev} {prev=$0} END{if($1==id && $NF=="SUCCESS"){$NF="FAILURE"}; print}' Input_file > temp && mv temp Input_file
Explanation: Adding detailed explanation for above.
awk -v id="$your_shell_variable"' ##Starting awk program from here.
NR>1{ ##Checking condition if prev is NOT NULL then do following.
print prev ##Printing prev here.
}
{
prev=$0 ##Assigning current line to prev here.
}
END{ ##Starting END block of this program from here.
if($1==id && $NF=="SUCCESS"){ ##Checking condition if first field is 235791 and last field is SUCCESS then do following.
$NF="FAILURE" ##Setting last field FAILURE here.
}
print ##Printing last line here.
}
' Input_file > temp && mv temp Input_file ##Mentioning Input_file name here.
2nd solution: As per Ed sir's comment some awk's don't support $1, $NF in END sections so if above doesn't work for someone please try more generic solution as follows.
One liner form of solution(since specifically asking it):
awk -v id="$your_shell_variable" 'BEGIN{ FS=OFS="###" } NR>1{print prev} {prev=$0} END{num=split(prev,array,"###");if(array[1]==id && array[num]=="SUCCESS"){array[num]="FAILURE"};for(i=1;i<=num;i++){val=(val?val OFS:"")array[i]};print val}' Input_file > temp && mv temp Input_file
Detailed level(non-one liner code):
awk -v id="$your_shell_variable" '
BEGIN{ FS=OFS="###" }
NR>1{
print prev
}
{
prev=$0
}
END{
num=split(prev,array,"###")
if(array[1]==id && array[num]=="SUCCESS"){
array[num]="FAILURE"
}
for(i=1;i<=num;i++){
val=(val?val OFS:"")array[i]
}
print val
}
' Input_file > temp && mv temp Input_file
$ awk -v val='235791' '
BEGIN { FS=OFS="###" }
NR>1 { print prev }
{ prev=$0 }
END {
$0=prev
if ($1 == val) {
$NF="FAILURE"
}
print
}
' file
235788###235788###20200724_103122###SUCCESS
235791###235791###20200724_105934###SUCCESS
235833###235833###20200724_130652###FAILURE
235842###235842###20200724_132721###FAILURE
235852###235852###20200724_134607###FAILURE
235791###235791###20200724_105934###FAILURE
I have two files - FileA and FileB. FileA has 10 fields with 100 lines. If Field1 and Field2 match, Field3 should be changed. FileB has 3 fields. I am reading in FileB with a while loop to match the two fields and to get the value that should be use for field 3.
while IFS=$'\t' read hostname interface metric; do
awk -v var1=${hostname} -v var2=${interface} -v var3=${metric} '{if ($1 ~ var1 && $2 ~ var2) $3=var3; print $0}' OFS="\t" FileA.txt
done < FileB.txt
At each line iteration, this prints FileB.txt with the single line that changed. I only want it to print the line that was changed.
Please Help!
It's a smell to be calling awk once for each line of file B. You should be able to accomplish this task with a single pass through each file.
Try something like this:
awk -F'\t' -v OFS='\t' '
# first, read in data from file B
NR == FNR { values[$1 FS $2] = $3; next }
# then, output modified lines from matching lines in file A
($1 FS $2) in values { $3 = values[$1 FS $2]; print }
' fileB fileA
I'm assuming that you actually want to match with string equality instead of ~ pattern matching.
I only want it to print the line that was changed.
Simply put your print $0 statement to if clause body:
'{if ($1 ~ var1 && $2 ~ var2) { $3=var3; print $0 }}'
or even shorter:
'$1~var1 && $2~var2{ $3=var3; print $0 }'
I have a text file that is comma delimited. The first line is a list of field names, and subsequent lines contain data. I'll get new versions of the file, and I want to extract all the values from a particular column by name rather than by column number. (I.e. the column I want may be in different positions in different versions of the file.)
For example, here are two files:
foo,bar,interesting,junk
1,2,gold,ramjet
2,25,diamonds,superfluous
and
foo,bar,baz,interesting,junk,morejunk
5,3,smurf,platinum,garbage,scrap
6,2.5,mushroom,sodium,liverwurst,eew
I'd like a single script that will go through multiple files, extracting the minerals in the "interesting" column. :-)
What I've got so far is something that works on ONE file, but I know that awk is more elegant than this. How do I clean this up and make it work on multiple files at once?
BEGIN {
FS=",";
}
NR == 1 {
for(i=1; i<=NF; i++) {
if($i=="interesting") {
col=i;
}
}
}
NR > 1 {
print $col;
}
You're pretty darn close already. Just use FNR instead of NR, for "File NR".
#!/usr/bin/awk -f
BEGIN { FS="," }
FNR==1 {
for (col=1;col<=NF;col++)
if ($col=="interesting")
next
}
{ print $col }
Or if you like:
#!/usr/bin/awk -f
BEGIN { FS="," }
FNR==1 { for (col=1;$col!="interesting";col++); next }
{ print $col }
Or if you prefer one-liners:
$ awk -F, -v txt="interesting" 'FNR==1{for(c=1;$c!=txt;c++);next} {print $c}' file1 file2
Of course, be careful that you actually have the specified column, or you may find yourself in an endless loop. You can probably figure out the extra condition that saves you from that risk.
Note that in awk, you only need to terminate commands with semicolons if they are followed by another command. Thus, you would do this:
command1; command2
But you can drop the semicolon if you separate commands with newlines:
command1
command2
Do it this way:
$ cat tst.awk
BEGIN { FS=OFS="," }
FNR==1 { for (i=1;i<=NF;i++) f[$i]=i; next }
{ print $(f["interesting"]) }
$ awk -f tst.awk file1 file2
gold
diamonds
platinum
sodium
Creating a name->value array is always the best approach when it's applicable. It keeps every part of the code simple and decoupled from the rest of the code, and it sets you up for doing other things like changing the order of the fields when you output the results, e.g.:
$ cat tst.awk
BEGIN { FS=OFS="," }
FNR==1 { for (i=1;i<=NF;i++) f[$i]=i; next }
{ print $(f["junk"]), $(f["interesting"]), $(f["bar"]) }
$ awk -f tst.awk file1 file2
ramjet,gold,2
superfluous,diamonds,25
garbage,platinum,3
liverwurst,sodium,2.5
This is the wrong syntax but if I wanted to something where the action is done if and only if both conditions are met, how would I do this in awk?
I've tried:
awk '{if($1<=28.25&&$2<=28.25){print $0}}'
but failed.
Awk uses { print $0 } as a default action, so you can simply write:
awk -F, '$1 <= 28.25 && $2 <= 28.25' inputFile
An omitted action is equivalent to { print $0 }
I'm new to this and a little in the dark, so if my title is off the mark please correct me. I'm trying to set a variable in awk from one file, and then invoke the script on a different file.
ex:
sqlinsert writes to fields.txt
I execute:
cat textfile | ./awkscript
awkscript pulls 'fields' var from fields.txt while running on textfile
Here is what I have. I'm using getline, and that isn't what I'm looking for. I want it to grab the value from the first line of a separate file.
\#!/opt/local/bin/gawk -f
BEGIN {
printf "Enter field lengths: "
getline fields < "-"
print fields
}
BEGIN {FIELDWIDTHS = fields; OFS="|"}
{
{ for (i=1;i<=NF;i++) sub(/[ \t]*$/,"",$i) }
\# { for (i=1;i<=NF;i++) sub(/^[ \t]*/,"",$i) }
print
}
What I was looking for was this:
cat textfile | generic.awk -v fields='10 1 21 21 4'
The -v option can also be used multiple times:
cat textfile | generic.awk -v field1="10" -v field2="1" -v field3="21" -v field4="21" -v field5="4"