Awk splitting a string and comparison - awk

I have a string like AS|REQ|XYZ|value=12 which I am splitting with:
awk -F\| 'print {$4}' | awk -F"=" '{print $2}'
This gives the value 12.
But for the string DF|REG|EXP|value=, it comes back blank.
What I need as if my string encounters value in fourth column and is blank, throw error. Can this be done in awk command ?
Thanks

#JamesBrown has the right answer to your question as asked, but given the input you posted all you need to produce the output you want is:
awk -F'=' '{print ($NF=="" ? "Error" : $NF)}' file
If that's NOT all you need then edit your question to show some more truly representative sample input and expected output.

You could be more specific about what you mean by throwing an error. If you want the program to exit with a non-zero exit code, use if and exit with value`:
$ awk 'BEGIN{exit}'
$ echo $?
0
$ awk 'BEGIN{exit 1}'
$ echo $?
1
$ awk -F\| '{split($4,a,"="); if(a[2]=="") exit 1; else print a[2]}' foo
12
$ echo $?
1
or just print an error message and continue execution:
$ awk -F\| '{split($4,a,"="); print (a[2]==""?"ERROR":a[2])}' foo
12
ERROR
Test data used above:
$ cat foo
AS|REQ|XYZ|value=12
DF|REG|EXP|value=

Something like this perhaps?
awk -F\| '{print $4}' | awk -F"=" '{if ($2 == "") print "ERROR: Empty Value"; else print $2}'

Hope this command might work for you. The below command will behave as expected. If you have any value in the value field, it will just print the value. Else if it is blank, it prints "error". The string was placed in test.txt
awk -F\| '{if($4!="value=") {gsub("value=","",$4);print $4} else print "error" }' test.txt

Something like this -
cat f
AS|REQ|XYZ|value=12
AS|REQ|XYZ|value=
awk -F'[|=]' '{if($4 == "value" && $5 == "") {print ("Error Found at Line: ",NR)} else {print $0}}' f
AS|REQ|XYZ|value=12
Error Found at Line: 2
It search for value in 4th column and blank in 5th column.

Related

Need to retrieve a value from an HL7 file using awk

In a Linux script program, I've got the following awk command for other purposes and to rename the file.
cat $edifile | awk -F\| '
{ OFS = "|"
print $0
} ' | tr -d "\012" > $newname.hl7
While this is happening, I'd like to grab the 5th field of the MSH segment and save it for later use in the script. Is this possible?
If no, how could I do it later or earlier on?
Example of the segment.
MSH|^~\&|business1|business2|/u/tmp/TR0049-GE-1.b64|routing|201811302126||ORU^R01|20181130212105810|D|2.3
What I want to do is retrieve the path and file name in MSH 5 and concatenate it to the end of the new file.
I've used this to capture the data but no luck. If fpth is getting set, there is no evidence of it and I don't have the right syntax for an echo within the awk phrase.
cat $edifile | awk -F\| '
{ OFS = "|"
{fpth=$(5)}
print $0
} ' | tr -d "\012" > $newname.hl7
any suggestions?
Thank you!
Try
filename=`awk -F'|' '{print $5}' $edifile | head -1`
You can skip the piping through head if the file is a single line
First of all, it must be mentioned that the awk line in your first piece of code, has zero use:
$ cat $edifile | awk -F\| ' { OFS = "|"; print $0 }' | tr -d "\012" > $newname.hl7
This is totally equivalent to
$ cat $edifile | tr -d "\012" > $newname.hl7
because OFS is only used to redefine $0 if you redefine a field.
Example:
$ echo "a|b|c" | awk -F\| '{OFS="/"; print $0}'
a|b|c
$ echo "a|b|c" | awk -F\| '{OFS="/"; $1=$1; print $0}'
a/b/c
I understand that you have a hl7 file in which you have a single line starting with the string "MSH". From this line you want to store the 5th field: this is achieved in the following way:
fpth=$(awk -v outputfile="${newname}.hl7" '
BEGIN{FS="|"; ORS="" }
($1 == "MSH"){ print $5 }
{ print $0 > outputfile }' $edifile)
I have replaced ORS to an empty character set, as it is equivalent to tr -d "\012". The above will work very nicely if you only have a single MSH in your file.

awk: print each column of a file into separate files

I have a file with 100 columns of data. I want to print the first column and i-th column in 99 separate files, I am trying to use
for i in {2..99}; do awk '{print $1" " $i }' input.txt > data${i}; done
But I am getting errors
awk: illegal field $(), name "i"
input record number 1, file input.txt
source line number 1
How to correctly use $i inside the {print }?
Following single awk may help you too here:
awk -v start=2 -v end=99 '{for(i=start;i<=end;i++){print $1,$i > "file"i;close("file"i)}}' Input_file
An all awk solution. First test data:
$ cat foo
11 12 13
21 22 23
Then the awk:
$ awk '{for(i=2;i<=NF;i++) print $1,$i > ("data" i)}' foo
and results:
$ ls data*
data2 data3
$ cat data2
11 12
21 22
The for iterates from 2 to the last field. If there are more fields that you desire to process, change the NF to the number you'd like. If, for some reason, a hundred open files would be a problem in your system, you'd need to put the print into a block and add a close call:
$ awk '{for(i=2;i<=NF;i++){f=("data" i); print $1,$i >> f; close(f)}}' foo
If you want to do what you try to accomplish :
for i in {2..99}; do
awk -v x=$i '{print $1" " $x }' input.txt > data${i}
done
Note
the -v switch of awk to pass variables
$x is the nth column defined in your variable x
Note2 : this is not the fastest solution, one awk call is fastest, but I just try to correct your logic. Ideally, take time to understand awk, it's never a wasted time

How do I pass a variable into AWK FNR?

#!/bin/bash
export num=50
echo $num
awk -v awk_num=$num 'FNR==2, FNR==$awknum {print $1;}' big_report > short_report
I have a big_report file. The desired output is to print rows 2 to 50 in column 1 of big_report into short_report. However, when I run above the result in short_report, includes all lines in column 1 instead of the specified rows 2-50.
I would really appreciate it if anyone could help! Thanks!!!
Like this:
awk -v awk_num=$num 'FNR==2, FNR==awk_num {print $1}' big_report > short_report

How to extract only numbers with awk

Hello i have the following output:
replication complete (rid=969811 lid=969811)
or sometimes:
no change of listener transaction id for last 0 checks (rid=971489 lid=970863)
now i want to use awk to get only the numbers from rid and lid, the following works only with the first option
|awk -F'[^0-9]*' '{print $2-$3}'
$ echo "no change of listener transaction id for last 0 checks (nid=971491 lid=970876)" |
awk -F'[()]' '{gsub(/[^0-9 ]/,"",$2); print $2}'
971491 970876
$ echo "no change of listener transaction id for last 0 checks (nid=971491 lid=970876)" |
awk -F'[()= ]' '{print $(NF-3), $(NF-1)}'
971491 970876
$ echo "no change of listener transaction id for last 0 checks (nid=971491 lid=970876)" |
awk -F'[()= ]' '{for (i=1;i<=NF;i++) m[$i]=$(i+1); print m["nid"], m["lid"]}'
971491 970876
$ echo "no change of listener transaction id for last 0 checks (nid=971491 lid=970876)" |
awk '{gsub(/.*\(|[^0-9 ]+|\).*$/,"")}1'
971491 970876
etc., etc.... The right one for you really depends what else you plan to do with the text.
Hmm, I now see in your question that you MIGHT want to print the subtraction of one number from the other instead of printing the numbers as I thought. Here's one way based on the above:
$ echo "no change of listener transaction id for last 0 checks (nid=971491 lid70876)" |
awk -F'[()= ]' '{print $(NF-3) - $(NF-1)}'
615
Alternatives left as an exercise!
You can use this awk, if your goal is to work with rid and lid values.
awk -F\(rid=\|lid=\) '{print $2-$3}' yourfile
(OR)
awk 'BEGIN{FS="(rid=|lid=)"} {print $2-$3}' yourfile
awk -F'=' '{print int($2)-int($3)}'
Works because of the way awk parses strings.
Another solution, this works in GNU-awk 4 only .... Defining Fields by Content in GAWK
echo "no change of listener transaction id for last 0 checks (rid=971489 lid=970863)" |
gawk -vFPAT='[0-9]+' '{print $(NF-1), $NF}'
you get,
971489 970863
echo "replication complete (rid=969811 lid=969811)" |
gawk -vFPAT='[0-9]+' '{print $(NF-1), $NF}'
you get,
969811 969811
Note: if, you want to do subtraction
echo "replication complete (rid=969811 lid=969811)" |
gawk -vFPAT='[0-9]+' '{print $(NF-1)-$NF}'
you get,
0

awk + export value to awk

the following program needs to print the words
First
Second
Third
But because i parameter from awk not get the value from “for” loop its print all words:
First second third
First second third
First second third
How to fix awk in order to print first the “first” word second the “second” word and so on
THX
Yael
program:
for i in 1 2 3
do
echo "first second third" | awk '{print $i}'
done
You can change you code like this:
for i in 1 2 3
do
echo "first second third" | awk -v i=$i '{print $i}'
done
To use the variable 'i' from the shell.
You can also just change the record separator (RS) to have the same result :
echo "first second third" | awk 'BEGIN{RS=" "} {print $1}'
But I'm not sure if that's what you're looking for.
You could do:
for a in First Second Third
do
awk 'BEGIN { print ARGV[1] }' $a
done
Or you could do:
for a in First Second Third
do
awk -v arg=$a 'BEGIN { print arg }'
done
don't do the unnecessary. the shell for loop is not needed! Just do it with awk!
$ echo "first second third" | awk '{for(i=1;i<=NF;i++)print $i}'
Or you could use:
echo "first second third" | awk -F " " -v OFS="\n" '{print $1,$2,$3}'