Pass variable as parameter to awk file in TCL - variables

I want to pass variable to awk file in TCL which is somewhat similar to below code. Is there any way?
// out.tr is input file and variable to be passed is to_node
my_code.tcl:
set to_node 13
exec awk -f check_ack.awk out.tr $to_node
check_ack.awk:
BEGIN {}
{
# variable passed should be saved in node[i] --> But this doesnot work
for ( i = 0; i < ARGC; i++ ) {
node[i] = ARGV[i]
}
status = $7
if(FILENAME=="trace.tr") {
if(node[1] > -1 && $1 == "s" node[1] == $3 && status == "ack") {
print "Node" " "node[1]" " "has not sent acknowledgement" > "failed.txt"
}
}
}
END {
exit 0
}

You pass variables like this:
awk -f some/script.awk -v var="my value"
Now you can access your variable var in awk and if you print it, you'll see it contains my value.

Related

Swapping / rearranging of columns and its values based on inputs using Unix scripts

Team,
I have an requirement of changing /ordering the column of csv files based on inputs .
example :
Datafile (source File) will be always with standard column and its values example :
PRODUCTCODE,SITE,BATCHID,LV1P_DESCRIPTION
MK3,Biberach,15200100_3,Biologics Downstream
MK3,Biberach,15200100_4,Sciona Upstream
MK3,Biberach,15200100_5,Drag envois
MK3,Biberach,15200100_8,flatsylio
MK3,Biberach,15200100_1,bioCovis
these columns (PRODUCTCODE,SITE,BATCHID,LV1P_DESCRIPTION) will be standard for source files and what i am looking for solution to format this and generate new file with the columns which we preferred .
Note : Source / Data file will be always comma delimited
Example : if I pass PRODUCTCODE,BATCHID as input then i would like to have only those column and its data extracted from source file and generate new file .
Something like script_name <output_column> <Source_File_name> <target_file_name>
target file example :
PRODUCTCODE,BATCHID
MK3,15200100_3
MK3,15200100_4
MK3,15200100_5
MK3,15200100_8
MK3,15200100_1
if i pass output_column as "LV1P_DESCRIPTION,PRODUCTCODE" then out file should be like below
LV1P_DESCRIPTION,PRODUCTCODE
Biologics Downstream,MK3
Sciona Upstream,MK3
Drag envios,MK3
flatsylio,MK3
bioCovis,MK3
It would be great if any one can help on this.
I have tried using some awk scripts (got it from some site) but it was not working as expected , since i don't have unix knowledge finding difficulties to modify this .
awk code:
BEGIN {
FS = ","
}
NR==1 {
split(c, ca, ",")
for (i = 1 ; i <= length(ca) ; i++) {
gsub(/ /, "", ca[i])
cm[ca[i]] = 1
}
for (i = 1 ; i <= NF ; i++) {
if (cm[$i] == 1) {
cc[i] = 1
}
}
if (length(cc) == 0) {
exit 1
}
}
{
ci = ""
for (i = 1 ; i <= NF ; i++) {
if (cc[i] == 1) {
if (ci == "") {
ci = $i
} else {
ci = ci "," $i
}
}
}
print ci
}
the above code is saves as Remove.awk and this will be called by another scripts as below
var1="BATCHID,LV2P_DESCRIPTION"
## this is input fields values used for testing
awk -f Remove.awk -v c="${var1}" RESULT.csv > test.csv
The following GNU awk solution should meet your objectives:
awk -F, -v flds="LV1P_DESCRIPTION,PRODUCTCODE" 'BEGIN { split(flds,map,",") } NR==1 { for (i=1;i<=NF;i++) { map1[$i]=i } } { printf "%s",$map1[map[1]];for(i=2;i<=length(map);i++) { printf ",%s",$map1[map[i]] } printf "\n" }' file
Explanation:
awk -F, -v flds="LV1P_DESCRIPTION,PRODUCTCODE" ' # Pass the fields to print as a variable field
BEGIN {
split(flds,map,",") # Split fld into an array map using , as the delimiter
}
NR==1 { for (i=1;i<=NF;i++) {
map1[$i]=i # Loop through the header and create and array map1 with the column header as the index and the column number the value
}
}
{ printf "%s",$map1[map[1]]; # Print the first field specified (index of map)
for(i=2;i<=length(map);i++) {
printf ",%s",$map1[map[i]] # Loop through the other field numbers specified, printing the contents
}
printf "\n"
}' file

Sum up from line "A" to line "B" from a big file using awk

aNumber|bNumber|startDate|timeZone|duration|currencyType|cost|
22677512549|778|2014-07-02 10:16:35.000|NULL|NULL|localCurrency|0.00|
22675557361|76457227|2014-07-02 10:16:38.000|NULL|NULL|localCurrency|10.00|
22677521277|778|2014-07-02 10:16:42.000|NULL|NULL|localCurrency|0.00|
22676099496|77250331|2014-07-02 10:16:42.000|NULL|NULL|localCurrency|1.00|
22667222160|22667262389|2014-07-02 10:16:43.000|NULL|NULL|localCurrency|10.00|
22665799922|70110055|2014-07-02 10:16:45.000|NULL|NULL|localCurrency|20.00|
22676239633|433|2014-07-02 10:16:48.000|NULL|NULL|localCurrency|0.00|
22677277255|76919167|2014-07-02 10:16:51.000|NULL|NULL|localCurrency|1.00|
This is the input (sample of million of line) i have in csv file.
I want to sum up duration based on date.
My concern is i want to sum up first 1000000 lines
the awk program i'm using is:
test.awk
BEGIN { FS = "|" }
NR>1 && NR<=1000000
FNR == 1{ next }
{
sub(/ .*/,"",$3)
key=sprintf("%10s",$3)
duration[key] += $5 } END {
printf "%-10s %16s,"dAccused","Duration"
for (i in duration) {
printf "%-4s %16.2f i,duration[i]
}}
i run my script as
$awk -f test.awk 'file'
The input i have doesn't condsidered my condition NR>1 && NR<=1000000
ANY SUGGESTION? PLEASE!
You're looking for this:
BEGIN { FS = "|" }
1 < NR && NR <= 1000000 {
sub(/ .*/, "", $3)
key = sprintf("%10s",$3)
duration[key] += $5
}
END {
printf "%-10s %16s\n", "dAccused", "Duration"
for (i in duration) {
printf "%-4s %16.2f i,duration[i]
}
}
A lot of errors become obvious with proper indentation.
The reason you saw 1,000,000 lines was due to this:
NR>1 && NR<=1000000
That is a condition with no action block. The default action is to print the current record if the condition is true. That's why you see a lot of awk one-liners end with the number 1
You didn't post any expected output and your duration field is always NULL so it's still not clear what you really want output, but this is probably the right approach:
$ cat tst.awk
BEGIN { FS = "|" }
NR==1 { for (i=1;i<NF;i++) f[$i] = i; next }
{
sub(/ .*/,"",$(f["startDate"]))
sum[$(f["startDate"])] += $(f["duration"])
}
NR==1000000 { exit }
END { for (date in sum) print date, sum[date] }
$ awk -f tst.awk file
2014-07-02 0
Instead of discarding your header line, it uses it to create an array f[] that maps the field names to their order in each line so instead of having to hard-code that duration is field 4 (or whatever) you just reference it as $(f["duration"]).
Any time your input file has a header line, don't discard it - use it so your script is not coupled to the order of fields in your input file.

awk 1 unexpected character '.' suddenly appeared

the script was working. I added some comments and renamed it then submitted it. today my instructor told me it doesnt work and give me the error of awk 1 unexpected character '.'
the script is supposed to read a name in command line and return the student information for the name back.
right now I checked it and surprisingly it gives me the error.
I should run it by the command like this:
scriptName -v name="aname" -f filename
what is this problem and which part of my code make it?
#!/usr/bin/awk
BEGIN{
tmp=name;
nameIsValid;
if (name && tolower(name) eq ~/^[a-z]+$/ )
{
inputName=tolower(name)
nameIsValid++;
}
else
{
print "you have not entered the student name"
printf "Enter the student's name: "
getline inputName < "-"
tmp=inputName;
if (tolower(inputName) eq ~/^[a-z]+$/)
{
tmpName=inputName
nameIsValid++
}
else
{
print "Enter a valid name!"
exit
}
}
inputName=tolower(inputName)
FS=":"
}
{
if($1=="Student Number")
{
split ($0,header,FS)
}
if ($1 ~/^[0-9]+$/ && length($1)==8)
{
split($2,names," ")
if (tolower(names[1]) == inputName || tolower(names[2])==inputName )
{
counter++
for (i=1;i<=NF;i++)
{
printf"%s:%s ",header[i], $i
}
printf "\n"
}
}
}
END{
if (counter == 0 && nameIsValid)
{
printf "There is no record for the %-10s\n" , tmp
}
}
Here are the steps to fix the script:
Get rid of all those spurious NULL statements (trailing semi-colons at the end of lines).
Get rid of the unset variable eq (it is NOT an equality operator!) from all of your comparions.
Cleanup the indenting.
Get rid of that first non-functional nameIsValid; statement.
Change printf "\n" to the simpler print "".
Get rid of the useless ,FS arg to split().
Change name && tolower(name) ~ /^[a-z]+$/ to just the second part of that condition since if that matches then of course name is populated.
Get rid of all of those tolower()s and use character classes instead of explicit a-z ranges.
Get rid of the tmp variable.
Simplify your BEGIN logic.
Get rid of the unnecessary nameIsValid variable completely.
Make the awk body a bit more awk-like
And here's the result (untested since no sample input/output posted):
BEGIN {
if (name !~ /^[[:alpha:]]+$/ ) {
print "you have not entered the student name"
printf "Enter the student's name: "
getline name < "-"
}
if (name ~ /^[[:alpha:]]+$/) {
inputName=tolower(name)
FS=":"
}
else {
print "Enter a valid name!"
exit
}
}
$1=="Student Number" { split ($0,header) }
$1 ~ /^[[:digit:]]+$/ && length($1)==8 {
split(tolower($2),names," ")
if (names[1]==inputName || names[2]==inputName ) {
counter++
for (i=1;i<=NF;i++) {
printf "%s:%s ",header[i], $i
}
print ""
}
}
}
END {
if (counter == 0 && inputName) {
printf "There is no record for the %-10s\n" , name
}
}
I changed the shebang line to:
#!/usr/bin/awk -f
and then in command line didnt use -f. It is working now
Run the script in the following way:
awk -f script_name.awk input_file.txt
This seems to suppress the warnings and errors.
In my case, the problem was resetting the IFS variable to be IFS="," as suggested in this answer for splitting string into an array. So I resetted the IFS variable and got my code to work.
IFS=', '
read -r -a array <<< "$string"
IFS=' ' # reset IFS

Parsing errors in awk blocks

awk 'BEGIN
{
INPUTFILE ='XXX'; iterator =0;
requestIterator =0;
storageFlag =T;
printFlag =F;
currentIteration =F;
recordCount =1;
while (getline < "'"$INPUTFILE"'")
{
requestArray[requestIterator]++;
requestIterator++;
}
}
if ($1 ~ /RequestId/)
{
FS = "=";
if($2 in requestArray)
{
storage[iterator] =$0;
printFlag =T;
next
}
else
{
storageFlag =F;
next
}
}
else
{
if((storageFlag =='T' && $0 != "EOE"))
{
storage[iterator]=$0; iterator++;
}
else {if(storageFlag == 'F')
{
next
}
else
{
if(printFlag == 'T')
{
for(details in storage)
{
print storage[details] >> FILE1;
delete storage[details];
}
printFlag =F;
storageFlag =T;
next
}
}'
I am facing some syntax error in the above code. Could you ppl please help me?
awk: BEGIN{INPUTFILE =XXXX;iterator =0;requestIterator =0;storageFlag =T;printFlag =F;currentIteration =F;recordCount =1;while (getline < ""){requestArray[requestIterator]++;requestIterator++;}}if ($1 ~ /RequestId/){FS = "=";if($2 in requestArray){storage[iterator] =$0;printFlag =T;next}else{storageFlag =F;next}}else{if((storageFlag ==T && $0 != EOE)){storage[iterator]=$0;iterator++;}else{if(storageFlag == F){next}else{if(printFlag == T){for(details in storage){print storage[details] >> XXXX;delete storage[details];}printFlag = F;storageFlag =T;next}}}}
awk: ^ syntax error
awk: ^ syntax error
Quotes are the problem. The first single quotes on INPUTFILE ='XXX' is going to be parsed as matching the one before BEGIN, and from then on all the parsing is broken.
Either escape the quotes or just put the awk file into a seperate file rather than "inline".
# STARTING POINT - known bad
awk 'BEGIN { INPUTFILE ='XXX'; iterator =0; ... '
Has to be rewritten to remove all of the single quotes inside the outer pair
awk 'BEGIN { INPUTFILE ="XXX"; iterator =0; ... '
Or depending on if you need doubles or singles, use doubles outside and single inside
awk "BEGIN { INPUTFILE ='XXX'; iterator =0; ... '
or escape the singles quotes so they make it through to awk and don't get consumed by the shell.
awk 'BEGIN { INPUTFILE =\'XXX\'; iterator =0; ... '
All of your problems go away if you put the awk script into a separate file rather than inlining it the shell. You can have whatever quotes you like and no one will care !!

awk system not setting variables properly

I am having a issue in having the output of the grep (used in system() in nawk ) assigned to a variable .
nawk '{
CITIZEN_COUNTRY_NAME = "INDIA"
CITIZENSHIP_CODE=system("grep "CITIZEN_COUNTRY_NAME " /tmp/OFAC/country_codes.config | cut -d # -f1")
}'/tmp/*****
The value IND is displayed in the console but when i give a printf the value of citizenshipcode is 0 - Can you pls help me here
printf("Country Tags|%s|%s\n", CITIZEN_COUNTRY_NAME ,CITIZENSHIP_CODE)
Contents of country_codes.config file
IND#INDIA
IND#INDIB
CAN#CANADA
system returns the exit value of the called command, but the output of the command is not returned to awk (or nawk). To get the output, you want to use getline directly. For example, you might re-write your script:
awk ' {
file = "/tmp/OFAC/country_codes.config";
CITIZEN_COUNTRY_NAME = "INDIA";
FS = "#";
while( getline < file ) {
if( $0 ~ CITIZEN_COUNTRY_NAME ) {
CITIZENSHIP_CODE = $1;
}
}
close( file );
}'
Pre-load the config file with awk:
nawk '
NR == FNR {
split($0, x, "#")
country_code[x[2]] = x[1]
next
}
{
CITIZEN_COUNTRY_NAME = "INDIA"
if (CITIZEN_COUNTRY_NAME in country_code) {
value = country_code[CITIZEN_COUNTRY_NAME]
} else {
value = "null"
}
print "found " value " for country name " CITIZEN_COUNTRY_NAME
}
' country_codes.config filename