awk - send variable to .awk script and use as conditional - variables

I want to send a variable to my external .awk script to use as a conditional. The following script is however not working.
Here first is the command:
awk -v myVar="optA" -f /users/test/fixit.awk /users/test/input.txt > /users/test/output.txt
The sample fixit.awk script is:
BEGIN { printf "TITLE:\nDocuments \n", myVar, FS=" "; }
if (myVar="optA")
printf myVar
else
printf "OptB"
Can someone please help diagnose the problem?

An awk assignment is also an expression with the return value of what was assigned. If you write
if (myVar = "optA")
you actually check the return value of the assignment, optA, which awlays evaluates to "true". You want
if (myVar == "optA")
for comparison instead of assignment.
Also, you can't have "naked" statements like this. Your if/else clause either has to be part of the BEGIN block:
BEGIN {
printf "TITLE:\nDocuments \n", myVar, FS=" "
if (myVar=="optA")
printf myVar
else
printf "OptB"
}
to execute once, or in a separate block if it should be executed for every single line (less likely, though):
BEGIN { printf "TITLE:\nDocuments \n", myVar, FS=" " }
{
if (myVar=="optA")
printf myVar
else
printf "OptB"
}
As an aside, the way you use printf doesn't make much sense: you could either do
print "TITLE:\nDocuments\n" myVar
or
printf "TITLE:\nDocuments\n%s\n", myVar
And for printf myVar or printf "OptB", unless you explicitly don't want that newline, you can as well use print myVar and print "OptB".
And finally, that FS= assignment looks a bit out of place and is probably not needed as " " is the default value of FS.

Related

Else syntax error when nesting array formula

I am recieving a syntax error on "else" for this shell:
{for (i=8;i<=NF;i+=3)
{if ($0~"=>") # if-else statement designed to flag file / directory transfers
print "=> flag,"$1"," $2","$3","$4 ","$5","$6","$7"," $(i)","$(i+1)","$(i+2);
{split ($(i+2), array, "/");
for (x in array)
{j++;
a[j] =j;
printf (array[x] ",");}
printf ("%s\n", "");}
else
print "no => flag,"$1"," $2","$3","$4 ","$5","$6","$7"," $(i)","$(i+1)","$(i+2)
}
}
Can't figure out why. If I delete the array block (starting with split()), all is well. But I need to scan the contents of $(i+2), so cutting it does me no good.
Also, if anyone has guidance on a good list of how to interpret error messages, that would be great.
Thanks for your advice.
EDIT: here is the above script laid out with sensible formatting:
{
for (i=8;i<=NF;i+=3) {
if ($0~"=>") # if-else statement designed to flag file / directory transfers
print "=> flag,"$1"," $2","$3","$4 ","$5","$6","$7"," $(i)","$(i+1)","$(i+2);
{
split ($(i+2), array, "/");
for (x in array) {
j++;
a[j] =j;
printf (array[x] ",");
}
printf ("%s\n", "");
}
else
print "no => flag,"$1"," $2","$3","$4 ","$5","$6","$7"," $(i)","$(i+1)","$(i+2)
}
}
First thing first, since you didn't post any samples of input and expected output so didn't test it at all. Could you please try following, I hope you are running this in .awk script style. Also these are mostly syntax/cosmetic changes NOT on logic part, since no background was given on problem.
BEGIN{
OFS=","
}
{
for (i=8;i<=NF;i+=3){
if ($0~/=>/){
print "=> flag,"$1,$2,$3,$4,$5,$6,$7,$(i),$(i+1),$(i+2)
split ($(i+2), array, "/");
for(x in array){
j++;
a[j] =j;
printf (array[x] ",")
}
printf ("%s\n", "")
}
else{
print "no => flag",$1,$2,$3,$4,$5,$6,$7,$(i),$(i+1),$(i+2)
}
}
}
Problems fixed in OP's attempt:
{ starting curly braces(which indicates that if condition of for loop with multiple statements is started) could be in last of the line where they are present, NOT in next line, for better visibility purposes, I fixed in for loop and if condition first.
Since you are using regexp matching with a pattern so I fixed from $0~"=>" TO $0~/=>/.
Added BEGIN section in your attempt where I have set OFS(output field separator) value to , so that you need NOT to print like "," to print comma between variables, just , between variables will do the trick.
Fixed indentation, so that we are NOT confused where to close loop/condition and where to NOT.

How can I store the length of a line into a var withing awk script?

I have this simple awk script with which I attempt to check the amount of characters in the first line.
if the first line has more of less than 10 characters I want to store the amount
of caracters into a var.
Somehow the first print statement works but storing that result into a var doesn't.
Please help.
I tried removing dollar sign " thelength=(length($0))"
and removing the parenthesis "thelength=length($0)" but it doen't print anything...
Thanks!
#!/bin/ksh
awk ' BEGIN {FS=";"}
{
if (NR==1)
if(length($0)!=10)
{
print(length($0))
thelength=$(length($0))
print "The length of the first line is: ",$thelength;
exit 1;
}
}
END { print "STOP" }' $1
Two issues dealing with mixing ksh and awk scripting ...
no need to make a sub-shell call within awk to obtain the length; use thelength=length($0)
awk variables do not require a leading $ when being referenced; use print ... ,thelength
So your code becomes:
#!/bin/ksh
awk ' BEGIN {FS=";"}
{
if (NR==1)
if(length($0)!=10)
{
print(length($0))
thelength=length($0)
print "The length of the first line is: ",thelength;
exit 1;
}
}
END { print "STOP" }' $1

Delete a variable in awk

I wonder if it is possible to delete a variable in awk. For an array, you can say delete a[2] and the index 2 of the array a[] will be deleted. However, for a variable I cannot find a way.
The closest I get is to say var="" or var=0.
But then, it seems that the default value of a non-existing variable is 0 or False:
$ awk 'BEGIN {if (b==0) print 5}'
5
$ awk 'BEGIN {if (!b) print 5}'
5
So I also wonder if it is possible to distinguish between a variable that is set to 0 and a variable that has not been set, because it seems not to:
$ awk 'BEGIN {a=0; if (a==b) print 5}'
5
There is no operation to unset/delete a variable. The only time a variable becomes unset again is at the end of a function call when it's an unused function argument being used as a local variable:
$ cat tst.awk
function foo( arg ) {
if ( (arg=="") && (arg==0) ) {
print "arg is not set"
}
else {
printf "before assignment: arg=<%s>\n",arg
}
arg = rand()
printf "after assignment: arg=<%s>\n",arg
print "----"
}
BEGIN {
foo()
foo()
}
$ awk -f tst.awk file
arg is not set
after assignment: arg=<0.237788>
----
arg is not set
after assignment: arg=<0.291066>
----
so if you want to perform some actions A then unset the variable X and then perform actions B, you could encapsulate A and/or B in functions using X as a local var.
Note though that the default value is zero or null, not zero or false, since its type is "numeric string".
You test for an unset variable by comparing it to both null and zero:
$ awk 'BEGIN{ if ((x=="") && (x==0)) print "y" }'
y
$ awk 'BEGIN{ x=0; if ((x=="") && (x==0)) print "y" }'
$ awk 'BEGIN{ x=""; if ((x=="") && (x==0)) print "y" }'
If you NEED to have a variable you delete then you can always use a single-element array:
$ awk 'BEGIN{ if ((x[1]=="") && (x[1]==0)) print "y" }'
y
$ awk 'BEGIN{ x[1]=""; if ((x[1]=="") && (x[1]==0)) print "y" }'
$ awk 'BEGIN{ x[1]=""; delete x; if ((x[1]=="") && (x[1]==0)) print "y" }'
y
but IMHO that obfuscates your code.
What would be the use case for unsetting a variable? What would you do with it that you can't do with var="" or var=0?
An unset variable expands to "" or 0, depending on the context in which it is being evaluated.
For this reason, I would say that it's a matter of preference and depends on the usage of the variable.
Given that we use a + 0 (or the slightly controversial +a) in the END block to coerce the potentially unset variable a to a numeric type, I guess you could argue that the natural "empty" value would be "".
I'm not sure that there's too much to read in to the cases that you've shown in the question, given the following:
$ awk 'BEGIN { if (!"") print }'
5
("" is false, unsurprisingly)
$ awk 'BEGIN { if (b == "") print 5 }'
5
(unset variable evaluates equal to "", just the same as 0)

converting awk command line to awk file

This works:
awk -F2 '{if (NF > 1) { if (substr($1,0,2) == "..") printf ("%.2f 2%s", ((50*length($1))/1000) , $2); else printf("%s2%s",$1,$2); for (i=3;i<=NF;i++) printf("2%s",$i) } else if (substr($1,0,2) == "..") printf("%.2f",((50*length($1))/1000)); else printf("%s",$1); printf("\n");}'-f debugconsole > debugconsoleWithCount
But when I make the file countdots.awk as follows:
BEGIN {
if (NF > 1)
{
if (substr($1,0,2) == "..")
printf ("%.2f 2%s", ((50*length($1))/1000) , $2);
else
printf("%s2%s",$1,$2)
for (i=3;i<=NF;i++)
printf("2%s",$i)
}
else
if (substr($1,0,2) == "..")
printf("%.2f",((50*length($1))/1000));
else printf("%s",$1);
printf("\n");
}
and run it like this:
awk -F2 -f countdots4.awk debugconsole > debugconsoleWithCount
I get an empty debugconsoleWithCount file.
A BEGIN block in awk is executed only once, before the first input record is read.
An END rule is executed only once, after all the input is read.
In your transformation, since you put everything in BEGIN block, it becomes a no-op since values of NF, $1, $2 .. etc is not even set. Hence you get an empty file. If you remove it it should work fine.
BEGIN and END blocks are not mandatory so you don't have to keep them in your awk script. BEGIN is often used to print titles, headers, initializing variables to particular values etc. END block is often used to do final processing after entire input is read.
Basically this has to do with not understanding BEGIN and END.
I took out BEGIN and it does what I want.
Should I delete this from stackoverflow? let me know.

Using variables to initialize regular expressions in awk

I want to initialize a variable with a regular expression, and then use it for pattern matching. Results do not come as expected . So for example I have,
BEGIN {
item_code_pattern=/ITM-CD-10/ ;
}
$0 ~ $item_code_pattern{ print ; }
I see that records which do not have pattern as ITM-CD-10 are also coming in the output.
Please suggest what should be the correct boolean expression before the block.
Thanks
You want to use a regular string:
awk '
BEGIN {
item_code_pattern = "ITM-CD-10" ;
}
$0 ~ $item_code_pattern { print ; }
'
The /pattern/ construct checks whether $0 matches the given pattern, so your original code is equivalent to saying:
item_code_pattern = $0 ~ "ITM-CD-10"
Since $0 is empty in the BEGIN section, item_code_pattern is set to 0.
You need to drop the $ and the / symbols (and there's no need for a BEGIN block, just assign the variable on the command line):
awk '$0 ~ item_code_pattern' item_code_pattern=ITM-CD-10
When you use $, some versions of awk will emit an error while others will silently convert the variable to an integer value of 0 so that $item_code_pattern is exactly the same as $0, and the code $0 ~ $item_code_pattern is the tautology $0 ~ $0.
If you insist on using a BEGIN block, the syntax is:
BEGIN { item_code_pattern="ITM-CD-10" }
$0 ~ item_code_pattern
Note that { print } is the default rule when no rule is given, so it is redundant.