Is the below expression valid in Yacc - yacc

In Yacc(or bison) , is the below expression a syntactically valid one ?
sentence : noun verb {
/* some action here which uses only $1 , $2 */
}
predicate {
/*some action which uses $1,$2,$3,$4 */
}

Yes, it is valid.
The first action is a mid-rule action. It itself has a semantic value, which will be $3, so the comment in the second action should include $4 (the value of predicate).

Related

Bison Grammar %type and %token

Why is it that I have to use $<nVal>4 explicitly in the below grammar snippet?
I thought the %type <nVal> expr line would remove the need so that I can simply put $4?
Is it not possible to use a different definition for expr so that I can?
%union
{
int nVal;
char *pszVal;
}
%token <nVal> tkNUMBER
%token <pszVal> tkIDENT
%type <nVal> expr
%%
for_statement : tkFOR
tkIDENT { printf( "I:%s\n", $2 ); }
tkEQUALS
expr { printf( "A:%d\n", $<nVal>4 ); } // Why not just $4?
tkTO
expr { printf( "B:%d\n", $<nVal>6 ); } // Why not just $6?
step-statement
list
next-statement;
expr : tkNUMBER { $$ = $1; }
;
Update following rici's answer. This now works a treat:
for_statement : tkFOR
tkIDENT { printf( "I:%s\n", $2 ); }
tkEQUALS
expr { printf( "A:%d\n", $5 /* $<nVal>5 */ ); }
tkTO
expr { printf( "A:%d\n", $8 /* $<nVal>8 */ ); }
step-statement
list
next-statement;
Why is it that I have to use $<nVal>4 explicitly in the below grammar snippet?
Actually, you should use $5 if you want to refer to the expr. $4 is the tkEQUALS, which has no declared type, so any use must be explicitly typed. $3 is the previous midrule action, which has no value since $$ is not assigned in that action.
By the same logic, the second expr is $8; $6 is the second midrule action, which also has no value (and no type).
See the Bison manual:
The mid-rule action itself counts as one of the components of the rule. This makes a difference when there is another action later in the same rule (and usually there is another at the end): you have to count the actions along with the symbols when working out which number n to use in $n.

What does it mean when an awk script has code outside curly braces?

Going through an awk tutorial, I came across this line
substr($0,20,5) == "HELLO" {print}
which prints a line if there is a "HELLO" string starting at 20th char.
Now I thought curly braces were necessary at the start of an awk script and an 'if' for this to work, but it works without nevertheless.
Can some explain how it evaluates?
If you have:
{ action }
...then that action runs on every line. By contrast, if you have:
condition { action }
...then that action runs only against lines for which the condition is true.
Finally, if you have only a condition, then the default action is print:
NR % 2 == 0
...will thus print every other line.
You can similarly have multiple pairs in a single script:
condition1 { action1 }
condition2 { action2 }
{ unconditional_action }
...and can also have BEGIN and END blocks, which run at the start and end of execution.

awk: "default" action if no pattern was matched?

I have an awk script which checks for a lot of possible patterns, doing something for each pattern. I want something to be done in case none of the patterns was matched. i.e. something like this:
/pattern 1/ {action 1}
/pattern 2/ {action 2}
...
/pattern n/ {action n}
DEFAULT {default action}
Where of course, the "DEFAULT" line is no awk syntax and I wish to know if there is such a syntax (like there usually is in swtich/case statements in many programming languages).
Of course, I can always add a "next" command after each action, but this is tedious in case I have many actions, and more importantly, it prevents me from matching the line to two or more patterns.
You could invert the match using the negation operator ! so something like:
!/pattern 1|pattern 2|pattern/{default action}
But that's pretty nasty for n>2. Alternatively you could use a flag:
{f=0}
/pattern 1/ {action 1;f=1}
/pattern 2/ {action 2;f=1}
...
/pattern n/ {action n;f=1}
f==0{default action}
GNU awk has switch statements:
$ cat tst1.awk
{
switch($0)
{
case /a/:
print "found a"
break
case /c/:
print "found c"
break
default:
print "hit the default"
break
}
}
$ cat file
a
b
c
d
$ gawk -f tst1.awk file
found a
hit the default
found c
hit the default
Alternatively with any awk:
$ cat tst2.awk
/a/ {
print "found a"
next
}
/c/ {
print "found c"
next
}
{
print "hit the default"
}
$ awk -f tst2.awk file
found a
hit the default
found c
hit the default
Use the "break" or "next" as/when you want to, just like in other programming languages.
Or, if you like using a flag:
$ cat tst3.awk
{ DEFAULT = 1 }
/a/ {
print "found a"
DEFAULT = 0
}
/c/ {
print "found c"
DEFAULT = 0
}
DEFAULT {
print "hit the default"
}
$ gawk -f tst3.awk file
found a
hit the default
found c
hit the default
It's not exaclty the same semantics as a true "default" though so it's usage like that could be misleading. I wouldn't normally advocate using all-upper-case variable names but lower case "default" would clash with the gawk keyword so the script wouldn't be portable to gawk in future.
As mentioned above by tue, my understanding of the standard approach in Awk is to put next at each alternative and then have a final action without a pattern.
/pattern1/ { action1; next }
/pattern2/ { action2; next }
{ default-action }
The next statement will guarantee that no more patterns are considered for the line in question. And the default-action will always happen if the previous ones don't happen (thanks to all the next statements).
There is no "maintanance free" solution for a DEFAULT-Branch in awk.
The first possibility i would suggest is to complete each branch of a pattern match with a 'next' statement. So it's like a break statement. Add a final action at the end that matches everything. So it's the DEAFULT branch.
The other possibility would be:
set a flag for each branch that has a pattern match (i.e. your non-default branches)
e.g. start your actions with NONDEFAULT=1;
Add a last action at the end (the default branch) and define a condition NONDEFAULT==0 instaed of a reg expression match.
A fairly clean, portable workaround is using an if statement:
Instead of:
pattern1 { action1 }
pattern2 { action2 }
...
one could use the following:
{
if ( pattern1 ) { action1 }
else if ( pattern2 ) { action2 }
else { here is your default action }
}
As mentioned above, GNU awk has switch statements, but other awk implementations don't, so using switch would not be portable.

Using variables to initialize regular expressions in awk

I want to initialize a variable with a regular expression, and then use it for pattern matching. Results do not come as expected . So for example I have,
BEGIN {
item_code_pattern=/ITM-CD-10/ ;
}
$0 ~ $item_code_pattern{ print ; }
I see that records which do not have pattern as ITM-CD-10 are also coming in the output.
Please suggest what should be the correct boolean expression before the block.
Thanks
You want to use a regular string:
awk '
BEGIN {
item_code_pattern = "ITM-CD-10" ;
}
$0 ~ $item_code_pattern { print ; }
'
The /pattern/ construct checks whether $0 matches the given pattern, so your original code is equivalent to saying:
item_code_pattern = $0 ~ "ITM-CD-10"
Since $0 is empty in the BEGIN section, item_code_pattern is set to 0.
You need to drop the $ and the / symbols (and there's no need for a BEGIN block, just assign the variable on the command line):
awk '$0 ~ item_code_pattern' item_code_pattern=ITM-CD-10
When you use $, some versions of awk will emit an error while others will silently convert the variable to an integer value of 0 so that $item_code_pattern is exactly the same as $0, and the code $0 ~ $item_code_pattern is the tautology $0 ~ $0.
If you insist on using a BEGIN block, the syntax is:
BEGIN { item_code_pattern="ITM-CD-10" }
$0 ~ item_code_pattern
Note that { print } is the default rule when no rule is given, so it is redundant.

Embedding Code in Yacc

I'm writing a yacc file as part of a compiler.
I have the following error:
lang_grammar.y:143.54-55: $2 of `ClassDeclaration' has no declared type
lang_grammar.y:143.69-70: $4 of `ClassDeclaration' has no declared type
lang_grammar.y:143.84-85: $6 of `ClassDeclaration' has no declared type
occurring on this line in my .y file:
CLASS { /* code will be embedded here */ } ID EXTENDS ID '{' ClassBody '}'
{ $$.classDeclaration = new ClassDeclaration($2.identifier, $4.identifier, $6.classBody); }
When I remove the inner embedded code:
CLASS ID EXTENDS ID '{' ClassBody '}'
{ $$.classDeclaration = new ClassDeclaration($2.identifier, $4.identifier, $6.classBody); }
It works just fine.
Are there limitations to embedding code within yacc? I was under the impression that this was possible.
Thanks.
I think you have used wrong indexes. In previous way, embedded codes are also indexed, say
CLASS { /* code will be embedded here */ } ID EXTENDS ID '{' ClassBody '}'
$1 $2 $3 $4 $5 $6 $7 $8
So the action codes should be
{ $$.classDeclaration = new ClassDeclaration($3.identifier, $5.identifier, $7.classBody); }