FFL: second identifier from where clause not recognized - where-clause

I believe my syntax is correct on this one, but for some reason the Anura FFL parser is not recognizing the second identifier choice defined in my where clause. What am I missing?
def(class creature creature, class game_state game) ->commands [
if(creature.choices,
if(choice < size(player.deck), [
set(player.deck, player.deck[0:choice] + player.deck[choice+1:]),
game.crypt.spawn_cards(creature.summoner, [card]),
set(creature.effects_tracking['Buried Treasure'], card),
] where card=player.deck[choice]
) where player=game.players[creature.summoner],
choice=creature.choices[0]
),
]
It gives me this error:
formula.cpp:1067 ASSERTION FAILED: Unknown identifier 'choice' :
if(choice < size(player.deck), [
^-----^
Note: if I change it to where a=... where b=... instead of where a=... , b=... then it parses.

The comma is being interpreted as an argument separator for if() -- it's ambiguous and impossible for the parser to tell intent. You have to use parens to disambiguate it, though I just recommend using there where...where syntax, as it's much more reliable. Commas are just too open to problems like this so this syntax on where clauses is deprecated.

Related

ANTLR4: parse number as identifier instead as numeric literal

I have this situation, of having to treat integer as identifier.
Underlying language syntax (unfortunately) allows this.
grammar excerpt:
grammar Alang;
...
NLITERAL : [0-9]+ ;
...
IDENTIFIER : [a-zA-Z0-9_]+ ;
Example code, that has to be dealt with:
/** declaration block **/
Method 465;
...
In above code example, because NLITERAL has to be placed before IDENTIFIER, parser picks 465 as NLITERAL.
What is a good way to deal with such a situations?
(Ideally, avoiding application code within grammar, to keep it runtime agnostic)
I found similar questions on SO, not exactly helpful though.
There's no good way to make 465 produce either an NLITERAL token or an IDENTIFIER token depending on context (you might be able to use lexer modes, but that's probably not a good fit for your needs).
What you can do rather easily though, is to allow NLITERALs in addition to IDENTIFIERS in certain places. So you could define a parser rule
methodName: IDENTIFIER | NLITERAL;
and then use that rule instead of IDENTIFIER where appropriate.

Rules for barewords

Barewords can be used at the left hand side of Pair declarations (this is not documented yet, I'm addressing this issue right now, but I want to get everything right). However, I have not found what is and what's not going to be considered a bareword key anywhere.
This seems to work
say (foo'bar-baz => 3); # OUTPUT: «foo'bar-baz => 3␤»
This does not
say (foo-3 => 3); # OUTPUT: «(exit code 1) ===SORRY!=== Error while compiling /tmp/jorTNuKH9V␤Undeclared routine:␤ foo used at line 1␤␤»
So it apparently follows the same syntax as the ordinary identifiers. Is that correct? Am I missing something here?
There are no barewords in Perl 6 in the sense that they exist in Perl 5, and the term isn't used in Perl 6 at all.
There are two cases that we might call a "bare identifier":
An identifier immediately followed by zero or more horizontal whitespace characters (\h*), followed by the characters =>. This takes the identifier on the left as a pair key, and the term parsed after the => as a pair value. This is an entirely syntactic decision; the existence of, for example, a sub or type with that identifier will not have any influence.
An identifier followed by whitespace (or some other statement separator or terminator). If there is already a type of that name, then it is compiled into a reference to the type object. Otherwise, it will always be taken as a sub call. If no sub declaration of that name exists yet, it will be considered a call to a post-declared sub, and an error produced at CHECK-time if a sub with that name isn't later declared.
These two cases are only related in the sense that they are both cases of terms in the Perl 6 grammar, and that they both look to parse an identifier, which follow the standard rules linked in the question. Which wins is determined by Longest Token Matching semantics; the restriction that there may only be horizontal whitespace between the identifier and => exists to make sure that the identifier, whitespace, and => will together be counted as the declarative prefix, and so case 1 will always win over case 2.

How to solve ambiguity in with keywords as identifiers in grammar kit

I've been trying to write the graphql language grammar for grammarkit and I've found myself really stuck on an ambiguity issue for quite some time now. Keywords in graphql (such as: type, implements, scalar ) can also be names of types or fields. I.E.
type type implements type {}
At first I defined these keywords as tokens in the bnf but that'd mean the case above is invalid. But if I write these keywords directly as I'm describing the rule, It results in an ambiguity in the grammar.
An example of an issue I'm seeing based on this grammar below is if you define something like this
directive #foo on Baz | Bar
scalar Foobar #cool
the PSI viewer is telling me that in the position of #cool it's expecting a DirectiveAddtlLocation, which is a rule I don't even reference in the scalar rule. Is anyone familiar with grammarkit and have encountered something like this? I'd really appreciate some insight. Thank You.
Here's an excerpt of grammar for the error example I mentioned above.
{
tokens=[
LEFT_PAREN='('
RIGHT_PAREN=')'
PIPE='|'
AT='#'
IDENTIFIER="regexp:[_A-Za-z][_0-9A-Za-z]*"
WHITE_SPACE = 'regexp:\s+'
]
}
Document ::= Definition*
Definition ::= DirectiveTypeDef | ScalarTypeDef
NamedTypeDef ::= IDENTIFIER
// I.E. #foo #bar(a: 10) #baz
DirectivesDeclSet ::= DirectiveDecl+
DirectiveDecl ::= AT TypeName
// I.E. directive #example on FIELD_DEFINITION | ARGUMENT_DEFINITION
DirectiveTypeDef ::= 'directive' AT NamedTypeDef DirectiveLocationsConditionDef
DirectiveLocationsConditionDef ::= 'on' DirectiveLocation DirectiveAddtlLocation*
DirectiveLocation ::= IDENTIFIER
DirectiveAddtlLocation ::= PIPE? DirectiveLocation
TypeName ::= IDENTIFIER
// I.E. scalar DateTime #foo
ScalarTypeDef ::= 'scalar' NamedTypeDef DirectivesDeclSet?
Once your grammar sees directive #TOKEN on IDENTIFIER, it consumes a sequence of DirectiveAddtlLocation. Each of those consists of an optional PIPE followed by an IDENTIFIER. As you note in your question, the GraphQL "keywords" are really just special cases of identifiers. So what's probably happening here is that, since you allow any token as an identifier, scalar and Foobar are both being consumed as DirectiveAddtlLocation and it's never actually getting to see a ScalarTypeDef.
# Parses the same as:
directive #foo on Bar | Baz | scalar | Foobar
#cool # <-- ?????
You can get around this by listing out the explicit set of allowed directive locations in your grammar. (You might even be able to get pretty far by just copying the grammar in Appendix B of the GraphQL spec and changing its syntax.)
DirectiveLocation ::= ExecutableDirectiveLocation | TypeSystemDirectiveLocation
ExecutableDirectiveLocation ::= 'QUERY' | 'MUTATION' | ...
TypeSystemDirectiveLocation ::= 'SCHEMA' | 'SCALAR' | ...
Now when you go to parse:
directive #foo on QUERY | MUTATION
# "scalar" is not a directive location, so the DirectiveTypeDef must end
scalar Foobar #cool
(For all that the "identifier" vs. "keyword" distinction is a little weird, I'm pretty sure the GraphQL grammar isn't actually ambiguous; in every context where a free-form identifier is allowed, there's punctuation before a "keyword" could appear again, and in cases like this one there's unambiguous lists of not-quite-keywords that don't overlap.)

Another implicit token error - how to tweak definitions to address it

I am aware what implicit token definition error in parser means, but am having difficulty getting rid of it. (v4)
stripped down statements:
enum_decl : GTYPE_ENUM ID LSQUARE STRING STRING* RSQUARE SEMI ;
string_decl: GTYPE_STRING ID (COMMA ID)* SEMI ;
In string_decl, that error appears on SEMI
In enum_decl the same error is on RSQUARE
GTYPE_ENUM, ID, etc. all are defined / accepted correctly, in the Lexer section.
Have you type in that little tiny section trying to find a small test case that doesn't work? Without a grammar to test there's nothing we can do. Is either a bug or a problem with your grammar.

Antlr: ignore keywords in specific context

I'm constructing an English-like domain specific language with ANTLR. Its keywords are context-sensitive. (I know it sounds dirty, but it makes a lot of sense for the non-programmer target users.) For example, the usual logical operators such as or and not are to be treated as identifiers when surrounded in brackets, [like this and this]. My current approach looks like this:
bracketedStatement
: '[' bracketedWord+ ']'
;
bracketedWord
: (~(']')+
;
This, when combined with lexical definitions such as the following:
AND: 'and' ;
OR: 'or' ;
Produces the warning"Decision can match input such as "{AND..PROCESS, RPAREN..'with'}" using multiple alternatives: 1, 2". I'm clearly creating ambiguity for ANTLR, but I don't know how to resolve it. How do I fix this?
For anyone who finds this, check out this stack overflow question. It clarifies how to use the negation symbol correctly.