ANTLR4 adaptivePredict going to wrong Semantic Predicate? - antlr

I have a grammar with semantic predicates somewhat simplified like this:
startrule:
{<condition1 in C++>}? rule1 |
{<condition2 in C++>}? rule2
;
rule1:
{<condition1.1 in C++>}? rule1_statement1 |
{<condition1.2 in C++>}? rule1_statement2
;
rule2:
{<condition2.1 in C++>}? rule2_statement1 |
{<condition2.2 in C++>}? rule2_statement2
;
If condition1 or condition2 are evaluated to true, it correctly goes to rule1 or rule2. So the semantic predicates are working so far, but the problem I'm seeing is that, for example:
rule2 is executed
condition2.1 is false
condition2.2 is true (it should go to rule2_statement2)
When I see the Cpp code, I see this line:
switch (getInterpreter<atn::ParserATNSimulator>()->adaptivePredict(_input, 531, _ctx)) {
And then a case for each corresponding statement. When the code is executed, even if condition2.1 is false, it enters the case for rule2_statement1 (instead of the case for rule2_statement2). So it seems as if the semantic predicates are not working there?.
And since that code has a check for the condition like this:
if (!(condition2.1)) throw FailedPredicateException(this, "condition2.1");
It throws a FailedPredicate exception, my ErrorStrategy recover just calls the DefaultErrorStrategy recover, which eventually crashes because LL1Analyzer::_LOOK throws an out of range exception.
Any hint as to why some semantic predicates appear not to be working? rule2_statement1 and rule2_statement2 have the same tokens but different embedded actions.
Regards,
JZ

Nevermind... I had an issue in m C++ code with the conditions...

Related

Syntax for class filter patterns

I don't understand the syntax of the patterns used in IntelliJ IDEA's menu Run | View Breakpoints... | Catch class filter | Include | Add pattern... and probably many other places in the IDE.
Why does the pattern com.myname.*Tests not match com.myname.mypackage.Tests?
Include pattern
Is com.mypackage.Tests included?
com.myname.mypackage.Tests
true
com.myname.*
true
*Tests
true
com.myname.*Tests
false
Note that I also unsuccessfully tried:
com.myname.**.Tests
com.myname.*.Tests
com.myname..*.Tests
com.myname.**.*.Tests
Unfortunately this is not clear from the UI or documentation:
* in the middle is not supported. Regular expressions here are limited to exact matches and patterns that begin with * or end with *, for example, *.Foo or java.*.

ANTLR4 Correctly continuing to parse sections after error

I'm trying to write some tooling (validation/possibly autocomplete) for a SQL-esk query language. However, parser is tokenizing invalid/incomplete inputs in a way that is making it more difficult to work with.
I've reduce my scenario to its simplest reproducible form. Here is my minimized grammar:
grammar SOQL;
WHITE_SPACE : ( ' '|'\r'|'\t'|'\n' ) -> channel(HIDDEN) ;
FROM : 'FROM' ;
SELECT : 'SELECT' ;
/********** SYMBOLS **********/
COMMA : ',' ;
ID: ( 'A'..'Z' | 'a'..'z' | '_' | '$') ( 'A'..'Z' | 'a'..'z' | '_' | '$' | '0'..'9' )* ;
soql_query: select_clause from_clause;
select_clause: SELECT field ( COMMA field )*;
from_clause: FROM table;
field : ID;
table : ID;
When I run the following code (using antlr4ts, but it should be similar to any other port):
const input = 'SELECT ID, Name, Website, Contact, FROM Account'; //invalid trailing ,
let inputStream = new ANTLRInputStream(input);
let lexer = new SOQLLexer(inputStream);
let tokenStream = new CommonTokenStream(lexer);
let parser = new SOQLParser(tokenStream);
let qry = parser.soql_query();
let select = qry.select_clause();
console.log('FIELDS: ', select.field().map(field => field.text));
console.log('FROM: ', qry.from_clause().text);
Console Log
line 1:35 extraneous input 'FROM' expecting ID
line 1:47 mismatched input '<EOF>' expecting 'FROM'
FIELDS: Array(5) ["ID", "Name", "Website", "Contact", "FROMAccount"]
FROM:
I get errors (which is expected), but I was hoping it would still be able to correctly pick out the FROM clause.
It was my understanding since FROM is a identifier, it's not a valid field in the select_clause (maybe I'm just misunderstanding)?
Is there some way to setup the grammar or parser so that it will continue on to properly identify the FROM clause in this scenario (and other common WIP query states).
It was my understanding since FROM is a identifier, it's not a valid
field in the select_clause (maybe I'm just misunderstanding)?
All the parser sees is a discrete stream of typed tokens coming from the lexer. The parser has no intrinsic way to tell if a token is intended to be an identifier, or for that matter, have any particular semantic nature.
In designing a fault-tolerant grammar, plan the parser to be fairly permissive to syntax errors and expect to use several tree-walkers to progressively identify and where possible resolve the syntax and semantic ambiguities.
Two ANTLR features particularly useful to this end include:
1) implement a lexer TokenFactory and custom token, typically extending CommonToken. The custom token provides a convenient space for flags and logic for identifying the correct syntactic/semantic use and expected context for a particular token instance.
2) implement a parser error strategy, extending or expanding on the DefaultErrorStrategy. The error strategy will allow modest modifications to the parser operation on the token stream when an attempted match results in a recognition error. If the error cannot be fully resolved and appropriately fixed upon examining the surrounding (custom) tokens, at least those same custom tokens can be suitably annotated to ease problem resolution during the subsequent tree-walks.

SQL simulator on Prolog

i need to do a SQL simulator on Prolog. i need to simulate some functions like create_table, insert, delete, update_table, select_table, drop_table, show_table, etc. I am learning how to do asserts and retracts but im getting some errors in my first function create_table(N,A) where N is the name of the table and A a list with the atributtes
An example is create_table("student",["name","lastname","age"]). This will create my table named "student" with the atributes ["name","lastname","age"].
I was searching how to work with assert and i see i need to do dynamic before making assert, then my code is.
len([],0).
len([_|T],N) :- len(T,X), N is X+1.
create_table(_, []):-!.
create_table(N, Atributos):- len(Atributos, L), dynamic N/L, assert(N(Atributos)).
But i get this error :7: Syntax error: Operator expected on this line
create_table(N, Atributos):- len(Atributos, L), dynamic N/L, assert(N(Atributos)).
What im doing wrong? excuse me, i dont speak good english.
From the error message, seems you're using SWI-Prolog....
Your code could be (note that len/2 can be replaced by the builtin length/2)
create_table(N, Atributos):-
length(Atributos, L),
dynamic(N/L),
T =.. [N|Atributos], % this missing 'constructor' was the cause of the error message
assert(T). % this is problematic
There is an important 'design' problem: the CREATE TABLE SQL statement works at metadata level.
If you do, for instance,
?- assertz(student('Juan', 'Figueira', 20)).
pretending that the predicate student/3 holds table data, you're overlapping data and metadata
using dynamic/1 and assert is a tricky non-logical aspect of Prolog, and dynamically creating dynamic predicates is really unusual. Fundamentally you cannot have a Prolog query with the predicate name as a variable e.g.
query(N,X) :- N=student, N(X).
My suggestion is you remove a layer of complexity and have one dynamic predicate 'table', and assert your SQL tables as new 'table' clauses i.e.
:- dynamic table/2.
:- assertz(table(student,['Joe','Young',18])).
query(N,X) :- table(N,X).
:- query(student,[Name,Lastname,Age]).

ANTLR: return always the same number of children

I have the following rule:
statement : TOKEN1 opt1=TOKEN2? opt2=TOKEN3 TOKEN4 -> ^(TOKEN1 opt1? opt2);
The AST generated by this rule will have one or two children (depending on if
opt1 was defined or not).
I need to have always a fixed number of children (in this case 2). I know that
this can be achieved by doing the following (UNDEFINED is an imaginary token):
statement : TOKEN1 opt1=TOKEN2 TOKEN4 -> ^(TOKEN1 opt1 UNDEFINED)
| TOKEN1 opt1=TOKEN2 opt2=TOKEN3 TOKEN4 -> ^(TOKEN1 opt1 opt2);
This is fine for just one optional token. The problem is when I have a higher
number of optional tokens. A lot of rules must written in order to catch all
possible combinations. How can this issue be solved in an elegant way?
I'm using ANTLR 3.4/C target by the way.
Thanks,
T.
You could do this:
grammar G;
tokens {
CHILD1;
CHILD2;
CHILD3;
}
...
statement
: ROOT t2=TOKEN2? t3=TOKEN3? t4=TOKEN4?
-> ^(ROOT ^(CHILD1 $t2?) ^(CHILD2 $t3?) ^(CHILD3 $t4?))
;
which will cause the AST to always have 3 child nodes (which may or may not have a tokens as child themselves).

ANTLR ambigous reference - how to get output?

So I have a rule for statement which can lead to more statements:
statement returns[String txt]
: '{'{
$txt="{";
}
(statement{
$txt+=$statement.txt;
})*
'}'{
$txt+="}";
}
| ... //more rules // ...
;
I am getting
reference $statement is ambiguous; rule statement is enclosing rule and referenced in the production (assuming enclosing rule)
but don't know how to resolve it. Somehow I would need to tell ANTLR that I need the return txt of statement inside parent statement. Please help me out :)
If you use $statement, ANTLR doesn't know if you mean the rule itself, or the statement inside ( ... )*.
Try something like this:
statement returns[String txt]
: '{'{
$txt="{";
}
(s=statement{
$txt+=$s.txt;
})*
'}'{
$txt+="}";
}
| ...
;