ANTLR AST and Visitor Pattern - antlr

I am trying to understand how to apply Visitor pattern once I have an AST generated from ANTLR Grammar. Is it different approach than traversing a tree?
For example, assume I have the following AST (From this question):
If I want to appened FUNDEF ids to all its VARDECL ids, my approach is to do a tree traversal to find all VARDECL inside BLOCK and add a new child with (FUNDEF id + old_name) and remove the old one.
Is this the correct approach or would somehow visitor pattern work best? If visitor pattern, what would be the sample code in this specific case?
Thanks in advance!

I doubt the visitor pattern would help here. It is used when you have a data structure and take this to every node for changes. A typical case for this as described in "Head First Design Patterns" by Freeman & Freeman is ordering a Coffee. This is your object that visits each compositor for additional information (milk, sugar).
Your original idea of using a tree walk to modify the AST is probably the best option you have.

Related

String - Matching Automaton

So I am going to find the occurrence of s in d. s = "infinite" and d = "ininfinintefinfinite " using finite automaton. The first step I need is to construct a state diagram of s. But I think I have problems on identifying the occurrence of the string patterns. I am so confused about this. If someone could explain a little bit on this topic to me, it'll be really helpful.
You should be able to use regular expressions to accomplish your goal. Every programming language I've seen has support for regular expressions and should make your task much easier. You can test your regexs here: https://regexr.com/ and find a reference sheet for different things. For your example, the regex /(infinite)/ should do the trick.

Why Filter is structural while Interpreter is behavioral?

Both Filter and Interpreter Design Patterns look like much similar task oriented. Filter does filtering a list for some criteria, while Interpreter is doing pretty much same for a single element.
But I wonder why Filter is Structural and Interpreter is behavioral. Anyone got an idea?
Although it is true that they are "task-oriented", these two patterns actually refer to different purposes, let's take the example of an SQLTable class.
Filter pattern can serve to filter/remove/hide, more generally affect the structure of your database but don't modify its behavior at all.
Example 1 : Once filtered, it's only a new SQLTable with less/more rows and maybe less/more columns
Interpreter pattern belongs to behavioral pattern, in the sense that it modifes the behavior of an object (often represented with the help of a structural pattern such as Composite). The difference lies in the interpretation of the structure to behave differently.
Example 2: Once interpreted as a csv-table, your SQLTable can now be exported to a PDF file
I guess your misunderstanding comes from the fact that they are both applied to a structure in order to create something else but the actual difference lies in their intent rather than their concrete implementation which are relatively close in practice

SQL queries to their natural language description

Are there any open source tools that can generate a natural language description of a given SQL query? If not, some general pointers would be appreciated.
I don't know much about NLP, so I am not sure how difficult this is, although I saw from some previous discussion that the vice versa conversion is still an active area of research. It might help to say that the SQL tables I will be handling are not arbitrary in any sense, yet mine, which means that I know exact semantics of each table and its columns.
I can devise two approaches:
SQL was intended to be "legible" to non-technical people. A naïve and simpler way would be to perform a series of replacements right on the SQL query: "SELECT" -> "display"; "X=Y" -> "when the field X equals to value Y"... in this approach, using functions may be problematic.
Use a SQL parser and use a series of templates to realize the parsed structure in a textual form: "(SELECT (SUM(X)) (FROM (Y)))" -> "(display (the summation of (X)) (in the table (Y))"...
ANTLR has a grammar of SQL you can use: https://github.com/antlr/grammars-v4/blob/master/sqlite/SQLite.g4 and there are a couple SQL parsers:
http://www.sqlparser.com/sql-parser-java.php
https://github.com/facebook/presto/tree/master/presto-parser/src/main
http://db.apache.org/derby/
Parsing is a core process for executing a SQL query, check this for more information: https://decipherinfosys.wordpress.com/2007/04/19/parsing-of-sql-statements/
There is a new project (I am part of) called JustQuery.Me which intends to do just that with NLP and google's SyntaxNet. You can go to the https://github.com/justquery-me/justqueryme page for more info. Also, sign up for the mailing list at justqueryme-development#googlegroups.com and we will notify you when we have a proof of concept ready.

move subtree from one part of AST to another

I am working on a tool to convert Oracle SQL to ANSI SQL. I have a grammar that will parse both Oracle SQL and ANSI SQL.
I want to extract the Oracle outer join expressions from the where clause part of the AST and insert new join clauses at the end of the from clause part of the AST for the matching select or subquery.
Can a tree parser with rewrite rules do this type of tree transformation?
i.e. take an AST generated from Oracle SQL
SELECT
a.columna, b.columnb
FROM
tablea a,
tableb b
WHERE
a.columna2 (+) = b.columnb2 (+)
AND
a.columna3 = 'foo'
AND
b.columnb3 = 'bar'
and transform it to an AST for ANSI SQL
SELECT
a.columna, b.columnb
FROM
tablea a FULL OUTER JOIN tableb b ON (a.columna2 = b.columnb2 )
WHERE
a.columna3 = 'foo'
and
b.columnb3 = 'bar'
NOTE1: the table references for tablea and tableb are deleted from the FROM clause and replaced with a JOIN clause referring to the same tables and table aliases.
NOTE2: the Oracle join condition is identified as a FULL OUTER JOIN by the presence of the OuterJoinIndicator (+) on both sides of the sql_condition comparison.
NOTE3: the join condition comparison is deleted from the WHERE clause and used to construct the join clause ON condition [with the OuterJoinIndicator(s) removed].
Yes, this is quite possible, especially since you have a grammar that recognizes both Oracle and ANSI SQL. I once wrote a translator from AREV BASIC to Visual BASIC and did many similar transformations.
In my project I used ANTLR 2 and wrote a master tree grammar which did nothing but completely walk the tree according to all rules in my grammars. I then used ANTLR 2's subclassing to override specific rules to do the transformations. I liked this as it let me build up the translation in passes and keep all my expression handling in one pass, control structures in another pass, etc.
ANTLR 3 does not provide grammar subclassing, so you won't be able to use that approach. You will need a complete tree grammar to print out your resulting tree. Personally, I would write that tree grammar first and get it working properly. Then I would copy that grammar and strip all the actions out but put in the option to rewrite the AST. Then modify the rules you need for your transformation. If you do many transformations you may want to use multiple passes, one tree grammar for each pass. You may have a pass or two that does analysis to help drive the later passes. On my BASIC translation project I did control flow analysis, data flow analysis and dead code removal as analysis passes.
If you want help writing the specific transformation you'll need to share your tree grammar. There are quite a few tree grammar idioms to wrap your head around. Terence's ANTLR 3 book would be a valuable purchase if you need help there. If you haven't written the tree grammar yet then post questions when you get stuck. Choosing the correct root nodes is important. If you want to get an idea of how to build trees and tree parsers, you can look at my C grammar. It is ANTLR 2, but the tree building concepts are the same. http://www.antlr3.org/grammar/cgram/grammars/
Do you need to retain comments and formatting? That adds another layer of complexity, for which I would recommend creating another question.
If you have two different grammars, you are likely to discover that the "small differences" in the grammars lead to quite considerably different ASTs for these clauses, and so your real problem is to convert the tree structure for one into the tree structure for another. And you'll have to do this piecewise for the whole tree because such differences are spread all across the grammars. YMMV.
ANTLR's tree parser will pretty likely let you recognize arbitrary fragments; these are certainly cues for generating equivalents in the other grammar's AST. But you'll have to write lots of such fragments, and the code corresponding routines to assemble the equivalent tree node-by-node. As a general rule for a large grammar (such as Oracle SQL), this can be quite a lot of work. You can do it this way.
An alternative is program transformation systems. These are tools that allow you to write surface syntax patterns (e.g., phrases in Oracle SQL and ANSI SQL) to code and apply your transformations directly. Writing transformations this way is considerably easier IMHO. You'd end up writing something like this:
source domain Oracle.
target domain ANSISQL.
rule xlate_Oracle_SELECT(c: columns, t1: table, t2: table,
c1: column, c2: column,
more_conditions: conditions):SQL_phrase
"SELECT \c FROM \t1, \t2 WHERE c1 (+) = c2 (+) and \more_conditions";
=>
"SELECT \c FROM \t1 FULL OUTER JOIN \t2 on ( c1 = c2 ) WHERE \more_conditions";
(The backslash-IDs are pattern variables which can match an arbitrary subtree of the the declared syntax type legal in that location.)
The reason this works is that the transformation tool parses the first pattern with the first grammar, and so gets a tree it can match on trees of the first grammar, and similarly parses the second pattern using the second grammar, getting a replacement tree that follows the rules of the second grammar. The transformation engine matches the tree for first pattern, and substitutes the tree for the second. So such a rule transforms a small set of blue tree nodes from the blue tree, to a small set of green nodes of the desired tree type. The color analogy should make it clear that you have to translate all the blue nodes into green ones if you want an accurate translation.
You'd need additional rules to translate the various subclauses just to paper over the differences in the grammar, e.g.,:
rule translate column(t: IDENTIFIER, c: IDENTIFIER, ):table->table
"\t.\c" -> " \toSQLidentifier\(\t\).\toSQLidentifier\(\c\)";
This would handles differences in how the two languages spell identifiers, by calling a custom function toSQLidentifier that does string hacking.
I don't think ANTLR supports these kind of transformation rules. You can simulate it all by lots of code.
You might avoid some of this if you have one "union" grammar for both languages (which is what you imply), but this usually gets you a highly ambiguous grammar and that's a huge amount of trouble. If you have succeeded in this, than you only have to apply translation rules where the languages differ (e.g., everything is a blue node).
You can also hack it: scan the tree left to right; prettyprint the parts that are equivalent (figuring this out is harder than it looks), where they differ, prettyprint a substitution. This is a very fragile way to do this.

Checking logical implication relationships between OWL expressions?

I have a simple question which I suspect has no simple answer. Essentially, I want to check whether it is true that one OWL expression (#B) follows on logically from another (#A) - in other words I want to ask: is it true that #A -> #B?
The reason for this is that I'm writing a matching algorithm for an application which matches structures in a knowledge based (represented by the #KnowledgeStructure class) to a structure which describes the needs of the current application state (#StateRequirement). Both structures have properties which have string values representing OWL expressions over the state of a third kind of structure (#Model). These are: #KnowledgeStructure.PostCondition which expresses how the knowledge structure being applied to #Model will transform #Model; and #StateRequirement.GoalCondition, which expresses the #Model state that the application aims to achieve. I want to see, therefore, if the #KnowledgeStructure will satisfy the #StateRequirement by checking that the #KnowledgeStructure.PostCondition produces the desired #StateRequiremment.GoalCondition. I could express this abstractly as: (#KnowledgeStructure.Postcondition => #StateRequirement.GoalCondition) => Match(#KnowledgeStructure, #StateRequirement). Less confusingly I could express this as: ((#A -> #B) -> Match(#A, #B)) where both #A and #B are valid OWL expressions.
In the general case I would like to be able to express the following rule: "If it is true that the expression #B follows from #A, then the expression Match(#A, #B) is also true".
Essentially, my question is this: how do I pose or realise such a rule in OWL? How do I test whether one expression follows from another expression? Also, are existing reasoners sufficiently powerful to determine the relation #A -> #B between two expressions if this relation is not explicitly stated?
I'm not 100% sure that I fully understood the question, but from what I grasped I would face the situation in this way.
First of all I refer to Java because all the libraries I know are meant for this language. Secondly, I don't think that OWL on its own is able to satisfy your goal, given that it can represent rules and axioms, but it does not provide reasoning, that is, you need a reasoner, so you need to build a program that uses it, plus doing additional processing that I will sketch below:
1) You didn't mention it, but I guess you have an underlying ontology w.r.t. you need to prove your consequence relation (what you denote with symbol "->"). If the ontology is not explicit, maybe it can be extracted/composed from the textual expressions you mentioned in the question.
2) you need to use a library for ontology manipulation, I suggest OWL API from Manchester University, it is very powerful and simple, in the tutorial under section "documentation" you have an overview of the main functionalities, including the use of reasoners (the example shows Hermit, but the principle holds for any other reasoner).
3) At this point you need to check if the ontology is consistent (otherwise anything can be derived, as it often happens with false premises)
4) You add the following axiom to the ontology (you build it directly in Java, no need to serialize back, you can let the reasoner work on the in-memory representation) and check for consistency: A \sqsubseteq B, that is, using the associated interpretation: A^I \subseteq B^I, so it is equivalent to A => B (they have the same truth table).
5) At this point you can add the axiom Match(A,B), where A and B are your class expressions and Match is a Role/Relation that relates all the class expressions for which the second is a consequence of the first.
6) After a number of repetitions of these steps you may want to serialize the result and store it, and this again can be achieved quite simply using OWL API from the in-memory representation.
For some basics about Description Logics (the logic underpinning OWL ontologies) you can refer to A description logic Primer (2012), Horrocks et al. and to Foundations of Description Logics (2011), Rudolph.
I'm not a logician or a DL expert, so please verify all the information I provided and feel free to correct me :)