antlr4: conditional code gen handling in visitor

antlr4: conditional code gen handling in visitor - antlr

I write (own lang -> JS) transpiler, using ANTLR (javascript target using visitor).
Focus is on variations, in target code generation.
An earlier SO post, described solution to a simpler situation. This one being different, primarily due to recursion being involved.
Grammar:
grammar Alang;
...
variableDeclaration : DECL (varDecBl1 | varDecBl2) EOS;
varDecBl1 : ID AS TYP;
varDecBl2 : CMP ID US (SGST varDecBl1+ SGFN);
...
DECL : 'var-declare' ;
EOS : ';' ;
SGST : 'segment-start:' ;
SGFN : 'segment-finish' ;
AS : 'as';
CMP : 'cmp';
US : 'using';
TYP
: 'str'
| 'int'
| 'bool'
;
ID : [a-zA-Z0-9_]+ ;
Two different cases of source code needs to be handled differently.
Source code 1:
var-declare fooString as str;
target of which needs to come out as: var fooString;
Source code 2:
var-declare cmp barComplex using
segment-start:
barF1 as int
barF2 as bool
barF3 as str
segment-finish;
target of this one has to be: var barComplex = new Map();
(as simple var declaration can't handle value type)
code generation done using:
visitVarDecBl1 = function(ctx) {
this.targetCode += `var ${ctx.getChild(0).getText()};`;
return this.visitChildren(ctx);
};
....
visitVarDecBl2 = function(ctx) {
this.targetCode += `var ${ctx.getChild(1).getText()} = new Map();`;
return this.visitChildren(ctx);
};
(targetCode consolidates the target code)
Above works for case 1.
For case 2, it ends up going beyond var barComplex = new Map(), due to recursive use of rule varDecBl1, which invokes visitVarDecBl1 code gen implement again.
Wrong outcome:
var barComplex = new Map();
var barF1; // <---- wrong one coming from visitVarDecBl1
var barF2; // <---- wrong
var barF3; // <---- wrong
To deal with this, one approach that I want to try, is to make visitVarDecBl1 conditional to parent ctx.
If parent = variableDeclaration, target code = var ${ctx.getChild(0).getText()};.
If parent = varDecBl2, skip generation of code.
But I can't find invoking rule within ctx payload, that I can string compare.
Using something like ctx.parentCtx gives me [378 371 204 196 178 168] (hashes?).
Inputs welcome. (including proposal of a better approach, if any)

In case anyone comes looking with similar situation at hand, I post the answer myself.
I dealt with it by extracting parent rule myself.
util = require("util");
...
// parentCtx object -> string
const parentCtxStr = util.inspect(ctx.parentCtx);
// snip out parent rule from payload, e.g.: 'varDecBl2 {…, ruleIndex: 16, …}'
const snip = parentCtxStr.slice(0,parentCtxStr.indexOf("{")-1); // extracts varDecBl2 for you
}
Now use value of snip to deal with as described above.
Of course, this way establishes a coupling to how payload is structured. And risk breaking in future if that is changed.
I'd rather use same stuff through the API, though I couldn't find it.

Related

Is there a way to get back source code from antlr4ts parse tree after modifications ctx.removeLastChild/ctx.addChild? [duplicate]

I want to keep white space when I call text attribute of token, is there any way to do it?
Here is the situation:
We have the following code
IF L > 40 THEN;
ELSE
IF A = 20 THEN
PUT "HELLO";
In this case, I want to transform it into:
if (!(L>40){
if (A=20)
put "hello";
}
The rule in Antlr is that:
stmt_if_block: IF expr
THEN x=stmt
(ELSE y=stmt)?
{
if ($x.text.equalsIgnoreCase(";"))
{
WriteLn("if(!(" + $expr.text +")){");
WriteLn($stmt.text);
Writeln("}");
}
}
But the result looks like:
if(!(L>40))
{
ifA=20put"hello";
}
The reason is that the white space in $stmt was removed. I was wondering if there is anyway to keep these white space
Thank you so much
Update: If I add
SPACE: [ ] -> channel(HIDDEN);
The space will be preserved, and the result would look like below, many spaces between tokens:
IF SUBSTR(WNAME3,M-1,1) = ')' THEN M = L; ELSE M = L - 1;

This is the C# extension method I use for exactly this purpose:
public static string GetFullText(this ParserRuleContext context)
{
if (context.Start == null || context.Stop == null || context.Start.StartIndex < 0 || context.Stop.StopIndex < 0)
return context.GetText(); // Fallback
return context.Start.InputStream.GetText(Interval.Of(context.Start.StartIndex, context.Stop.StopIndex));
}
Since you're using java, you'll have to translate it, but it should be straightforward - the API is the same.
Explanation: Get the first token, get the last token, and get the text from the input stream between the first char of the first token and the last char of the last token.

#Lucas solution, but in java in case you have troubles in translating:
private String getFullText(ParserRuleContext context) {
if (context.start == null || context.stop == null || context.start.getStartIndex() < 0 || context.stop.getStopIndex() < 0)
return context.getText();
return context.start.getInputStream().getText(Interval.of(context.start.getStartIndex(), context.stop.getStopIndex()));
}

Looks like InputStream is not always updated after removeLastChild/addChild operations. This solution helped me for one grammar, but it doesn't work for another.
Works for this grammar.
Doesn't work for modern groovy grammar (for some reason inputStream.getText contains old text).
I am trying to implement function name replacement like this:
enterPostfixExpression(ctx: PostfixExpressionContext) {
// Get identifierContext from ctx
...
const token = CommonTokenFactory.DEFAULT.createSimple(GroovyParser.Identifier, 'someNewFnName');
const node = new TerminalNode(token);
identifierContext.removeLastChild();
identifierContext.addChild(node);
UPD: I used visitor pattern for the first implementation

Ability to explicit define "rule element labels"

Following https://github.com/antlr/antlr4/blob/master/doc/parser-rules.md#rule-element-labels is there a way to explicitly add a field to a rule context object?
My use case is a sequence of dots and identifiers:
dotIdentifierSequence
: identifier dotIdentifierSequenceContinuation*
;
dotIdentifierSequenceContinuation
: DOT identifier
;
Often we want to deal with the "full path" of the dotIdentifierSequence structure. Atm this means using DotIdentifierSequenceContext#getText. However, DotIdentifierSequenceContext#getText walks the tree visiting each sub-node collecting the text.
Rule labels as discussed on that doc page would let me do:
dotIdentifierSequence
: i:identifier c+=dotIdentifierSequenceContinuation*
;
and add fields i and c to the DotIdentifierSequenceContext. However to get the full structure's text I'd still have to visit the i node and each c node.
What would be awesome is to be able to define a "full sequence text" String field for both DotIdentifierSequenceContext and DotIdentifierSequenceContinuationContext.
Is that in any way possible today?

The only way I could find to do this was the following:
dotIdentifierSequence
returns [String fullSequenceText]
: (i=identifier { $fullSequenceText = _localctx.i.getText(); }) (c=dotIdentifierSequenceContinuation { $fullSequenceText += _localctx.c.fullSequenceText; })*
;
dotIdentifierSequenceContinuation
returns [String fullSequenceText]
: DOT (i=identifier { $fullSequenceText = "." + _localctx.i.getText(); })
;
Which works, but unfortunately makes the grammar quite unreadable.

Rust Arc/Mutex Try Macro Borrowed Content

I'm trying to do several operations with a variable that is shared across threads, encapsulated in an Arc<Mutex>. As not all of the operations may be successful, I'm trying to use the try! macro, or the ? operator, to auto-propagate the errors.
Here's a minimum viable example of my code:
lazy_static! {
static ref BIG_NUMBER: Arc<Mutex<Option<u32>>> = Arc::new(Mutex::new(Some(174)));
}
pub fn unpack_big_number() -> Result<u32, Box<Error>> {
let big_number_arc = Arc::clone(&BIG_NUMBER);
let mutex_guard_result = big_number_arc.lock();
let guarded_big_number_option = mutex_guard_result?;
let dereferenced_big_number_option = *guarded_big_number_option;
let big_number = dereferenced_big_number_option.unwrap();
// do some subsequent operations
let result = big_number + 5;
// happy path
Ok(result)
}
You will notice that in the line where I declare guarded_big_number_option, I have a ? at the end. This line is throwing the following compiler error (which it does not when I replace the ? with .unwrap():
error[E0597]: `big_number_arc` does not live long enough
--> src/main.rs:32:30
|
7 | let mutex_guard_result = big_number_arc.lock();
| ^^^^^^^^^^^^^^ borrowed value does not live long enough
...
17 | }
| - borrowed value only lives until here
|
= note: borrowed value must be valid for the static lifetime...
Now the thing is, it is my understanding that I'm not trying to use big_number_arc beyond its lifetime. I'm trying to extract a potential PoisonError contained within the result. How can I properly extract that error and make this propagation work?
Additionally, if it's any help, here's a screenshot of the type annotations that my IDE, CLion, automatically adds to each line:

lock() function returns LockResult<MutexGuard<T>>. Documentation says the following:
Note that the Err variant also carries the associated guard, and it can be acquired through the into_inner method
so you're essentially trying to return a reference to a local variable (wrapped into PoisonError struct), which is obviously incorrect.
How to fix it? You can convert this error to something with no such references, for example to String:
let guarded_big_number_option = mutex_guard_result.map_err(|e| e.to_string())?;

Why does this documentation example fail? Is my workaround an acceptable equivalent?

While exploring the documented example raised in this perl6 question that was asked here recently, I found that the final implementation option - (my interpretation of the example is that it provides three different ways to do something) - doesn't work. Running this;
class HTTP::Header does Associative {
has %!fields handles <iterator list kv keys values>;
sub normalize-key ($key) { $key.subst(/\w+/, *.tc, :g) }
method EXISTS-KEY ($key) { %!fields{normalize-key $key}:exists }
method DELETE-KEY ($key) { %!fields{normalize-key $key}:delete }
method push (*#_) { %!fields.push: #_ }
multi method AT-KEY (::?CLASS:D: $key) is rw {
my $element := %!fields{normalize-key $key};
Proxy.new(
FETCH => method () { $element },
STORE => method ($value) {
$element = do given $value».split(/',' \s+/).flat {
when 1 { .[0] } # a single value is stored as a string
default { .Array } # multiple values are stored as an array
}
}
);
}
}
my $header = HTTP::Header.new;
say $header.WHAT; #-> (Header)
$header<Accept> = "text/plain";
$header{'Accept-' X~ <Charset Encoding Language>} = <utf-8 gzip en>;
$header.push('Accept-Language' => "fr"); # like .push on a Hash
say $header<Accept-Language>.perl; #-> $["en", "fr"]
... produces the expected output. Note that the third last line with the X meta-operator assigns a literal list (built with angle brackets) to a hash slice (given a flexible definition of "hash"). My understanding is this results in three seperate calls to method AT-KEY each with a single string argument (apart from self) and therefore does not exersise the default clause of the given statement. Is that correct?
When I invent a use case that excersises that part of the code, it appears to fail;
... as above ...
$header<Accept> = "text/plain";
$header{'Accept-' X~ <Charset Encoding Language>} = <utf-8 gzip en>;
$header{'Accept-Language'} = "en, fr, cz";
say $header<Accept-Language>.perl; #-> ["en", "fr", "cz"] ??
# outputs
(Header)
This Seq has already been iterated, and its values consumed
(you might solve this by adding .cache on usages of the Seq, or
by assigning the Seq into an array)
in block at ./hhorig.pl line 20
in method <anon> at ./hhorig.pl line 18
in block <unit> at ./hhorig.pl line 32
The error message provides an awesome explanation - the topic is a sequence produced by the split and is now spent and hence can't be referenced in the when and/or default clauses.
Have I correctly "lifted" and implemented the example? Is my invented use case of several language codes in the one string wrong or is the example code wrong/out-of-date? I say out-of-date as my recollection is that Seq's came along pretty late in the perl6 development process - so perhaps, this code used to work but doesn't now. Can anyone clarify/confirm?
Finally, taking the error message into account, the following code appears to solve the problem;
... as above ...
STORE => method ($value) {
my #values = $value».split(/',' \s+/) ;
$element = do given #values.flat {
when 1 { $value } # a single value is stored as a string
default { #values } # multiple values are stored as an array
}
}
... but is it an exact equivalent?

That code works now (Rakudo 2018.04) and prints
$["en", "fr", "cz"]
as intended. It was probably a bug which was eventually solved.

Queuing system for actionscript

Is there an actionscript library providing a queuing system?
This system would have to allow me to pass the object, the function I want to invoke on it and the arguments, something like:
Queue.push(Object, function_to_invoke, array_of_arguments)
Alternatively, is it possible to (de-)serialize a function call? How would I evaluate the 'function_to_invoke' with the given arguments?
Thanks in advance for your help.

There's no specific queue or stack type data structure available in ActionScript 3.0 but you may be able to find a library (CasaLib perhaps) that provides something along those lines.
The following snippet should work for you but you should be aware that since it references the function name by string, you won't get any helpful compiler errors if the reference is incorrect.
The example makes use of the rest parameter which allows you to specify an array of arbitrary length as the arguments for your method.
function test(... args):void
{
trace(args);
}
var queue:Array = [];
queue.push({target: this, func: "test", args: [1, 2, "hello world"] });
queue.push({target: this, func: "test", args: ["apple", "pear", "hello world"] });
for (var i:int = 0; i < queue.length; i ++)
{
var queued:Object = queue[i];
queued.target[queued.func].apply(null, queued.args);
}

Sure, that works similar to JavaScript
const name:String = 'addChild'
, container:Sprite = new Sprite()
, method:Function = container.hasOwnProperty(name) ? container[name] : null
, child:Sprite = new Sprite();
if (method)
method.apply(this, [child]);
So a query method could look like:
function queryFor(name:String, scope:*, args:Array = null):void
{
const method:Function = scope && name && scope.hasOwnProperty(name) ? scope[name] : null
if (method)
method.apply(this, args);
}

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

antlr4: conditional code gen handling in visitor - antlr

Related

Is there a way to get back source code from antlr4ts parse tree after modifications ctx.removeLastChild/ctx.addChild? [duplicate]

Ability to explicit define "rule element labels"

Rust Arc/Mutex Try Macro Borrowed Content

Why does this documentation example fail? Is my workaround an acceptable equivalent?

Queuing system for actionscript

Categories

Resources