Stringtemplate template arguments always evaluated to a string - antlr

I'm using Antlr 4 RC1 (the complete jar) to parse a grammar, build a custom ast, and generate code from that AST with stringtemplate4 (I use the stringtemplate classes in the antlr jar).
Inside a template I call another template with a list of beans e.g.
<subtemplate(myListArg=parm.listOfBeans)>
then inside subtemplate I get a list of Strings (each string is one of the beans evaluated to a String). But what I need is the list of java beans (e.g. simple java object with properties) because I want to process the properties of the beans and not the value of the beans, e.g.
<subtemplate(myListArg)> ::= <<
<myListArg: {x | {... <x.someProperty> ...}>
>>
Looks to me like parameters of a template are always evaluated to strings. Is that the intended behaviour? If yes what should else should I write?

StringTemplate 4 does not render the members of an array or List to strings while calling a subtemplate, as verified by the following. Edit: Despite claims to the contrary in the comments on this post, I reiterate that ST4 does not render the members of an array or List to strings while calling a subtemplate. The element types stored in the array or list make no difference.
start(class) ::= <<
<fields(class.fields)>
>>
fields(fieldsArray) ::= <<
<fieldsArray:{field | <field.name>}; separator="\n">
>>
If you create an instance of the start template and pass Integer.class for the class parameter, you get the following output:
MIN_VALUE
MAX_VALUE
TYPE
SIZE
One of the following must be occurring in your application:
myListArg is not actually a list of strings (i.e. you are getting unexpected output, but not for the reason listed here)
You have a ModelAdaptor registered for the type of parm which is returning a list of strings for the property listOfBeans
The type of parm has a getListOfBeans method which returns a list of strings
Item (3) does not hold, and the type of parm has a listOfBeans field which is a list of strings
Edit: Regarding the question of lists vs. arrays, I performed the unit test above with passing new Clazz(Integer.class) instead of just Integer.class to the start template:
private static class Clazz {
private final Class<?> clazz;
public Clazz(Class<?> clazz) {
this.clazz = clazz;
}
public List<Field> getFields() {
return Arrays.asList(clazz.getFields());
}
}

Related

Using Kotlin Symbol Processing API to compile String expression

I have a requirement where I need to make operation on Strings. This operation is expensive at runtime, and need to be made only on static expression. Therefore I want to do it at compile time.
Since the operation need to be made a lot of time, in many different modules, having to copy paste, doing manually the conversion, to past the result in the source code is long and slow.
I would like to add an annotation on the Strings which is required to convert, and have a file generated with all the results.
An example could be
val original = #Convert "Hello world"
functionRequiringTheConvertedStringValue(DATA.get(original))
original being equal to "Hello world" and the Data.get function giving the result.
With the DATA class being a static file containing all the string converted and able to fetch them.
Therefore I want to use an annotation processing, but when reading the examples, and the examples, I can't find a way to achieve the logic I want to implement.
I created a SymbolProcessor, but I am not able to get any expression, I can only get functions (KSFunctionDeclarationImpl) and classes (KSClassDeclarationImpl).
...
override fun process(resolver: Resolver): List<KSAnnotated> {
val symbols = resolver.getSymbolsWithAnnotation("com.example.annotation.Convert")
symbols.forEach {
logger.warn("class: ${it.javaClass.name}")
}
val ret = symbols.filter { !it.validate() }.toList()
return ret
}
When I use the annotation on an expression, I do not get it using the getSymbolsWithAnnotation.
My question being, how to get the expression and their value which are annotated by the annotation I created.
Edit:
Seems to be an open issued : https://github.com/google/ksp/issues/1194

How to convert dynamically input user expression to java code?

I read the byte buddy and javassist doc and I would like do not know if is possible to convert a string like:
get foos where name == toto
to
data.getFoos().stream()
.filter( f -> f.name.equals( "toto" ) )
.collect( Collectors.toSet() )
A regex could capture the expression as:
final Pattern query = Pattern.compile("get (\\w+) where (\\w+) ([=!]+) (\\w+)");
final Scanner scanner = new Scanner(System.in);
final Matcher matcher = query.matcher(input);
matcher.group(1) // foos -> Foo and foos -> getFoos()
matcher.group(2) // field to use as filter
matcher.group(3) // symbol == / !=
matcher.group(4) // thing to match
convert get foos to getFoos()
check from Foo class if name field exists
if fields name is not an instance of Number.class translate == to .equals
make the expression
loop and print results
I read some examples without able to found a such thing. So I come here to get your light. thanks
Both Byte Buddy and Javassist generate byte code, not Java code. Javassist does however have functionality to translate String-contained source code to byte code from your inputs. However, the source code level is at Java 4 level such that you cannot use lambdas.
I do however wonder if this is the correct approach for your problem. Rather, I would suggest you to programmatically resolve a stream from the arguments. You could amplify this by creating a custom aPI to transform your arguments into the stream in question.

Serialization of ANTLR ParseTree

I have a generated grammar that does two things:
Check the syntax of a domain specific language
Evaluate input against that domain specific language
These two functions are separate, lets call them validate() and evaluate().
The validate() function builds the tree from a String input while ensuring it meets the requirements of the BNF for the language. The evaluate() function plugs in values to that tree to get a result (usually true or false).
What the code is currently doing is running validate() each time on the input, just to generate the tree that evaluate() uses. Some of the inputs take up to 60 seconds to be checked. What I would LIKE to do is serialize the results of validate() (assuming it meets the syntax requirements), store the serialized form in the backend database, and just load it from the database as part of evaluate().
I noticed that I can execute the method toStringTree() on the parse tree, and retrieve a LISP style tree. However, can I restore a LISP style tree to an ANTLR parse tree? If not, can anyone recommend another way to serialize and store the generated parse tree?
Thanks for any help.
Jason
ANTLR 4's ParseRuleContext data structure (the specific implementation of ParseTree used by generated parsers to represent grammar rules in the parse tree) is not serializable by default. Open issue #233 on the project issue tracker covers the feature request. However, based on my experience with many applications using ANTLR for parsing, I'm not convinced serializing the parse trees would be useful in the long run. For each problem serializing the parse tree is meant to address, a better solution already exists.
Another option is to store a hash of the last known valid file in the database. After you use the parser to create a parse tree, you could skip the validation step if the input file has the same hash as the last time it was validated. This leverages two aspects of ANTLR 4:
For the same input file, running the parser twice will produce the same parse tree.
The ANTLR 4 parser is extremely fast in almost all cases (e.g. the Java grammar can process around 20MB of source per second). The remaining cases tend to be caused by poorly structured grammar rules that the new parser interpreter feature in ANTLRWorks 2.2 can analyze and make suggestions for improvement.
If you need performance beyond what you get with this, then a parse tree isn't the data structure you should be using. StringTemplate 4's enormous performance advantage over StringTemplate 3 came primarily from the fact that the interpreter switched from using ASTs (equivalent to parse trees for this reasoning) to a linear bytecode representation/interpreter. The ASTs for ST4 would never need to be serialized for performance reasons because the bytecode would be serialized instead. In fact, the C# port of StringTemplate 4 provides exactly this feature.
If the input data to your grammar is made of several independent blocks, you could try to store the string of each block separately, and run the parsing process again for each block independently, using a ThreadPool for example.
Say for example your input data is a set of method declarations:
int add(int a, int b) {
return a+b;
}
int mul(int a, int b) {
return a*b;
}
...
and the grammar is something like:
methodList : methodDeclaration methodList
|
;
methodDeclaration : // your method declaration rules...
The first run of the parser just collects each method text and store it. The parser starts the process at the methodList rule.
void visitMethodList(MethodListContext ctx) {
if(ctx.methodDeclaration() != null) {
String methodStr = formatParseTree(ctx.methodDeclaration(), " ");
// store methodStr for later parsing
}
// visit next method list item, if any
if(ctx.methodList() != null) {
visit(ctx.methodList());
}
}
The second run launch the parsing of each method declaration (in a separate thread for example). For this, the parser starts at the methodDeclaration rule.
void visitMethodDeclaration(MethodDeclarationContext ctx) {
// parse the method block
}
The reason why the text of a methodDeclaration rule is formatted if because calling directly ctx.methodDeclaration().getText() would combine the text of all child nodes AntLR doc, possibly making it unusable for parsing again. If white space is a token separator in the grammar, then adding one space between tokens should not change the parse tree.
String formatParseTree(ParseTree tree, String separator) {
StringBuilder builder = new StringBuilder();
for(int i = 0; i < tree.getChildCount(); i ++) {
ParseTree child = tree.getChild(i);
if(child instanceof TerminalNode) {
builder.append(child.getText());
builder.append(separator);
} else if(child instanceof RuleContext) {
builder.append(formatParseTree(child, separator));
}
}
return builder.toString();
}

Why does Javassist insist on looking for a default annotation value when one is explicitly specified?

I am using Javassist to add and modify annotations on a package-info "class".
In some cases, I need to deal with the following edge case. Someone has (incorrectly) specified an #XmlJavaTypeAdapters annotation on the package-info package, but has not supplied a value attribute (which is defined as being required). So it looks like this:
#XmlJavaTypeAdapters // XXX incorrect; value() is required, but javac has no problem
package com.foobar;
import javax.xml.bind.annotation.adapters.XmlJavaTypeAdapters;
In Javassist, this comes through slightly oddly.
The javassist.bytecode.annotation.Annotation representing the #XmlJavaTypeAdapters annotation does not have a member value (getMemberValue("value") returns null), as expected.
It is of course possible to add a value() member value, and that is what I've done:
if (adaptersAnnotation.getMemberValue("value") == null) {
final ArrayMemberValue amv = new ArrayMemberValue(new AnnotationMemberValue(constantPool), constantPool);
adaptersAnnotation.addMemberValue("value", amv);
annotationsAttribute.addAnnotation(adaptersAnnotation);
}
In the code snippet above, I've created a new member value to hold an array of annotations, because the value() attribute of #XmlJavaTypeAdapters is an array of #XmlJavaTypeAdapter. I've specified its array type by trying to divine the Zen-like documentation's intent—it seems that if you supply another MemberValue that this MemberValue will somehow serve as the array's type. In my case I want the type of the array to be #XmlJavaTypeAdapter, which is an annotation, so the only kind of MemberValue that seemed appropriate was AnnotationMemberValue. So I've created an empty one of those and set it as the array type.
This works fine as far as it goes, as long as you stay "within" Javassist.
However, something seems to have gone wrong. If I ask Javassist to convert all of its proprietary annotations into genuine Java java.lang.annotation.Annotations, then when I try to access the value() attribute of this #XmlJavaTypeAdapters annotation, Javassist tells me that there is no default value. Huh?
In other words, that's fine—indeed there is not—but I have specified what I had hoped was a zero-length array (that is, the default value shouldn't be used; my explicitly specified zero-length array should be used instead):
final List<Object> annotations = java.util.Arrays.asList(packageInfoClass.getAnnotations());
for (final Object a : annotations) {
System.out.println("*** class annotation: " + a); // OK; one of these is #XmlJavaTypeAdapters
System.out.println(" ...of type: " + a.getClass()); // OK; resolves to XmlJavaTypeAdapters
System.out.println(" ...assignable to java.lang.annotation.Annotation? " + java.lang.annotation.Annotation.class.isInstance(a)); // OK; returns true
if (a instanceof XmlJavaTypeAdapters) {
final XmlJavaTypeAdapters x = (XmlJavaTypeAdapters)a;
System.out.println(" ...value: " + java.util.Arrays.asList(x.value())); // XXX x.value() throws an exception
}
}
So why is Javassist looking for a default value in this case?
My larger issue is of course to handle this (unfortunately somewhat common) case where #XmlJavaTypeAdapters is specified with no further information on it. I need to add a value member value that can hold an array of #XmlJavaTypeAdapter annotations. I can't seem to figure out how to accomplish this with Javassist. As always, all help appreciated.
For posterity, it appears that in this particular case (to avoid a NullPointerException and/or a RuntimeException), you need to do this:
if (adaptersAnnotation.getMemberValue("value") == null) {
final ArrayMemberValue amv = new ArrayMemberValue(constantPool);
amv.setValue(new AnnotationMemberValue[0]);
adaptersAnnotation.addMemberValue("value", amv);
annotationsAttribute.addAnnotation(adaptersAnnotation);
}
Note in particular that I deliberately omit the array type when building the ArrayMemberValue (including one of any kind will result in an exception). Then I explicitly set its value to an empty array of type AnnotationMemberValue. Any other combination here will result in an exception.
Additionally, and very oddly, the last line in that if block is critical. Even though in this particular case the annotation itself was found, and so hence was already present in the AnnotationsAttribute, you must re-add it. If you do not, you will get a RuntimeException complaining about the lack of a default value.
I hope this helps other Javassist hackers.

Interpreting a variable number of tree nodes in ANTLR Tree Grammar

Whilst creating an inline ANTLR Tree Grammar interpreter I have come across an issue regarding the multiplicity of procedure call arguments.
Consider the following (faulty) tree grammar definition.
procedureCallStatement
: ^(PROCEDURECALL procedureName=NAME arguments=expression*)
{
if(procedureName.equals("foo")) {
callFooMethod(arguments[0], arguments[1]);
}elseif(procedureName.equals("bar")) {
callBarMethod(arguments[0], arguments[1], arguments[2]);
}
}
;
My problem lies with the retrieval of the given arguments. If there would be a known quantity of expressions I would just assign the values coming out of these expressions to their own variable, e.g.:
procedureCallStatement
: ^(PROCEDURECALL procedureName=NAME argument1=expression argument2=expression)
{
...
}
;
This however is not the case.
Given a case like this, what is the recommendation on interpreting a variable number of tree nodes inline within the ANTLR Tree Grammar?
Use the += operator. To handle any number of arguments, including zero:
procedureCallStatement
: ^(PROCEDURECALL procedureName=NAME argument+=expression*)
{
...
}
;
See the tree construction documentation on the antlr website.
The above will change the type of the variable argument from typeof(expression) to a List (well, at least when you're generating Java code). Note that the list types are untyped, so it's just a plain list.
If you use multiple parameters with the same variable name, they will also create a list, for example:
twoParameterCall
: ^(PROCEDURECALL procedureName=NAME argument=expression argument=expression)
{
...
}
;