Java Dynamic Code Generation with support for generics - java-compiler-api

Is there any tool which provides Java dynamic code generation and that also supports generics?
Javassist for example, is the kind of tool that I need, but it does not support generics.
I wrote a small lib which uses the Java 6 Compiler API, however as far as I know it depends on JDK. Is there a way to specify another compiler? Or to ship with my application only the parts that I need to invoke with the Java Compiler API?

It seems you can manipulate and read generic info with Javaassist. See
http://www.mail-archive.com/jboss-user#lists.jboss.org/msg101222.html
[jboss-user] [Javassist user questions] - Re: Altering Generics Information of Methods using Javassist
SimonRinguette
Thu, 20 Dec 2007 12:22:14 -0800
I have done further reading on how this is implemented by the compiler and
finally found out the answer I was looking for.
You can defenitely do that with javaassist. The key class is
javassist.bytecode.SignatureAttribute.
From a CtMethod, i've obtained the methodInfo I add a Signature attribute. You can do it with something like:
CtMethod method = ....
MethodInfo methodInfo = method.getMethodInfo();
SignatureAttribute signatureAttribute = new
SignatureAttribute(methodInfo.getConstPool(),
"()Ljava/util/List<Ljava/lang/String;>;");
methodInfo.addAttribute(signatureAttribute);
If your more interesed in reading the signature with the generics inside, you
can use the methodInfo.getAttribute(SignatureAttribute.tag).
I hope this helped.

If you are comfortable with writing bytecode then ASM is quite a good library for that kind of thing. That will let you generate a class file on the fly without having to worry about the nitty-gritty of the classfile format. You can then use a classloader to dynamically load it into your application.

If I recall correctly, it is sufficient to have tools.jar in the classpath in order to use the Java compiler at runtime.

Actually, javaassist can handle generics using SignatureAttribute.
SignatureAttribute.Type retType = new SignatureAttribute.BaseType(Void.TYPE.getName());
SignatureAttribute.Type[] argType = getArgType();
SignatureAttribute.MethodSignature signature = new SignatureAttribute.MethodSignature(null, argType, retType, null);
method.setGenericSignature(signature.encode());
This project has a lot of very good examples. Hop they are helpful.

Related

How can I get the public elements from a Rust module?

In Node.js, I could get an array of the objects in foo with
Object.keys(require("foo"));
Is there any way I could do the same thing in Rust?
mod foo;
getobjs(foo);
No, there is no way to do this. This level of introspection of compile-time information simply doesn't exist at runtime. The concept of a module doesn't even exist.
If you are interested in compile-time information, you can do such things as build and view the docs (cargo doc --open) to see all the public items of the entire crate. You can probably also view the crate's documentation online before you use it.
There are also tools like the Rust Language Server which provide this type of information (and more) to editors and IDEs.

keep around a piece of context built during compile-time for later use in runtime?

I'm aware this might be a broad question (there's no specific code for you to look at), but I'm hoping I'd get some insights as to what to do, or how to approach the problem.
To keep things simple, suppose the compiler that I'm writing performs these three steps:
parse (and bind all variables)
typecheck
codegen
Also the language that I'm building the compiler for wants to support late-analysis/late-binding (ie., it has a function that takes a String, which is to be compiled and executed as a piece of source-code during runtime).
Now during parse-phase, I have a piece of context that I need to keep around till run-time for the sole benefit of the aforementioned function (because it needs to parse and typecheck its argument in that context).
So the question, how should I do this? What do other compilers do?
Should I just serialise the context object to disk (codegen for it) and resurrect it during run-time or something?
Thanks
Yes, you'll need to emit the type information (or other context, you weren't very specific) in your object/executable files, so that your eval can read it at runtime. You might look at Java's .class file format for inspiration; Java doesn't have eval as such, but you can dynamically spin new bytecode at runtime that must be linked in a type-safe manner. David Conrad's comment is spot-on: this information can also be used to implement reflection, if your language has such a feature.
That's as much as I can help you without more specifics.

How does Groovy work in a dynamic way?

I have built a small compiler, for a statically typed language. After understanding how a static language works, I'm having trouble getting my head into dynamic languages like groovy.
While constructing my compiler, I know that once I generate the machine level-code there is no way of changing it! (i.e its run-time).
But how does Groovy do this magical stuff like inferring type in statements like:
def a = "string"
a.size()
As far as I'm concerned, groovy has to find the type a is of string before running the line a.size(). It seems that it does so in compile time (while constructing AST)! But the language is called dynamic.
I'm confused, kindly help me figure out.
Thanks.
Groovy doesn't simply "call" a method, but dispatches it through the meta-object protocol. The method invocation is sent as a message to the object, which can respond to it or not. When using dynamic typing, it doesn't matter the object type, only if it responds to that message. This is called duck typing.
You can see it (though not easily) when you decompile Groovy code. You can compile using groovyc and decompile using other tool. I recommend jd-gui. You won't see the method being called explicitly, because of Groovy's method caching (it is done this way to achieve Groovy's neat performance).
For a simple script like:
def a = "abcdefg"
println a.substring(2)
This will be the generated code:
CallSite[] arrayOfCallSite = $getCallSiteArray(); Object a = "abcdefg";
return arrayOfCallSite[1].callCurrent(
this, arrayOfCallSite[2].call(a, Integer.valueOf(2))); return null;
And the method call is "dispatched" to the object, not called directly. This is a similar concept to Smalltalks and Ruby method dispatch. It is because of that mechanism that you can intercept methods and property access on Groovy objects.
Since Groovy 2, Groovy code can be statically compiled, thus acting like your compiler.

Why some java methods in core libraries end with numbers?

It's common in a lot of classes in JDK, just a few examples:
java.util.Properties
load0
store0
java.lang.Thread
start0
stop0
setPriority0
Usually they are private native methods (like in Thread class), but sometimes they are just private (Properties class)
I'm just curious if anybody know if there is any history behind that.
I believe they are named like that because equivalent functions with same names exist in the code and just to distinguish between native helper functions and public functions they decided to suffix them with 0.
in java.util.Properties both load, store and load0, store0 exist.
The 0 after the method name is done so to distinguish between public and private methods having same name .
Start function will call the start0 function.
Those functions which ends with 0 is private method.
And those which are not ending with number is public.
You can check in any of the library.
The use of zero suffixes on method names is just a convention to deal with cases where you have a public API method and a corresponding private method. In the Java SE libraries, this is commonly used for the native methods that provide the underlying functionality implemented by the classes. (You can see what is going on by looking at the OpenJDK source code.)
But your questions are:
Why some java methods in core libraries end with numbers?
Because someone thought it would be a good idea. It is not strictly necessary since they typically could have overloaded the public methods instead. And since the zero suffix matters are private, the naming of methods should not be relevant beyond the class and its native implementation.
I'm just curious if anybody know if there is any history behind that.
There is no mention of this convention in the original Java Style Guide. In fact, I think it predates Java. I vaguely recall seeing it in C libraries in 4.x BSD Unix. That was the mid 1980's. And I wouldn't be surprised if they adopted it from somewhere else.

How do you implement C#4's IDynamicObject interface?

To implement "method-missing"-semantics and such in C# 4.0, you have to implement IDynamicObject:
public interface IDynamicObject
{
MetaObject GetMetaObject(Expression parameter);
}
As far as I can figure out IDynamicObject is actually part of the DLR, so it is not new. But I have not been able to find much documentation on it.
There are some very simple example implementations out there (f.x. here and here), but could anyone point me to more complete implementations or some real documentation?
Especially, how exactly are you supposed to handle the "parameter"-parameter?
The short answer is that the MetaObject is what's responsible for actually generating the code that will be run at the call site. The mechanism that it uses for this is LINQ expression trees, which have been enhanced in the DLR. So instead of starting with an object, it starts with an expression that represents the object, and ultimately it's going to need to return an expression tree that describes the action to be taken.
When playing with this, please remember that the version of System.Core in the CTP was taken from a snapshot at the end of August. It doesn't correspond very cleanly to any particular beta of IronPython. A number of changes have been made to the DLR since then.
Also, for compatibility with the CLR v2 System.Core, releases of IronPython starting with either beta 4 or beta 5 now rename everything in that's in the System namespace to be in the Microsoft namespace instead.
If you want an end to end sample including source code, resulting in a dynamic object that stores value for arbitrary properties in a Dictionary then my post "A first look at Duck Typing in C# 4.0" could be right for you. I wrote that post to show how dynamic object can be cast to statically typed interfaces. It has a complete working implementation of a Duck that is a IDynamicObject and may acts like a IQuack.
If you need more information contact me on my blog and I will help you along, as good as I can.
I just blogged about how to do this here:
http://mikehadlow.blogspot.com/2008/10/dynamic-dispatch-in-c-40.html
Here is what I have figured out so far:
The Dynamic Language Runtime is currently maintained as part of the IronPython project. So that is the best place to go for information.
The easiest way to implement a class supporting IDynamicObject seems to be to derive from Microsoft.Scripting.Actions.Dynamic and override the relevant methods, for instance the Call-method to implement function call semantics. It looks like Microsoft.Scripting.Actions.Dynamic hasn't been included in the CTP, but the one from IronPython 2.0 looks like it will work.
I am still unclear on the exact meaning of the "parameter"-parameter, but it seems to provide context for the binding of the dynamic-object.
This presentation also provides a lot of information about the DLR:
Deep Dive: Dynamic Languages in Microsoft .NET by Jim Hugunin.