When is invokedynamic actually useful (besides lazy constants)? - dynamic

TL;DR
Please provide a piece of code written in some well known dynamic language (e.g. JavaScript) and how that code would look like in Java bytecode using invokedynamic and explain why the usage of invokedynamic is a step forward here.
Background
I have googled and read quite a lot about the not-that-new-anymore invokedynamic instruction which everyone on the internet agrees on that it will help speed dynamic languages on the JVM. Thanks to stackoverflow I managed to get my own bytecode instructions with Sable/Jasmin to run.
I have understood that invokedynamic is useful for lazy constants and I also think that I understood how the OpenJDK takes advantage of invokedynamic for lambdas.
Oracle has a small example, but as far as I can tell the usage of invokedynamic in this case defeats the purpose as the example for "adder" could much simpler, faster and with roughly the same effect expressed with the following bytecode:
aload whereeverAIs
checkcast java/lang/Integer
aload whereeverBIs
checkcast java/lang/Integer
invokestatic IntegerOps/adder(Ljava/lang/Integer;Ljava/lang/Integer;)Ljava/lang/Integer;
because for some reason Oracle's bootstrap method knows that both arguments are integers anyway. They even "admit" that:
[..]it assumes that the arguments [..] will be Integer objects. A bootstrap method requires additional code to properly link invokedynamic [..] if the parameters of the bootstrap method (in this example, callerClass, dynMethodName, and dynMethodType) vary.
Well yes, and without that interesing "additional code" there is no point in using invokedynamic here, is there?
So after that and a couple of further Javadoc and Blog entries I think that I have a pretty good grasp on how to use invokedynamic as a poor replacement when invokestatic/invokevirtual/invokevirtual or getfield would work just as well.
Now I am curious how to actually apply the invokedynamic instruction to a real world usecase so that it actually is some improvements over what we could with "traditional" invocations (except lazy constants, I got those...).

Actually, lazy operations are the main advantage of invokedynamic if you take the term “lazy creation” broadly. E.g., the lambda creation feature of Java 8 is a kind of lazy creation that includes the possibility that the actual class containing the code that will be finally invoked by the invokedynamic instruction doesn’t even exist prior to the execution of that instruction.
This can be projected to all kind of scripting languages delivering code in a different form than Java bytecode (might be even in source code). Here, the code may be compiled right before the first invocation of a method and remains linked afterwards. But it may even become unlinked if the scripting language supports redefinition of methods. This uses the second important feature of invokedynamic, to allow mutable CallSites which may be changed afterwards while supporting maximal performance when being invoked frequently without redefinition.
This possibility to change an invokedynamic target afterwards allows another option, linking to an interpreted execution on the first invocation, counting the number of executions and compiling the code only after exceeding a threshold (and relinking to the compiled code then).
Regarding dynamic method dispatch based on a runtime instance, it’s clear that invokedynamic can’t elide the dispatch algorithm. But if you detect at runtime that a particular call-site will always call the method of the same concrete type you may relink the CallSite to an optimized code which will do a short check if the target is that expected type and performs the optimized action then but branches to the generic code performing the full dynamic dispatch only if that test fails. The implementation may even de-optimize such a call-site if it detects that the fast path check failed a certain number of times.
This is close to how invokevirtual and invokeinterface are optimized internally in the JVM as for these it’s also the case that most of these instructions are called on the same concrete type. So with invokedynamic you can use the same technique for arbitrary lookup algorithms.
But if you want an entirely different use case, you can use invokedynamic to implement friend semantics which are not supported by the standard access modifier rules. Suppose you have a class A and B which are meant to have such a friend relationship in that A is allowed to invoke private methods of B. Then all these invocations may be encoded as invokedynamic instructions with the desired name and signature and pointing to a public bootstrap method in B which may look like this:
public static CallSite bootStrap(Lookup l, String name, MethodType type)
throws NoSuchMethodException, IllegalAccessException {
if(l.lookupClass()!=A.class || (l.lookupModes()&0xf)!=0xf)
throw new SecurityException("unprivileged caller");
l=MethodHandles.lookup();
return new ConstantCallSite(l.findStatic(B.class, name, type));
}
It first verifies that the provided Lookup object has full access to A as only A is capable of constructing such an object. So sneaky attempts of wrong callers are sorted out at this place. Then it uses a Lookup object having full access to B to complete the linkage. So, each of these invokedynamic instructions is permanently linked to the matching private method of B after the first invocation, running at the same speed as ordinary invocations afterwards.

Related

Finding the Pharo documentation for the compile & evaluate methods, etc. in the compiler class

I've got an embarrassingly simple question here. I'm a smalltalk newbie (I attempt to dabble with it every 5 years or so), and I've got Pharo 6.1 running. How do I go about finding the official standard library documentation? Especially for the compiler class? Things like the compile and evaluate methods? I don't see how to perform a search with the Help Browser, and the method comments in the compiler class are fairly terse and cryptic. I also don't see an obvious link to the standard library API documentation at: http://pharo.org/documentation. The books "Pharo by Example" and "Deep into Pharo" don't appear to cover that class either. I imagine the class is probably similar for Squeak and other smalltalks, so a link to their documentation for the compiler class could be helpful as well.
Thanks!
There are several classes that collaborate in the compilation of a method (or expression) and, given your interest in the subject, I'm tempted to stimulate you even further in their study and understanding.
Generally speaking, the main classes are the Scanner, the Parser, the Compiler and the Encoder. Depending on the dialect these may have slightly different names and implementations but the central idea remains the same.
The Scanner parses the stream of characters of the source code and produces a stream of tokens. These tokens are then parsed by the Parser, which transforms them into the nodes of the AST (Abstract Syntax Tree). Then the Compiler visits the nodes of the AST to analyze them semantically. Here all variable nodes are classified: method arguments, method temporaries, shared, block arguments, block temporaries, etc. It is during this analysis where all variables get bound in their corresponding scope. At this point the AST is no longer "abstract" as it has been annotated with binding information. Finally, the nodes are revisited to generate the literal frame and bytecodes of the compiled method.
Of course, there are lots of things I'm omitting from this summary (pragmas, block closures, etc.) but with these basic ideas in mind you should now be ready to debug a very simple example. For instance, start with
Object compile: 'm ^3'
to internalize the process.
After some stepping into and over, you will reach the first interesting piece of code which is the method OpalCompiler >> #compile. If we remove the error handling blocks this methods speaks for itself:
compile
| cm |
ast := self parse.
self doSemanticAnalysis.
self callPlugins.
cm := ast generate: self compilationContext compiledMethodTrailer
^cm
First we have the #parse message where the parse nodes are created. Then we have the semantic analysis I mentioned above and finally #generate: produces the encoding. You should debug each of these methods to understand the compilation process in depth. Given that you are dealing with a tree be prepared to navigate thru a lot of visitors.
Once you become familiar with the main ideas you may want to try more elaborated -yet simple- examples to see other objects entering the scene.
Here are some simple facts:
Evaluation in Smalltalk is available everywhere: in workspaces, in
the Transcript, in Browsers, inspectors, the debugger, etc.
Basically, if you are allowed to edit text, most likely you will
also be allowed to evaluate it.
There are 4 evaluation commands
Do it (evaluates without showing the answer)
Print it (evaluates and prints the answer next to the expression)
Inspect it (evaluates and opens an inspector on the result)
Debug it (opens a debugger so you can evaluate your expression step by step).
Your expression can contain any literal (numbers, arrays, strings, characters, etc.)
17 "valid expression"
Your expression can contain any message.
3 + 4.
'Hello world' size.
1 bitShift: 28
Your expression can use any Global variable
Object new.
Smalltalk compiler
Your expression can reference self, super, true, nil, false.
SharedRandom globalGenerator next < 0.2 ifTrue: [nil] ifFalse: [self]
Your expression can use any variables declared in the context of the pane where you are writing. For example:
If you are writing in a class browser, self will be bound to the current class
If you are writing in an inspector, self is bound to the object under inspection. You can also use its instances variables in the expression.
If you are in the debugger, your expression can reference self, the instance variables, message arguments, temporaries, etc.
Finally, if you are in a workspace (a.k.a. Playground), you can use any temporaries there, which will be automatically created and remembered, without you having to declare them.
As near as I can tell, there is no API documentation for the Pharo standard library, like you find with other programming languages. This seems to be confirmed on the Pharo User's mailing list: http://forum.world.st/Essential-Documentation-td4916861.html
...there is a draft version of the ANSI standard available: http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf
...but that doesn't seem to cover the compiler class.

keep around a piece of context built during compile-time for later use in runtime?

I'm aware this might be a broad question (there's no specific code for you to look at), but I'm hoping I'd get some insights as to what to do, or how to approach the problem.
To keep things simple, suppose the compiler that I'm writing performs these three steps:
parse (and bind all variables)
typecheck
codegen
Also the language that I'm building the compiler for wants to support late-analysis/late-binding (ie., it has a function that takes a String, which is to be compiled and executed as a piece of source-code during runtime).
Now during parse-phase, I have a piece of context that I need to keep around till run-time for the sole benefit of the aforementioned function (because it needs to parse and typecheck its argument in that context).
So the question, how should I do this? What do other compilers do?
Should I just serialise the context object to disk (codegen for it) and resurrect it during run-time or something?
Thanks
Yes, you'll need to emit the type information (or other context, you weren't very specific) in your object/executable files, so that your eval can read it at runtime. You might look at Java's .class file format for inspiration; Java doesn't have eval as such, but you can dynamically spin new bytecode at runtime that must be linked in a type-safe manner. David Conrad's comment is spot-on: this information can also be used to implement reflection, if your language has such a feature.
That's as much as I can help you without more specifics.

Why does the JVM have both `invokespecial` and `invokestatic` opcodes?

Both instructions use static rather than dynamic dispatch. It seems like the only substantial difference is that invokespecial will always have, as its first argument, an object that is an instance of the class that the dispatched method belongs to. However, invokespecial does not actually put the object there; the compiler is the one responsible for making that happen by emitting the appropriate sequence of stack operations before emitting invokespecial. So replacing invokespecial with invokestatic should not affect the way the runtime stack / heap gets manipulated -- though I expect that it will cause a VerifyError for violating the spec.
I'm curious about the possible reasons behind making two distinct instructions that do essentially the same thing. I took a look at the source of the OpenJDK interpreter, and it seems like invokespecial and invokestatic are handled almost identically. Does having two separate instructions help the JIT compiler better optimize code, or does it help the classfile verifier prove some safety properties more efficiently? Or is this just a quirk in the JVM's design?
Disclaimer: It is hard to tell for sure since I never read an explicit Oracle statement about this, but I pretty much think this is the reason:
When you look at Java byte code, you could ask the same question about other instructions. Why would the verifier stop you when pushing two ints on the stack and treating them as a single long right after? (Try it, it will stop you.) You could argue that by allowing this, you could express the same logic with a smaller instruction set. (To go further with this argument, a byte cannot express too many instructions, the Java byte code set should therefore cut down wherever possible.)
Of course, in theory you would not need a byte code instruction for pushing ints and longs to the stack and you are right about the fact that you would not need two instructions for INVOKESPECIAL and INVOKESTATIC in order to express method invocations. A method is uniquely identified by its method descriptor (name and raw argument types) and you could not define both a static and a non-static method with an identical description within the same class. And in order to validate the byte code, the Java compiler must check whether the target method is static nevertheless.
Remark: This contradicts the answer of v6ak. However, a methods descriptor of a non-static method is not altered to include a reference to this.getClass(). The Java runtime could therefore always infer the appropriate method binding from the method descriptor for a hypothetical INVOKESMART instruction. See JVMS §4.3.3.
So much for the theory. However, the intentions that are expressed by both invocation types are quite different. And remember that Java byte code is supposed to be used by other tools than javac to create JVM applications, as well. With byte code, these tools produce something that is more similar to machine code than your Java source code. But it is still rather high level. For example, byte code still is verified and the byte code is automatically optimized when compiled to machine code. However, the byte code is an abstraction that intentionally contains some redundancy in order to make the meaning of the byte code more explicit. And just like the Java language uses different names for similar things to make the language more readable, the byte code instruction set contains some redundancy as well. And as another benefit, verification and byte code interpretation/compilation can speed up since a method's invocation type does not always need to be inferred but is explicitly stated in the byte code. This is desirable because verification, interpretation and compilation are done at runtime.
As a final anecdote, I should mention that a class's static initializer <clinit> was not flagged static before Java 5. In this context, the static invocation could also be inferred by the method's name but this would cause even more run time overhead.
There are the definitions:
http://docs.oracle.com/javase/specs/jvms/se5.0/html/Instructions2.doc6.html#invokestatic
http://docs.oracle.com/javase/specs/jvms/se5.0/html/Instructions2.doc6.html#invokespecial
There are significant differences. Say we want to design an invokesmart instruction, which would choose smartly between inkovestatic and invokespecial:
First, it would not be a problem to distinguish between static and virtual calls, since we can't have two methods with same name, same parameter types and same return type, even if one is static and second is virtual. JVM does not allow that (for a strange reason). Thanks raphw for noticing that.
First, what would invokesmart foo/Bar.baz(I)I mean? It may mean:
A static method call foo.Bar.baz that consumes int from operand stack and adds another int. // (int) -> (int)
An instance method call foo.Bar.baz that consumes foo.Bar and int from operand stack and adds int. // (foo.Bar, int) -> (int)
How would you choose from them? There may exist both methods.
We may try to solve it by requiring foo/Bar.baz(Lfoo/Bar;I) for the static call. However, we may have both public static int baz(Bar, int) and public int baz(int).
We may say that it does not matter and possibly disable such situation. (I don't think that it is a good idea, but just to imagine.) What would it mean?
If the method is static, there are probably no additional restrictions. On the other hand, if the method is not static, there are some restrictions: "Finally, if the resolved method is protected (§4.6), and it is either a member of the current class or a member of a superclass of the current class, then the class of objectref must be either the current class or a subclass of the current class."
There are some further differences, see the note about ACC_SUPER.
It would mean that all the referenced classes must be loaded before bytecode verification. I hope this is not necessary now, but I am not 100% sure.
So, it would mean very inconsistent behavior.

Will the Hotspot VM inline functions as necessary?

I am converting some C++ code to Java and I was wondering what I can do about the inlined functions. Can I assume that functions will be inlined by the VM (as an when necessary) and just not worry about this? How do I profile to observe this behaviour? Suppose there is a main outer function, and I throw a for loop around it and cause a million invocations. Should I expect to see an improvements as the VM inlines more and more?
Yes Java does inline method calls. The inlining is performed by the JIT compiler, so you won't see it by examining the bytecode files.
Whether inlining actually occurs for a given method call will depend on the size of the method body, and whether the call is inlineable. (If a method call involves dispatching ... after the JVM has a bunch of global optimization designed to remove unnecessary dispatching ... then it cannot be inlined.)
The same applies to your example with your outer main function. It depends on how big the method body is. On the other hand, if the method takes a significant time to execute, then the relative importance of the optimization decreases correspondingly.
My advice is to not worry about things like this at this stage. Just write the code clearly and simply, and let the JIT compiler deal with the problem of optimizing. When your application is working, you can profile it and see if there are any "hot spots" in the code that are worthwhile optimizing by hand.
But I should be able to see this in something like Visual VM right? I mean initially no inlining, then more and more stuff is inlined so the average time for the outer method is slightly reduced.
It may be observable and it may not, depending on the amount time spent in making the calls relative to executing the method bodies. (Profiling often relies on sampling the program counter. The reported times may be inaccurate if the number of samples for a given region of code is too small ... and for other reasons.)
It also depends on the JVM you are using. Not all JVMs will re-optimize code that they have previously optimized.
Finally, there is a way to get the JVM to dump the native code output by the JIT compiler. That will give you a definitive answer as to what has been inlined ... if you are prepared to read the machine instructions.

What OCaml standard library types cannot be marshalled?

I have a failure Marshaling a data structure (error abstract type (Custom)). There is one known abstract type in use, namely Big_int. However that Marshals fine. There is no custom C code in the application. Apart from Nums, Unix library is also used (however I don't believe there are any active objects of that type). We're Marshal'ing with Closures.
Two (only) third party libraries are in use: OCS Scheme (Scheme interpreter, pure Ocaml) and Dypgen (extensible GLR parser, also pure Ocaml). The problem is with a new feature of Dypgen, saving a dynamically extended parser.
The Ocaml error message is next to useless (it doesn't identify which abstract type with Custom tag is the culprit).
We suspected Lexbuf as the culprit because it contains a closure over an Ocaml channel, and can't be Marshal'ed, but it seems this isn't the problem. So my question is:
Which standard library components can't be Marshall'd?
Weak arrays cannot be marshaled. I am not familiar with OCS Scheme, but I would expect an interpreter for a garbage-collected language written in OCaml to use weak pointers (they let you piggy-back on OCaml's memory management).
In OCaml's defense, I do not think that the Custom method block contains the name of the type (retrospectively, that seems like a good thing to have).
EDIT: Yep:
$ grep Weak ~/Downloads/ocs-1.0.3/src/*.ml
/Users/pascal/Downloads/ocs-1.0.3/src/ocs_sym.ml:module SymTable = Weak.Make (HashSymbol)
EDIT2:
As pointed out by ygrek, there is room for a name in the custom method block. I should also clarify that weak arrays are not custom values, since my answer seemed to imply that. Weak arrays have the Abstract tag and are chained using the first word of data so that the garbage collector can traverse them in special weak-pointer-related phases of the collection cycle.