How is stored the result of the invokedynamic? - jvm

Java 8 introduces support for first-class functions, which allows assigning functions to variables. In this case the variables must be of a function type, which is defined by a functional interface (an interface with just one abstract method).
So, considering an example of an interface I and a class A with the following definition:
interface I{ int foo(); }
class A implements I{
public int foo(){return 7;}
public static int bar(){return 11;}
}
We can assign to a variable of type I an instance of A or a method reference to the method bar of A. Both can be store on variables of type I, such as:
I i1 = new A();
I i2 = A::bar;
If we analyze the bytecodes resulting from the compilation of the previous code we will get:
0: new #2 // class A
3: dup
4: invokespecial #3 // Method A."<init>":()V
7: astore_1
8: invokedynamic #4, 0 // InvokeDynamic #0:foo:()LI;
13: astore_2
For i1 = new A(); that is clearly that the corresponding instruction 7: astore_1 is storing an instance of A that is compatible with I. But, as a result of the i2 = A::bar we are storing the result of the 8: invokedynamic #4, 0.
So, that means that the result of an invokedynamic is always an instance of the target type, which is the type of the variable that we are assigning with a method reference?

Each invokedynamic bytecode refers to a corresponding CONSTANT_InvokeDynamic_info structure in the constant pool. This structure contains a Method Descriptor that is used to derive the types of the arguments and the type of return value for this invokedynamic instruction.
In your example the method descriptor is ()LI; computed during source-to-bytecode translation.
8: invokedynamic #4, 0 // InvokeDynamic #0:foo:()LI;
^^^^^
It means that this particular bytecode expects no arguments and always produces the result of type I.

The result of the invokedynamic instruction, the way Java 8’s lambda expressions and method references use it, is indeed an instance of the target functional interface.
It isn’t the result of the invokedynamic instruction that is remembered by the JVM but the CallSite that is returned by the bootstrap method, in case of the new Java 8 features one of the two methods of the LambdaMetafactory.
The CallSite instances linked to an invokedynamic instruction encapsulate behavior, not a particular result value. The actual behavior provided by the LambdaMetafactory is intentionally unspecified to provide a wide degree of freedom but the current implementation exhibits two different behaviors.
For non-capturing lambda expressions, the behavior will be to return a single instance which has been created during the invokedynamic bootstrapping. This can be done by creating a constant wrapping MethodHandle wrapped in a ConstantCallSite. In this case, subsequent executions of the invokedynamic instruction will evaluate to this instance.
For lambda expressions which capture values, the instruction will be linked against a constructor or factory method of a generated class which accepts the captured values. So then, subsequent executions of the invokedynamic instruction will behave like an ordinary object construction (which creates a new instance of the class which implements the target interface each time).

Related

Resolving the method of a ABAP dynamic call: order of types considered

I try to provoke a behaviour described in the ABAP Keyword Documentation 7.50 but fail. It's given with Alternative 2 of CALL METHOD - dynamic_meth:
CALL METHOD oref->(meth_name) ...
Effect
... oref can be any class reference variable ... that points to an object that contains the method ... specified in meth_name. This method is searched for first in the static type, then in the dynamic type of oref
I use the test code as given below. The static type of oref is CL1, the dynamic type CL2. Shouldn't then the dynamic CALL METHOD statement call the method M in CL1?
REPORT ZU_DEV_2658_DYNAMIC.
CLASS CL1 DEFINITION.
PUBLIC SECTION.
METHODS M.
ENDCLASS.
CLASS CL1 IMPLEMENTATION.
METHOD M.
write / 'original'.
ENDMETHOD.
ENDCLASS.
CLASS CL2 DEFINITION INHERITING FROM CL1.
PUBLIC SECTION.
METHODS M REDEFINITION.
ENDCLASS.
CLASS CL2 IMPLEMENTATION.
METHOD M.
write / 'redefinition'.
ENDMETHOD.
ENDCLASS.
START-OF-SELECTION.
DATA oref TYPE REF TO cl1. " static type is CL1
CREATE OBJECT oref TYPE cl2. " dynamic type is CL2
oref->m( ). " writes 'redefinition' - that's ok
CALL METHOD oref->('M'). " writes 'redefinition' - shouldn't that be 'original'?
Update:
I'd like to answer to the (first four) comments to my original question. Because of the lengthy code snippet, I answer by augmenting my post, not by comment.
It is true that the behaviour of the code snippet of the original question is standard OO behaviour. It's also true that for calls with static method name and class, types are resolved as given by the link. But then:
Why does the ABAP Keyword Documentation make the statement I've linked?
Calls with dynamic method names do search for the method name in the dynamic type, as demonstrated by the following code piece. That's certainly not standard OO behaviour.
My question was: Apparently, the search mechanism differs from the one described. Is the description wrong or else do I miss something?
REPORT ZU_DEV_2658_DYNAMIC4.
CLASS CL_A DEFINITION.
ENDCLASS.
CLASS CL_B DEFINITION INHERITING FROM CL_A.
PUBLIC SECTION.
METHODS M2 IMPORTING VALUE(caller) TYPE c OPTIONAL PREFERRED PARAMETER caller.
ENDCLASS.
CLASS CL_B IMPLEMENTATION.
METHOD M2.
write / caller && ' calls b m2'.
ENDMETHOD.
ENDCLASS.
START-OF-SELECTION.
DATA orefaa TYPE REF TO cl_a.
CREATE OBJECT orefaa TYPE cl_a. " static and dynamic type is CL_A
*orefaa->m2( 'orefa->m2( )' ). syntax error: method m2 is unknown'.
*CALL METHOD orefaa->('M2') EXPORTING caller = 'CALL METHOD orefa->("M2")'. results in exception: method m2 is unknown'.
DATA orefab TYPE REF TO cl_a. " static type is CL_A
CREATE OBJECT orefab TYPE cl_b. " dynamic type is CL_B
*orefab->m2( 'orefab->m2( )' ). results in syntax error: method m2 is unknown'.
CALL METHOD orefab->('M2') EXPORTING caller = 'CALL METHOD orefab->("M2")'. " succeeds
You are actually answering your own question there, aren't you?
In your first example, you perform a call method to the method m on a variable that's typed as cl1. The runtime looks up the class cl1, and finds the requested method m there. It then calls that method. However, your variable actually has the type cl2, a sub-class of cl1, that overrides that method m. So the call effectively reaches that redefinition of the method, not the super-class's original implementation. As you and the commenters sum it up: this is standard object-oriented behavior.
Note how in essence this has nothing to do at all with the static-vs-dynamic statement you quote from the documentation. The method m is statically present in cl1, so there is no dynamic lookup involved whatsoever. I assume you were looking for a way to probe the meaning of this statement, but this example doesn't address it.
However, your second example then precisely hits the nail on the head. Let me rewrite it again with different names to talk it through. Given an empty super class super_class:
CLASS super_class DEFINITION.
ENDCLASS.
and a sub-class sub_class that inherits it:
CLASS sub_class DEFINITION
INHERITING FROM super_class.
PUBLIC SECTION.
METHODS own_method.
ENDCLASS.
Now, as super_class is empty, sub_class does not take over any methods there. In contrast, we add a method own_method specifically to this class.
The following statement sequence then demonstrates exactly what's special with the dynamic calling:
DATA cut TYPE REF TO super_class.
cut = NEW sub_class( ).
CALL METHOD cut->('OWN_METHOD').
" runs sub_class->own_method
The runtime encounters the call method statement. It first inspects the static type of the variable cut, which is super_class. The requested method own_method is not present there. If this was all that happened, the call would now fail with a method-not-found exception. If we wrote a hard-coded cut->own_method( ), we wouldn't even get this far - the compiler would already reject this.
However, with call method the runtime continues. It determines the dynamic type of cut as being sub_class. Then it looks whether it finds an own_method there. And indeed, it does. The statement is accepted and the call is directed to own_method. This additional effort that's happening here is exactly what's described in the documentation as "This method is searched for first in the static type, then in the dynamic type of oref".
What we're seeing here is different from hard-coded method calls, but it is also not "illegal". In essence, the runtime here first casts the variable cut to its dynamic type sub_class, then looks up the available methods again. As if we were writing DATA(casted) = CAST super_class( cut ). casted->own_method( ). I cannot say why the runtime acts this way. It feels like the kind of relaxed behavior we usually find in ABAP when statements evolve throughout their lifetime and need to remain backwards-compatible.
There is one detail that needs additional addressing: the tiny word "then" in the documentation. Why is it important to say that it first looks in the static type, then in the dynamic type? In the example above, it could simply say "and/or" instead.
Why this detail may be important is described in my second answer to your question, which I posted some days ago. Let me wrap it up shortly again here, so this answer here is complete. Given an interface with a method some_method:
INTERFACE some_interface PUBLIC.
METHODS some_method RETURNING VALUE(result) TYPE string.
ENDINTERFACE.
and a class that implements it, but also adds another method of its own, with the exact same name some_method:
CLASS some_class DEFINITION PUBLIC.
PUBLIC SECTION.
INTERFACES some_interface.
METHODS some_method RETURNING VALUE(result) TYPE string.
ENDCLASS.
CLASS some_class IMPLEMENTATION.
METHOD some_interface~some_method.
result = `Executed the interface's method`.
ENDMETHOD.
METHOD some_method.
result = `Executed the class's method`.
ENDMETHOD.
ENDCLASS.
Which one of the two methods is now called by CALL METHOD cut->('some_method')? The order in the documentation describes it:
DATA cut TYPE REF TO some_interface.
cut = NEW some_class( ).
DATA result TYPE string.
CALL METHOD cut->('SOME_METHOD')
RECEIVING
result = result.
cl_abap_unit_assert=>assert_equals(
act = result
exp = `Executed the interface's method` ).
Upon encountering the call method statement, the runtime checks the static type of the variable cut first, which is some_interface. This type has a method some_method. The runtime thus will continue to call this method. This, again is standard object orientation. Especially note how this example calls the method some_method by giving the string some_method alone, although its fully qualified name is actually some_interface~some_method. This is consistent with the hard-coded variant cut->some_method( ).
If the runtime acted the other way around, inspecting the dynamic type first, and the static type afterwards, it would act differently and call the class's own method some_method instead.
There is no way to call the class's own some_method, by the way. Although the documentation suggests that the runtime would consider cut's dynamic type some_class in a second step, it also adds that "In the dynamic case too, only interface components can be accessed and it is not possible to use interface reference variable to access any type of component."
The only way to call the class's own method some_method, is by changing cut's type:
DATA cut TYPE REF TO some_class.
cut = NEW some_class( ).
DATA result TYPE string.
CALL METHOD cut->('SOME_METHOD')
RECEIVING
result = result.
cl_abap_unit_assert=>assert_equals(
act = result
exp = `Executed the class's method` ).
This is rather about interface implementations than class inheritance. What the ABAP language help means is this:
Suppose you have an interface that declares a method
INTERFACE some_interface PUBLIC.
METHODS some_method RETURNING VALUE(result) TYPE string.
ENDINTERFACE.
and a class that implements it, but alongside also declares a method with the same name, of its own
CLASS some_class DEFINITION PUBLIC.
PUBLIC SECTION.
INTERFACES some_interface.
METHODS some_method RETURNING VALUE(result) TYPE string.
ENDCLASS.
CLASS some_class IMPLEMENTATION.
METHOD some_interface~some_method.
result = `Executed the interface's method`.
ENDMETHOD.
METHOD some_method.
result = `Executed the class's method`.
ENDMETHOD.
ENDCLASS.
then a dynamic call on a reference variable typed with the interface will choose the interface method over the class's own method
METHOD prefers_interface_method.
DATA cut TYPE REF TO zfh_some_interface.
cut = NEW zfh_some_class( ).
DATA result TYPE string.
CALL METHOD cut->('SOME_METHOD')
RECEIVING
result = result.
cl_abap_unit_assert=>assert_equals(
act = result
exp = `Executed the interface's method` ).
ENDMETHOD.
This is actually the exact same behavior we are observing with regular calls to methods, i.e. if we provide the method's name in the code, not in a variable.
Only if the runtime cannot find a method with the given name in the static type will it start looking for a method with that name in the dynamic type. This is different from regular method calls, where the compiler will reject the missing some_interface~ and require us to add an alias for this to work.
By the way, as some people brought it up in the comments, the "static" here does not refer to CLASS-METHODS, as opposed to "instance" methods. "Static type" and "dynamic type" refer to different things, see the section Static Type and Dynmic Type in the help article Assignment Rules for Reference Variables.

What is a baseless method in Nim?

I'm new to the language. When trying to compile a new object type with a method (where the first argument is an instance of my new type), the compiler warned me like this:
Warning: use {.base.} for base methods; baseless methods are deprecated [UseBase]
Base methods correspond to what would be the base class for a method in a single-dispatch language. The base method is the most general application of a method to one or more classes. If you are dispatching on just a single argument, the base method should be associated with the type that would normally be the base class containing the method.
This warning typically happens to me when I define a method on a derived type -- thinking that I'm overriding behavior from a base type -- but the method signature is wrong and I'm effectively not overriding any method, hence the warning.
e.g.,
type
Base = ref object of RootObj
Derived = ref object of Base
method doSomething(b: Base, n: int) {.base.} =
...
# !!! This method gets warning because it's not overriding the base
# !!! doSomething method due to different parameter types
method doSomething(d: Derived, n: string) =
...

Where is the definition of a class stored in memory as opposed to the instance?

This question is merely out of interest and trying to understand something about memory management in object-oriented languages. It is not specific to one language, but I just want to understand as a general principle.
What I want to know is how the definition of an object reference is stored compared to the instance of that reference.
When you define and object in OO source code, e.g. in Java, without instantiating it:
String s;
How does this get stored? How does the memory usage of this definition differ from when the object is actually instantiated:
s = new String("abc");
? Is there a general principle that applies to all OO languages in terms of how memory is allocated or do different language implementers use different techniques for allocating memory?
Normaly when we declare a refrence like String s; it is created as a normal variable just like int , float but this type of variable hold the memory address ( it similar concept as pointers in C language) but when we use s = new String("abc");, it creates an object in heap and assign that address to the reference variable s.
In Java byte code, all Objects are stored as Objects. Explicit type-checking is added when needed. So for example this Java function
public Integer getValue(Object number){
int i = ((Number) number).toInt();
return new Integer(i);
}
is translated to a bytecode like this:
(accepts java.lang.Object, returns java.lang.Integer)
-read the first argument as an Object
-if the value is not a Number, raise an exception
-call the virtual method toInt(java.lang.Integer) of the value
and remember the int result
-use the value as an int argument
-instantiate a new java.lang.Integer
-call the constructor(int) of java.lang.Integer on the new number,
getting an Object back
[since the declared return value of Number.toInt is the same
as the return value of our function, no type checking is needed]
-return the value
So, types of unused variables get stripped out by the compiler. Types of public and protected fields are stored with its class.
The runtime type of an Object is stored with the object. In C++, it is a pointer to the Virtual Method Table. In Java, it is as a 16-bit index into the table of all loaded classes.
The Java class file stores an index of all dependent classes in a similar table. Only the class names are stored here. All field descriptions then point to this table.
So, when you write String s = new String("abc") (or even String s = "abc"), your class stores:
it is dependent on the class java.lang.String in the table of dependencies
"abc" in the table of String literals
your method loading a String literal by ID
(in the first case) your method calling a constructor of its first dependent class (String) with the first dependent class (String) as an argument.
the compiler can prove storing the new String in a String variable is safe, so it skips the type checking.
A class can be loaded as soon as it is referenced, or as late as its first use (in which case it is refered to by its depending class and ID within the class). I think the latter is always the case nowadays.
When a class is loaded:
-its class loader is asked to retreive the class by its name.
-(in the case of the system loader) the class loader looks
for the corresponding file in the program JAR, in the system library
and in all libraries referenced.
-the byte stream is then decoded into a structure in memory
-(in the case of early loading) all dependent classes are loaded recursively
if not already loaded
-it is stored in the class table
-(in the case of late loading) its static initialiser is run
(possibly loading more classes in the process).
In C++, none of the class loading takes place, as all user classes and most libraries are stored in the program as a mere virtual method table and the corresponding method. All of the system functions (not classes) can still be stored in a DLL (in case of Windows) or a similar file and loaded by the library at runtime. If a type checking is implied by an explicit type-cast, it is performed on the virtual method table. Also note that C++ did not have a type checking mechanism for a while.

CIL instruction "isinst <valuetype>"

The ECMA Common Language Infrastructure documentation says this about the CIL "isinst class" instruction:
Correct CIL ensures that class is a valid typeref or typedef or typespec token indicating a class, and
that obj is always either null or an object reference.
This implies that a valuetype is not allowed, right? But mscorlib.dll contains a method System.RuntimeTypeHandle::Equals(object obj) with the following instruction:
IL_0001: isinst System.RuntimeTypeHandle
And System.RuntimeTypeHandle is a valuetype. Can anybody put me right here?
Have a look at the declaration of RuntimeTypeHandle:
.class public sequential ansi serializable sealed beforefieldinit RuntimeTypeHandle
extends System.ValueType
implements System.Runtime.Serialization.ISerializable
Although RuntimeTypeHandle is declared as a struct its representation in CIL is some kind of special class. In other words, you can imagine structs as special classes that inherit from System.ValueType and whose attributes follow a strict order.
With that in mind isinst would be callable with RuntimeTypeHandle. For what I interpret isinst is not limited to reference types at all as long as there is a class representing the type.
Let's say we write in C#:
var i = 4;
var b = i is Int32;
We get a compiler warning
Warning: The given expression is always of the provided ('int') type.
What happens? We assign 4 to i. ibecoms an int. On the next line iis being auto-boxed to its corresponding ReferenceType (class), so that the warning is obvious. We could even write
var b = i is int;
I hope this can contribute to some kind of clearification on this topic.

What's the difference between Polymorphism and Multiple Dispatch?

...or are they the same thing? I notice that each has its own Wikipedia entry: Polymorphism, Multiple Dispatch, but I'm having trouble seeing how the concepts differ.
Edit: And how does Overloading fit into all this?
Polymorphism is the facility that allows a language/program to make decisions during runtime on which method to invoke based on the types of the parameters sent to that method.
The number of parameters used by the language/runtime determines the 'type' of polymorphism supported by a language.
Single dispatch is a type of polymorphism where only one parameter is used (the receiver of the message - this, or self) to determine the call.
Multiple dispatch is a type of polymorphism where in multiple parameters are used in determining which method to call. In this case, the reciever as well as the types of the method parameters are used to tell which method to invoke.
So you can say that polymorphism is the general term and multiple and single dispatch are specific types of polymorphism.
Addendum: Overloading happens during compile time. It uses the type information available during compilation to determine which type of method to call. Single/multiple dispatch happens during runtime.
Sample code:
using NUnit.Framework;
namespace SanityCheck.UnitTests.StackOverflow
{
[TestFixture]
public class DispatchTypes
{
[Test]
public void Polymorphism()
{
Baz baz = new Baz();
Foo foo = new Foo();
// overloading - parameter type is known during compile time
Assert.AreEqual("zap object", baz.Zap("hello"));
Assert.AreEqual("zap foo", baz.Zap(foo));
// virtual call - single dispatch. Baz is used.
Zapper zapper = baz;
Assert.AreEqual("zap object", zapper.Zap("hello"));
Assert.AreEqual("zap foo", zapper.Zap(foo));
// C# has doesn't support multiple dispatch so it doesn't
// know that oFoo is actually of type Foo.
//
// In languages with multiple dispatch, the type of oFoo will
// also be used in runtime so Baz.Zap(Foo) will be called
// instead of Baz.Zap(object)
object oFoo = foo;
Assert.AreEqual("zap object", zapper.Zap(oFoo));
}
public class Zapper
{
public virtual string Zap(object o) { return "generic zapper" ; }
public virtual string Zap(Foo f) { return "generic zapper"; }
}
public class Baz : Zapper
{
public override string Zap(object o) { return "zap object"; }
public override string Zap(Foo f) { return "zap foo"; }
}
public class Foo { }
}
}
With multiple dispatch, a method can have multiple arguments passed to it and which implementation is used depends on each argument's type. The order that the types are evaluated depends on the language. In LISP, it checks each type from first to last.
Languages with multiple dispatch make use of generic functions, which are just function declarations and aren't like generic methods, which use type parameters.
Multiple dispatch allows for subtyping polymorphism of arguments for method calls.
Single dispatch also allows for a more limited kind of polymorphism (using the same method name for objects that implement the same interface or inherit the same base class). It's the classic example of polymorphism, where you have methods that are overridden in subclasses.
Beyond that, generics provide parametric type polymorphism (i.e., the same generic interface to use with different types, even if they're not related — like List<T>: it can be a list of any type and is used the same way regardless).
Multiple Dispatch is more akin to function overloading (as seen in Java/C++), except the function invoked depends on the run-time type of the arguments, not their static type.
I've never heard of Multiple Dispatch before, but after glancing at the Wikipedia page it looks a lot like MD is a type of polymorphism, when used with the arguments to a method.
Polymorphism is essentially the concept that an object can be seen as any type that is it's base. So if you have a Car and a Truck, they can both be seen as a Vehicle. This means you can call any Vehicle method for either one.
Multiple dispatch looks similar, in that it lets you call methods with arguments of multiple types, however I don't see certain requirements in the description. First, it doesn't appear to require a common base type (not that I could imagine implementing THAT without void*) and you can have multiple objects involved.
So instead of calling the Start() method on every object in a list (which is a classic polymorphism example), you can call a StartObject(Object C) method defined elsewhere and code it to check the argument type at run time and handle it appropriately. The difference here is that the Start() method must be built into the class, while the StartObject() method can be defined outside of the class so the various objects don't need to conform to an interface.
This could be nice if the Start() method needed to be called with different arguments. Maybe Car.Start(Key carKey) vs. Missile.Start(int launchCode)
But both could be called as StartObject(theCar) or StartObject(theMissile)
Interesting concept...
if you want the conceptual equivalent of a method invocation
(obj_1, obj_2, ..., obj_n)->method
to depend on each specific type in the tuple, then you want multiple dispatch. Polymorphism corresponds to the case n=1 and is a necessary feature of OOP.
Multiple Dispatch relies on polymorphism based. Typical polymorphism encountered in C++, C#, VB.NET, etc... uses single dispatch -- i.e. the function that gets called only depends on a single class instance. Multiple dispatch relies on multiple class instances.
Multiple Dispatch is a kind of polymorphism. In Java/C#/C++, there is polymorphism through inheritance and overriding, but that is not multiple dispatch, which is based on two or more arguments (not just this, like in Java/C#/C++)