JVM crashes/OutOfMemory when Java Instrumentation retransform a class many times - instrumentation

I wrote a Java agent, with a instance that implements ClassFileTransformer's below method:
class MyTransformer {
public byte[] transform(ClassLoader loader, String classNameWithSlash,
Class classBeingRedefined, ProtectionDomain protectionDomain,
byte[] classfileBuffer) throws IllegalClassFormatException {
return null;
}
}
In order to be able to retransform a loaded class, I used below method to add the above transformer:
inst.addTransformer(new MyTransformer(), true);
It functionally works, however, there is a problem:
When testing using Oracle JDK1.8.0_102 (on Linux 64bit), and enable tracing of class loading and unloading by adding below arguments to JVM startup arguments
-XX:+TraceClassLoading -XX:+TraceClassUnloading. I can see trace logs for class loading when a inst.retransform(aClass) is invoked, e.g.
[Loaded myapp.MyClass from __VM_RedefineClasses__]
This is fine for class loading, but when I retransform the same class again, I can see this 'loading' trace log again but can't see any trace related to 'unloading'.
So the problem is, if I call inst.retransform() to a certain number of times, the JVM crashes! When I increase the setting for -XX:MetaspaceSize, the number of calls I can make increases before the JVM crashes. These two numbers are highly correlated, .
So, the questions: Is this a JVM bug, or it is the agent developer's job to unload a class being transformed?
Note: The JVM crashes (or complains about out of memory in meta space) even if my transformer does not do any transformation by returning null.

Classes loaded are default behavior when functionality of the classes are used but unloading cannot be performed similar way. Classes unloading are done by GC. GC performs some heuristics checks before unloading classes. This is not a bug. JVM is working as designed.

Related

The format of class data loaded in method area of jvm by the class loader?

What is the format of the class related information loaded in method area by the class loader of jvm? Does the class information loaded in method area by class loader is present in machine language(native instructions set)?
Actually as the JVM's execution engine converts the bytecode into native instructions set. So I wanted to know does the class related information loaded by classloaders in method area of jvm is already present in native instructions set or present in any other format and later gets converted to native instructions set by JVM's execution engine.

Javassist NotFoundException when getting java.io.Serializable with JDK9

I have the following code:
private static CtClass resolveCtClass(String clazz) throws NotFoundException {
ClassPool pool = ClassPool.getDefault();
return pool.get( clazz );
}
When running under JDK8, if this method is called using java.io.Serializable, it works, but when running under the JDK9 environment, it throws the NotFoundException.
Is there something I overlooked here?
This does no longer happen with the current EA builds of Java 9. Class files are now always locatable even if they are encapsulated in a module.
This is a consequence of Java 9's module encapsulation where non-exported resources are no longer available via the ClassLoader API. Under the covers, Javassist calls
ClassLoader.getSystemClassLoader().findResource("java/io/Serializable.class");
to get hold of the class file for Serializable. It then parses this class file and represents the information similarly to the Java reflection API but without loading the class such that it can be edited prior to loading it.
Until Java 8, this class file was accessible as most class loaders rely on looking up a class file before loading it such that the above call returned a URL pointing to the file. Since Java 9, resources of named modules are only available via the new API method findResource(String, String) where the second arguments names the module of that class.
The short answer is: Javassist does no longer work with Java 9 and none of its dependant projects will. This is a known issue with the current Java 9 implementation and will hopefully be fixed prior to release.
(I never used Javassist so I'm just shooting in the dark, here...)
The documentation of ClassPool says:
If get() is called on this object, it searches various sources represented by ClassPath to find a class file and then it creates a CtClass object representing that class file.
This seems to be bound to the concept of the class path. Looking at ClassPath and CtClass supports that assumption.
If that is the case, then Javassist might just not be fit to look into JDK 9's brand new modules.
If my guess is correct, you should not be able to get any JDK class from the pool. This should be easily verifiable.

When is class side initialize sent?

I am curious about when the class side initialize messages are sent in Smalltalk (Pharo and Squeak particularly). Is there any specified order? Is it at least safe to assume all other classes packaged with it have already been loaded and compiled, or does the system eagerly initialize (send initialize before even finishing loading and compiling the other classes)?
The class-side initialize is never sent by the system. During development you do it manually (which is why many of these methods have a "self initialize" comment.
When exporting code of a class into a changeset, the exporter puts a send of initialize at the very end, so it gets executed when the class is loaded into another system.
This behavior is mimicked by Monticello. When loading a class for the first time, or when the code of the initialize method was changed, it is executed. That is because conceptually MC builds a changeset on-the-fly containing the difference of what is in the image already and what the package to be loaded contains. If that diff includes a class-side initialize method, it will be executed when loading that package version.
As you asked about loading and compiling, I'm assuming you mean when loading code...
When loading a package or changeset, class-side #initialize methods are called after all code is installed (1). While you can not count on a specific order, you can assume that all classes and methods from that package are loaded.
As Bert pointed out, if you were not loading but implementing class-side #initialize, you'd have to send the message yourself.
One way to know for sure, is to test it yourself. Smalltalk systems make this kind of thing a little more approachable than many other systems. Just define a your own MyTestClass, and then implement your own class side (that's important) initialize message so that you can discover for yourself when it fires, how often it fires, etc.
initialize
Transcript show: 'i have been INITIALIZED!!! Muwahahahah!!!'
Make sure it works by opening a Transcript and running
MyTestClass initialize
from a Workspace. Now you can play with filing it out and back in, Monticello loading, whatever and when it runs.

Find Self-Referential Code in IntelliJ

In IntelliJ when code is not used anywhere it will be "grayed out." Is there any way to see if a set of classes aren't used anywhere?
I have this set of classes with references to each other so IntelliJ is counting this set of classes as being used. In this case I know the code is useless but it would be nice to have the ability to automatically detect this sort of thing. The logic to do this isn't amazingly difficult... Does anyone know if this is possible in IntelliJ?
This "greyed out" mark simply reflects declaration usages in other source code files or framework configuration files. Declaration usage search cannot detect orphan clusters of classes as these classes are formally referenced.
There is a technique, that may help here: define some root set of entry points (main() methods, web.xml declarations, etc) and trace all the references, effectively building a graph of used classes/methods. Once graph is completed, you can treat unvisited classes as dead code. Pretty similar to what Java garbage collector does during young gen collection. It is quite difficult and resource consuming for on-the-fly code analysis, so Intellij has it implemented as a separate inspection one can run manually.
To demonstrate it let's create a fresh project containing the following code:
public class Main {
public static void main(String[] args) {
System.out.println(new Used());
}
}
class Used {}
class ObviouslyUnused {}
class TrickyUnused1 {
TrickyUnused1() {
System.out.println(new TrickyUnused2());
}
}
class TrickyUnused2 {
TrickyUnused2() {
System.out.println(new TrickyUnused1());
}
}
In the editor we can see, that only ObvoiuslyUnused is greyed out. Let's run an "Unused declaration" inspection:
and here we go, inspections shows, that our unused self-referenced class cluster is not reachable:
You should be aware, though, that there are always means of referencing code in implicit ways: reflection, native calls, runtime code generation, SPI implementations, references from framework configuration files, etc. So no static anlisys tool can be 100% accurate when detecting dead code.

Why do we synchronize lazy singletons but not eager ones?

Typical lazy singleton:
public class Singleton {
private static Singleton INSTANCE;
private Singleton() {
}
public static synchronized Singleton getInstace() {
if(INSTANCE == null)
INSTANCE = new Singleton();
return INSTANCE;
}
}
Typical eager singleton:
public class Singleton {
private static Singleton INSTANCE = new Singleton();
public static Singleton getInstance() {
return INSTANCE;
}
}
Why are we not concerned about synchronization with eager singletons, but have to worry about synchronization with their lazy cousins?
Eager instantiation does not require explicit synchronisation for sharing the reference of the field because the JVM will have handled it for us already as part of the class loading mechanism.
To elaborate in more detail, before a class becomes available to any thread for use, it will have been loaded, verified and initialised. The compiler rewrites the static field assignment into this initialisation stage and via the rules of the Java Memory Model and the underlying hardware architecture will ensure that all threads who access that class will see this version of the class. This means that the JVM will have handled any hardware barriers etc for us.
That said, I would recommend marking the eager initialisation final. This will make your intent clearer and the compiler will enforce that the eager initialisation never changes. If it did, then concurrency control would be required again.
private static **final** Singleton INSTANCE = new Singleton();
FYI If you are interested, section 5.5 of the Java Virtual Machine Specification covers this in much more detail. A couple of choice snippets from the spec are
*"Because the Java Virtual Machine is multithreaded,
initialization of a class or interface requires careful
synchronization"*
*"For each class or interface C, there is a unique initialization lock LC"*
*9&10) "Next, execute the class or interface initialization method of C"
"If the execution of the class or interface initialization
method completes normally, then acquire LC, label the Class object for
C as fully initialized, notify all waiting threads, release LC, and
complete this procedure normally."*
It is in step 10 of the spec where the static fields will have been set, and the use of a lock (LC) is used to ensure that only one thread performs the initialisation and that the result is shared correctly.
Since an eager singleton is initialized when the class if first loaded into memory (jit) and this happens only once. However if two clients will try to call the singleton instance method from two threads at the same time, two singletons may be created.
Because in the latter example the instance of the Singleton is always present when the getInstance is called - nothing to synchronize here. That is in contrary to the first example where the instance isn't necessary initialized yet. In this case the getInstance contains critical section (the if and it's body) which needs to be protected (e.g. by synchronization) against simultaneous access.