Is the JVM ever re-compiling compiled code? - jvm

Is the JVM (any of them) ever re-compiling code that has already been compiled at runtime?

It depends on what you mean by re-compiling, but the HotSpot VM will discard code that relies on optimistic assumptions when they are proven to be wrong or no longer relevant. See deoptimization:
Deoptimization is the process of changing an optimized stack frame to an unoptimized one. With respect to compiled methods, it is also the process of throwing away code with invalid optimistic optimizations, and replacing it by less-optimized, more robust code.
The fourth point is particularly interesting:
If a class is loaded that invalidates an earlier class hierarchy analysis, any affected method activations, in any thread, are forced to a safepoint and deoptimized.
This applies to optimistic method inlining as described in this paper:
A class hierarchy analysis (CHA) is
used to detect virtual call sites where currently only one suitable method exists.
This method is then optimistically inlined. If a class is loaded later that adds
another suitable method and, therefore, the optimistic assumption no longer
holds, the method is deoptimized.

Related

Kotlin Arrow EffectScope.shift() implementation throws exception?

When looking at arrows documentation about functional error handling one of the reason listed to avoid throwing exceptions is performance cost (referencing The hidden performance costs of instantiating Throwables)
So it is suggested to model errors/failures as an Effect.
When building an Effect to interrupt a computation the shift() method should be used (and under the hood it is also what is used to "unwrap" the effects through the bind() method).
Looking at the shift() method implementation seems that its magic is done...throwing an exception, meaning that not only exceptions are created when we want to signal an error, but also to "unwrap" any missing Option, Left instance of an Either and all the other effect types exposed by the library.
What I'm not getting is if there's some optimization done to avoid the issues with "the hidden performance costs of instantiating Throwables", or in the end they are not a real problem?
What I'm not getting is if there's some optimisation done to avoid the issues with "the hidden performance costs of instantiating Throwables", or in the end they are not a real problem?
The argumentation that this is the biggest reason for using typed errors on the JVM is probably an overstatement, there are better reason for using typed errors. Exceptions are not typed, so they are not tracked by the compiler. This is what we want to avoid if you care about type-safety, or purity. This will be better reflected in the documentation or 2.x.x.
Avoiding the performance penalty can be a benefit in hot-loops, but in general application programming it can probably be neglected.
However to answer your question on how this is dealt with in Kotlin, and Arrow:
In Kotlin cancellation of Coroutines works through CancellationException, so it's required to use this mechanism to correctly work in the Kotlin language. You can find more details in the Arrow 2.x.x Raise design document.
It's possible to remove the performance penalty of exceptions. Which is also what Arrow is doing. (Except a small regression in a single version, this was be fixed in the next release).
An example of this can also be found in the official KotlinX Coroutines which applies the same technique for disabled stack traces for JobCancellationException.

When creating lots of ByteBuddy classes, do I need to acquire locks of any kind?

I am creating several ByteBuddy classes (using DynamicTypeBuilder) and loading them. The creation of these classes and the loading of them happens on a single thread (the main thread; I do not spawn any threads myself nor do I submit anything to an ExecutorService) in a relatively simple sequence.
I have noticed that running this in a unit test several times in a row yields different results. Sometimes the classes are created and loaded fine. Other times I get errors from the generated bytecode when it is subsequently used (often in the general area of where I am using withArgumentArrayElements, if it matters; ArrayIndexOutOfBoundsErrors and the like; again other times this all works fine (with the same inputs)).
This feels like a race condition, but as I said I'm not spawning any threads. Since I am not using threads, only ByteBuddy (or the JDK) could be. I am not sure where that would be. Is there a ByteBuddy synchronization mechanism I should be using when creating and loading classes with DynamicTypeBuilder.make() and getLoaded()? Maybe some kind of class resolution is happening (or not happening!) on a background thread or something at make() time, and I am accidentally somehow preventing it from completing? Maybe if I'm going to use these classes immediately (I am) I need to supply a different TypeResolutionStrategy? I am baffled, as should be clear, and cannot figure out why a single-threaded program with the same inputs should produce generated classes that behave differently from run to run.
My pattern for loading these classes is:
Try to load the (normally non-existent) class using Class#forName(name, true, Thread.currentThread().getContextClassLoader()).
If (when) that fails, create the ByteBuddy-generated class and load it using the usual ByteBuddy recipes.
If that fails, it would be only because some other thread might have created the class already. In this unit test, there is no other thread. In any case, if a failure were to occur here, I repeat step 1 and then throw an exception if the load fails.
Are there any ByteBuddy-specific steps I should be taking in addition or instead of these?
Phew! I think I can chalk this up to a bug in my code (thank goodness). Briefly, what looked like concurrency issues was (most likely) an issue with accidentally shared classnames and HashMap iteration order: when one particular subclass was created-and-then-loaded, the other would simply be loaded (not created) and vice versa. The net effect was effects that looked like those of a race condition.
Byte Buddy is fully thread-safe. But it does attempt to create a class every time you invoke load what is a fairly expensive operation. To avoid this, Byte Buddy offers the TypeCache mechanism that allows you to implement an efficient cache.
Note that libraries like cglib offer automatic caching. Byte Buddy does not do this since the cache uses all inputs as keys and references them statically what can easily create memory leaks. Also, the keys are rather inefficient which is why Byte Buddy chose this approach.

What are the specifics about the continuations upon which Raku(do) relies?

The topic of delimited continuations was barely discussed among programming language enthusiasts in the 1990s and 2000s. It has recently been re-emerging as a major thing in programming language discussions.
My hope is that someone can at least authoritatively say whether the continuations underlying Rakudo (as contrasted with Raku) do or don't have each of the six characteristics listed below. I say a bit more about the sort of answer I'm hoping for after the list.
Quoting verbatim (with a formatting touch up) from an online message[1] written by the person driving the work on adding continuations to the JVM:
Asymmetric: When the continuation suspends or yields, the execution returns to the caller (of Continuation.run()). Symmetric continuations don't have the notion of a caller. When they yield, they must specify another continuation to transfer the execution to. Neither symmetric nor asymetric continuations are more powerful than one another, and each could be used to simulate the other.
Stackful: The continuation can be suspended at any depth in the call-stack, rather than in the same subroutine where the delimited context begins when the continuation is stackless (as is the case in C#). I.e the continuation has its own stack rather than just a single subroutine frame. Stackful continuations are more powerful than stackless ones.
Delimited: The continuation captures the execution context that starts with a specific call (in our case, the body of a certain runnable) rather than the entire execution state all the way up to main(). Delimited continuations are strictly more powerful than undelimited ones (http://okmij.org/ftp/continuations/undelimited.html), the latter considered "not practically useful" (http://okmij.org/ftp/continuations/against-callcc.html).
Multi-prompt: Continuations can be nested, and anywhere in the call stack, any of the enclosing continutions can be suspended. This is similar to nesting of try/catch blocks, and throwing an exception of a certain type that unwinds the stack up to the nearest catch that handles it rather than just the nearest catch. An example of nested continuations can be using a Python-like generator inside a virtual thread. The generator code can do a blocking IO call, which will suspend the enclosing thread continuation, and not just the generator: https://youtu.be/9vupFNsND6o?t=2188
One-shot/non-reentrant: Every time we continue a suspended continuation its state is mutated, and we cannot continue it from the same suspension state multiple times (i.e we can't go back in time). This is unlike reentrant continuations where every time we suspend them, a new immutable continuation object that represents a particular suspension point is returned. I.e. the continuation is a single point in time, and every time we continue it we go back to that state. Reentrant continuations are strictly more powerful than non-reentrant ones; i.e. they can do things that are strictly impossible with just one-shot continuations.
Cloneable: If we are able to clone a one-shot continuation we can provide the same ability as reentrant continuations. Even though the continuation is mutated every time we continue it, we can clone its state before continuing to create a snapshot of that point in time that we can return to later.
Aiui continuations aren't directly exposed in Raku, so perhaps the correct answer related to Raku (as against Rakudo) would be "there are no continuations". But that's not clear to me so in the following, in which I describe what I'm hoping might be in an answer if I'm very lucky, I'll pretend it makes some sense to talk about them in the context of both Raku and Rakudo as two distinct realms.
Here's the sort of answer I'm imagining would be possible (though I'm just somewhat wildly guessing at what is actually true):
"As a "100 year" language design, Raku's current underlying semantic [execution?] model requires, at minimum, stackless one-shot multi prompt delimited continuations.
From a theoretic pov, Raku's design can never expand to require that continuations are cloneable but it could theoretically expand to require they are stackful.
Rakudo implements the currently required continuation semantics.
MoarVM has support for these semantics built in, and could realistically track the theoretically possible expansions of requirements if Raku's design ever so expands.
The JVM and JS backends have suitable shims that achieve the same thing, albeit at a cost to performance. It seems plausible that the JVM backend could switch to using continuations that are native to the JVM if it comes to pass that it gets them, provided of course that they meet requirements, but my current impression is that it would likely realistically be perhaps a decade away, or more, before we would need to consider crossing that bridge."
(Or something vaguely like that.)
If an answer also provided a bit more detail on something like the above, perhaps some code links, that would be a particularly awesome addition.
Similarly, if an answer included a couple brief examples of how this continuation power surfaces in current Raku features, and a speculation about how it might one day, say 10 years from now, surface in other features, that would make an answer an over-the-top brilliant one.
PS. Thank you to #Larry who understood things deeply enough to know continuations needed to be part of the picture; to Stefan O'Rear for his contributions, including the initial implementations of what I think are one-shot multi prompt delimited continuations; and to jnthn for making the dream come true.
Footnotes
1 There is work underway to introduce continuations as a first class construct to the JVM. A key driver of this effort is Ron Pressler. The above is based on a message he wrote in November.
Rakudo uses continuations as an implementation strategy for two features:
gather/take - for implementing lazy iterators
Making await on the thread pool non-blocking
The characteristics of the continuations implemented follow the requirements of these language features. I'll go through them in a slightly different order than above because it eases explaining.
Stackful - yes, because we need to be able to do the take or await at any depth in the callstack relative to the gather or the thread pool worker's work loop. For example, you could write a recursive graph traversal algorithm inside of a gather and then take each encountered node. For await, this is at the heart of the difference between Raku's await and await as seen in many other languages: you don't have to refactor all the way up the call stack.
Delimited - yes. The continuation reset operation installs a tag (or "prompt"), and when we do a continuation control operation, we slice the stack at this delimiter. I can't imagine how you'd implement the Raku features involved without them being delimited.
Multi-prompt - yes, this is required because you can be iterating one data source provided by a gather inside of another gather's implementation, or do an await inside of a gather.
Asymmetric - after the continuation has been taken, execution continues after the reset instruction. In the await case, we go and find another task in the worker task queue, and in the take case we're back in the pull-one method of the iterator and can return the taken value. I think this approach fits well in a language where only a few features use continuations.
One-shot/non-reentrant - yes, and at least in MoarVM the memory safety of the runtime depends on this property. It is enforced by an atomic compare and swap operation, so if two threads were to race to invoke the continuation, only one could ever succeed. No Raku features need the additional complexity that reentrant continuations would imply.
Cloneable - no, because no Raku features need it. In theory this isn't too awful to implement in MoarVM in terms of saying "yes, we can do it", but I suspect it raises a lot of questions like "how deep should be clone". If you just cloned all the invocation records and similar, you'd still share Scalar containers, Arrays, etc. between the clones.
As I understand it - though I follow from a distance - the JVM continuations are at least partly aimed at the same design space that the Raku await mechanism is in, and so I'd be surprised if they didn't end up providing what Raku needs. This would clearly simplify compilation of Raku code to the JVM (currently it does the global CPS transform as it does code generation, which curiously turned out simpler than I expected), and it'd almost certainly perform much better too, because the transform required probably obscures quite a few things from the perspective of the JIT compiler.
So far as code goes, you can see the current continuations implementation, which uses the continuation data structure which in turn has various bits of memory management. At the time of writing, these have all been significantly refactored as part of the new callstack representation required by ongoing dispatcher work; those changes do make working with continuations more efficient, but don't change the overall set of operations.

Java Monitors: Does having a Java monitor with Synchronised Methods Avoid Deadlocks?

Basically, if I have lots of synchronised methods in a monitor. Will this effectively avoid deadlocks?
In general, no, it does not guarantee the absence of the deadlocks. Please have a look at the code examples at
Deadlocks and Synchronized methods and Deadlock in Java. The two classes, A and B, with synchronized methods only generate a perfect deadlock.
Also, in my opinion , your wording "Java monitor with Synchronised Methods", although being conceptually correct, slightly deviates from the one accepted in Java. For example the java.lang.Object.wait() javadoc puts in the following way :
"The current thread must own this object's monitor"
That implicitly suggests that the object and the monitor are not the same thing. Instead, the monitor is something we don't directly see or address.

How do I debug singletons in Objective-C

My app contains several singletons (following from this tutorial). I've noticed however, when the app crashes because of a singleton, it becomes nearly impossible to figure out where it came from. The app breakpoints at the main function giving an EXEC_BAD_ACCESS even though the problem lies in one of the Singleton objects. Is there a guide to how would I debug my singleton objects if they were problematic?
if you don't want to change your design (as recommended in my other post), then consider the usual debugging facilities: assertions, unit tests, zombie tests, memory tests (GuardMalloc, scribbling), etc. this should identify the vast majority of issues one would encounter.
of course, you will have some restrictions regarding what you can and cannot do - notably regarding what cannot be tested independently using unit tests.
as well, reproducibility may be more difficult in some contexts when/if you are dealing with a complex global state because you have created several enforced singletons. when the global state is quite large and complex - testing these types independently may not be fruitful in all cases since the bug may appear only in a complex global state found in your app (when 4 singletons interact in a specific manner). if you have isolated the issue to interactions of multiple singleton instances (e.g. MONAudioFileCache and MONVideoCache), placing these objects in a container class will allow you to introduce coupling, which will help diagnose this. although increasing coupling is normally considered a bad thing; this does't really increase coupling (it already exists as components of the global state) but simply concentrates existing global state dependencies -- you're really not increasing it as much as you are concentrating it when the state of these singletons affect other components of the mutable global state.
if you still insist on using singletons, these may help:
either make them thread safe or add some assertions to verify mutations happen only on the main thread (for example). too many people assume an object with atomic properties implies the object is thread safe. that is false.
encapsulate your data better, particularly that which mutates. for example: rather than passing out an array your class holds for the client to mutate, have the singleton class add the object to the array it holds. if you truly must expose the array to the client, then return a copy. ths is just basic ood, but many objc devs expose the majority of their ivars disregarding the importance of encapsualtion.
if it's not thread safe and the class is used in a mutithreaded context, make the class (not the client) implement proper thread safety.
design singletons' error checking to be particularly robust. if the programmer passes an invalid argument or misuses the interface - just assert (with a nice message about the problem/resolution).
do write unit tests.
detach state (e.g. if you can remove an ivar easily, do it)
reduce complexity of state.
if something is still impossible to debug after writing/testing with thorough assertions, unit tests, zombie tests, memory tests (GuardMalloc, scribbling), etc,, you are writing programs which are too complex (e.g. divide the complexity among multiple classes), or the requirements do not match the actual usage. if you're at that point, you should definitely refer to my other post. the more complex the global variable state, the more time it will take to debug, and the less you can reuse and test your programs when things do go wrong.
good luck
I scanned the article, and while it had some good ideas it also had some bad advice, and it should not be taken as gospel.
And, as others have suggested, if you have a lot of singleton objects it may mean that you're simply keeping too much state global/persistent. Normally only one or two of your own should be needed (in addition to those that other "packages" of one sort or another may implement).
As to debugging singletons, I don't understand why you say it's hard -- no worse than anything else, for the most part. If you're getting EXEC_BAD_ACCESS it's because you've got some sort of addressing bug, and that's nothing specific to singleton schemes (unless you're using a very bad one).
Macros make debugging difficult because the lines of code they incorporate can't have breakpoints put in them. Deep six macros, if nothing else. In particular, the SYNTHESIZE_SINGLETON_FOR_CLASS macro from the article is interfering with debugging. Replace the call to this macro function with the code it generates for your singleton class.
ugh - don't enforce singletons. just create normal classes. if your app needs just one instance, add them to something which is created once, such as your app delegate.
most cocoa singleton implementations i've seen should not have been singletons.
then you will be able to debug, test, create, mutate and destroy these objects as usual.
the good part is course that the majority of your global variable pains will disappear when you implement these classes as normal objects.