JVM garbage collection algorithm

JVM garbage collection algorithm - jvm

I know there are different garbage collection algorithms. Those are Copy collection and Mark Compact collection, Incremental collection. I have a query now. Which algorithm is used in JVM? Why there are different algorithm available?

First off, there is more than one version of the JVM.
I believe most major JVM's are using a generational garbage collection by default. They may also use a hybrid strategy however.
Here are some links on major JVM's using generational garbage collection:
OJVM Generational collection
Hotspot JVM
Here is a great article I found that indicates Jrockit uses a marking strategy:
Comparison of three Major JVM's

Different garbage collectors have different strengths and weaknesses, important features are throughput, pause times and parallelization. Which garbage collectors are used or available depends on the JDK version, the JVM mode (client or server) and a ton of configuration settings you can use. Keep in mind that GC technology evolves. Here are some useful links:
The Garbage-First Garbage Collector
Java SE 6 Performance White Paper
Java Tuning White Paper
Java HotSpot VM Options

as jvm develops, more and more jvm algorithms appear to solve the lack of pre-one,
now in JDK5.0 there area four types clollector:serial ,throught,concurrent and train
collector

Related

Why does a GraalVM (SubstrateVM) native image uses so much less memory at runtime than a corresponding JIT build?

I'm wondering why a GraalVM (SubstrateVM) native image of a Java application makes it run where the runtime behavior will consume much less memory, yet if run normally, it will consume a lot more memory?
And why can't the normal JIT be made to similarly consume a small amount of memory?

GraalVM native images don't include the JIT compiler or the related infrastructure -- so there's no need to allocate memory for JIT, for the internal representation of the program to JIT it (for example a control flow graph), no need to store some of the class metadata, etc.
So it's unlikely that a JIT which actually does useful work can be implemented with the same zero overhead.
It could be possible to create an economic implementation of the virtual machine that will perhaps use less memory than HotSpot. Especially if you only want to measure the default configuration without comparing the setups where you control the amounts of memory the JVM is allowed to use. However, one needs to realize that it'll either be an incremental improvement on the existing implementations or picking a different option for some trade-off, because the existing JVM implementations are actually really-really good.

Why does Intellij not release memory after closing a project?

I had three projects open. One of them - Spark - was very large. Upon closing spark there was NO difference in memory usage - as reported by os/x activity monitor. Note: all projects are opened within the same Intellij instance.
It is in fact using just over 4GB. And I only now have two projects open. Those two projects only take up 1.5GB if I shut down Intellij and start it up again.
So .. what to do to "encourage" Intellij to release the memory it is using? It is running very very slowly (can not keep up with my typing for example)
Update I just closed the larger of the two remaining projects. STILL no reduction in memory usage. The remaining project is a single python file. So Intellij should be using under 512Meg at this point!

Following up on #PeterGromov's answer it seems that is were difficult to obtain the memory back. In addition #KevinKrumwiede mentioned the XX:MaxHeapFreeRatio which appears to be an avenue.
Here are a couple of those ideas taken bit farther from Does GC release back memory to OS?
The HotSpot JVM does release memory back to the OS, but does so
reluctantly.
You can make it more aggressive by setting -XX:GCTimeRatio=19
-XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=30 which will allow it to spend more CPU time on collecting and constrain the amount of
allocated-but-unused heap memory after a GC cycle.
Additionally with Java 9 -XX:-ShrinkHeapInSteps option can be be used
to apply the shrinking caused by the previous two options more
aggressively. Relevant OpenJDK bug.
Do note that shrinking ability and behavior depends on the chosen
garbage collector. For example G1 only gained the ability to yield
back unused chunks in the middle of the heap with jdk8u20.
So if heap shrinking is needed it should be tested for a particular
JVM version and GC configuration.
and from How to free memory in Java?
To extend upon the answer and comment by Yiannis Xanthopoulos and Hot
Licks (sorry, I cannot comment yet!), you can set VM options like this
example:
-XX:+UseG1GC -XX:MinHeapFreeRatio=15 -XX:MaxHeapFreeRatio=30 In my jdk 7 this will then release unused VM memory if more than 30% of the heap
becomes free after GC when the VM is idle. You will probably need to
tune these parameters.
While I didn't see it emphasized in the link below, note that some
garbage collectors may not obey these parameters and by default java
may pick one of these for you, should you happen to have more than one
core (hence the UseG1GC argument above).
I am going to add the -XX:MaxHeapFreeRatio to IJ and report back if it were to help.
Our application presently only runs on Java7 so the first approach above is not yet viable - but there is hope since our app is moving to jdk8 soon.

https://www.jetbrains.com/help/idea/status-bar.html
I used this:
Shows the current heap level and memory usage. Visibility of this section in the Status bar is defined by the Show memory indicator check box in the Appearance page of the Settings/Preferences dialog. It is not shown by default.
Click the memory indicator to run the garbage collector.

The underlying Java virtual machine supports only growing of its heap. So even if after closing all projects the IDE doesn't need all of it, it's still allocated and counted as used in the OS.

Why manual memory management?

Are there any plans for auto memory management?
What are the advantanges of manually managing memory...does it conserve memory in the long run?
I have noticed in .Net Windows Applications - they are very sluggish - is this partly due to the garbage collector not working correctly?

Are there any plans for auto memory management?
On Mac — There's garbage collection already on 10.5.
On iPhone — No (as of 4.0).
What are the advantanges of manually managing memory... does it conserve memory in the long run?
See When NOT to use garbage collection?

The advantages of manual memory management are mainly that you can specialize the memory management specifically for your application, making it optimal and allowing "easy" optimization (on size and speed).
Automatic memory management is helpful when it's not necessary and even C++ commitee aknowledge that (there are plans to add an optional garbage collector to C++) but sometimes you really need to control what's happening behind the scene because you have a bigger sight of view of the application than any compiler or garbage collector.
Having choice between both is certainly very powerfull but it's not available in most languages.

With respect to real-time systems, garbage collection can have negative effects on the responsiveness of the program. In their book Small Memory Software, Weir and Noble discuss some of these issues and you can read about it at the end of this section of their book.
In many cases programmers simply choose write their own memory management routines that address these issues.

Why is the JVM stack-based and the Dalvik VM register-based?

I'm curious, why did Sun decide to make the JVM stack-based and Google decide to make the DalvikVM register-based?
I suppose the JVM can't really assume that a certain number of registers are available on the target platform, since it is supposed to be platform independent. Therefor it just postpones the register-allocation etc, to the JIT compiler. (Correct me if I'm wrong.)
So the Android guys thought, "hey, that's inefficient, let's go for a register based vm right away..."? But wait, there are multiple different android devices, what number of registers did the Dalvik target? Are the Dalvik opcodes hardcoded for a certain number of registers?
Do all current Android devices on the market have about the same number of registers? Or, is there a register re-allocation performed during dex-loading? How does all this fit together?

There are a few attributes of a stack-based VM that fit in well with Java's design goals:
A stack-based design makes very few
assumptions about the target
hardware (registers, CPU features),
so it's easy to implement a VM on a
wide variety of hardware.
Since the operands for instructions
are largely implicit, the object
code will tend to be smaller. This
is important if you're going to be
downloading the code over a slow
network link.
Going with a register-based scheme probably means that Dalvik's code generator doesn't have to work as hard to produce performant code. Running on an extremely register-rich or register-poor architecture would probably handicap Dalvik, but that's not the usual target - ARM is a very middle-of-the-road architecture.
I had also forgotten that the initial version of Dalvik didn't include a JIT at all. If you're going to interpret the instructions directly, then a register-based scheme is probably a winner for interpretation performance.

I can't find a reference, but I think Sun decided for the stack-based bytecode approach because it makes it easy to run the JVM on an architecture with few registers (e.g. IA32).
In Dalvik VM Internals from Google I/O 2008, the Dalvik creator Dan Bornstein gives the following arguments for choosing a register-based VM on slide 35 of the presentation slides:
Register Machine
Why?
avoid instruction dispatch
avoid unnecessary memory access
consume instruction stream efficiently (higher semantic density per instruction)
and on slide 36:
Register Machine
The stats
30% fewer instructions
35% fewer code units
35% more bytes in the instructions stream
but we get to consume two at a time
According to Bornstein this is "a general expectation what you could find when you convert a set of class files to dex files".
The relevant part of the presentation video starts at 25:00.
There is also an insightful paper titled "Virtual Machine Showdown: Stack Versus Registers" by Shi et al. (2005), which explores the differences between stack- and register-based virtual machines.

I don't know why Sun decided to make JVM stack based. Erlangs virtual machine, BEAM is register based for performance reasons. And Dalvik also seem to be register based because of performance reasons.
From Pro Android 2:
Dalvik uses registers as primarily units of data storage instead of the stack. Google is hoping to accomplish 30 percent fewer instructions as a result.
And regarding the code size:
The Dalvik VM takes the generated Java class files and combines them into one or more Dalvik Executables (.dex) files. It reuses duplicate information from multiple class files, effectively reducing the space requirement (uncompressed) by half from traditional .jar file. For example, the .dex file of the web browser app in Android is about 200k, whereas the equivalent uncompressed .jar version is about 500k. The .dex file of the alarm clock is about 50k, and roughly twice that size in its .jar version.
And as I remember Computer Architecture: A Quantitative Approach also conclude that a register machine perform better than a stack based machine.

Points to be considered while designing or coding for lesser footprint deliverables

Please post the points one should keep in mind while designing or coding for lesser footprint deliverables for embedded systems.
I am not giving compiler or platform details, as I want generic information. But, any specific information on Linux based OS is also welcome.

Depends on how low you want to get. I'm currently coding for fiscal printers, and there's no OS, and the main rule is no dynamic memory allocation. The funny thing is that I still convinced the crew to code fully modern C++ ;).
Actually there are a few rules we decided upon:
no dynamic allocation
hence, no STL
no exception handling (obvious reasons)

There isn't a general answer, only ones specific to language/platform ... but
Small memory footprint ...
Don't use Java, C#/mono, PHP, Perl, Python or anything with garbage collection
Get as close to the metal as feasible, Use C
Do alot of profiling to see where memory is getting allocated, if you are using dynamic allocation
Ensure you prevent heap-fragmentation by allocating sensible chunks and sizes of the heap
Avoid recursive functions especially those that use malloc(). Better allocating a chunk and passing a pointer around.
use free() ;)
Ensure your types are no bigger than required
Turn on compiler optimizations
There will be more.

for real low footprint consider doing Assembly directly.
We all know that Hello World in C or C++ is 20kb+(because of all the default libraries which get linked). In Assembly this overhead is gone. As pointed out in the comments one can reduce the standard libraries quite a bit. However, the fact remains that the code density you can get when coding assembly is much higher than a compiler will generate from a higher language. So for code where every byte matters, use assembly.
also when programming on devices with less capable processors, programming in assembly language might be your only way to do make the program fast enough for it to be realtime enough to (for instance) control machines

When faced with such constraints, it is advisable to pre-allocate memory in order to guarantee that the system will work under load. A design pattern such as "object pooling" can be used to share resources within the system.
The C language enables tight resource (i.e. memory & compute cycles) control. It should be strongly considered.
Avoid recursion as it is easy to abuse and can result in stack overflow conditions.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas