Is there any advantage in setting Xms and Xmx to the same value? - jvm

Usually I set -Xms512m and -Xmx1g so that when JVM starts it allocates 512MB and gradually increases heap to 1GB as necessary. But I see these values set to same say 1g in a dedicated server instance. Is there any advantage for the having both set to the same value?

Well there are couple of things.
Program will start with -Xms value and if the value is lesser it will eventually force GC to occur more frequently
Once the program reaches -Xms heap, jvm request OS for additional memory and eventually grabs -Xmx that requires additional time leading to performance issue, you might as well set it to that at the beginning avoiding jvm to request additional memory.
It is very nicely answered here - https://developer.jboss.org/thread/149559?_sscc=t

From Oracle Java SE 8 docs:
https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/sizing.html
By default, the virtual machine grows or shrinks the heap at each
collection to try to keep the proportion of free space to live objects
at each collection within a specific range. This target range is set
as a percentage by the parameters -XX:MinHeapFreeRatio=<minimum> and
-XX:MaxHeapFreeRatio=<maximum>, and the total size is bounded below by -Xms<min> and above by -Xmx<max>. Setting -Xms and -Xmx to the same value increases predictability by removing the most important sizing
decision from the virtual machine. However, the virtual machine is
then unable to compensate if you make a poor choice.
if the value of -Xms and -Xmx is same JVM will not have to adjust the heap size and that means less work by JVM and more time to your application. but if the chosen value is a poor choice for -Xms then some of the memory allocated will never be used because the heap will never shrink and if it is a poor choice for -Xmx you will get OutOfMemoryError.

AFAIK One more reason, is that expansion of heap is a stop-the-world event; setting those to the same value will prevent that.

There are some advantages.
if you know the size is going to grow to the maximum, e.g. in a benchmark, you may as well start with the size you know you need.
you can get better performance giving the program more memory that it might naturally give itself. YMWV
In general, I would make the Xms a value I am confident it will use, and the double this for head room for future use cases or situations we haven't tested for. i.e. a size we don't expect but it might use.
In short, the maximum is the point you would rather the program fail than use any more.

Application will suffer frequent GC with lower -Xms value.
Every time asking for more memory from OS with consume time.
Above all, if your application is performance critical then you would certainly want to avoid memory pages swapping out to/from disk as this will cause GC consuming more time. To avoid this, memory can be locked. But if Xms and Xmx are not same then memory allocated after initial allocation will not be locked.

Related

JVM Option MaxMetaspaceSize - What is Optimal value to set?

I have application, have set around 8GB of Max heap memory, it does lots of message processing. when I am taking jmap -heap for my application, it shows me very huge MaxMetaspaceSize memory, something like below,
MaxMetaspaceSize = 17592186044415 MB
If I have to put the limit on MaxMetaspaceSize, what could be optimal value for that?
Note* - I have read on pros and cons of putting limit of MaxMetaspaceSize
Measure how much metaspace your program needs once it has settled into a steady state. Add some safety margin, then configure that as your max.
Should your program change behavior, e.g. load additional classes without unloading old ones you'll be notified about that through OOMEs.
it shows me very huge MaxMetaspaceSize memory
Note that that is just the upper limit enforced by the JVM, not the actual size.

Configuring Lucene Index writer, controlling the segment formation (setRAMBufferSizeMB)

How to set the parameter - setRAMBufferSizeMB? Is depending on the RAM size of the Machine? Or Size of Data that needs to be Indexed? Or any other parameter? could someone please suggest an approach for deciding the value of setRAMBufferSizeMB.
So, what we have about this parameter in Lucene javadoc:
Determines the amount of RAM that may be used for buffering added
documents and deletions before they are flushed to the Directory.
Generally for faster indexing performance it's best to flush by RAM
usage instead of document count and use as large a RAM buffer as you
can. When this is set, the writer will flush whenever buffered
documents and deletions use this much RAM.
The maximum RAM limit is inherently determined by the JVMs available
memory. Yet, an IndexWriter session can consume a significantly larger
amount of memory than the given RAM limit since this limit is just an
indicator when to flush memory resident documents to the Directory.
Flushes are likely happen concurrently while other threads adding
documents to the writer. For application stability the available
memory in the JVM should be significantly larger than the RAM buffer
used for indexing.
By default, Lucene uses 16 Mb as this parameter (this is the indication to me, that you shouldn't have that much big parameter to have fine indexing speed). I would recommend you to tune this parameter by setting it let's say to 500 Mb and checking how well your system behave. If you will have crashes, you could try some smaller value like 200 Mb, etc. until your system will be stable.
Yes, as it stated in the javadoc, this parameter depends on the JVM heap, but for Python, I think it could allocate memory without any limit.

How to properly assign huge heap space for JVM

Im trying to work around an issue which has been bugging me for a while. In a nutshell: on which basis should one assign a max heap space for resource-hogging application and is there a downside for tit being too large?
I have an application used to visualize huge medical datas, which can eat up to several gigabytes of memory if several imaging volumes are opened size by side. Caching the data to be viewed is essential for fluent workflow. The software is supported with windows workstations and is started with a bootloader, which assigns the heap size and launches the main application. The actual memory needed by main application is directly proportional to the data being viewed and cannot be determined by the bootloader, because it would require reading the data, which would, ultimately, consume too much time.
So, to ensure that the JVM has enough memory during launch we set up xmx as large as we dare based, by current design, on the max physical memory of the workstation. However, is there any downside to this? I've read (from a post from 2008) that it is possible for native processes to hog up excess heap space, which can lead to memory errors during runtime. Should I maybe also sniff for free virtualmemory or paging file size prior to assigning heap space? How would you deal with this situation?
Oh, and this is my first post to these forums. Nice to meet you all and be gentle! :)
Update:
Thanks for all the answers. I'm not sure if I put my words right, but my problem rose from the fact that I have zero knowledge of the hardware this software will be run on but would, nevertheless, like to assign as much heap space for the software as possible.
I came to a solution of assigning a heap of 70% of physical memory IF there is sufficient amount of virtual memory available - less otherwise.
You can have heap sizes of around 28 GB with little impact on performance esp if you have large objects. (lots of small objects can impact GC pause times)
Heap sizes of 100 GB are possible but have down sides, mostly because they can have high pause times. If you use Azul Zing, it can handle much larger heap sizes significantly more gracefully.
The main limitation is the size of your memory. If you heap exceeds that, your application and your computer will run very slower/be unusable.
A standard way around these issues with mapping software (which has to be able to map the whole world for example) is it break your images into tiles. This way you only display the image which is one the screen (or portions which are on the screen) If you need to be able to zoom in and out you might need to store data at two to four levels of scale. Using this approach you can view a map of the whole world on your phone.
Best to not set JVM max memory to greater than 60-70% of workstation memory, in some cases even lower, for two main reasons. First, what the JVM consumes on the physical machine can be 20% or more greater than heap, due to GC mechanics. Second, the representation of a particular data entity in the JVM heap may not be the only physical copy of that entity in the machine's RAM, as the OS has caches and buffers and so forth around the various IO devices from which it grabs these objects.

Estimating available RAM left with safety margin in C (STM32F4)

I am currently developing application for STM32F407 using STM32CubeMx and Keil uVision. I know that dynamic memory allocation in embedded systems is mostly discouraged, but from spot to spot on internet I can find some arguments in favor of it.
Due to my inventors soul I wanted to try to do it, but do it safely. Let's assume I'm creating a dynamically allocated fifo for incoming UART messages, holding structs composed of the msg itself and its' length. However I wouldn't like to consume all the heap size doing so, therefore I want to check how much of it I have left: Me new (?) idea is to try temporarily allocating some big chunk of memory (say 100 char) - if it's successful, I accept the incoming msg, if not - it means that I'm running out of heap and ignore the msg (or accept it and dequeue the oldest). After checking I of course free the temp memory.
A few questions arise in my mind:
First of all, does it make sens at all? Do you think, basic on your experience, that it could be usefull and safe?
I couldn't find precise info about what exactly shares RAM in ES (I know about heap, stack and volatile vars) so my question is: providing that answer to 1. isn't "hell no go home", what size of the temp memory checker would you pick for the mentioned controller?
About the micro itself - it has 192kB RAM, however in the Drivers\CMSIS\Device\ST\STM32F4xx\Source\Templates\arm\startup_stm32f407xx.s file only 512B+1024B are allocated for heap and stack - isn't that very little, leaving the whooping, remaining 190kB for volatile vars? Would augmenting the heap size to, say 50kB be sensible? If yes, do I do it directly in this file or it's a better practice to do it somewhere else?
Probably for some of you "safe dynamic memory" and "embedded" in one post is both schocking and dazzling, but keep in mind that this is experimenting and exploring new horizons :) Thanks and greetings.
Keil uVision describes only the IDE. If you are using KEil MDK-ARM which implies ARM's RealView compiler then you can get accurate heap information using the __heapstats() function.
__heapstats() is a little strange in that rather than simply returning a value it outputs heap information to a formatted output stream facilitated by a function pointer and file descriptor passed to it. The output function must have an fprintf() like interface. You can use fprintf() of course, but that requires that you have correctly retargetted the stdio
For example the following:
typedef int (*__heapprt)(void *, char const *, ...);
__heapstats( (__heapprt)fprintf, stdout ) ;
outputs for example:
4180 bytes in 1 free blocks (avge size 4180)
1 blocks 2^11+1 to 2^12
Unfortunately that does not really achieve what you need since it outputs text. You could however implement your own function to capture the data in memory and parse the result. You may only need to capture the first decimal digit characters and discard anything else, except that the amount of free memory and the largest allocatable block are not necessarily the same thing of course. Fragmentation is indicated by the number or free blocks and their average size. You can perhaps guarantee to be able to allocate at least an average sized block.
The issue with dynamic allocation in embedded systems are to do with handling memory exhaustion and, in real-time systems, the non-deterministic timing of both allocation and deallocation using the default malloc/free implementations. In your case you might be better off using a fixed-block allocator. You can implement such an allocator by creating a static array of memory blocks (or by dynamically allocating them from the heap at start-up), and placing a pointer to each block on a queue or linked list or stack structure. To allocate you simply remove a pointer from the queue/list/stack, and to free you place a pointer back. When the available blocks structure is empty, memory is exhausted. It is entirely deterministic, and because it is your implementation can be easily monitored for performance and capacity.
With respect to question 3. You are expected to adjust the heap and system stack size to suit your application. Most tools I have used have a linker script that automatically allocates all available memory not statically allocated, allocated to a stack or reserved for other purposes to the heap. However MDK-ARM does not do that in the default linker scripts but rather allocates a fixed size heap.
You can use the linker map file summary to determine how much space is unused and manually expand the heap. I usually do that leaving a small amount of unused space to account for maintenance when the amount of statically allocated data may increase. At some point however; you end up running out of memory, and the arcane error messages from the linker may not make it obvious that your heap is just too big. It is possible to override the default linker script and provide your own, and no doubt possible then to automatically size the heap - though I have never taken the trouble to try it.
Okay I have tested my idea with dynamic heap free space checking and it worked well (although I didn't perform long-run tests), however #Clifford answer and this article convinced me to abandon the idea of dynamic allocation. Eventually I implemented my own, static heap with pages (2d array), occupied pages indicator (0-1 array of size of number of pages) and fifo of structs consisting of pointer to the msg on my static heap (actually just the index of the array) and length of message (to determine how many contiguous pages it occupies). 95% of msg I receive should take up only one page, 5% - 2 or 3 pages, so fragmentation is still possible, but at least I keep a tight rein on it and it affects only the part of memory assigned to this module of the code (in other words: the fragmentation doesn't leak to other parts of the code). So far it has worked without any problems and for sure is faster because the lookup time is O(n*m), n - number of pages, m - the longest page possible, but taking into consideration the laws of probability it goes down to O(n). Moreover n is always a lot smaller the number of all allocation units in memory, so way less to look for.

How can I change maximum available heap size for a task in FreeRTOS?

I'm creating a list of elements inside a task in the following way:
l = (dllist*)pvPortMalloc(sizeof(dllist));
dllist is 32 byte big.
My embedded system has 60kB SRAM so I expected my 200 element list can be handled easily by the system. I found out that after allocating space for 8 elements the system is crashing on the 9th malloc function call (256byte+).
If possible, where can I change the heap size inside freeRTOS?
Can I somehow request the current status of heap size?
I couldn't find this information in the documentation so I hope somebody can provide some insight in this matter.
Thanks in advance!
(Yes - FreeRTOS pvPortMalloc() returns void*.)
If you have 60K of SRAM, and configTOTAL_HEAP_SIZE is large, then it is unlikely you are going to run out of heap after allocating 256 bytes unless you had hardly any heap remaining before hand. Many FreeRTOS demos will just keep creating objects until all the heap is used, so if your application is based on one of those, then you would be low on heap before your code executed. You may have also done something like use up loads of heap space by creating tasks with huge stacks.
heap_4 and heap_5 will combine adjacent blocks, which will minimise fragmentation as far as practical, but I don't think that will be your problem - especially as you don't mention freeing anything anywhere.
Unless you are using heap_3.c (which just makes the standard C library malloc and free thread safe) you can call xPortGetFreeHeapSize() to see how much free heap you have. You may also have xPortGetMinimumEverFreeHeapSize() available to query how close you have ever come to running out of heap. More information: http://www.freertos.org/a00111.html
You could also define a malloc() failed hook (http://www.freertos.org/a00016.html) to get instant notification of pvPortMalloc() returning NULL.
For the standard allocators you will find a config option in FreeRTOSConfig.h .
However:
It is very well possible you run out of memory already, depending on the allocator used. IIRC there is one that does not free() any blocks (free() is just a dummy). So any block returned will be lost. This is still useful if you only allocate memory e.g. at startup, but then work with what you've got.
Other allocators might just not merge adjacent blocks once returned, increasing fragmentation much faster than a full-grown allocator.
Also, you might loose memory to fragmentation. Depending on your alloc/free pattern, you quickly might end up with a heap looking like swiss cheese: Many holes between allocated blocks. So while there is still enough free memory, no single block is big enough for the size required.
If you only allocate blocks that size there, you might be better of using your own allocator or a pool (blocks of fixed size). Thaqt would be statically allocated (e.g. array) and chained as a linked list during startup. Alloc/free would then just be push/pop on a stack (or put/get on a queue). That would also be very fast and have complexity O(1) (interrupt-safe if properly written).
Note that normal malloc()/free() are not interrupt-safe.
Finally: Do not cast void *. (Well, that's actually what standard malloc() returns and I expect that FreeRTOS-variant does the same).