Valgrind massif - measure total memory allocations per callstack - valgrind

I record memory allocations using valgrind massif and use ms_print to create a document of snapshots that shows me which callstack holds how much memory currently, right?
I want to measure which callstacks have allocated most over the whole program run, that means deallocated memory should be taken into account when calculating the weight of a callstack.
Is this possible?
Regards

When a tool (such as memcheck, massif, ...) replaces the memory allocation functions (malloc, free, ...), then valgrind provides the option:
--xtree-memory=none|allocs|full profile heap memory in an xtree [none]
and produces a report at the end of the execution
none: no profiling, allocs: current allocated
size/blocks, full: profile current and cumulative
allocated size/blocks and freed size/blocks.
--xtree-memory-file=<file> xtree memory report file [xtmemory.kcg.%p]
So, if you use --xtree-memory=full, you will get a file that you can visualise with kcachegrind. The resulting file details a.o. what is currently allocated, and what was allocated and then freed.
See http://www.valgrind.org/docs/manual/manual-core.html#manual-core.xtree
for more details.

Related

How to get currently allocation counts in Vulkan?

I'm writing a memory manager in my project to manage Vulkan memory allocation. In practice, allocation counts should be smaller than maxMemoryAllocationCount, so I counted all allocations in my app, and check if it exceeds maxMemoryAllocationCount each allocation.
However, I think is design has bugs, because other apps could also allocate memories from the same device, so I need to get the allocation counts which are counted by the device, but I didn't find any kind of these APIs.
So do I miss something or maxMemoryAllocationCount are application local?
other apps could also allocate memories from the same device
No, they cannot.
They can allocate memory from the same physical device. But they cannot allocate memory from the same VkDevice object. Such objects are specific to the process and cannot be shared. The allocations can be shared, but not the devices themselves (note that a shared allocation counts against the limit on all devices that can access it).
The specification is very clear that this is bound to a specific VkDevice:
The maximum number of valid memory allocations that can exist simultaneously within a VkDevice may be restricted by implementation-or-platform-dependent limits. The maxMemoryAllocationCount feature describes the number of allocations that can exist simultaneously before encountering these internal limits.
When the specification says "device", unless it makes it clear otherwise, it means "VkDevice", not "actual GPU".

valgrind generated very large xtree

valgrind 3.13 supported xtree http://valgrind.org/docs/manual/dist.news.html
I used it with massif
valgrind --tool=massif --xtree-memory=full --xtree-memory-file=xtmemory.ms.%p
Then it generated a 16G file. It is too large to load by massif visualizer.. What is the best practice to use xtree-memory
The massif report contains non detailed snapshots, and some detailed snapshots.
The detailed snapshots only show the allocated memory.
The stack traces that are below the massif threshold will be regrouped
together (i.e. no detail is given for the stack traces below the
thresholds).
The xtree requested with --xtree-memory=full contains 6 different detailed
snapshots, giving the currently allocated bytes/blocks, the total allocated bytes/blocks, the total freed bytes/blocks.
There is no threshold filtering for this xtree report, and so if
your application has a lot of stack traces that are doing a small
proportion of the alloc or free, you will have a lot more data in the
massif xtree report than in the snapshots.
Instead of using a .ms format for the --xtree-memory-file, you might rather
use the .kcg format, and examine it using kcachegrind : the kcachegrind
format is more efficient to store huge amount of stack traces.
See http://www.valgrind.org/docs/manual/manual-core.html#manual-core.xtree
for some more background information.

How to measure Valgrind's memory usage?

I have an application written in C which uses the zmalloc (borrowed from Redis) memory wrapper to keep track of the total dynamic allocated memory by my program. I am also using Valgrind on Linux to find memory leaks and invalid memory accesses.
The problem is that zmalloc and top show totally different memory usage reports when I am using Valgrind. This makes me think that Valgrind itself is consuming too much memory.
How do I measure Valgrind's memory usage?
valgrind tools such as memcheck or helgrind use a lot of memory for
tracking various aspects of your program.
So, it is normal that top shows a lot more memory than what your program
allocates itself.
If you want to have an idea about the memory used by valgrind, you can do:
valgrind --stats=yes ...
The lines following
------ Valgrind's internal memory use stats follow ------
will give some info about valgrind memory usage.
Use valgrind --profile-heap=yes ... to have detailed memory use.
Note that if you do not use the standard malloc library, you might need to use the option --soname-synonyms=... to have tools such as memcheck or helgrind working properly.
to

Activity monitor - Memory usage when profiling / not profiling

Any idea why my app's memory usage does not increase whilst using Instruments profiler (searching for leaks), but does when I don't use any profiler? To the tune of 1MB per operation performed. Instruments does not show any leaks.
OS memory management is a complex thing. It is likely that when you free memory it is not returned immediately to the system, but instead it is still "attached" to your process to make any future allocations your application needs more efficient. Although it is recorded as part of your process's memory space, it would be marked as unused, and when the system is running out of memory (or when your application exits), it would then reclaim the unused memory from your application.
If Instruments isn't reporting any leaks, you should be fine.

How to determine the amount of memory used by unmanaged code

I'm working against a large COM library (ArcObjects) and I'm trying to pinpont a memory leak.
What is the most reliable way to determine the amount of memory used by unmanaged code/objects.
What performance counters can be used?
Use UMDH to get snapshot of your memory heap, run it twice then use the tools to show all the allocations that occurred between the 2 snapshots. This is great in helping you track down which areas might be leaking.
This article explains in in simple terms.
I suggest you use a CComPtr<> to wrap your objects, not forgetting that you must release it before passing it into a function that returns a raw pointer reference (as the cast operator will be used to get the pointer that then gets overwritten)
The 'Virtual Bytes' counter for a process represents the total amount of memory the process has reserved. If you have a memory leak then this will trend upwards.