I am using weblogic server and was trying to get the JFRs for my Weblogic Server. The command line arguments I use are:
-XX:FlightRecorderOptions=defaultrecording=true,dumponexit=true,dumponexitpath=/my/path,repository=/some/path
There are 2 disadvantages here:
1) There is a maximum of 3 JFRs stored and data before that are lost.
2) When there is an OOM, I execute a script to kill the server with signal 11 (SIGSEGV). This does not dump the currently recording JFR.
How do I go about getting the data at the time of crash and retain all the JFR data? Space is not an issue here. If I specify maxage=0, then the JFR is never dumped. If I specify maxsize, the files are deleted once the limit is reached.
I assume JDK 7/8, since it is 2018 and you are on WLS, which means recordings can only be dumped in the Java shutdown hook. Try SIGTERM
kill -l 15
In JDK 9 and later, a dump can also be written (in native) if the JVM crashes. The file is located where the Java process was started and is called hs_err_pidXXX.jfr
JDK 10 added support for Old Object Sample events, which can be used to diagnose memory leaks. If the application exits due to an OutOfMemoryError, it will write an OOS event with paths to GC roots (regardless if you have enabled the event or not). It should provide information to solve the memory leak.
JDK 11.03 or later contains a command line tool, which can be used to print the contents of a recording file.
$ jfr print --events OldObjectSample hs_oom_pidXXX.jfr
By looking at the allocationTime you can see when objects were allocated. Memory leaks are typically allocated through out the lifetime of the application, so if you ignore the early samples (static objects) and late samples (short-lived objects) you are likely to find a leaking object and its path to the GC root. Just follow the reference chain until you find a reference that should not be there.
Related
Our client uses an automation software called ActiveBatch (by Advanced Systems Concepts, Inc.). They're currently using ActiveBatch v8 and is now on the the process of migrating the automated jobs to a newer ActiveBatch v11.
Most the jobs have no problems coping with the newer software and they're running OK as of this writing. However, there is one job that is unable to run, rather, initialize in the first place. This job runs OK on v8. Whenever this job is being run on v11, it produces an error message:
%ABAT-W-CREPRCERR, error creating batch process for job %1
Quite self-explanatory; means the process for the particular job was not created. As per checking the user manual, it stated that the job's log file might explain more why the error occurred. Problem is, the log file is not very helpful as it only show magic numbers shown below:

Further readings states that it's Byte Order Mark for UTF-8. I don't know much about this stuff but since the log file only contains those characters, I'm not sure they're helpful at all.
Another thing, if I run the job manually (running EXE via Windows Explorer), no problems will be encountered and it will be a success. The job by the way is a Power Builder 9 application.
I configured Debug Diag on Production where I set a Crash rule for a specific app pool with action type Long Stack Trace. But the problem is it's generating dump file those are very large in size approx 700mb each. I'm not sure why these files are too large. Is there a way to truncate it?
When you use "Log Stack Trace" option, the callstack for the exception will be logged in to a text file (not dump file) that Debug Diagnostic generates for the process to which it is attached to. I am assuming that the dump is getting generated if your process is crashing with a 2nd chance exception (that is, if you didn't change anything else in the default crash rule).
If you look at the name of the dump file, you would be able to identify on what exact condition the dump got generated.
I tried to search this, but couldn't find, exactly what I am looking for
so someone please provide me an explanation on IDEA14's capture memory snapshot
It is added to 14 version for convenience reporting in case of memory troubles.
Snippet from How to report IntelliJ IDEA performance problems:
In case of memory related issues (memory usage goes high, garbage is not collected, etc) please use the Memory snapshot button in the menu near the CPU snapshot button. If it's not possible to get the snapshot because of the application crashing with OutOfMemory errors, please add the
-XX:+HeapDumpOnOutOfMemoryError
option to the IntelliJ IDEA JVM options. On the next OOM error the .hrpof dump will be produced and saved by the JVM (usually in the application working directory which is IDEA_HOME\bin).
Upload this dump to our FTP as described above in the CPU snapshot section.
Please note that memory snapshot may contain the sensitive source code from your project.
If you are uploading to a public service, use some password protection or enctyption. JetBrains FTP server is write only and you don't need to protect files uploaded there.
Additional link:
Reporting performance problems
Does anyone know what apis Apple is using for it's Get Info panel to determine free space in Lion? All of the code I have tried to get the same Available Space that Apple is reporting is failing, even Quick Look isn't displaying the same space that Get Info shows. This seems to happen if I delete a bunch of files and attempt to read available space.
When I use NSFileManager -> NSFileSystemFreeSize I get 42918273024 bytes
When I use NSURL -> NSURLVolumeAvailableCapacityKey i get 42918273024 bytes
When I use statfs -> buffer.f_bsize * buffer.f_bfree i get 43180417024 bytes
statfs gets similar results to Quick Look, but how do I match Get Info?
You are probably seeing a result of local Time Machine snapshot backups. The following quotes are from the following Apple Support article - OS X Lion: About Time Machine's "local snapshots" on portable Macs:
Time Machine in OS X Lion includes a new feature called "local
snapshots" that keeps copies of files you create, modify or delete on
your internal disk. Local snapshots compliment regular Time Machine
backups (that are stored on your external disk or Time Capsule) giving
you a "safety net" for times when you might be away from your external
backup disk or Time Capsule and accidentally delete a file.
The article finishes by saying:
Note: You may notice a difference in available space statistics between Disk Utility, Finder, and Get Info inspectors. This is
expected and can be safely ignored. The Finder displays the available
space on the disk without accounting for the local snapshots, because
local snapshots will surrender their disk space if needed.
It looks like all the programmatic methods of measuring available disk space that you have tried give the true free space value on the disk, not the space that can be made available by removing local Time Machine backups. I doubt command line tools like df have been made aware of local Time Machine backups either.
This is a bit of a workaround, not a real api, but the good old unix command df -H will get you the same information as in the 'get info' panel, you just need to select the line of your disk and parse the output.
The df program has many other options that you might want to explore. In this particular case the -H switch tells the program to spit out the numbers in human readable format and to use base 10 sizes.
Take a look here on how to run command lines from within an app and get the output inside your program: Execute a terminal command from a Cocoa app
I believe that the underpinnings of both df and the get info panel are very likely to be the same thing.
Background
We have a .NET WinForms application written in C# that interfaces to a handheld store scanner via a console application. The console application is written in good ol' VB6-- no managed code there. The VB6 application consists of several COM objects.
The .NET WinForms application refreshes the data in the scanner by invoking the console application with the right parameters. When the console application starts, it pops up a modal form reminding the user to place the handheld device into its cradle.
Problem
A customer has a bizarre situation in which the call to start the console application appears to hang before it displays the reminder form. If the user presses any key-- even something innocent like Shift or Alt-- the application unfreezes, and the reminder form appears. While it is hung, the CPU usage of the console application is very high.
We have obtained a memory dump from the command line application using ProcDump. I have some experience debugging managed dump files, but this VB 6 dump is strange to me.
We captured several full memory dumps in a row. In some of them, there appears to be COM glue stacks. For example, several dump files show a call stack like this:
msvbm60!BASIC_DISPINTERFACE_GetTICount
msvbm60!_vbaStrToAnsi
msvbm60!IIDIVbaHost
msvbm60!rtcDoEvents
msvbm60!IIDIVbaHost
msvbm60!BASICCLASS_QueryInterface
[our code which I think is trying to create and invoke a COM object]
It doesn't help that the only symbols I have are from our code. The Microsoft symbol server does not have a PDB file for msvbm60.dll (or at least not from their version which is 6.0.98.2).
Questions
I am suspecting there may be some COM threading issue that is happening only on their system.
1) How can I determine the thread state of each thread in a dump file? If this were a managed dump file, I would look at !threads and then !threadstate to figure out the thread states. There is no managed code, so I can't use sos.dll. I didn't see any hints using ~ and !teb.
2) Is there a way to see what COM objects have been created in a dump file? Again, in a managed dump, I can do a !dumpheap to get a list of managed objects. Is there something similar I can find for COM objects?
3) Can I determine the threading model of COM objects in the dump file?
You can dump thread state by using command:
~*
this will not display 'background' as a state, you will only see running, frozen or suspended.
I'm not sure how you can get information from COM objects, I have never tried but will investigate and get back to you, regards to threading model it will be difficult to infer that without painful monitoring of application state after stepping through and even with that, when you step through all other threads will run unless you use .bpsync 1 which syncs all threads to the current one, but that could cause a hang (e.g. gui thread has now been told to freeze) so I think it will be difficult unless you have access to the source code.
I can only answer question 1. Use !runaway to find the thread or threads consuming the CPU. To get all thread stacks use ~*kb1000.