Flink standalone cluster: SIGSEGV crushing TaskManager - jvm

We have a simple standlone session cluster (20 TaskManagers) with several Flink streaming jobs.
Periodically (maybe couple of times a month) one of our TaskManagers dies with SIGSEGV error:
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f8302eab031, pid=3947, tid=0x00007f82876f6700
#
# JRE version: OpenJDK Runtime Environment (8.0_272-b10) (build 1.8.0_272-8u272-b10-0+deb9u1-b10)
# Java VM: OpenJDK 64-Bit Server VM (25.272-b10 mixed mode linux-amd64 )
# Problematic frame:
# V [libjvm.so+0x5b6031]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# //hs_err_pid3947.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
#
hs_err_pid content is here
As I understand, the problem is somewhere in native code and I suppose that can be RocksDB error (we use RocksDB as state backend in our jobs).
All information about similar errors in internet is pretty old, e.g. https://issues.apache.org/jira/browse/FLINK-8309
We use Flink 1.10.0.
I will be glad to any help or advice, I have no idea how to localize the problem.

Related

How to close Karate.robot session running on windows machine, getting some thread error

I had a question regarding karate.robot do we have any method or function to shut down or close or quit Karate.robot session like driver. quit or close?
it seems some threads are occupied getting following error many times:
#
# A fatal error has been detected by the Java Runtime Environment:
#
# EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x0000000065a03e06, pid=11236,
tid=8836
#
# JRE version: Java(TM) SE Runtime Environment (8.0_25-b18) (build 1.8.0_25-b18)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.25-b02 mixed mode windows-amd64
compressed oops)
# Problematic frame:
# V [jvm.dll+0x4c3e06]
#
# Failed to write core dump. Minidumps are not enabled by default on client versions
of Windows
#
[thread 6836 also had an error]
[.error occurred during error reporting , id 0xc0000005]
#
# If you would like to submit a bug report, please visit:
# http://bugreport.sun.com/bugreport/crash.jsp
#
Dll Process Attached
Loading jawt.dll
Dll Process Detach
Process finished with exit code 1
No, we don't see a need. You typically start the Robot instance and it stays up until the end of your entire suite. Maybe you should try install the JDK 64-bit or 32-bit.
You are welcome to contribute code to improve anything if required. So far no one has reported any problems like this. Maybe you are trying to do things in parallel threads which is not supported. Provide a way to replicate if you can: https://github.com/intuit/karate/wiki/How-to-Submit-an-Issue
EDIT - one area you can help us investigate is if we need to do more to release JNA resources after a Scenario in this method.
Also see this answer: Java JNA: JRE crashes after application completes

Java 7 supported Application crashes on Mojave

My application supported on
jdk1.7.0_76
JavaFx2.2.76_b13
Netbeans IDE
It's running successfully till Mac-OSX-HighSierra.
When I tried to run this application on Mojave using Netbeans the application crashes and giving following error.
Launching <fx:deploy> task from /Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/../lib/ant-javafx.jar
jfx-deployment-script:
jfx-deployment:
jar:
objc[8382]: Class JavaLaunchHelper is implemented in both /Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/bin/java (0x1018244c0) and /Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/jli/./libjli.dylib (0x10b4f3480). One of the two will be used. Which one is undefined.
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGILL (0x4) at pc=0x00007fff4200543b, pid=8382, tid=775
#
# JRE version: Java(TM) SE Runtime Environment (7.0_80-b15) (build 1.7.0_80-b15)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# C [CoreFoundation+0x13f43b] _CFRelease+0x434
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/rahulsharma/NetBeansProjects/CreatFXMLTst/hs_err_pid8382.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
Java Result: 134
debug:
jfxsa-debug:
BUILD SUCCESSFUL (total time: 17 seconds)
We're seeing the exact same crash in Firefox (illegal instruction at that address), it's probably an issue in CoreFoundation:
Firefox crashes # CoreFoundation+0x13f43b

SIGSEGV - Fatal Error in JavaFX Application - libjvm.so [duplicate]

#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007ff17a60c678, pid=4219, tid=140673779791616
#
# JRE version: Java(TM) SE Runtime Environment (8.0-b124) (build 1.8.0-ea-b124)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b66 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V [libjvm.so+0x665678] jni_invoke_nonstatic(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)+0x38
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /media/data/K's World/javaFX/ChatApp/hs_err_pid4219.log
Compiled method (c1) 16675 988 3 java.util.concurrent.atomic.AtomicBoolean::set (14 bytes)
total in heap [0x00007ff16535ef50,0x00007ff16535f2a0] = 848
relocation [0x00007ff16535f070,0x00007ff16535f0a0] = 48
main code [0x00007ff16535f0a0,0x00007ff16535f1c0] = 288
stub code [0x00007ff16535f1c0,0x00007ff16535f250] = 144
metadata [0x00007ff16535f250,0x00007ff16535f258] = 8
scopes data [0x00007ff16535f258,0x00007ff16535f268] = 16
scopes pcs [0x00007ff16535f268,0x00007ff16535f298] = 48
dependencies [0x00007ff16535f298,0x00007ff16535f2a0] = 8
#
# If you would like to submit a bug report, please visit:
# http://bugreport.sun.com/bugreport/crash.jsp
#
I am writing chat App in javaFx..and I am using eclipse IDE..
My Application is running well but I don't know why suddenly application has been stopped.
It sounds like you're running JavaFX with Java 8 on Linux, and you've run into this bug:
https://bugs.openjdk.java.net/browse/JDK-8141687
App crashes while starting Main.class in JavaFx
ava version "1.8.0_60" Java(TM) SE Runtime Environment (build
1.8.0_60-b27) Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
ADDITIONAL OS VERSION INFORMATION : Mint17.2 Cinnamon 64Bit
SUGGESTION: Try a different version of Java/JavaFX.
Run sudo update-alternatives --config java to see what alternatives are already present on your system. I would downgrade to Java 1.7 if possible.
https://askubuntu.com/questions/272187/setting-jdk-7-as-default
If there are no suitable candidates, use apt-get install openjdk-7-jdk:
https://www.digitalocean.com/community/tutorials/how-to-install-java-on-ubuntu-with-apt-get
I had the same issue (except that it was java-8-oracle build 101) and found out why it was happening:
I have a login screen that appears before my main application and that screen gets closed after the login occurs and, apparently, closing it (or even hiding it) and showing a new window makes it crash.

JVM Crash Problematic Frame: Canonicalizer::do_If

Iam facing JVM Crash cosistently while enabling hotdeploy (USING below java options on starting up JAVA_OPTS -Xmx4096m -XX:MetaspaceSize=512m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=crash -XX:ThreadStackSize=512 -XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=5 -XX:NewRatio=2 -XX:+UnlockDiagnosticVMOptions -XX:-UseLoopPredicate -Xdebug -Xrunjdwp:transport=dt_socket,address=$DEBUG_PORT,server=y,suspend=n -XX:NewRatio=2 -Dspringloaded.synchronize=true JAVA_OPTS=`echo $JAVA_OPTS -Dspringloaded.synchronize=true -javaagent:springloaded-1.2.1.jar -noverify
)
Environment : JDK 1.8 U 66, RHEL 6.7
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007faee9a1e27c, pid=27208, tid=140379827795712
#
# JRE version: Java(TM) SE Runtime Environment (8.0_66-b17) (build 1.8.0_66-b17)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.66-b17 mixed mode linux-amd64 )
# Problematic frame:
# V [libjvm.so+0x35027c] Canonicalizer::do_If(If*)+0x1c
#
# Core dump written. Default location: core.27208
#
# An error report file with more information is saved as:
# hs_err_pid27208.log
# [ timer expired, abort... ]
I've noticed both -javaagent and -noverify in Java options list.
It looks like springloaded agent generates invalid bytecode, while the bytecode verification is explicitly turned off. No surprise, this may lead to unpredictable results including JVM crash.
This is not a JVM problem, but most likely a bug in springloaded agent. Try to remove -noverify option.
-XX:-TieredCompilation may also work around this particular problem, but don't expect application to work correctly if the bytecode fails to pass verification. It's better to stay away from the buggy agent libraries.
4.2.1 Crash in HotSpot Compiler Thread or Compiled Code
If the fatal error log indicates that the crash occurred in a compiler
thread, then it is possible (but not always the case) that you have
encountered a compiler bug. Similarly, if the crash is in compiled
code then it is possible that the compiler has generated incorrect
code.
In the case of the HotSpot Client VM (-client option), the compiler
thread appears in the error log as CompilerThread0. With the HotSpot
Server VM there are multiple compiler threads and these appear in the
error log file as CompilerThread0, CompilerThread1, and AdapterThread.
Below is a fragment of an error log for a compiler bug that was
encountered and fixed during the development of J2SE 5.0. The log file
shows that the HotSpot Server VM is used and the crash occurred in
CompilerThread1. In addition, the log file shows that the Current
CompileTask was the compilation of the java.lang.Thread.setPriority
method.
An unexpected error has been detected by HotSpot Virtual Machine:
:
Java VM: Java HotSpot(TM) Server VM (1.5-internal-debug mixed mode) :
--------------- T H R E A D ---------------
Current thread (0x001e9350): JavaThread "CompilerThread1" daemon
[_thread_in_vm, id=20]
Stack: [0xb2500000,0xb2580000), sp=0xb257e500, free space=505k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code,
C=native code) V [libjvm.so+0xc3b13c] :
Current CompileTask: opto: 11 java.lang.Thread.setPriority(I)V
(53 bytes)
--------------- P R O C E S S ---------------
Java Threads: ( => current thread ) 0x00229930 JavaThread "Low
Memory Detector" daemon [_thread_blocked, id=21]
=>0x001e9350 JavaThread "CompilerThread1" daemon [_thread_in_vm, id=20] :
In this case there are two potential workarounds:
The brute force approach: change the configuration so that the application is run with the -client option to specify the HotSpot
Client VM.
Assume that the bug only occurs during the compilation of the setPriority method and exclude this method from compilation.
The first approach (to use the -client option) might be trivial to
configure in some environments. In others, it might be more difficult
if the configuration is complex or if the command line to configure
the VM is not readily accessible. In general, switching from the
HotSpot Server VM to the HotSpot Client VM also reduces the peak
performance of an application. Depending on the environment, this
might be acceptable until the actual issue is diagnosed and fixed.
The second approach (exclude the method from compilation) requires
creating the file .hotspot_compiler in the working directory of the
application. Below is an example of this file:
exclude java/lang/Thread setPriority
In general the format of this file is exclude CLASS METHOD, where
CLASS is the class (fully qualified with the package name) and METHOD
is the name of the method. Constructor methods are specified as
and static initializers are specified as .
Note - The .hotspot_compiler file is an unsupported interface. It is
documented here solely for the purposes of troubleshooting and finding
a temporary workaround.
Once the application is restarted, the compiler will not attempt to
compile any of the methods listed as excluded in the .hotspot_compiler
file. In some cases this can provide temporary relief until the root
cause of the crash is diagnosed and the bug is fixed.
In order to verify that the HotSpot VM correctly located and processed
the .hotspot_compiler file that is shown in the example above, look
for the following log information at runtime. Note that the file name
separator is a dot, not a slash.
Excluding compile: java.lang.Thread::setPriority
Source
Agree with #apangin, In the program you are doing bytecode intrumentation (-agent) but specifies -noverify. When verification is turned off, you may end up such crashes.
You should not use -noverify or -Xverify:none during byte code intrumentation.
For those of you unfamiliar with bytecode verification, it is simply part of the JVM's classloading process that checks the code for certain dangerous and disallowed behavior. You can (but shouldn't) disable this protection on many JVMs by adding -Xverify:none or -noverify to the Java command line. https://blogs.oracle.com/buck/entry/never_disable_bytecode_verification_in

How to debug the crahses of JNI interfaced lib?

There is an error in my JNI lib outside of JVM.
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007ff9037cd8de, pid=24387, tid=140708181948160
#
# JRE version: Java(TM) SE Runtime Environment (8.0_31-b13) (build 1.8.0_31-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.31-b07 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [_MyLib-ExitOnDelete-3797760937319876478.so+0x408de] MyLib::MyLibStringImpl::inc()+0xc
#
It seems to happen when i try to create a temp file and then delete it on exit of the function. but it leads to my string impl for increasing a counter inc() {return count +=1;}. The c++ lib runs fine by itself. Only in the JNI lib, it crashes often.
Any insights?