Java OOM: no heapdump created - jvm

I'm getting OOM of memory issue on a long running application (3~5 hours) with the following symptoms
-XX:+HeapDumpOnOutOfMemoryError produce an empty dump
http://bugs.java.com/view_bug.do?bug_id=6784422
Exception in thread "[STANDBY] ExecuteThread: '21' for queue:
'weblogic.kernel.Default (self-tuning)'" java.lang.OutOfMemoryError:
getNewTla
Is there any other JVM options I can add to find out the exact cause I have the above symptoms. The application owner is refuse to allow me like increase xmx xns xms or anything with exception of collecting more data.
jrockit-jdk1.6.0

Since you are using jrockit, you need to bump up the tla
-XXtlaSize:min=10k,preferred=256k

Related

Web logic thread dump analysis : HttpSession is invalid

Hi I am very new to thread dump analysis so please excuse. There was an increase in WL queue, while trying to analyze I got this below error, several(more than 1000) times.
Can anybody help me what is the issue with the below one, and how does sessions got invalidated? Is this a potential issue?
<[ACTIVE] ExecuteThread: '173' for queue: 'weblogic.kernel.Default (self-tuning)'> <guest> <> <> <1472479201177> <BEA-100025> <An unexpected error occurred in HTTP session timeout callback while deleting sessions.
java.lang.IllegalStateException: HttpSession is invalid
at weblogic.servlet.internal.session.SessionData.getInternalAttribute(SessionData.java:790)
at weblogic.servlet.internal.session.SessionData.getInternalAttribute(SessionData.java:785)
at weblogic.servlet.internal.session.SessionContext.invalidateSession(SessionContext.java:1067)
at weblogic.servlet.internal.session.SessionContext.access$400(SessionContext.java:45)
at weblogic.servlet.internal.session.SessionContext$SessionInvalidator.cleanupExpiredSessions(SessionContext.java:1001)
at weblogic.servlet.internal.session.SessionContext$SessionInvalidator$1.run(SessionContext.java:903)
at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:120)
at weblogic.servlet.internal.session.SessionContext$SessionInvalidator.timerExpired(SessionContext.java:897)
at weblogic.timers.internal.TimerImpl.run(TimerImpl.java:284)
at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:550)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:263)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:221)

NullPointerException during to Webshpere MQ connection from OSB

From OSB project I connect to IBM Websphere MQ in BINDING MODE. Sending message asynchronously from the Business Service to MQ queue is working fine but I keep getting a NPE.
Could someone please tell me what I am doing wrong :) and I advice? I am new to OSB and MQ. The error is below:
<AdminServer> <[ACTIVE] ExecuteThread: '23' for queue: 'weblogic.kernel.Default (self-tuning)'>
<<WLS Kernel>> <> <d4c01266a9822b8f:-5e045fa4:154e15afad0:-8000-000000000000222d> <1464087403232> <BEA-000802> <ExecuteRequest failed
java.lang.NullPointerException.
java.lang.NullPointerException
at java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:209)
at java.util.concurrent.ConcurrentHashMap.containsKey(ConcurrentHashMap.java:836)
at com.bea.wli.sb.resources.mqconnection.MQConnectionFacade.getMQConnectionContext(MQConnectionFacade.java:70)
at com.bea.wli.sb.transports.mq.MQTransportTimerListener.timerExpired(MQTransportTimerListener.java:222)
at weblogic.timers.internal.TimerImpl.run(TimerImpl.java:284)
at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:550)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:263)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:221)
A java.lang.NullPointerException is being thrown because a null object has been passed into the containsKey(Object) method on a ConcurrentHashMap. See the Javadoc for this:
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html#containsKey(java.lang.Object)
You need to work out why the code in com.bea.wli.sb.resources.mqconnection.MQConnectionFacade.getMQConnectionContext(MQConnectionFacade.java:70) is trying to lookup an object using a null key. If this is code belongs to Oracle, you might need to engage their Support Teams.

Weblogic server log fills up with <BEA-010213>

My Weblogic server log fills up with this error message:
####<Mar 2, 2015 11:38:57 AM MST> <Info> <EJB> <max75demo> <MAXIMOSERVER> <[ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1425321537312> <BEA-010213> <Message-Driven EJB: JMSContQueueProcessor-1's transaction was rolled back. The transaction details are: Xid=BEA1-73C4DAA3AC1F8569980C(21198065),Status=Rolled back. [Reason=weblogic.transaction.internal.AppSetRollbackOnlyException: setRollbackOnly called on transaction],numRepliesOwedMe=0,numRepliesOwedOthers=0,seconds since begin=0,seconds left=60,XAServerResourceInfo[WLStore_mydomain_cqinstore]=(ServerResourceInfo[WLStore_mydomain_cqinstore]=(state=rolledback,assigned=MAXIMOSERVER),xar=WLStore_mydomain_cqinstore21478888,re-Registered
= false),SCInfo[mydomain+MAXIMOSERVER]=(state=rolledback),OwnerTransactionManager=ServerTM[ServerCoordinatorDescriptor=(CoordinatorURL=MAXIMOSERVER+10.0.0.11:80+mydomain+t3+, XAResources={WLStore_mydomain_sqinstore, WLStore_mydomain_sqoutstore, WSATGatewayRM_MAXIMOSERVER_mydomain, WLStore_mydomain_cqinstore},NonXAResources={})],CoordinatorURL=MAXIMOSERVER+10.0.0.11:80+mydomain+t3+).>
These logs consume disk space at 5MB/s, causing my small drive to fill up quickly. The only recent change out of the ordinary was that I synced the machine's time with a time server and changed the time zone. I have since cleared out the tmp folder and have restarted the server but to no avail. I'm running Weblogic 10.3.3.0.
Is there something I can do to prevent these errors from occurring?
Thanks!
Thanks for the tip.
In my case, a co-worker had made some changes to an enterprise product on the host I was using (without informing me) which was likely causing these error messages, since the change involved JMS queues. Had nothing to do with the time change.

JobTracker - High memory and native thread usage

We are running hadoop on GCE with HDFS default file system, and data input/output from/to GCS.
Hadoop version: 1.2.1
Connector version: com.google.cloud.bigdataoss:gcs-connector:1.3.0-hadoop1
Observed behavior: JT will accumulate threads in waiting state, leading to OOM:
2015-02-06 14:15:51,206 ERROR org.apache.hadoop.mapred.JobTracker: Job initialization failed:
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1371)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.initialize(AbstractGoogleAsyncWriteChannel.java:318)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.create(GoogleCloudStorageImpl.java:275)
at com.google.cloud.hadoop.gcsio.CacheSupplementedGoogleCloudStorage.create(CacheSupplementedGoogleCloudStorage.java:145)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.createInternal(GoogleCloudStorageFileSystem.java:184)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.create(GoogleCloudStorageFileSystem.java:168)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.<init>(GoogleHadoopOutputStream.java:77)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.create(GoogleHadoopFileSystemBase.java:655)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:564)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:545)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:452)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:444)
at org.apache.hadoop.mapred.JobHistory$JobInfo.logSubmitted(JobHistory.java:1860)
at org.apache.hadoop.mapred.JobInProgress$3.run(JobInProgress.java:709)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:706)
at org.apache.hadoop.mapred.JobTracker.initJob(Jobenter code hereTracker.java:3890)
at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
After looking through the JT logs I found these warnings:
2015-02-06 14:30:17,442 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #0 from primary datanode xx.xxx.xxx.xxx:50010
java.io.IOException: Call to /xx.xxx.xxx.xxx:50020 failed on local exception: java.io.IOException: Couldn't set up IO streams
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1150)
at org.apache.hadoop.ipc.Client.call(Client.java:1118)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy10.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.checkVersion(RPC.java:422)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:414)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:201)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3317)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2200(DFSClient.java:2783)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2987)
Caused by: java.io.IOException: Couldn't set up IO streams
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642)
at org.apache.hadoop.ipc.Client$Connection.access$2200(Client.java:205)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1249)
at org.apache.hadoop.ipc.Client.call(Client.java:1093)
... 9 more
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:635)
... 12 more
This appears to be similar to hadoop bug reporter here: https://issues.apache.org/jira/browse/MAPREDUCE-5606
I tried proposed solution by disabling saving job logs into the output path and it solved the problem at the expense of missing logs :)
I also ran jstack on JT and it showed hundreds of WAITING or TIMED_WAITING threads as such:
pool-52-thread-1" prio=10 tid=0x00007feaec581000 nid=0x524f in Object.wait() [0x00007fead39b3000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000074d86ba60> (a java.io.PipedInputStream)
at java.io.PipedInputStream.read(PipedInputStream.java:327)
- locked <0x000000074d86ba60> (a java.io.PipedInputStream)
at java.io.PipedInputStream.read(PipedInputStream.java:378)
- locked <0x000000074d86ba60> (a java.io.PipedInputStream)
at com.google.api.client.util.ByteStreams.read(ByteStreams.java:181)
at com.google.api.client.googleapis.media.MediaHttpUploader.setContentAndHeadersOnCurrentReque
st(MediaHttpUploader.java:629)
at com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.
java:409)
at com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(Abstr
actGoogleClientRequest.java:419)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(Abstr
actGoogleClientRequest.java:343)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogl
eClientRequest.java:460)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.run(AbstractGo
ogleAsyncWriteChannel.java:354)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- <0x000000074d864918> (a java.util.concurrent.ThreadPoolExecutor$Worker)
It appears JT is having hard time keeping up communicating with GCS via GCS Connector.
Please advise,
Thank you
At the moment, every open FSDataOutputStream in the GCS connector for Hadoop consumes a thread until it's closed, because a separate thread needs to run the "resumable" HttpRequests while the user of the OutputStream writes bytes intermittently. In most cases, (such as in individual Hadoop tasks), there's only ever one long-lived output stream, and possibly a few shorter-lived ones for writing small metadata/marker files, etc.
In general, there are two possible causes for the OOM you're running into:
You have lots of queued up jobs; every submitted job holds an unclosed OutputStream, and thus consumes a "waiting" thread. However, since you mention you only need to queue up ~10 jobs, this shouldn't be the root cause.
Something is causing a "leak" of the PrintWriter objects, originally created in logSubmitted and added to fileManager. Typically, terminal events (like logFinished will correctly close() all the PrintWriters before removing them from the map via markCompleted, but in theory they may be bugs here or there which can cause one of the OutputStreams to leak without being close()'d. For example, while I haven't had a chance to verify this assertion, it seems that IOException trying to do something like logMetaInfo will "removeWriter" without closing it.
I've verified that at least under normal circumstances, the OutputStream seem to get closed correctly, and my sample JobTracker shows a clean jstack after having successfully run a lot of jobs.
TL;DR: There are some working theories as to why some resource may leak and ultimately prevent necessary threads from being created. You should consider changing hadoop.job.history.user.location to some HDFS location in the meantime, as a way to preserve the job logs in the absence of placing them on GCS.

What is going wrong with my etl process?

I'm using GoodData's CloudConnect (based on CloverETL) to read a massive json file and write certain elements to a .csv.
Unfortunately, I'm seeing the error pasted below in the console log. Am I running out of memory due to the error, or is that not enough memory the actual error?
ERROR [WatchDog_0] - Component [JSONReader:JSONREADER1] finished with status ERROR.
Java heap space
ERROR [WatchDog_0] - Error details:
org.jetel.exception.JetelRuntimeException: Component [JSONReader:JSONREADER1] finished with status ERROR.
at org.jetel.graph.Node.createNodeException(Node.java:543)
at org.jetel.graph.Node.run(Node.java:522)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.Exception: java.lang.OutOfMemoryError: Java heap space
at org.jetel.component.TreeReader$StreamConvertingXPathProcessor.checkThrownException(TreeReader.java:766)
at org.jetel.component.TreeReader$StreamConvertingXPathProcessor.manageThread(TreeReader.java:757)
at org.jetel.component.TreeReader$StreamConvertingXPathProcessor.processInput(TreeReader.java:732)
at org.jetel.component.TreeReader.execute(TreeReader.java:412)
at org.jetel.graph.Node.run(Node.java:493)
... 1 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at net.sf.saxon.tinytree.TinyTree.condense(TinyTree.java:379)
at net.sf.saxon.tinytree.TinyBuilder.close(TinyBuilder.java:177)
at net.sf.saxon.event.ReceivingContentHandler.endDocument(ReceivingContentHandler.java:219)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endDocument(AbstractSAXParser.java:745)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:515)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649)
at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:404)
at net.sf.saxon.event.Sender.send(Sender.java:193)
at net.sf.saxon.event.Sender.send(Sender.java:50)
at net.sf.saxon.Configuration.buildDocument(Configuration.java:2973)
at net.sf.saxon.sxpath.XPathExpression.evaluate(XPathExpression.java:154)
at org.jetel.component.tree.reader.xml.XmlXPathEvaluator.iterate(XmlXPathEvaluator.java:79)
at org.jetel.component.tree.reader.XPathPushParser.handleContext(XPathPushParser.java:104)
at org.jetel.component.tree.reader.XPathPushParser.parse(XPathPushParser.java:84)
at org.jetel.component.TreeReader$StreamConvertingXPathProcessor$PipeParser.work(TreeReader.java:827)
at org.jetel.graph.runtime.CloverWorker.run(CloverWorker.java:87)
... 1 more
This looks like the second case: this error is caused by insufficient memory for your task.
Error occurred during evaluating (one of) your JSONReader component(s).
The JSON seems to be really huge and you should consider splitting this task into smaller ones if possible.
Did you run your transformation locally or on the gooddata server?
It is really hard to advise something specific without knowing details.
Try to use JSONExtract instead if JSONReader - it uses less memory, but also reads JSON files.
From the respective help documents:
JSONReader uses DOM, so the whole input is stored in memory and therefore the component can be memory-greedy.
JSONExtract uses SAX instead of DOM, so it uses less memory than JSONReader