Weblogic server log fills up with <BEA-010213> - weblogic

My Weblogic server log fills up with this error message:
####<Mar 2, 2015 11:38:57 AM MST> <Info> <EJB> <max75demo> <MAXIMOSERVER> <[ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1425321537312> <BEA-010213> <Message-Driven EJB: JMSContQueueProcessor-1's transaction was rolled back. The transaction details are: Xid=BEA1-73C4DAA3AC1F8569980C(21198065),Status=Rolled back. [Reason=weblogic.transaction.internal.AppSetRollbackOnlyException: setRollbackOnly called on transaction],numRepliesOwedMe=0,numRepliesOwedOthers=0,seconds since begin=0,seconds left=60,XAServerResourceInfo[WLStore_mydomain_cqinstore]=(ServerResourceInfo[WLStore_mydomain_cqinstore]=(state=rolledback,assigned=MAXIMOSERVER),xar=WLStore_mydomain_cqinstore21478888,re-Registered
= false),SCInfo[mydomain+MAXIMOSERVER]=(state=rolledback),OwnerTransactionManager=ServerTM[ServerCoordinatorDescriptor=(CoordinatorURL=MAXIMOSERVER+10.0.0.11:80+mydomain+t3+, XAResources={WLStore_mydomain_sqinstore, WLStore_mydomain_sqoutstore, WSATGatewayRM_MAXIMOSERVER_mydomain, WLStore_mydomain_cqinstore},NonXAResources={})],CoordinatorURL=MAXIMOSERVER+10.0.0.11:80+mydomain+t3+).>
These logs consume disk space at 5MB/s, causing my small drive to fill up quickly. The only recent change out of the ordinary was that I synced the machine's time with a time server and changed the time zone. I have since cleared out the tmp folder and have restarted the server but to no avail. I'm running Weblogic 10.3.3.0.
Is there something I can do to prevent these errors from occurring?
Thanks!

Thanks for the tip.
In my case, a co-worker had made some changes to an enterprise product on the host I was using (without informing me) which was likely causing these error messages, since the change involved JMS queues. Had nothing to do with the time change.

Related

ServerFailureTriggerMBean.MaxStuckThreadTime & ServerFailureTriggerMBean.StuckThreadCount strange behaviour

I'm facing a strange behaviour with some parameters in weblogic.
I have a J2EE batch which is executed during more than 10 minutes in a weblogic server which cause an exception like
com.ibm.jbatch.container.exception.BatchContainerRuntimeException:
java.lang.InterruptedException
After some investigation, I found that the property MaxStuckThreadTime is set to 600 seconds (default value) and the property StuckThreadCount is set to 25 (was 0 in the past without any issue).
If I understand well, this means, the server should fail if and only if at least 25 threads are busy since more than 600seconds.
But I have maximum 10 threads running at the same time on the server.
I made some test on my dev environement and as soon as I have one thread stuck (busy during 10 minutes, the interruped exception is launched), is-it the expected behaviour?
I don't have the right to modify those value on production.
So, any idea is welcome to by pass this kind of error.
In the documentation, I found :
StuckThreadCount = The number of stuck threads after which the server is transitioned into FAILED state.
MaxStuckThreadTime = Sets the value of the MaxStuckThreadTime attribute.
So, in my point of view, the interupted excpetion, should only appears if the 2 conditions are field-in, but i have the impression that only one stuck thread is enough to interupt the batch.
Am-i correct if I say that the MaxStuckThreadTime is only taken into account if the StuckThreadCount is different than 0?
Thanks in advance for your help
edit :
I tried to implement the proposal here under but until now, without success.
So, in my weblogic-ejb-jar.xml, I've added the following code :
<work-manager>
<name>BatchWorkManager</name>
<ignore-stuck-threads>true</ignore-stuck-threads>
</work-manager>
<managed-executor-service>
<name>batch-job-executor</name>
<dispatch-policy>BatchWorkManager</dispatch-policy>
<long-running-priority>10</long-running-priority>
</managed-executor-service>
and in my batch, I added
#Resource(name = "BatchWorkManager")
WorkManager myMW;
and the call to my batch like this
#Override
public String process() throws Exception {
myWM.schedule(new MyWork("MyBatchName"));
return BatchStatus.COMPLETED.toString();
}
After a few minutes (defined in the MaxStuckThreadTime parameter), the job is put on status failed.
If I debug the code, I see the value of the workmanager :
stuckThreadActions = null name = "NO STUCK THREAD ACTIONS !"
stuckThreads = {BitSet#36226} "{}"
It seems, the workmanager is correctly setup (NO STUCK THREAD ACTIONS ! is what I want).
So, I still don't understand, why the batch is failing ...
Any help is welcome.
For information, the statcktrace I receive :
###<Apr 21, 2022, 12:40:00,793 PM CEST> <com.ibm.jbatch.container.impl.BatchletStepControllerImpl>
<[STUCK] ExecuteThread: '0' for queue:
'weblogic.kernel.Default (self-tuning)'> <>
<33ef2b10-13cc-45be-bf47-e06daf40042c-0000003b> <1650537600793>
<[severity-value: 16] [rid: 0:1] [partition-id: 0] [partition-name:
DOMAIN] > <Caught exception executing step:
com.ibm.jbatch.container.exception.BatchContainerRuntimeException:
java.lang.InterruptedException at
com.ibm.jbatch.container.impl.PartitionedStepControllerImpl.executeAndWaitForCompletion(PartitionedStepControllerImpl.java:407)
at
com.ibm.jbatch.container.impl.PartitionedStepControllerImpl.invokeCoreStep(PartitionedStepControllerImpl.java:297)
at
com.ibm.jbatch.container.impl.BaseStepControllerImpl.execute(BaseStepControllerImpl.java:144)
at
com.ibm.jbatch.container.impl.ExecutionTransitioner.doExecutionLoop(ExecutionTransitioner.java:112)
at
com.ibm.jbatch.container.impl.JobThreadRootControllerImpl.originateExecutionOnThread(JobThreadRootControllerImpl.java:110)
at
com.ibm.jbatch.container.util.BatchWorkUnit.run(BatchWorkUnit.java:80)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at weblogic.work.concurrent.TaskWrapper.call(TaskWrapper.java:151)
at
weblogic.work.concurrent.future.AbstractFutureImpl.runTask(AbstractFutureImpl.java:391)
at
weblogic.work.concurrent.future.AbstractFutureImpl.doRun(AbstractFutureImpl.java:436)
at
weblogic.work.concurrent.future.ManagedFutureImpl.run(ManagedFutureImpl.java:28)
at
weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:348)
at
weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:333)
at
weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:54)
at
weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41)
at
weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:640)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:406) at
weblogic.work.ExecuteThread.run(ExecuteThread.java:346) Caused by:
java.lang.InterruptedException at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
com.ibm.jbatch.container.impl.PartitionedStepControllerImpl.executeAndWaitForCompletion(PartitionedStepControllerImpl.java:402)
... 17 more
You could configure a new work manager for running the batch job and configure stuck threads to be ignored, or launch the batch job as a long running request.
A work manager can be configured globally via the weblogic console, or locally for each deployed application. To define a work manager in an application, you can configure it in the weblogic.xml (or equivalent for ear files) packaged up with your deployment. For example, i have this in my weblogic.xml file to define a work manager that ignores stuck threads...
<?xml version="1.0" encoding="UTF-8"?>
<weblogic-web-app xmlns="http://xmlns.oracle.com/weblogic/weblogic-web-app" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.oracle.com/weblogic/weblogic-web-app http://xmlns.oracle.com/weblogic/weblogic-web-app/1.4/weblogic-web-app.xsd">
...
<work-manager>
<name>batch-job-wm</name>
<max-threads-constraint>
<name>batch-job-max-threads</name>
<count>10</count>
</max-threads-constraint>
<ignore-stuck-threads>true</ignore-stuck-threads>
</work-manager>
<managed-executor-service>
<name>batch-job-executor</name>
<dispatch-policy>batch-job-wm</dispatch-policy>
<long-running-priority>10</long-running-priority>
<max-concurrent-long-running-requests>10</max-concurrent-long-running-requests>
</managed-executor-service>
<resource-env-description>
<resource-env-ref-name>concurrent/batch-job-executor</resource-env-ref-name>
<resource-link>batch-job-executor</resource-link>
</resource-env-description>
...
</weblogic-web-app>
I reference that managed-executor-service in my web.xml...
<?xml version="1.0" encoding="UTF-8"?>
<web-app xmlns="http://java.sun.com/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/javaee
http://java.sun.com/xml/ns/javaee/web-app_3_0.xsd" version="3.0">
...
<resource-env-ref>
<resource-env-ref-name>concurrent/batch-job-executor</resource-env-ref-name>
<resource-env-ref-type>javax.enterprise.concurrent.ManagedExecutorService</resource-env-ref-type>
</resource-env-ref>
</web-app>
In my web application, I can then access that task executor as follows...
#Configuration
public class ResourceConfig {
#Bean
public TaskExecutor batchTaskExecutor() {
DefaultManagedTaskExecutor taskExecutor = new DefaultManagedTaskExecutor();
taskExecutor.setJndiName("java:comp/env/concurrent/batch-job-executor");
return taskExecutor;
}
}
When launching a batch job using that work manager, any stuck threads are ignored by weblogic and the servers show as healthy even for long running tasks.
An enhancement to this is to have the batch job launched as a long running task. I think this will cause weblogic to create a new thread for the task instead of taking a thread from the work manager thread pool. Also weblogic won't consider a thread assigned to a long running task as being stuck.
To launch a long running task, you need to set the LONGRUNNING_HINT to true in the ManagedTask that is launched. For more details see the following...
https://docs.oracle.com/javaee/7/api/javax/enterprise/concurrent/ManagedTask.html#LONGRUNNING_HINT
https://docs.oracle.com/javaee/7/api/javax/enterprise/concurrent/ManagedExecutorService.html
https://blogs.oracle.com/weblogicserver/post/concurrency-utilities-support-in-weblogic-server-1221-part-one-managedexecutorservice

HANA hdbindexserver start issue after power outage

There was a power outage for our 5+1 node HANA cell cluster.
After we booted up the servers, tried to start the HANA DB.
During HDB start with SIDADM we can see on the nodes 2-3-4-5:
FAIL: process hdbindexserver HDB Indexserver not running
So of course trying to start hdbindexserver with hand with SIDADM:
cd /usr/sap/SIDADM/HDB0x/exe; ./hdbindexserver
But this just produces error:
/usr/sap/SIDADM/HDB0x/foobar003/trace> cat indexserver_alert_foobar003.trc
...
[14268]{-1}[-1/-1] 2017-10-09 19:55:34.593776 e TrexNet Communication.cpp(00501) : no internal interface found
[14287]{-1}[-1/-1] 2017-10-09 19:56:01.428226 e Checkpoint CheckpointMgr.cc(00244) : Skip versions garbage collection savepoint: transaction distribution work failure: snapshot timestamp synchronization failed
[14287]{-1}[-1/-1] 2017-10-09 19:56:22.467184 e Row_Engine transdtx.cc(01410) : Unexpected ltt exception thrown: transaction distribution work failure (at foobar/ptime/storage/tm/transdtx.cc:1410 )
[14287]{-1}[-1/-1] 2017-10-09 19:56:22.467427 f PersistenceLayer PersistenceController.cpp(00679) : startup failed exception 1: no.71000145 (ptime/storage/tm/transdtx.cc:1512)
snapshot timestamp synchronization failed
...
The IPs are up. There is 1 TB of RAM.
The question: what could cause hdbindexserver to fail to start?
Looks like the indexserver process wasn't able to bind the internal network interface again:
Communication.cpp(00501) : no internal interface found
I'd look into the other tracefiles and the system log to check whether the configured NI is up and available.
It seems the persistence storage (disk where data and log file resides) is not responding within time and hence it's getting timed out. Can you check if you can access the data file and log file from the server.
Also check is network I/O slow or disk I/O slow on that server, causing the synchronization to timeout.
You can try stopping the system completely and try to bring HDB on just that server first to check if above issue exists.

Web logic thread dump analysis : HttpSession is invalid

Hi I am very new to thread dump analysis so please excuse. There was an increase in WL queue, while trying to analyze I got this below error, several(more than 1000) times.
Can anybody help me what is the issue with the below one, and how does sessions got invalidated? Is this a potential issue?
<[ACTIVE] ExecuteThread: '173' for queue: 'weblogic.kernel.Default (self-tuning)'> <guest> <> <> <1472479201177> <BEA-100025> <An unexpected error occurred in HTTP session timeout callback while deleting sessions.
java.lang.IllegalStateException: HttpSession is invalid
at weblogic.servlet.internal.session.SessionData.getInternalAttribute(SessionData.java:790)
at weblogic.servlet.internal.session.SessionData.getInternalAttribute(SessionData.java:785)
at weblogic.servlet.internal.session.SessionContext.invalidateSession(SessionContext.java:1067)
at weblogic.servlet.internal.session.SessionContext.access$400(SessionContext.java:45)
at weblogic.servlet.internal.session.SessionContext$SessionInvalidator.cleanupExpiredSessions(SessionContext.java:1001)
at weblogic.servlet.internal.session.SessionContext$SessionInvalidator$1.run(SessionContext.java:903)
at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:120)
at weblogic.servlet.internal.session.SessionContext$SessionInvalidator.timerExpired(SessionContext.java:897)
at weblogic.timers.internal.TimerImpl.run(TimerImpl.java:284)
at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:550)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:263)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:221)

NullPointerException during to Webshpere MQ connection from OSB

From OSB project I connect to IBM Websphere MQ in BINDING MODE. Sending message asynchronously from the Business Service to MQ queue is working fine but I keep getting a NPE.
Could someone please tell me what I am doing wrong :) and I advice? I am new to OSB and MQ. The error is below:
<AdminServer> <[ACTIVE] ExecuteThread: '23' for queue: 'weblogic.kernel.Default (self-tuning)'>
<<WLS Kernel>> <> <d4c01266a9822b8f:-5e045fa4:154e15afad0:-8000-000000000000222d> <1464087403232> <BEA-000802> <ExecuteRequest failed
java.lang.NullPointerException.
java.lang.NullPointerException
at java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:209)
at java.util.concurrent.ConcurrentHashMap.containsKey(ConcurrentHashMap.java:836)
at com.bea.wli.sb.resources.mqconnection.MQConnectionFacade.getMQConnectionContext(MQConnectionFacade.java:70)
at com.bea.wli.sb.transports.mq.MQTransportTimerListener.timerExpired(MQTransportTimerListener.java:222)
at weblogic.timers.internal.TimerImpl.run(TimerImpl.java:284)
at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:550)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:263)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:221)
A java.lang.NullPointerException is being thrown because a null object has been passed into the containsKey(Object) method on a ConcurrentHashMap. See the Javadoc for this:
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html#containsKey(java.lang.Object)
You need to work out why the code in com.bea.wli.sb.resources.mqconnection.MQConnectionFacade.getMQConnectionContext(MQConnectionFacade.java:70) is trying to lookup an object using a null key. If this is code belongs to Oracle, you might need to engage their Support Teams.

Java OOM: no heapdump created

I'm getting OOM of memory issue on a long running application (3~5 hours) with the following symptoms
-XX:+HeapDumpOnOutOfMemoryError produce an empty dump
http://bugs.java.com/view_bug.do?bug_id=6784422
Exception in thread "[STANDBY] ExecuteThread: '21' for queue:
'weblogic.kernel.Default (self-tuning)'" java.lang.OutOfMemoryError:
getNewTla
Is there any other JVM options I can add to find out the exact cause I have the above symptoms. The application owner is refuse to allow me like increase xmx xns xms or anything with exception of collecting more data.
jrockit-jdk1.6.0
Since you are using jrockit, you need to bump up the tla
-XXtlaSize:min=10k,preferred=256k