Why are my WebLogic clustered MDB app deployments in warning state? - weblogic

I have a WebLogic cluster on which I've deployed numerous topics and applications that use them. My applications uniformly show themselves in a Warning status. Looking at Monitoring on the deployment, I see the MDB application connects to Server #1, but on server #2 it shows this:
MDB application appName is NOT connected to messaging system.
My JMS Server is targetted to a migratable target, which is in turn targetted to the #1 server and has a cluster identified. And messages sent to either server all flow as expected. I just don't know why these deployments show in a Warning state.
WebLogic 11g

This can be avoided by using the parameter below
<start-mdbs-with-application>false</start-mdbs-with-application>
In the weblogic-application.xml, Setting start-mdbs-with-application to false forces MDBs to defer starting until after the server instance opens its listen port, near the end of the server boot up process.
If you want to perform startup tasks after JMS and JDBC services are available, but before applications and modules have been activated, you can select the Run Before Application Deployments option in the Administration Console (or set the StartupClassMBean’s LoadBeforeAppActivation attribute to “true”).
If you want to perform startup tasks before JMS and JDBC services are available, you can select the Run Before Application Activations option in the Administration Console (or set the StartupClassMBean’s LoadBeforeAppDeployments attribute to “true”).
Refer :http://docs.oracle.com/cd/E13222_01/wls/docs81/ejb/message_beans.html
this is applicable for the versions till 12c and later

I don't like unanswered questions, so I'm going to answer this one.
The problem is resolved, though I was not involved in its resolution. At present the problem only exists for the length of time it takes the JMS subsystem to fully initialize. During that period (with many queues, it can take a while) the JNDI system throws errors and the apps are truly in warning state. Once the JMS is fully initialized, everything goes green.
My belief is that someone corrected something in the JMS Server / Cluster config. I'll never know what it was.

Related

Weblogic migratable JMS consumer doesn't follow the service to the new managed server if the old server remains running

I have a JMS service targeted at a migratable target (using an Auto-Migrate Exactly-Once policy) in a cluster which consists of 2 managed servers, at any point of time the service is hosted at one of them and the consumer (which is targeted at the cluster) is supposed to receive messages seamlessly no matter where the service is hosted.
When I manually switch the host of the migratable target (clicking migrate), without turning the hosting managed server off, the consumer fails to receive messages sent to the queues, unless I turn off the previous hosting managed server forcing the consumer to the new host.
I can rule out sender problems, I can see the messages in the queue right after them being sent.
I'll be grateful if anyone can advice on how to configure either the consumer or the migratable service to work seamlessly when migration happens.
I think that may just be a misunderstanding of how migration works. The docs state Auto-Migrate Exactly-Once:
indicates that if at least one Managed Server in the candidate list
is running, then the JMS service will be active somewhere in the
cluster if servers should fail or are shut down (either gracefully or
forcibly). For example, a migratable target hosting a path service
should use this option so if its hosting server fails or is shut down,
the path service will automatically migrate to another server and so
will always be active in the cluster. Note that this value can lead to
target grouping. For example, if you have five exactly-once migratable
targets and only one server member is started, then all five
migratable targets will be activated on that server member.
The docs also state:
Manual Service Migration—the manual migration of pinned JTA and
JMS-related services (for example, JMS server, SAF agent, path
service, and custom store) after the host server instance fails
Your server/service has neither failed or shut down, you are forcing it to migrate with a healthy host still running, so it has not met the criteria for migration.
See more here as well.
I have some experience that sounds reminiscent of what you're looking at. There was some WLS-specific capability around recognizing reconfiguration in JMS destinations as part of their clustered server design.
In one case I had to call a WLS-specific method: weblogic.jms.extensions.WLSession.setExceptionListener(). This was on their implementation of the JMS Session interface. This is analogous to the standard JMS Connection.setExceptionListener().
With this WLS-specific capability, the WLSession.setExceptionListener() callback would occur at a point where the consuming client should tear down and re-establish the connection / session / consumer in reaction to a reconfiguration (migration) that had happened.

Behavior of WL.server.createEventSource on a Worklight Cluster Environment

Let's assume I have a cluster of 2 worklight servers sharing the same WL runtime.
On that runtime, I've installed a application with a adapter that is a create event source function.
Just like this IBM article.
https://www.ibm.com/developerworks/community/blogs/worklight/entry/configuring_a_polling_event_source_to_send_push_notifications?lang=en
My question is, what will happen on a cluster environment.
Will repeated work ensue?
By other words, would my two WL Servers will be pooling for events?
Or perhaps that functionality is writing a task on the WL DB that the WL Servers poll regularly to check for work if no instance is taking care of it, so that only a server at a time would be "the event source"?
I'm working with IBM Worklight 6.2 and Websphere Liberty Profile 8.5.5
Thanks in advance!
Here's my attempt to answer this after some consultation:
My question is, what will happen on a cluster environment. Will
repeated work ensue? By other words, would my two WL Servers will be
pooling for events?
While the Worklight Servers share the same runtime, they are still considered as 2 instances. This means that each of them will attempt to perform a polling action. This is considered OK.
However, it is important to note that the backend system that is being polled should likely be smart enough to handle such a situation where 2 polling attempts are done for the same message.
If the backend doesn't know how to handle polling properly, the same message can be pulled more than once. This is true even of you have a single eventsource running. So this is something to keep in mind.

Using a web service to drop message onto an ActiveMQ Queue fails on failover

I have a two activeMQ(5.6.0) brokers. They use a shared kaha database so only one can be 'running' at once.
I have a (asp.net) webservice that puts a message on a queue, locally if I start and stop the brokers the webservice fails over correctly
when I test with the brokers on seperate machines it sometimes works but often I get "socketException: Connection reset" errors and the message is lost.
The connection string I am using is below. Note that I am aware NMS does not understand the priority backup command but I have left it there for the future.
failover:(tcp://MACHINE1:61616,tcp://MACHINE2:62616)?transport.initialReconnectDelay=1000&transport.timeout=10000&randomize=false&priorityBackup=true
How can I make my fail over between brokers fool proof?
The shared Kaha database was on a simple share. Currently activeMQ (or windows) cannot reliably get or release the lock in this configuration. The shared database must sit on a 'real' SAN so that both instances of the queue software see the database as being on a local filestore not a network location.
See this page for more info http://activemq.apache.org/shared-file-system-master-slave.html

BizTalk connectivity issue to SQL during VM snapshot

We have one VM for BizTalk and a separate VM for the SQL backend. We are using Veeam for backups which basically kicks off a snapshot of the VM. When this snapshot is being finalized on the SQL VM, BizTalk services on the application server fail. Usually they restart automatically but sometimes this requires manual intervention to start the services. The error below is logged on the BizTalk server.
Is there any timeout setting or config changes that will allow BizTalk services to stay up during the snapshot process?
An error occurred that requires the BizTalk service to terminate. The most common causes are the following:
1) An unexpected out of memory error.
OR
2) An inability to connect or a loss of connectivity to one of the BizTalk databases.
The service will shutdown and auto-restart in 1 minute. If the problematic database remains unavailable, this cycle will repeat.
Error message: [DBNETLIB][ConnectionRead (recv()).]General network error. Check your network documentation.
Error source:
BizTalk host name: BizTalkServerApplication
Windows service name: BTSSvc$BizTalkServerApplication
We experienced the same situation and error with both BizTalk 2009 and BizTalk 2013, each set up with two App servers and one SQL DB server.
When our VMware does the final step of the Snapshot backup on the Application servers, it freezes the application server for about 10 seconds, preventing it from receiving packets. On SQL Server 2008 and 2012, it by default will send out keep-alive packets to the clients every 30 seconds (30,000 ms). If the SQL server fails to receive a response back from the App server, it will send out 5 retries (default setting) of the keep-alive request 1 second (1,000 ms) apart. If SQL still does not receive the response back, it will terminate the connection, which will cause the BizTalk hosts on the App server to reset, and in our case, when our German-made ERP system sends its EDI documents over to BizTalk during that reset period, the transmission will fail.
We trapped the issue by running NetMon on the DB and App servers, waiting for the next error message. Upon inspection, we see the five SQL keep-alive packets being sent to the App servers 1 second apart, and at the same time there were NO packets at all received on the Application server. At first guess, one might think they were "just dropped network packets", which is rarely the case. We then made the correlation to the timing of the VM Snapshots, and now confirm each time the snapshot finishes each day, the App servers freeze.
As a Short-to-mid-term workaround, we raised the number of retries SQL attempts before declaring a connection dead, (5 by default), by adding the registry value TcpMaxDataRetransmissions and setting it to 30 (thus 30 seconds before SQL declares the client unresponsive). This has masked the problem for now for us, and use at your own discretion.
We are also looking at an Agent-based version of the VM Snapshot, which may alleviate the condition of freezing the server.
Is there any timeout setting or config changes that will allow BizTalk services to stay up during the snapshot process?
Not that I am aware of, however you might want to Google config options in the btsntsvc.exe.config file which is located in your BizTalk installation directory.
All messages that pass through BizTalk are written to the BizTalkMsgBoxDb and its other databases are involved if you are running tracking, BAM etc. The only service that can cache 'stuff' and handle a database outage is the Enterprise Single Sign-On (ESSO) Service. BizTalk therefore needs a persistent connection to the database server to remain 'up', hence why your Host Instance (BizTalkServerApplication) is stopping - it simply wouldn't be able to process messages if the database wasn't there.
I would add that your approach to back-ups probably isn't supported by Microsoft and I would further suggest that you seriously consider whether an approach that takes your database server offline during the backup is viable?
BizTalk has a pretty robust backup solution for its various databases built into the product, and I would recommend that you take a look at using this supported method.
If you do need to take snapshots of the database system - say once a night - you might want to consider stopping the BizTalk Host Instances, performing the snapshot, and then re-starting the Host Instances through some scripted task.
You might also want to consider checking whether there are any hotfixes for your version of BizTalk Server included in a Cumulative Update that might help address your problem.

MQ With WLS Foreign Server

I am facing two issues when i try to connect to MQ which is deployed on a Remote Server from Weblogic Server(WLS) by creating a Foreign Server.
1. When I try to connect to MQ Queuemanager in Bindings mode(after importing the .Bindings file) i keep getting the below error in WLS console:
java.lang.UnsatisfiedLinkError: no mqjbnd05 in java.library.path
If i Switch the Transport to Client i keep getting:
JMSWMQ0018: Failed to connect to queue manager '' with connection mode 'Client' and host name 'localhost'. Check the queue manager is started and if running in client mode, check there is a listener running. Please see the linked exception for more information.
Has anyone seen this, and are there any performance implications which dictate the use of client over bindings and vice versa?
TIA
Finally i was able to resolve this, i had to recreate the .bindings file in the client mode, with changes to the IVTsetup.bat which is most likely present in
C:\Program Files\IBM\WebSphere MQ\java\bin, I had to run this
def qcf(psQCF) TRANSPORT(CLIENT) HOST(SMEKA) PORT(1415) CHANNEL(ps_SRV_CHANNEL) QMGR(psQM)
to generate the .bindings file.
Refer to this link for more details:
http://publib.boulder.ibm.com/infocenter/wbihelp/v6rxmx/index.jsp?topic=/com.ibm.wbia_adapters.doc/doc/peoplesoft/peopleso103.htm
Where the question states that I try to connect to MQ which is deployed on a Remote Server from Weblogic Server I assume this means that WLS and WMQ are on two different hosts. If that is the case, then a bindings mode connection (which relies on shared memory segments) won't work.
The client mode connection appears to be using a CF that is pointed to localhost rather than the IP or hostname of the WMQ server. This would work for an application on the same host as the queue manager but not when the app and QMgr are on separate servers.
As far as choosing between client and bindings mode, the answer is that if the QMgr is local use bindings. This provides highest reliability, best performance and XA transactionality. When using client mode, two-phase XA commit is not supported without the Extended Transactional Client. Per the JMS specification, there is an ambiguity that can exist if an app loses the connection during a COMMIT call. Depending on how the app handles this it's possible to end up with duplicate messages. (The JMS spec refers to these as "functionally duplicate.") This ambiguity is much less likely to occur with a bindings mode connection since there is no network latency and not even any traversal of the IP stack or interface. So use bindings mode where possible.
UPDATE:
Removed note about Extended Transactional Client being a chargeable component. As of April 24th, XTC is free of charge for all versions of WMQ on all platforms.