Platform services getting hang - endeca

We are using endeca 11.x and recently we did OS upgrade on endeca servers. Now we are facing Eac communication issue between ITL and CAS server. I restarted Platform Services on both and the job will run for 1hr and after that it is throwing EAC communication exception. Other than that there is no error log in any file. After going through catalina file I found that Platform Services will stop responding and give Socket Timeout exception after 1 hr.
Any pointer on what could be the issue or what should I check will be really helpful.
Thanks
Neha

Related

Why Mobilefirst 7.1 cannot auto recover / reconnect after network got lost / disconnected

IBM MobileFirst 7.1 is not auto recovering after a network failure / lost of connection even though all services/connections are back to normal.
We have a clustered / farm setup with 2 web and app servers (Tomcat). Both app servers are able to serve incoming transactions. We have this incident where-in there is a network failure/lost connection and during that time, all transactions are pointing to 1 app server. Although all connections went back to normal, this 1 app server still unable to connect to the configuration DB. What we did is turn-off this failure server and try the app which is now pointing to the other app server and the app works. We tried to restart the failure app server, test the app and is now accepting transactions. The question is, why it does not auto recover and Tomcat service needs to be restarted? Is MobileFirst 7.1 designed/built in such behavior (not auto recover)?
The expectation is, it should auto recover.
Please help and advise what can be checked/adjusted.
Thanks in advance.
Best regards,
Jonathan
The default DB configuration (datasource configuration) provided with MFP is not designed to auto recover when there is a DB connectivity issue. You
should be able to configure the MFP for auto-reconnect by providing correct data source configuration. See an article on how this is done for different app servers : https://www.techpaste.com/2016/04/jndi-autoreconnect-java-application-servers/

Weblogic wls-wsat component in Payara: CVE-2017-10271

Question: I thought Weblogic and GlassFish / Payara are completely different servers and do not share any common code/component. How come I reached a Weblogic CVE when using Payara?
Configuration: Both our development and production systems are under Payara:
Payara 4.1.1.171.1 Full edition
Oracle Java 1.8.0_144
CentOS 7
Symptoms:
we have illegal connection to url /wls-wsat/CoordinatorPortType11 and wls-wsat/ParticipantPortType, under Anonymous authentication despite having Apache Shiro as security system.
we have an unknown Pyhton program running in our Production. Nothing found in Development so far
Payara Development has shutdown once and one deployment failed ending having Payara stopped (start-domain was required). Payara Production has shutdown once. All of it for unknown reason, especially there were at most one or two users doing nothing special at the shutdown moments
What I can (not) do:
After seeing this and reading this, I think the problems is solved for WebLogic systems but I don't know the mapping GlassFish version <-> Weblogic version, if it exists
Unless I missed a big stuff, I haven't found anything related CVE-2017-10271 and Payara.
We are planning to upgrade to Payara 4.1.2.174 shortly but I have no guarantee it will fix this issue.
I'm trying to check how Shiro can block such connection
I'm asking this question to make sure that there is (or not) no relationship between WebLogic and GlassFish/Payara before opening an issue on Payara GitHub. I unsuccessfully tried to run the python script, I don't know Python :(

Mono/Unity3D disconnecting from a wcf duplex service after a while

I have a client using unity3D which connects to a WCF duplex service. The client connects and receives data all the time, but after a random amount of minutes it just stops receiving data. The service stays stable and can be reconnected without any problem, also no error is thrown.
I have set up other test clients using .net 3.5 and the exact same code i use on mono and the client stays connected pretty much forever.
Does WCF in mono have known connectivity issues? How can i prevent that from happening?
I was facing the same problem, until I copy another version of System.ServiceModel.dll and System.Runtime.Serialization.dll in my Unity Asset (or plugin) folder. You can find these assemblies here : https://www.dropbox.com/sh/z05gp6zsqhshvpx/S-Wywb7NDh
(Running with Unity 4.2.2f1)
Have a good day !

Error in log after windows restart when subscriber is installed as a service

I'm using NserviceBus 2.0 with pub/sub mode.
My subscribers are installed as a windows service.
However after computer restart I always get the following problem in log : "Problem in peeking a message from queue: ServiceNotAvailable".
After digging into source code I found that this is NserviceBus custom error and it occurs in MsmqTransport class. It seems like my subscriber's service is started before Msmq service. Bus this should be impossible because subscriber's service has Msmq as dependency.
After some time service is starting and working correctly. But I have several megabytes of errors in log. And sometimes service is not even starting.
Can anyone help me? I'm using Windows 7. Msmq is installed with NserviceBus utils.
You need to configure your service to be dependent on the MSMQ service. This is should be automatically taken care of if you're using the NServiceBus host.
Installing a Windows Service with dependencies
Have seen the same problem. Actually the impact was even worse since we used log4net and SmtpAppender. Took down the mailserver, ouch! Seems like this is fixed in NSB 3. It sets number of workerthreads to zero and logs "please reboot service". You can even execute own code when the error occurs. Config with lambda using OnCriticalError. We ended up patching the NSB 2 code, since we havent upgraded to NSB 3 yet. Handling MSMQExceptions, logging and stopping the process on errorcode ServiceNotAvailable like they already do when you don't have correct rights to queue. You should probably stop the service on any MSMQExceptions exept IOTimeout.

ODBC connection re establishes after application pool recycle

I have a web service application which connects to databases through odbc sql native client and SQL Server drivers. all of a sudden the application stopped connecting to the database throwing the error 08001. But when i did the application pool recycle it started working. Now it is happening intermittently and became a headache for me. It cant be a memory problem as it happened immediately after app pool reclycle once. but agian got corrected after one more app pool recycle. i dont know what is happening as none of the error logs give any clue:(. Please help me...
the first step is to be able to diagnose what is going on. You cannot fix what you cannot measure. To do this I would enable pooling in the data source console for the driver, then add the counters to the performance monitor to see what the connection pool is doing.
I'm not sure what the realtionship between IIS applocation pool processes and odbc connections is but we are seeing some unexpected behaviour in this area. Also the odbc connection performance counters are visible if I connect to the driver through a locally installed console application but I cannot see any performance counter activity for connections made via the web service app pool in IIS? ODD!?