Weblogic Deployment Health Status through WLST - weblogic

I need to monitor the Health Status of Deployment on Weblogic Server. i can get Server health status, threadpool status, overall server HealthStatus but didn't find Deployment Health Status. although i can get CurrentStatus which is the deployment Status (Prepared, Active etc). Please help

The health state lives on the individual server runtimes and cannot be found on the overall domainRuntime, I think the admin server just adds up all health states.
To get to the health state of a single deployment, you can use the following code after connecting to the Adminserver over WLST:
domainRuntime()
appBean = getMBean("ServerRuntimes/my_server1/ApplicationRuntimes/my_app1")
appBean.getHealthState()

Related

WebSphere 9 ND node agent stopped and the applications are still working. How/why?

This is WebSphere 9 ND. I've stopped the node agent and the serverStatus.sh script reports that it is down: ADMU0509I: The Node Agent "nodeagent" cannot be reached. Why are the applications still authenticating and appear to be working?
See this article explaining the basic concepts of IBM Websphere application server Network Deployment.
node agent
A node agent manages all managed processes on a WebSphere Application Server on a node by communicating with the Network Deployment Manager to coordinate and synchronize the configuration. A node agent performs management operations on behalf of the Network Deployment Manager. The node agent represents the node in the management cell. Node agents are installed with WebSphere Application Server base, but are not required until the node is added to a cell in a Network Deployment environment.
application server
The application server is the primary component of WebSphere. The server runs a Java™ virtual machine, providing the runtime environment for the application's code. The application server provides containers that specialize in enabling the execution of specific Java application components.
Apps are deployed into the Application server and not to the nodeagent. The role of the node agent is to perform management operations on behalf of Deployment Manager.
So, if the nodeagent is stopped, you will only loose the ability to manage the servers running under that node and it will not stop already running application
servers or applications deployed to servers in that node.
You can validate this by grepping the server name (eg:server1) from the list of all running processes:
ps -ef | grep java | grep servername
Sample output (for an app server) is given below:
wasadmin 12345 98765 2 13:18 pts/0 00:04:57 /opt/ibm/WebSphere/AppServer/java/8.0/bin/java -Dosgi.install.area=/opt/ibm/WebSphere/AppServer <collapsed text> cellname nodename servername
where:
wasadmin - is the os username running the application server on that
node
12345 - is the pid of the application server running on that node.
98765 - is the pid of the parent process (nodeagent). This will be
"1" if the nodeagent is stopped

Gridgain console load balance

I have Gridgain three node cluster and also running Gridgain web console agent and web console on all three nodes. It is all hosted on Windows Server.
I would like to load balance My web console. The problem is I don't know how to share user registration database which it stores in a work directory. Can I use external database to store all that information so that my cluster uses the same database?
There is a problem with Web Console Agent as well. How do I share tokens stored in default.properties?
There is no definitive guide on how to create a cluster for web console for high availability.
Can someone please guide me on how can I form a cluster for a Web console sharing its user store and tokens?
Thanks
If you are looking for multi-cluster support, take a look at documentation:
https://www.gridgain.com/docs/web-console/latest/multi-cluster-support
If you are looking for agent fault-tolerance: just start several agents. Fisrt agent will process all messages, other will be in the hot-stand-by mode.
If you are looking for connection fault-tolerance between agent and cluster (if cluster node failed that is a connection point for agent, Web Console will loose connection to cluster), just specify several nodes addresses as comma-separated list for "node-uri" parameter (in default.properties or as command-line argument).
For example:
node-uri=http://192.168.0.1:8080,http://192.168.0.2:8080;http://192.168.0.3:8080
Hope this helps.

Workday integration jobs failing

We run integration jobs between Workday and on premise applications. The integration
jobs run and dump or get data files on a sftp server, on our site, running bitvise sftp server.
The integrations connect to our sftp server through a virtual account and public keys.
These jobs have been running fine until a couple of days ago when some of the integrations
started to fail; the connections to the sftp server seems to be successful at first, but they get termininated right away. We'll appreciate any help with this.
Thanks.
This issue was fixed after UTM (Threat management) on Palo was disabled.
Here is Network Admin:
"The solution was to remove IPS/IDS check from the rule that allow Workday servers to sftp in.
My guess is after Palo Alto’s dynamic updates(Antivirus/Wildfire) it started to cut the working session all of a sudden."

How does a GlassFish cluster find active IIOP endpoints?

I have a curiosity and I was searching for it without any result. In GlassFish documentation it is written:
If the GlassFish Server instance on which the application client is
deployed participates in a cluster, the GlassFish Server finds all
currently active IIOP endpoints in the cluster automatically. However,
a client should have at least two endpoints specified for
bootstrapping purposes, in case one of the endpoints has failed.
but I am asking myself how this list is created.
I've done some tests with a stand-alone client that is executed in a JVM and does some RMI calls on an application that is deployed in a GlassFish cluster and I can see from the logs that the IIOP endpoints list is completed automatically and it is set as com.sun.appserv.iiop.endpoints system property but if I stop a server instance or start another during the execution of the client the list remains the one that was created when the JVM was started.
GlassFish clustering is managed by the GMS (Group Management Service) which usually uses UDP Multicast, but can use TCP where that is not available.
See section 4 "Administering GlassFish Server Clusters" in the HA Administration Guide (PDF)
The Group Management Service (GMS) enables instances to participate in a cluster by
detecting changes in cluster membership and notifying instances of the changes. To
ensure that GMS can detect changes in cluster membership, a cluster's GMS settings
must be configured correctly.

Windows Server 2008 VM - network services failing

I would really appreciated another perspective on an issue we have been experiencing.
The environment:
We have a small subset of VMs (5 Windows Server 2008 R2 VM's) hosted on a Windows Server 2012 Cluster of 8 Physical Hosts which supports 100's over VMs across various OS (2008/2012 etc).
The issue:
Servers within the subset of VMs experience widespread network SERVICE failures. The failure presents itself as a loss in connectivity for a large number of network related services operating on the VMs (including certain critical network dependant applications).
The impacts:
Server remains online.
Inability to RDP to the servers via Domain Accounts (Local accounts are fine).
Windows event logs associated with Netlogon Failure: Event ID 5719 - This computer was not able to set up a secure session with a domain controller in domain DOWNERGROUP due to the following:
The RPC server is unavailable. This may lead to authentication problems.
Windows event logs assocaited with Group Policy Failure:
Event ID 1054:The processing of Group Policy failed. Windows could not
obtain the name of a domain controller. This could be caused by a name
resolution failure. Verify your Domain Name System (DNS) is configured
and working correctly
Widespread Agent Failure (AV, Monitoring, Application) - Lack of connectivty to centralised management servers.
The resolution(s). Stopping an agent service. Strange however its not limited to a specific agent however if we stop agent A, the server comes back to life, however if we also stop agent B, the server comes back to life with Agent A still running. Restarting the VM also resolves the issue.
Note that these events do not appear on other VMs hosted off the same host at the time of the outage. Also note that the guest is located on the same host prior to, during and after the outage.
We have investigated the suspicion that their may be issues with Dynamic Range Port Allocation with the server possibly getting into a bottleneck state. We have implementedthe "MaxUserPort" and "TCPTimedWaitDelay" registry parameters and have set them to 65k and 30 respectively.
Also note that when an outage occurs, it does not always occur on the same VMs in the group. Often times it is 2, 3, 4 or all servers.
Im really just asking if anyone can see these symptoms and relate to possible causes for our situation.
Any help/discussion would be appreciated.
Well, this turned out to be an interesting resolution.
We discovered that one of our server agents, while not actually showing open ports in Netstat, had over 40,000 handles growing linearly over time.
Had to enable the "handles" column in task manager to be able to see this info.
This was the miracle post...
http://blogs.technet.com/b/kimberj/archive/2012/07/06/sever-quot-hangs-quot-and-ephemeral-port-exhaustion-issues.aspx