Jboss Mod_cluster - apache

I have a jboss cluster with 2 nodes (a and b) + 1 apache working as mod_cluster (apache in a separate server)
If one of the nodeA goes down, mod cluster can't connect to another one.
So, if nodeA crashes, I can't access jboss aplication by http://apache_server/myapp, but I can by http://nodeb/myapp and vice-versa
I dig on google almost all i have found say that is related to sessions but I can't fnd whats is wron with my config. (Mod_cluster as configured with this tool Load Balancer Configuration Tool
NodeA Log
15/05/2016 07:45:22,741 ERROR [org.jgroups.protocols.TCP] (http-/nodeA:8080-90) failed sending message to jbossnodeb:jbossnodeb/web (4148 bytes): java.net.SocketException: Socket closed, cause: null
15/05/2016 07:45:22,790 ERROR [org.jgroups.protocols.TCP] (OOB-6464,shared=tcp) failed sending message to jbossnodeb:jbossnodeb/web (4141 bytes): java.net.SocketException: Broken pipe, cause: null
NodeB Log
15/05/2016 07:45:23,126 ERROR [org.jgroups.protocols.TCP] (OOB-4949,shared=tcp) failed sending message to jbossnodea:jbossnodea/web (79 bytes): java.net.SocketException: Broken pipe, cause: null
15/05/2016 07:45:53,457 WARN [org.jgroups.protocols.TCP] (Timer-1,shared=tcp) null: no physical address for jbossnodea:jbossnodea/web, dropping message
Apache mod_cluster server log
[Sun May 15 07:45:04 2016] [error] (70007)The timeout specified has expired: proxy: read response failed from (null) (nodeA_IP)
[Sun May 15 07:45:34 2016] [error] (70007)The timeout specified has expired: ajp_cping_cpong: apr_socket_recv failed
[Sun May 15 07:45:38 2016] [error] ajp_handle_cping_cpong: ajp_ilink_receive failed
[Sun May 15 07:45:38 2016] [error] (70007)The timeout specified has expired: proxy: AJP: cping/cpong failed to (null) (nodeA_IP)
[Sun May 15 07:45:44 2016] [error] (70007)The timeout specified has expired: ajp_cping_cpong: apr_socket_recv failed
[Sun May 15 07:45:44 2016] [error] (70007)The timeout specified has expired: proxy: dialog to nodeA_IP:8009 (nodeA_IP) failed
[Sun May 15 07:45:44 2016] [error] ajp_read_header: ajp_ilink_receive failed
[Sun May 15 07:45:44 2016] [error] (70007)The timeout specified has expired: proxy: dialog to nodeA_IP:8009 (nodeA_IP) failed
[Sun May 15 07:45:44 2016] [error] (70007)The timeout specified has expired: proxy: dialog to nodeA_IP:8009 (nodeA_IP) failed
[Sun May 15 07:45:45 2016] [error] ajp_read_header: ajp_ilink_receive failed
[Sun May 15 07:45:45 2016] [error] (70007)The timeout specified has expired: proxy: dialog to (null) (nodeA_IP) failed
[Sun May 15 07:45:45 2016] [error] ajp_read_header: ajp_ilink_receive failed
[Sun May 15 07:45:45 2016] [error] (70007)The timeout specified has expired: proxy: dialog to (null) (nodeA_IP) failed
[Sun May 15 07:45:45 2016] [error] ajp_read_header: ajp_ilink_receive failed
[Sun May 15 07:45:45 2016] [error] proxy: CLUSTER: (balancer://clusterjboss). All workers are in error state
Config apache mod_cluster
AdvertiseGroup 225.0.1.107:23364
KeepAliveTimeout 60
ManagerBalancerName clusterjboss
ServerAdvertise On
AdvertiseFrequency 5
EnableMCPMReceive
CreateBalancers 0
AllowDisplay On
ProxyPass / balancer://clusterjboss/ stickysession=JSESSIONID|jsessionid nofailover=On

Visibility
JBoss worker instances must be able to contact your ```EnableMCPMReceive`` VirtualHost
Your JBoss worker instances report their IP address and AJP port to the Apache HTTP Server
Your Apache HTTP Server must be able to contact them back on those reported addresses
ProxyPass
JGroups, Infinispan, Domains, Clustering
mod_cluster, i.e. modcluster subsystem has nothing to do with the aforementioned whatsoever. The subsystem is completely oblivious to the fact that there is some cluster formed or that you have your instances in a domain -- which is also irrelevant to having your instances in a cluster in the first place. Don't bother with JGroups messages while investigating mod_cluster configuration.
Although, if your JGroups cluster is broken...
Infinispan - i.e. distributed or replicated cache of your web session data in this case, relies on JGroups for forming a cluster and for exchanging messages in this cluster. If your instances cannot for a cluster or fail to exchange messages, you might experience a loss of session data on failover.
For example: Apache HTTP Server mod_cluster balacner decides to send request with JSESSIONID yadayadaXXX.worker-1 to worker-2, because worker-1 is down. Due to a network configuration error, worker-1 and worker-2 has never correctly formed a cluster, so worker-2 does not have the session data of worker-1. The result is a web application with a new session created, i.e. your client lost his context, e.g. shopping cart (popular showcase).
ProxyPass
Don't use it unless you have something specific in mind. The whole point of mod_cluster is that it creates all proxy directives in memory, on the fly dynamically as your worker nodes and their web applications come and go. You start fiddling with additional ProxyPass directives if you want to:
react to special error codes from a special web applciation, e.g. to treat HTTP codes that are supposed to mean an error as valid and vice versa
to serve static content directly from the Apache HTTP Server and not from worker nodes - e.g. pictures...
to load balance some contexts to mod_cluster-aware JBoss worker nodes and some contexts to non-mod_cluster servers, e.g. another Apache HTTP Server running Drupal in PHP...
ManagerBalancerName
It is not clear to me why you would need to change it. If you change the default value, you have to also alter balancer="new_value" in your Jboss modcluster subsystem configuration. What is actually does is that it tells mod_cluster in the Apache HTTP Server to create more separate named ProxyPass Balacners internally. One then could use ProxyPass directives to tweak them separately. Do you need to tweak them? According to the rest of your config I am convinced it is not the case. For example, the session stickiness is configured in JBoss nodes in mod_cluster subsystems - worker ndoes report this to the Apache HTTP Server balancer.
HTH, -K-

Possible changes that need to be done in domain.xml:
1. Under < domain-controller>, add < remote host="< ip-address-of-master-node>" port="< port>" security-realm="ManagementRealm"/>
2. Under < servers>, add < server name="slave-node" group="server-group" auto-start="true">
3. Under mod-cluster subsystem, add < mod-cluster-config advertise-socket="modcluster" proxy-list="< ip-address>:< port-in-mod-cluster-config" connector="ajp">
In mod-cluster configuration:
1. Allow from all
2. ManagerBalancerName server-group (exact name as above)
Also, are you using any virtualization/containers? To deal problems with session replication in such cases, you might need to try out "sticky session".

Related

WebServers cannot connect to app server ELB - AWS

I have a simple deployment with some webservers connected to an AWS ELB. This ELB in-turn has some application servers behind it.
The webservers are unable to connect to the application server ELB. The httpd error log is full of:
[Thu Dec 22 15:28:05.897273 2016] [proxy:error] [pid 10188] (70007)The timeout specified has expired: AH00957: HTTP: attempt to connect to 54.254.179.37:80 (elblinkhere) failed
[Thu Dec 22 15:28:05.897348 2016] [proxy:error] [pid 10188] AH00959: ap_proxy_connect_backend disabling worker for (elblinkhere) for 60s
[Thu Dec 22 15:28:05.897361 2016] [proxy_http:error] [pid 10188] [client 10.0.0.54:13789] AH01114: HTTP: failed to make connection to backend: elblinkhere
I have tried to check if this is an SELinux issue but that does not seem so.
I have also read a large number of threads on the internet about this and not come across any solutions.
My question(s):
1. What other methods can I use to resolve this?
2. How do I resolve this?
Did you configure your ELB as external and also enabled necessary port for ELB's security group?

Apache2 Request on SSL waits until time-out expired to return data

I am working with a server that I recently inherited from a departed developer. The server returns XML documents via a REST-ful interface over an SSL port. For small documents, the data is returned quickly. For larger (say, larger than 1 MB), the server waits until the server time-out value is exhausted and then returns the data.
I know this because if I set the time-out value to five minutes the data will be returned to a browser in a little over 300 seconds. If I drop the time-out value to two minutes, it will be returned in about 120 seconds. If I drop it to 10 seconds, then the data is returned in about 10 seconds.
Now, if I set my VirtualHost to port 80, the data is returned almost instantly, which is what I expect.
There are a number of diagnostics in the apache log files such as:
[Thu Apr 28 16:46:44.234689 2016] [ssl:info] [pid 22606] (70014)End of file found: [client 172.26.61.243:62030] AH01991: SSL input filter read failed.
[Thu Apr 28 16:46:44.237818 2016] [ssl:debug] [pid 22509] ssl_engine_io.c(1212): (70014)End of file found: [client 172.26.61.243:62030] AH02007: SSL handshake interrupted by system [Hint: Stop button pressed in browser?!]
[Thu Apr 28 16:46:44.569913 2016] [ssl:debug] [pid 22426] ssl_engine_io.c(1212): (70007)The timeout specified has expired: [client 172.26.61.243:62031] AH02007: SSL handshake interrupted by system [Hint: Stop button pressed in browser?!]
I do not know if these are relevant nor where to look for a solution. I have searched the internet, Apache and SSL documentation and found nothing relevant or useful.

mod_jk not changes IP of hostname when occurs changing of IP on DNS

In apache, the module mod_jk not changes IP of hostname when occurs changing of IP on DNS.
Version of apache:
Server version: Apache/2.2.15 (Unix)
Server built: Aug 2 2013 08:02:15
Version mod_jk: 1.2.37
Example:
workers.properties
worker.portalconsultoras_prd.type=ajp13
worker.portalconsultoras_prd.host=hostexample.com.br
worker.portalconsultoras_prd.port=8009
This configuration works fine.
But, when occurs change ip in the host name in DNS, the module md_jk starts fail to connect. Follow below the log of mod_jk:
[Wed Sep 18 12:00:33 2013] [5315:140659824723936] [info] jk_open_socket::jk_connect.c (627): connect to 107.xx.xx.220:8009 failed (errno=115)
[Wed Sep 18 12:00:33 2013] [5315:140659824723936] [info] ajp_connect_to_endpoint::jk_ajp_common.c (995): Failed opening socket to (107.xx.xxx.220:8009) (errno=115)
[Wed Sep 18 12:00:33 2013] [5315:140659824723936] [error] ajp_send_request::jk_ajp_common.c (1630): (portalconsultoras_prd) connecting to backend failed. Tomcat is probably not started or is listening on the wrong port (errno=115)
I would like a configuration of apache that avoid this problem.
Looking for the solutions in google, have turn on the "HostnameLookups", but is inefficient.
Thanks!

Apache Tomcat and Mod_jk

We have been running Apache with Tomcat using mod_jk for about a month now with out issues. This morning I have started seeing the error below in the mod_jk log files.
I am fairly new to using mod_jk and am not sure how to increase the number of connections, see the number of active connections and/or kill of connections that are idle or dead.
Any ideas/help would be much appreciated.
[Thu Sep 19 11:02:42 2013] [1644:11984] [warn] ajp_get_endpoint::jk_ajp_common.c (3177): Unable to get the free endpoint for worker Worker1 from 10 slots
[Thu Sep 19 11:02:42 2013] [1644:11984] [error] jk_handler::mod_jk.c (2726): Could not get endpoint for worker=Worker1
[Thu Sep 19 11:02:42 2013] [1644:11984] [info] jk_handler::mod_jk.c (2788): Service error=0 for worker=Worker1
So it turns out this issue was a by product of another configuration issue. We had different Railo contexts configure to point to the same set of shared directories, some of the context's mapped to directories that were within the root context which caused Java thread locks

Apache Proxy Error

I am getting the following error intermittently on my server:
**Proxy Error**
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /.
Reason: Error reading from remote server
The error logs show the following:
[Sun Feb 06 03:06:00 2011] [error] [client 82.43.154.57] proxy: Error reading from remote server returned by /login, referer: https://demo.XXXXX.us/
[Sun Feb 06 03:06:30 2011] [error] [client 82.43.154.57] (70007)The timeout specified has expired: proxy: error reading status line from remote server XXXXX.us
[Sun Feb 06 03:06:30 2011] [error] [client 82.43.154.57] proxy: Error reading from remote server returned by /
[Sun Feb 06 03:13:31 2011] [error] [client 82.43.154.57] (70007)The timeout specified has expired: proxy: error reading status line from remote server XXXXX.us
[Sun Feb 06 03:13:31 2011] [error] [client 82.43.154.57] proxy: Error reading from remote server returned by /
I have read a lot of posts suggesting connection timeout settings in tomcat and environment settings in Apache. I have set the following in httpd.conf:
<VirtualHost *>
SetEnv force-proxy-request-1.0 1
SetEnv proxy-nokeepalive 1
</VirtualHost>
I have also set the following in tomcat server.xml:
<Connector port="9080" maxHttpHeaderSize="8192"
maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" redirectPort="9443" acceptCount="100"
connectionTimeout="60000" disableUploadTimeout="true" />
Also, once the error occurs, I have to start a new browser for the error to disappear as it continues to show even on a refresh. Secondly, I am using htaccess to rewrite the url. Don't know if this has any impact on the error?
EDIT>
My server is running on 150mb of free memory at normal times and can drop quite low but not at the exact times of the above error. Would this cause such an error?
I would appreciate any ideas people have.
Thank you.
This was an issue with Pear Mailer.
We were using Pear Mailer which uses a queue to stack emails ready for sending with a cron job. There was an error in the Pear script which was being called on every action on our site (making posts, sending messages etc..). Pear was crashing which in turn crashed the browser resulting in the above errors.
Disabling Pear resolved the problem, and tweaking the code got it working again.
It took so long to find the issue as we never thought Pear Mailer could cause such a response.
we had a similar problem on our server after a mysql crash, and the only solution was to restart the server.