I have an activemq installation with master / slave failover.
Master and Slave are synced using the lease-database-locker
Master and Slave run on 2 different machines and the database is located on a third machine.
Failover and client reconnection works properly on a forced shutdown of the master broker. The slave is taking over properly and the clients reconnect due to their failover setting.
The problems start, if I simulate a network outage on the master broker only. This is done by using an iptables Drop Rule for packages going to the database on the master.
The master now realizes, that it cannot connect to the Database any longer. The slave starts up, since it's network connection is still alive.
It seems from the logs, that the clients still try to reconnect to the non responding master
For my understanding the master should inform the clients, that there is no connection anymore. The clients should failover and reconnect to the slave.
But this is not happening.
The clients do reconnect to the slave if I reestablish the db connection by reenabling the network connection to the db for the master. The master gives up beeing the master then.
I have set a queryTimeout on the lease-database-locker.
I have set updateClusterClients=true for the transport connector.
I have set a validationQueryTimeout of 10s on the db connection.
I have set a testOnBorrow for the db connection
Is there a way to force the master to inform the clients to failover in this particular case ?
After some digging I found the trick.
The broker was not informing the clients due to a missing ioExceptionHandler configuration.
The documentation can be found here
http://activemq.apache.org/configurable-ioexception-handling.html
I needed to specify
<bean id="ioExceptionHandler" class="org.apache.activemq.util.LeaseLockerIOExceptionHandler">
<property name="stopStartConnectors"><value>true</value></property>
<property name="resumeCheckSleepPeriod"><value>5000</value></property>
</bean>
and tell the broker to use the Handler
<broker xmlns="http://activemq.apache.org/schema/core" ....
ioExceptionHandler="#ioExceptionHandler" >
In order to produce an error on network outages I also had to set a queryTimeout on the lease query:
<jdbcPersistenceAdapter dataDirectory="${activemq.base}/data" dataSource="#mysql-ds-db01-st" lockKeepAlivePeriod="3000">
<locker>
<lease-database-locker lockAcquireSleepInterval="10000" queryTimeout="8" />
</locker>
This will produce an sql exception if the query takes to long due to a network outage.
I did test the network by dropping packages to the database using an iptables rule:
/sbin/iptables -A OUTPUT -p tcp --destination-port 13306 -j DROP
Sounds like you client doesn't have the address of the slave in its URI so it doesn't know where to reconnect to. The master broker doesn't inform the client where the slave is as it doesn't know there is a slave(s) or where that slave might be on the network, and even if it did that would be unreliable depending on what the conditions are that caused the master broker to drop in the first place.
You need to provide the client with the connection information for the master and the slave in the failover URI.
Related
I would like to set up a basic 3-node Redis Sentinel setup using the new TLS features of Redis 6. Unfortunately, it doesn't seem like Redis 6 Sentinel is smart enough to speak TLS to clients.
Does anyone know of a way to do this, or if it's not possible, if there are any mentions online about adding support for this in the future? It seems a shame to have these nice TLS features and not be able to use them with Redis' own tools.
I am aware that in the past people have used Stunnel to do this. With TLS support added to Redis, I am only interested in doing this if it can be done without third party addtions.
My setup:
3 Redis servers (6.0-rc, last pulled last week), running TLS with the test certs as specified in the Redis docs - one master and 2 replicas
3 Sentinels (6.0-rc, also last pulled last week), not running TLS on their ports (I would like to, but that's a secondary problem)
What I've Tried:
Pointing Sentinel to the Redis TLS port - this results in lots of TLS errors in Redis' logs about incorrect TLS version received, as Sentinel is not speaking TLS to Redis. Since it fails, Sentinel thinks the master is down.
Adding "https://" in the Sentinel config in front of the master IP - this results in Sentinel refusing to run, saying it can't find the master hostname.
Adding TLS options to Sentinel - this results in Sentinel trying to talk TLS on its ports, but not to clients, which doesn't help. I couldn't find any options specifically about making Sentinel speak TLS to clients.
Pointing Sentinel to the Redis not-TLS port (not ideal, I would rather only have the TLS port open) - this results in Sentinel reporting the wrong (not-TLS) port for the master to the simple Python client I'm testing with (it literally just tries to get master info from Sentinel) - I want the client to talk to Redis over TLS for obvious reasons
Adding the "replica-announce-port" directive to Redis with Sentinel still pointed to the not-TLS port - this fails in 2 ways: the master port is still reported incorrectly as the not-TLS port (seems to be because the master is not a replica and so the directive does not apply), and Sentinel now thinks the replicas are both down (because the TLS port is reported, replicas are auto discovered, and it can't speak to the replicas on the TLS port).
I am aware of this StackOverflow question (Redis Sentinel and TLS) - it is old and asks about Redis 4, so it's not the same.
I did figure this out and forgot to post the answer earlier: The piece I was missing was that I needed to set the tls-replication yes option on both the Redis and Sentinel servers.
Previously, I had only set it on the Redis servers, as they were the only ones that needed to do replication over TLS. But for some reason, that particular option is what is needed to actually make Sentinel speak TLS to Redis.
So overall, for TLS options, both sides of the equation needed:
tls-port <port>
port 0
tls-auth-clients yes
tls-ca-cert-file <file>
tls-key-file <file>
tls-cert-file <file>
tls-replication yes
Try to add tls-port option to the sentinel.conf as it seems to enable TLS support in general and the same is stated in documentation. For me the below two statements added to sentinel.conf on a top of the rest of TLS configuration actually made the trick.
tls-port 26379
port 0
We had multiple clients configured to talk to this cluster of aerospike nodes. Now that we have removed the configuration from all the clients we are aware of, there are still some read/write requests coming to this cluster, as shown in the AMC.
I looked at the log file generated in /var/log/aerospike/aerospike.log, but could not get any information.
Update
The netstat command as mentioned in the answer by #kporter shows the number of connections, with statuses ESTABLISHED, TIME_WAIT, CLOSE_WAIT etc. But, that does not mean those connections are currently being used for get/set operations. How do I get the IPs from which aerospike operations are currently being done?
Update 2 (Solved)
As mentioned in the comments to #kporter's answer, a tcpdump command on the culprit client showed packets still being sent to the aerospike cluster which was no more referenced in the config file. This was happening while even AMC of that cluster did not show any more read/write TPS.
I later found that this stopped after doing a restart of the nginx service on the client. Please note that the config file in the client now references a new aerospike cluster and packets sent to that cluster did not stop after the nginx restart. This is weird but it worked.
Clients connect to Aerospike over port 3000:
The following command, when run on the server nodes, will show the addresses of hosts connecting to the server over port 3000.
netstat --tcp --numeric-ports | grep 3000
I have a strange behavior in ActiveMQ with network connectors. Here is the setup:
Broker A listening for connections via a nio transport connector on 61616
Broker B establishing a duplex connection to broker A
A producer on A sends messages to a known queue, say q1
A consumer on B subscribes to the queue q1
I can clearly see that the duplex connection is established but the consumer on B doesn't receive any message.
On jconsole I can see that the broker A is sending messages, up to the value of the prefetch limit (1000 messages) to the network consumer, which seems fine. The "DispatchedQueue", "DispatchedQueueSize", and more importantly the "MessageCountAwaitingAck" counters have the same value: they are stuck to 1000.
On the broker B, the queue size is 0.
At the system level, I can clearly see an established connection between broker A and broker B:
# On broker A (192.168.x.x)
$ netstat -t -p -n
tcp 89984 135488 192.168.x.x:61616 172.31.x.x:57270 ESTABLISHED 18591/java
# On broker B (172.31.x.x)
$ netstat -t -p -n
tcp 102604 101144 172.31.x.x:57270 192.168.x.x:61616 ESTABLISHED 32455/java
Weird thing: the recv-q and send-q on both brokers A and B seem to have some data not read by the other side. They don't increase or decrease, they are just stuck to these values.
The ActiveMQ logs on both sides don't say much, even in TRACE level.
Seems like neither broker A or broker B are sending acks for the messages to the other side.
How is that possible? What's a potential cause and fix?
EDIT: I should add that I'm using an embedded ActiveMQ 5.13.4 on both sides.
Thanks.
I'm using Redis 3.2.0 and enabled replication. But I got result for "info replication" as follows:
master_link_status:down
Redis log shows:
Connecting to MASTER master_host:6379
MASTER <-> SLAVE sync started
...
Timeout connecting to the MASTER
Connecting to MASTER master_host:6379
...
Ping and telnet to port 6379 of master host from slave host is succeeded.
So, I thought redis process on slave host is trying to connect to master host via wrong network interface(slave host has multiple network interfaces).
Can I specify network interface which is used by redis replication?
When Redis connects to master host, client socket is binded to address which is specified by first argument of "bind" parameter.
I have Redis cluster of three instances and the cluster is powered by Redis Sentinel and they are running as [master,slave,slave].
Also and HAproxy instance is running to transfer the traffic to the master node, and those tow slaves are read only, are used by another applications.
It was very easy to configure HAproxy to select the Master Node when same auth key used for all instance, but now we have different auth keys for every instance different from others.
#listen redis-16
bind ip_address:6379 name redis
mode tcp
default_backend bk_redis_16
backend bk_redis_16
# mode tcp
option tcp-check
tcp-check connect
tcp-check send AUTH\ auth_key\r\n
tcp-check send PING\r\n
tcp-check expect string +PONG
tcp-check send info\ replication\r\n
tcp-check expect string role:master
tcp-check send QUIT\r\n
tcp-check expect string +OK
server R1 ip_address:6379 check inter 1s
server R2 ip_address:6380 check inter 1s
server R3 ip_address:6381 check inter 1s
So the above code works only when we have one passwords across {R1,R2,R3}, How to configuer HAproxy for different passwords.
I mean how to make HAproxy use the the each auth key for its server, like the following:
R1 : abc
R2 : klm
R3 : xyz
You have two primary options:
Set up an HA Proxy config for each set of servers which have different passwords.
Set up HA Proxy to not use auth but rather pass all connections through transparently.
You have other problems with the setup you list. Your read-only slaves will not have a role of "master". Thus even if you could assign each a different password, your check would refuse the connection. Also, in the case of a partition your check will allow split-brain conditions.
When using HA Proxy in front of a Sentinel managed Redis pod[1] if you try to have HA Proxy figure out where to route connections to you must have HA Proxy check all Sentinels to ensure that the Redis instance the majority of Sentinels have decided is indeed the master. Otherwise you can suffer from split-brain where two or more instances report themselves as the Master. There is actually a moment after a failover when you can see this happen.
If your master goes down and a slave is promoted, when the master comes back up it will report itself as master until Sentinel detects the master and reconfigures it to be a slave. During this time your HA Proxy check will send writes to the original master. These writes will be lost when Sentinel reconfigures it to be a slave.
For the case of option 1:
You can either run a separate configured instance of HA Proxy or you can set up front ends and multiple back ends (paired up). Personally I'd go with multiple instances of HA Proxy as it allows you to manage them without interference with each other.
For the case of option 2:
You'll need to glue Sentinel's notification mechanism to HA Proxy being reconfigured. This can easily be done using a script triggered on Sentinel to reach out and reconfigure HA Proxy on the switch-master event. The details on doing this are at http://redis.io/topics/sentinel and more directly at the bottom of the example file found at http://download.redis.io/redis-stable/sentinel.conf
In a Redis Pod + Sentinel setup with direct connectivity the clients are able to gather the information needed to determine where to connect to. When you place a non-transparent proxy in between them your proxy needs to be able to make those decisions - or have them made for it when topology changes occur - on behalf of the client.
Note: what you describe is not a Redis cluster, it is a replication setup. A Redis cluster is entirely different. I use the term "pod" to apply to a replication based setup.