IO thread error : 1595 (Relay log write failure: could not queue event from master) - replication

Slave status :
Last_IO_Errno: 1595
Last_IO_Error: Relay log write failure: could not queue event from master
Last_SQL_Errno: 0
from error log :
[ERROR] Slave I/O for channel 'db12': Unexpected master's heartbeat data: heartbeat is not compatible with local info; the event's data: log_file_name toku10-bin.000063<D1> log_pos 97223067, Error_code: 1623
[ERROR] Slave I/O for channel 'db12': Relay log write failure: could not queue event from master, Error_code: 1595
I tried to restarting the slave_io thread for many times, still its same.

we need to keep on start io_thread whenever it stopped manually, hope its bug from percona
I have simply written shell and scheduled the same for every 10mins to check if io_thread is not running , start slave io_thread for channel 'db12';. It's working as of now

Related

ERR WAIT cannot be used with slave instances Redisson

<redisson.version>3.16.3</redisson.version>
readMode: SLAVE
Not sure why am I seeing this error.
caused by: org.redisson.client.RedisException: ERR WAIT cannot be used with slave instances. Please also note that since Redis 4.0 if a slave is configured to be writable (which is not the default) writes to slaves are just local and are not propagated.. channel: [id: 0x90c58500, L:/172.16.189.108:43192 - R:10.112.94.132/10.112.94.132:15003] command: (WAIT), promise: RedissonPromise [promise=ImmediateEventExecutor$ImmediatePromise#35eeb35e(incomplete)], params: [1, 1000]
at org.redisson.client.handler.CommandDecoder.decode(CommandDecoder.java:370)
at org.redisson.client.handler.CommandDecoder.decodeCommandBatch(CommandDecoder.java:271)
at org.redisson.client.handler.CommandDecoder.decodeCommand(CommandDecoder.java:210)
at org.redisson.client.handler.CommandDecoder.decode(CommandDecoder.java:137)
at org.redisson.client.handler.CommandDecoder.decode(CommandDecoder.java:113)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502)
at io.netty.handler.codec.ReplayingDecoder.callDecode(ReplayingDecoder.java:366)

I have a RabbitMQ container that has regularly "unexpected_frame" exceptions, what does that mean?

I have an application that is pushing data into RabbitMQ and then some other apps are subscribing to the different exchanges.
But recently, I keep having errors like this after a few hours:
2020-07-09 12:45:12.670 [error] <0.23578.1> Error on AMQP connection <0.23578.1> (172.18.0.5:48230 ->
172.18.0.3:5672, vhost: '/', user: 'guest', state: running), channel 6:
operation basic.publish caused a connection exception unexpected_frame:
"expected content header for class 60, got non content"
2020-07-09 12:45:12.674 [info] <0.23578.1> closing AMQP connection <0.23578.1> (172.18.0.5:48230 ->
172.18.0.3:5672, vhost: '/'
On the client side, I get messages like this:
"Already closed: The AMQP operation was interrupted: AMQP close-reason, initiated by Peer,
code=505, text='UNEXPECTED_FRAME - expected content body, got non content body frame instead',
classId=60, methodId=40"
This is on a docker container.
What could this error be about?
You are sharing a channel for concurrent publishing, use below code
lock (ch) { ch.BasicPublish(); }

RabbitMQ Ack Timeout

I'm using RPC Pattern for processing my objects with RabbitMQ.
You suspect,I have an object, and I want to have that process finishes and After that send ack to RPC Client.
Ack as default has a timeout about 3 Minutes.
My process Take long time.
How can I change this timeout for ack of each objects or what I must be do for handling these like processess?
Modern versions of RabbitMQ have a delivery acknowledgement timeout:
In modern RabbitMQ versions, a timeout is enforced on consumer delivery acknowledgement. This helps detect buggy (stuck) consumers that never acknowledge deliveries. Such consumers can affect node's on disk data compaction and potentially drive nodes out of disk space.
If a consumer does not ack its delivery for more than the timeout value (30 minutes by default), its channel will be closed with a PRECONDITION_FAILED channel exception. The error will be logged by the node that the consumer was connected to.
Error message will be:
Channel error on connection <####> :
operation none caused a channel exception precondition_failed: consumer ack timed out on channel 1
Timeout by default is 30 minutes (1,800,000ms)note 1 and is configured by the consumer_timeout parameter in rabbitmq.conf.
note 1: Timeout was 15 minutes (900,000ms) before RabbitMQ 3.8.17.
if you run rabbitmq in docker, you can describe volume with file rabbitmq.conf, then create this file inside volume and set consumer_timeout
for example:
docker compose
version: "2.4"
services:
rabbitmq:
image: rabbitmq:3.9.13-management-alpine
network_mode: host
container_name: 'you name'
ports:
- 5672:5672
- 15672:15672 ----- if you use gui for rabbit
volumes:
- /etc/rabbitmq/rabbitmq.conf:/etc/rabbitmq/rabbitmq.conf
And you need create file
rabbitmq.conf
on you server by this way
/etc/rabbitmq/
documentation with params: https://github.com/rabbitmq/rabbitmq-server/blob/v3.8.x/deps/rabbit/docs/rabbitmq.conf.example

ActiveMQ DB persistent does not reconnect when HA cluster DB failover occurs

Whenever HA failover occurs activeMQ broker got stuck with the messages that has and could not restart by itself.
Messages are processed successfully when we restart the activeMQ.
The bean is created to stop and start the connectors in case if IOExceptions.
bean id="ioExceptionHandler" class="org.apache.activemq.util.DefaultIOExceptionHandler"
property name="ignoreSQLExceptions"value=false property
property name="stopStartConnectors" value=true property
bean
We are getting connection closed as exceptions when this failover occurs as below
Initiating stop/restart of transports on BrokerService[localhost] due to IO exception, java.io.IOException: The connection is closed. | org.apache.activemq.util.DefaultIOExceptionHandler | ActiveMQ Transport: tcp:///hostname:52272#8501
java.io.IOException: The connection is closed.
Later it is trying to restart the transport connectors as a result of this config, but not able to continue further.
INFO | waiting for broker persistence adapter checkpoint to succeed
before restarting transports |
org.apache.activemq.util.DefaultIOExceptionHandler |
IOExceptionHandler: restart transports.
Please let us know if any configuration required for broker to restart and process the messages it has.

Data Node and Node Manager is not starting in pseudo-cluster mode(Apache Hadoop)

The Data Node and Node Manager is not starting in pseudo-cluster mode (Apache Hadoop).
Seeing this error in the log file:
***2017-08-22 17:15:08,403 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Unexpected error starting NodeStatusUpdater
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: NodeManager from archit doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the NodeManager.***
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:278)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:197)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:272)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:496)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:543)
2017-08-22 17:15:08,404 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
The "ResourceManager: NodeManager from archit doesn't satisfy minimum allocations" error is seen when node on which node manager is being started does not have enough resources w.r.t yarn.scheduler.minimum-allocation-vcores and yarn.scheduler.minimum-allocation-mb configurations.
Reduce values of yarn.scheduler.minimum-allocation-vcores and yarn.scheduler.minimum-allocation-mb then restart yarn.