Python program using Stomp protocol to connect to ActiveMQ keeps disconnecting - activemq

Below is the connection parameters in the python program to connect to ActiveMQ
broker_url = config_params.items('BROKERS')
conn = stomp.Connection12(broker_url,
reconnect_sleep_initial=20.0,
reconnect_sleep_increase=2.0,
reconnect_attempts_max=10,
heartbeats=(60000,60000)
)
So the ReadCheckInterval and WriteCheckInterval are set to 1 minute for the connection. It looks like the heartbeats are being missed. I am just trying to figure out if the heart beats are missing from the client or the ActiveMQ server end. Can someone help me?
Below are the logs from the Python program:
2020-02-25 12:27:16,141 - INFO - Attempting connection to host
2020-02-25 12:27:16,142 - INFO - Established connection to host
2020-02-25 12:27:16,142 - INFO - Starting receiver loop
2020-02-25 12:27:16,143 - DEBUG - Sending frame: ['STOMP', '\n', 'accept-version:1.2\n', 'client-id, 'heart-beat:60000,60000\n',]
2020-02-25 12:27:16,143 - DEBUG - Received frame: 'CONNECTED', headers={'server': 'ActiveMQ/5.15.2', 'heart-beat': '60000,60000']body=''
2020-02-25 12:27:16,143 - DEBUG - Sending frame: ['SUBSCRIBE', '\n', 'ack:auto\n', 'activemq.subscriptionName:subscriber\n']
2020-02-25 12:30:16,144 - DEBUG - Received frame: 'heartbeat', headers={}, body=None
2020-02-25 12:30:16,145 - ERROR - disconnected from broker, will attempt to reconnect...
2020-02-25 12:30:16,145 - INFO - Receiver loop ended
2020-02-25 12:30:16,320 - INFO - Attempting connection to host
2020-02-25 12:30:16,321 - INFO - Established connection to host
2020-02-25 12:30:16,321 - INFO - Starting receiver loop
2020-02-25 12:30:16,321 - DEBUG - Sending frame: ['STOMP', '\n', 'accept-version:1.2\n', 'client-id:\n', 'heart-beat:60000,60000\n']
2020-02-25 12:30:16,322 - DEBUG - Received frame: 'CONNECTED', headers={'server': 'ActiveMQ/5.15.2', 'heart-beat': '60000,60000']body=''
2020-02-25 12:30:16,322 - DEBUG - Sending frame: ['SUBSCRIBE', '\n', 'ack:auto\n', 'activemq.subscriptionName:subscriber]
I see client and the server both missing sending heart beats to each other. Below is a log where the client has missed sending the heartbeat. The connection gets established at 12:03:32. The client sends the first heart beat at 12:03:32 and then subscribes to the ActiveMQ destination. It keeps getting messages, so there is activity, until 12:12:08. Then a period of inactivity until 12:13:32 (>60 seconds) and the connection gets terminated. Is this a problem of the ActiveMQ server being too less tolerant to missed heart beats from the client. Would increasing the heartbeat interval from the client to 120 seconds help in this case?
2020-02-26 12:03:32,498 - INFO - Established connection to host, port 61613
2020-02-26 12:03:32,499 - INFO - Sending frame: 'STOMP', headers={'heart-beat': '60000,60000'}
2020-02-26 12:03:32,512 - INFO - Received frame: 'CONNECTED', headers={'heart-beat': '60000,60000'}
2020-02-26 12:03:32,513 - INFO - Sending frame: 'SUBSCRIBE'
2020-02-26 12:04:27,924 - INFO - Received frame: 'MESSAGE'
.
.
2020-02-26 12:12:08,475 - INFO - Received frame: 'MESSAGE'
2020-02-26 12:13:32,519 - INFO - Received frame: 'heartbeat'
2020-02-26 12:13:32,548 - ERROR - disconnected from broker
I also see problems os the server missing to send the heartbeat and the client getting a heartbeat timeout error. I am thinking of disabling heartbeats from the server by setting the heartbeat configuration to (120000,0). Any suggestions?

After some testing it turned out that even a few milliseconds delay in the client heartbeat was causing the connection to be closed by the broker.
For the same reason, from ActiveMQ server version 5.9.0,
transport.hbGracePeriodMultiplier (default=1) was added. This would increase the heartbeat timeout by a factor of the set value. Below is the JIRA link which was used to implement this feature.
https://issues.apache.org/jira/browse/AMQ-4674
I've also removed the broker heartbeat, by setting the heartbeat as (60000,0) as it was redundant.

The first value in the Connect hear-beat header is the 'will send' value for client heart beats to the broker. The client should be attempting to maintain a consistent heart beat at the level indicated which is defined as
smallest number of milliseconds between heart-beats that it can
guarantee
The broker will allow for some grace period based on that value after which if the client has not sent a heart beat or any other frame the connection will be closed. From the trace given the client is not sending any heart beats or other wire level activity so the broker is dropping the connection.

Related

Using bitronox manager, XA not working with my custom dev-kit adapter

I developed my custom connector with dev-kit, my connector act as a source it connect to ejb and extract the data, and send to the another end point.
I am using bitronix for transaction manager.
I used the below code to register my ejb in the mule transaction context.
public static void registerXaResource(MuleContext muleContext) {
EJBClientTransactionContext txContext = EJBClientTransactionContext.create(muleContext.getTransactionManager(),
getSynchronizationRegistry());
EJBClientTransactionContext.setGlobalContext(txContext);
XaResourceProducer.registerXAResource("dummyResource", new DummyXaResource());
}
/**
* #return
*/
private static TransactionSynchronizationRegistry getSynchronizationRegistry() {
return TransactionManagerServices.getTransactionSynchronizationRegistry();
}
After that am using next end point as JMS and configured with XA,always join.
But it not behave as XA.
It looks like bitronix delisting the JMS resource.
2019-12-11 16:59:48,398 [Receiving Thread] DEBUG
bitronix.tm.resource.jms.DualSessionWrapper - choosing XA session
2019-12-11 16:59:48,410 [Receiving Thread] DEBUG bitronix.tm.resource.jms.DualSessionWrapper - looking for producer based on a MessageProducerConsumerKey on ActiveMQQueue[sampleReplyQueue]
2019-12-11 16:59:48,410 [Receiving Thread] DEBUG bitronix.tm.resource.jms.DualSessionWrapper - found no producer based on a MessageProducerConsumerKey on ActiveMQQueue[sampleReplyQueue], creating it
2019-12-11 16:59:48,411 [Receiving Thread] DEBUG bitronix.tm.resource.jms.DualSessionWrapper - choosing XA session
2019-12-11 16:59:48,447 [Receiving Thread] DEBUG bitronix.tm.resource.jms.DualSessionWrapper - closing a DualSessionWrapper in state ACCESSIBLE of a JmsPooledConnection of pool 1605822565-inboundtest-JMS in state ACCESSIBLE with underlying connection org.apache.activemq.artemis.jms.client.ActiveMQXAConnection#207dd1b7
2019-12-11 16:59:48,447 [Receiving Thread] DEBUG bitronix.tm.resource.common.TransactionContextHelper - delisting a DualSessionWrapper in state ACCESSIBLE of a JmsPooledConnection of pool 1605822565-inboundtest-JMS in state ACCESSIBLE with underlying connection org.apache.activemq.artemis.jms.client.ActiveMQXAConnection#207dd1b7 from a Bitronix Transaction with GTRID [31363035383232353635000000002582E13C00000001], status=ACTIVE, 1 resource(s) enlisted (started Thu Jan 08 12:18:54 IST 1970)
2019-12-11 16:59:48,447 [Receiving Thread] DEBUG bitronix.tm.resource.common.TransactionContextHelper - resource is not in enlisting global transaction context: a DualSessionWrapper in state ACCESSIBLE of a JmsPooledConnection of pool 1605822565-inboundtest-JMS in state ACCESSIBLE with underlying connection org.apache.activemq.artemis.jms.client.ActiveMQXAConnection#207dd1b7
2019-12-11 16:59:48,447 [Receiving Thread] DEBUG bitronix.tm.resource.common.TransactionContextHelper - requeuing a DualSessionWrapper in state ACCESSIBLE of a JmsPooledConnection of pool 1605822565-inboundtest-JMS in state ACCESSIBLE with underlying connection org.apache.activemq.artemis.jms.client.ActiveMQXAConnection#207dd1b7 from a Bitronix Transaction with GTRID [31363035383232353635000000002582E13C00000001], status=ACTIVE, 1 resource(s) enlisted (started Thu Jan 08 12:18:54 IST 1970)
2019-12-11 16:59:48,447 [Receiving Thread] DEBUG bitronix.tm.resource.common.TransactionContextHelper - resource is not in enlisting global transaction context: a DualSessionWrapper in state ACCESSIBLE of a JmsPooledConnection of pool 1605822565-inboundtest-JMS in state ACCESSIBLE with underlying connection org.apache.activemq.artemis.jms.client.ActiveMQXAConnection#207dd1b7
As per the logs the JMS not comes under the transaction which i begin.
Or else right way to implement XA Mule custom connector.
Devkit doesn't support transactions. Probably just registering the resource in that way is not enough to fully implement the XA transaction.
The SDK for Mule 4 does support transactions though I understand this is not the version you are interested.

Asterisk: "TLS clean shutdown alert reading data" after 120s in SIP call

I am using a Secure SIP trunk provided by Twilio to implement an IVR. I have implemented per Twilio's Asterisk configuration guide, installed SRTP to /usr/local/lib, as well as implemented the configuration in https://wiki.asterisk.org/wiki/display/AST/Secure+Calling+Tutorial.
The problem lies in any call that is longer than 2 minutes cannot be ended cleanly and causes Asterisk to restart.
sip.conf (using chan_sip, not pjsip):
[general]
; other configuration lines removed
tlsenable=yes
tlsbindaddr=0.0.0.0
tlscertfile=/etc/pki/tls/private/pbx.pem
tlscafile=/etc/pki/tls/private/gd_bundle-g2-g1.crt
tlscipher=ALL
tlsclientmethod=tlsv1
tlsdontverifyserver=yes
[twilio-trunk](!)
type=peer
context=from-twilio ;Which dialplan to use for incoming calls
dtmfmode=rfc4733
canreinvite=no
insecure=port,invite
transport=tls
qualify=yes
encryption=yes
media_encryption=sdes
I can make and receive calls just fine, and I have confirmed the calls are encrypted both via wireshark and confirmation from Twilio's own support queue.
At exactly 120 seconds into every call, this debug pops up:
[Dec 6 13:14:39] DEBUG[30015]: iostream.c:157 iostream_read: TLS clean shutdown alert reading data
[Dec 6 13:14:39] DEBUG[30015]: chan_sip.c:2905 sip_tcptls_read: SIP TCP/TLS server has shut down
The call continues to flow bi-directionally, the caller never knows there is a problem until they hit a hangup in context, i.e. h,1,Hangup(). Then Asterisk is restarted (new PID) and the caller hangs in limbo for another 5 minutes before the call times out with a fast busy. Twilio confirms they see the BYE and return an ACK at the point of the Hangup.
I was on 13.11 and updated to 15.1.3, same result. Calls longer than 120s result in TLS message in debug and Asterisk restarts.
No Google query results out there. Twilio hasn't been real helpful. Can anyone shed some light on what is happening and where I need to look next?
More logs:
[Dec 8 10:18:48] DEBUG[4993][C-00000001]: channel.c:5551 set_format: Channel SIP/twilio0-00000000 setting write format path: gsm -> ulaw
[Dec 8 10:18:48] DEBUG[4993][C-00000001]: res_rtp_asterisk.c:4017 rtp_raw_write: Difference is 2472, ms is 329
[Dec 8 10:18:48] DEBUG[4993][C-00000001]: channel.c:3192 ast_settimeout_full: Scheduling timer at (50 requested / 50 actual) timer ticks per second
– <SIP/twilio0-00000000> Playing ‘IVR/omnicare_9d_account.gsm’ (language ‘en’)
[Dec 8 10:18:48] DEBUG[4993][C-00000001]: res_rtp_asterisk.c:4928 ast_rtcp_interpret: Got RTCP report of 64 bytes from 34.203.250.7:10475
[Dec 8 10:18:53] DEBUG[4993][C-00000001]: res_rtp_asterisk.c:4928 ast_rtcp_interpret: Got RTCP report of 64 bytes from 34.203.250.7:10475
[Dec 8 10:18:55] DEBUG[4992]: iostream.c:157 iostream_read: TLS clean shutdown alert reading data
[Dec 8 10:18:55] DEBUG[4992]: chan_sip.c:2905 sip_tcptls_read: SIP TCP/TLS server has shut down
[Dec 8 10:18:58] DEBUG[4993][C-00000001]: channel.c:3192 ast_settimeout_full: Scheduling timer at (0 requested / 0 actual) timer ticks per second
[Dec 8 10:18:58] DEBUG[4993][C-00000001]: channel.c:3192 ast_settimeout_full: Scheduling timer at (0 requested / 0 actual) timer ticks per second
[Dec 8 10:18:58] DEBUG[4993][C-00000001]: channel.c:3192 ast_settimeout_full: Scheduling timer at (0 requested / 0 actual) timer ticks per second
[Dec 8 10:18:58] DEBUG[4993][C-00000001]: channel.c:5551 set_format: Channel SIP/twilio0-00000000 setting write format path: ulaw -> ulaw
[Dec 8 10:18:58] DEBUG[4993][C-00000001]: res_rtp_asterisk.c:4928 ast_rtcp_interpret: Got RTCP report of 64 bytes from 34.203.250.7:10475
[Dec 8 10:19:01] DEBUG[4914]: cdr.c:4305 ast_cdr_engine_term: CDR Engine termination request received; waiting on messages…
Asterisk uncleanly ending (0).
Executing last minute cleanups
== Destroying musiconhold processes
[Dec 8 10:19:01] DEBUG[4914]: res_musiconhold.c:1627 moh_class_destructor: Destroying MOH class ‘default’
[Dec 8 10:19:01] DEBUG[4914]: cdr.c:1289 cdr_object_finalize: Finalized CDR for SIP/twilio0-00000000 - start 1512749813.880448 answer 1512749813.881198 end 1512749941.201797 dispo ANSWERED
== Manager unregistered action DBGet
== Manager unregistered action DBPut
== Manager unregistered action DBDel
== Manager unregistered action DBDelTree
[Dec 8 10:19:01] DEBUG[4914]: asterisk.c:2157 really_quit: Asterisk ending (0).
Check your firewall logs. We've had issues with sessions being torn down by firewalls that thought the NAT entries were stale/old.
You can also try configuring Asterisk to send keep-alive packets using the option qualify=yes and nat=yes in your sip.conf entry for that user/trunk. Or inside the RTP stream with rtpkeepalive=<secs>. The best docs I could find for sip.conf are the example config on github.
I dug in the source code for the text "TLS clean shutdown alert reading data", which pointed me to some OpenSSL docs which suggest a clean/normal closure (which I'm guessing was caused by your firewall):
The TLS/SSL connection has been closed. If the protocol version is SSL 3.0 or higher, this result code is returned only if a closure alert has occurred in the protocol, i.e. if the connection has been closed cleanly. Note that in this case SSL_ERROR_ZERO_RETURN does not necessarily indicate that the underlying transport has been closed.

Getting error while connecting to restcomm server bind 0x0000000F: "System ID invalid"

I have two different systems running restcomm smsc gw. I'm trying to configure one as a server side and other as client side. Logs for both PC are given below:
SERVER ERROR
10:40:32,975 INFO [SmppServerConnector] (SmppManagement) New channel from [172.31.130.126:33712]
10:40:33,020 INFO [UnboundSmppSession] (SmppManagement.UnboundSession.172.31.130.126:33712) received PDU: (bind_transceiver: 0x0000002F 0x00000009 0x00000000 0x00000001) (body: systemId [jazz] password [jazz123] systemType [] interfaceVersion [0x34] addressRange (0x00 0x00 [^[0-9a-zA-Z]*])) (opts: )
10:40:33,022 ERROR [DefaultSmppServerHandler] (SmppManagement.UnboundSession.172.31.130.126:33712) Received BIND request but no ESME configured for SystemId=jazz Host=172.31.130.126 Port=33712 SmppBindType=TRANSCEIVER
10:40:33,022 WARN [UnboundSmppSession] (SmppManagement.UnboundSession.172.31.130.126:33712) Bind request rejected or failed for connection [172.31.130.126:33712] with error [SMPP processing error [0x0000000F]]
10:40:33,029 INFO [UnboundSmppSession] (SmppManagement.UnboundSession.172.31.130.126:33712) send PDU: (bind_transceiver_resp: 0x00000022 0x80000009 0x0000000F 0x00000001 result: "System ID invalid") (body: systemId [RestCommSMSC]) (opts: (sc_interface_version: 0x0210 0x0001 [34]))
10:40:33,034 INFO [UnboundSmppSession] (SmppManagement.UnboundSession.172.31.130.126:33712) Connection closed with [172.31.130.126:33712]
CLIENT ERROR
10:40:33,012 INFO [DefaultSmppSession] (Thread-45) sync send PDU: (bind_transceiver: 0x0000002F 0x00000009 0x00000000 0x00000001) (body: systemId [jazz] password [jazz123] systemType [] interfaceVersion [0x34] addressRange (0x00 0x00 [^[0-9a-zA-Z]*])) (opts: )
10:40:33,015 INFO [DefaultSmppSession] (Thread-45) write bytes: [0000002f0000000900000000000000016a617a7a006a617a7a31323300003400005e5b302d39612d7a412d5a5d2a00]
10:40:33,063 INFO [DefaultSmppSession] (jazz) read bytes: [00000022800000090000000f0000000152657374436f6d6d534d5343000210000134]
10:40:33,070 INFO [DefaultSmppSession] (jazz) received PDU: (bind_transceiver_resp: 0x00000022 0x80000009 0x0000000F 0x00000001 result: "System ID invalid") (body: systemId [RestCommSMSC]) (opts: (sc_interface_version: 0x0210 0x0001 [34]))
10:40:33,071 ERROR [SmppSessionHandlerInterfaceImpl] (jazz) Rx : fireChannelUnexpectedlyClosed for SmppSessionImpl=client Default handling is to discard an unexpected channel closed
10:40:33,072 ERROR [SmppClientOpsThread] (Thread-45) Exception when trying to bind client SMPP connection for ESME systemId=jazz
com.cloudhopper.smpp.type.SmppBindException: Unable to bind [error: 0x0000000F "System ID invalid"]
at com.cloudhopper.smpp.impl.DefaultSmppSession.bind(DefaultSmppSession.java:341)
at com.cloudhopper.smpp.impl.DefaultSmppClient.doBind(DefaultSmppClient.java:215)
at com.cloudhopper.smpp.impl.DefaultSmppClient.bind(DefaultSmppClient.java:196)
at org.restcomm.smpp.SmppClientOpsThread.initiateConnection(SmppClientOpsThread.java:292)
at org.restcomm.smpp.SmppClientOpsThread.run(SmppClientOpsThread.java:129)
at java.lang.Thread.run(Thread.java:745)
Server Bind
ESME name=server systemId=jazz state=CLOSED password=jazz123 host=127.0.0.1 port=-1 networkId=21 chargingEnabled=false bindType=TRANSCEIVER systemType=
Client Bind
ESME name=client systemId=jazz state=CLOSED password=jazz123 host=172.31.130.101 port=2776 networkId=21 chargingEnabled=false bindType=TRANSCEIVER systemType=
As you can see SystemID is same on both side but still im getting this error. Any help will be appreciated.

why my rabbitmq cluster connections and channels stay in flow status

I am testing my rabbitmq 3 nodes cluster these days,
I use the java tool to test,
[root#server-42 bin ]$ ./runjava com.rabbitmq.perf.PerfTest -x1 -y1 -e testex -Hmqp://username:password#123.123.123.2/test' -t topic -k sample.info -s 1500 -i 20
id: test-154506-639, starting consumer #0
id: test-154506-639, starting consumer #0, channel #0
id: test-154506-639, starting producer #0
id: test-154506-639, starting producer #0, channel #0
id: test-154506-639, time: 20.000s, sent: 8913 msg/s, received: 8804 msg/s, min/avg/max latency: 6317/251907/727492 microseconds
id: test-154506-639, time: 40.004s, sent: 8993 msg/s, received: 8991 msg/s, min/avg/max latency: 157294/256691/387926 microseconds
id: test-154506-639, time: 60.011s, sent: 9029 msg/s, received: 9019 msg/s, min/avg/max latency: 146744/255631/384696 microseconds
id: test-154506-639, time: 80.017s, sent: 8946 msg/s, received: 8972 msg/s, min/avg/max latency: 164969/259147/723908 microseconds
id: test-154506-639, time: 100.019s, sent: 8971 msg/s, received: 8949 msg/s, min/avg/max latency: 164012/258115/353767 microseconds
I find my rabbitmq connection and channel status keeps at flow status.
however why is it ? is there any way to increase the performance?
I thought the flow status to keep the publisher to send messages to quick, in case that server can not queue the messages.
but the sending rate I used to test seems not high at all, why they are still in flow status?
anyone can help? thanks in advance.
Flow Control:
RabbitMQ will reduce the speed of connections which are publishing too quickly for queues to keep up.
If you want to learn more about the credit flow you can read this doc, in particular:
To see how credit_flow and its settings affect publishing, let’s see how internal messages flow in RabbitMQ. Keep in mind that RabbitMQ is implemented in Erlang, where processes communicate by sending messages to each other.
you can try to increase credit_flow parameter
in my case, I was getting this due to Lack of memory (Too many unacked messages causes memory to be full which makes the connection in flow state)

PDI Error occured while trying to connect to the database

I got the following error while executing a PDI job.
I do have mysql driver in place (libext/JDBC). Can some one say, what would be the reason of failure?
Despite the error while connecting to DB, my DB is up and I can access it by command prompt.
Error occured while trying to connect to the database
Error connecting to database: (using class org.gjt.mm.mysql.Driver)
Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
ERROR 03-08 11:05:10,595 - stepname- Error initializing step [Update]
ERROR 03-08 11:05:10,595 - stepname - Step [Update.0] failed to initialize!
INFO 03-08 11:05:10,595 - stepname - Finished reading query, closing connection.
ERROR 03-08 11:05:10,596 - stepname - Unable to prepare for execution of the transformation
ERROR 03-08 11:05:10,596 - stepname - org.pentaho.di.core.exception.KettleException:
We failed to initialize at least one step. Execution can not begin!
Thanks
Is this a long running query by any chance? Or; in PDI world it can be because your step kicks off at the start of the transform, waits for something to do, and if nothing comes along by the net write timeout then you'll see this error.
If so your problem is caused by a timeout that MySQL uses and frequently needs increasing from the default which is 10 mins.
See here:
http://wiki.pentaho.com/display/EAI/MySQL