erlang failed to resolve ipv6 addresses using parameter from rabbitmq - rabbitmq

I'm using rabbitmq cluster in k8s which has only pure ipv6 address. inet return nxdomain error when parsing the k8s service name.
The paramter passed to erlang from rabbitmq is:
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+A 128 -kernel inetrc '/etc/rabbitmq/erl_inetrc' -proto_dist inet6_tcp"
RABBITMQ_CTL_ERL_ARGS="-proto_dist inet6_tcp"
erl_inetrc: |-
{inet6, true}.
when rabbitmq using its plugin rabbit_peer_discovery_k8s to invoke k8s api:
2019-10-15 07:33:55.000 [info] <0.238.0> Peer discovery backend does not support locking, falling back to randomized delay
2019-10-15 07:33:55.000 [info] <0.238.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized start
up delay.
2019-10-15 07:33:55.000 [debug] <0.238.0> GET https://kubernetes.default.svc.cluster.local:443/api/v1/namespaces/tazou/endpoints/zt4-crmq
2019-10-15 07:33:55.015 [debug] <0.238.0> Response: {error,{failed_connect,[{to_address,{"kubernetes.default.svc.cluster.local",443}},{inet,[inet]
,nxdomain}]}}
2019-10-15 07:33:55.015 [debug] <0.238.0> HTTP Error {failed_connect,[{to_address,{"kubernetes.default.svc.cluster.local",443}},{inet,[inet],nxdom
ain}]}
2019-10-15 07:33:55.015 [info] <0.238.0> Failed to get nodes from k8s - {failed_connect,[{to_address,{"kubernetes.default.svc.cluster.local",443}}
,
{inet,[inet],nxdomain}]}
2019-10-15 07:33:55.016 [error] <0.237.0> CRASH REPORT Process <0.237.0> with 0 neighbours exited with reason: no case clause matching {error,"{fa
iled_connect,[{to_address,{\"kubernetes.default.svc.cluster.local\",443}},\n {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from
_config/0 line 167 in application_master:init/4 line 138
2019-10-15 07:33:55.016 [info] <0.43.0> Application rabbit exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kub
ernetes.default.svc.cluster.local\",443}},\n
in k8s console, the address could be resolved:
[rabbitmq]# nslookup -type=AAAA kubernetes.default.svc.cluster.local
Server: 2019:282:4000:2001::6
Address: 2019:282:4000:2001::6#53
kubernetes.default.svc.cluster.local has AAAA address fd01:abcd::1
the inet could return ipv6 address.
kubectl exec -ti zt4-crmq-0 rabbitmqctl eval 'inet:gethostbyname("kubernetes.default.svc.cluster.local").'
{ok,{hostent,"kubernetes.default.svc.cluster.local",[],inet6,16,
[{64769,43981,0,0,0,0,0,1}]}}
as I know, plugin call httpc:request to invoke k8s api. I don't know what's the gap between httpc:request and inet:gethostbyname. I also don't what's used by httpc:request to resolve the address of hostname.
I query for the rabbitmq plugin, It's said that rabbitmq plugin don't aware how erlang resovlve the address. https://github.com/rabbitmq/rabbitmq-peer-discovery-k8s/issues/55.
Anything else I could set for erl_inetrc so that erlang could resolve the ipv6 address? what did i miss to config? or how could i debug from erlang side? I'm new to erlang.
B.R,
Tao

Related

X-Ray Daemon don't receive any data from envoy

I have a service running a task definition with three containers:
service itself
envoy
x-ray daemon
And I want to trace and monitor my services interacting with each other with x-ray.
But I don't see any data in x-ray.
I can see the request logs and everything in the envoy logs but there are no error messages about missing connection to the x-ray daemon.
Envoy container has three env variables:
APPMESH_VIRTUAL_NODE_NAME = mesh/mesh-name/virtualNode/service-virtual-node
ENABLE_ENVOY_XRAY_TRACING = 1
ENVOY_LOG_LEVEL = trace
The x-ray daemon is pretty plain and has just a name and an image (amazon/aws-xray-daemon:1).
But when looking in the logs of the x-ray dameon, there is only the following:
2022-05-31T14:48:05.042+02:00 2022-05-31T12:48:05Z [Info] Initializing AWS X-Ray daemon 3.0.0
2022-05-31T14:48:05.042+02:00 2022-05-31T12:48:05Z [Info] Using buffer memory limit of 76 MB
2022-05-31T14:48:05.042+02:00 2022-05-31T12:48:05Z [Info] 1216 segment buffers allocated
2022-05-31T14:48:05.051+02:00 2022-05-31T12:48:05Z [Info] Using region: eu-central-1
2022-05-31T14:48:05.788+02:00 2022-05-31T12:48:05Z [Error] Get instance id metadata failed: RequestError: send request failed
2022-05-31T14:48:05.788+02:00 caused by: Get http://169.254.169.254/latest/meta-data/instance-id: dial tcp xxx.xxx.xxx.254:80: connect: invalid argument
2022-05-31T14:48:05.789+02:00 2022-05-31T12:48:05Z [Info] Starting proxy http server on 127.0.0.1:2000
As far as I read, the error you can see in these logs doesn't affect the functionality (https://repost.aws/questions/QUr6JJxyeLRUK5M4tadg944w).
I'm pretty sure I'm missing a configuration or access right.
It's running already on staging but I set this up several weeks ago and I don't find any differences between the configurations.
Thanks in advance!
In my case, I made a copy-paste mistake by copying trailing line break into the name of the environment variable ENABLE_ENVOY_XRAY_TRACING which wasn't visible in the overview and only inside the text field.

Bad CONNECT when trying to subscribe to message queue

I'm completely new to RabbitMQ and now I'm looking for a configuration error. The client doesn't receive any messages from RabbitMQ and I debugged it as far as possible.
Frontend messages:
Message 1:
CONNECT
login:frontend_listener
passcode:xxx
accept-version:1.0,1.1,1.2
heart-beat:20000,0
Message 2:
ERROR
message:Bad CONNECT
content-type:text/plain
version:1.0,1.1,1.2
content-length:30
Virtual host '/' access denied
There are two vHosts: / and someVhost and there are different users like frontend_listener. Now I found a way to access the log file.
RabbitMQ log file:
2020-02-11 15:50:53.579 [warning] <0.798.0> STOMP login failed for user "frontend_listener"
2020-02-11 15:50:53.579 [error] <0.798.0> STOMP error frame sent:
Message: "Bad CONNECT"
Detail: "Access refused for user 'frontend_listener'\n"
Server private detail: none
...
2020-02-11 15:51:25.349 [info] <0.850.0> Creating user 'frontend_listener'
2020-02-11 15:51:30.374 [info] <0.857.0> Setting permissions for 'frontend_listener' in 'someVhost' to '$', '$', 'client-notification.*'
2020-02-11 15:51:54.980 [warning] <0.867.0> STOMP login failed - not_allowed (vhost access not allowed)~n
2020-02-11 15:51:54.980 [error] <0.867.0> STOMP error frame sent:
Message: "Bad CONNECT"
Detail: "Virtual host '/' access denied"
Server private detail: none
2020-02-11 15:52:56.427 [warning] <0.875.0> STOMP login failed - not_allowed (vhost access not allowed)~n
It reads like the permissions are wrong. Can someone help me out interpreting that correctly?
I try to read it: User frontend_listener wants to access the vHost /, but it hasn't sufficient permissions (don't know what $ here mean other than a part of regular expression). The thing is, that I don't know if that is the correct vHost. How do I find out the URL of each vHost?
I'm asking this because I believe that the mapping to the vHost is wrong or something is missing.
Edit:
After adding host: 'someVhost' to my stomp-config.ts I was able to subscribe to the queues. Now I get the following error in the log:
2020-02-12 16:32:25.913 [error] <0.5159.1> Channel error on connection <0.5149.1> (127.0.0.1:58136 -> 127.0.0.1:15674, vhost: 'someVhost', user: 'frontend_listener'), channel 1:
operation basic.consume caused a channel exception access_refused: access to queue 'stomp-subscription-SZ3-PO1-PbZroPol-WXSQw' in vhost 'someVhost' refused for user 'frontend_listener'
2020-02-12 16:32:26.022 [error] <0.5145.1> STOMP error frame sent:
Message: access_refused
On the frontend I don't get a message or error.
You need to also pass host information in the STOMP CONNECT frame..
this is what the specifications says and clients MUST set this header
host : The name of a virtual host that the client wishes to connect to. It is recommended clients set this to the host name that the socket was established against, or to any name of their choosing. If this header does not match a known virtual host, servers supporting virtual hosting MAY select a default virtual host or reject the connection.
So this is how your CONNET frame should look
CONNECT
login:frontend_listener
passcode:xxx
accept-version:1.0,1.1,1.2
host: someVhost
heart-beat:20000,0

Zookeeper and ActiveMQ LevelDB replication non reliable

In my current project we are trying to set up an activeMQ cluster with LevelDB replication. Our configuration has a ZooKeeper ensemble of three nodes and an ActiveMQ cluster of three nodes.
The following is the configuration used for activeMQ: (of course the hostname is different for each node in the cluster)
<persistenceAdapter>
<replicatedLevelDB
replicas="3"
bind="tcp://0.0.0.0:0"
hostname="activemq1"
zkAddress="zk1:2181,zk2:2181,zk3:2181"
zkPath="/activemq/leveldb-stores"
/>
</persistenceAdapter>
We start up three instances of zookeeper and three instances of activemq. We observe that the zookeeper leader gets correctly elected. But in activeMQ cluster Master election is not happening. Go through the log we came to know that there is a authentication problem with zookeeper. (as per the log, I am having less knowledge in zookeeper/activemq). Herewith I pasted the logs for reference.
INFO: Loading '/opt/activemq//bin/env'
INFO: Using java '/usr/bin/java'
INFO: Starting in foreground, this is just for debugging purposes (stop process by pressing CTRL+C)
INFO: Creating pidfile /data/activemq/activemq.pid
Java Runtime: Oracle Corporation 1.8.0_91 /usr/lib/jvm/java-8-openjdk-amd64/jre
Heap sizes: current=62976k free=59998k max=932352k
JVM args: -Xms64M -Xmx1G -Djava.util.logging.config.file=logging.properties -Djava.security.auth.login.config=/opt/activemq/conf.tmp/login.config -Dcom.sun.management.jmxremote -Djava.awt.headless=true -Djava.io.tmpdir=/opt/activemq//tmp -
Dactivemq.classpath=/opt/activemq/conf.tmp:/opt/activemq//../lib/: -Dactivemq.home=/opt/activemq/ -
Dactivemq.base=/opt/activemq/ -Dactivemq.conf=/opt/activemq/conf.tmp -Dactivemq.data=/data/activemq
Extensions classpath:[/opt/activemq/lib,/opt/activemq/lib/camel,/opt/activemq/lib/optional,/opt/activemq/lib/web,/opt/activemq/lib/extra]
ACTIVEMQ_HOME: /opt/activemq
ACTIVEMQ_BASE: /opt/activemq
ACTIVEMQ_CONF: /opt/activemq/conf.tmp
ACTIVEMQ_DATA: /data/activemq
Loading message broker from: xbean:activemq.xml
INFO | Refreshing org.apache.activemq.xbean.XBeanBrokerFactory$1#7823a2f9: startup date [Sat Jun 17 09:15:51 UTC 2017]; root of context hierarchy
INFO | JobScheduler using directory: /data/activemq/localhost/scheduler
INFO | Using Persistence Adapter: Replicated LevelDB[/data/activemq/leveldb, ip-172-20-44-97.ec2.internal:2181,ip-172-20-45-105.ec2.internal:2181,ip-172-20-48-226.ec2.internal:2181//activemq/leveldb-stores]
INFO | Starting StateChangeDispatcher
INFO | Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
INFO | Client environment:host.name=activemq-m1n59
INFO | Client environment:java.version=1.8.0_91
INFO | Client environment:java.vendor=Oracle Corporation
INFO | Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
INFO | Client environment:java.class.path=/opt/activemq//bin/activemq.jar
INFO | Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
INFO | Client environment:java.io.tmpdir=/opt/activemq//tmp
INFO | Client environment:java.compiler=<NA>
INFO | Client environment:os.name=Linux
INFO | Client environment:os.arch=amd64
INFO | Client environment:os.version=4.4.65-k8s
INFO | Client environment:user.name=root
INFO | Client environment:user.home=/root
INFO | Client environment:user.dir=/tmp
INFO | Initiating client connection, connectString=ip-172-20-44-97.ec2.internal:2181,ip-172-20-45-105.ec2.internal:2181,ip-172-20-48-226.ec2.internal:2181 sessionTimeout=2000 watcher=org.apache.activemq.leveldb.replicated.groups.ZKClient#4b41dd5c
WARN | SASL configuration failed: javax.security.auth.login.LoginException: No JAAS configuration section named 'Client' was found in specified JAAS configuration file: '/opt/activemq/conf.tmp/login.config'. Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server allows it.
WARN | unprocessed event state: AuthFailed
INFO | Opening socket connection to server ip-172-20-45-105.ec2.internal/172.20.45.105:2181
WARN | Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)[:1.8.0_91] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)[:1.8.0_91] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)[zookeeper-3.4.6.jar:3.4.6-1569965] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)[zookeeper-3.4.6.jar:3.4.6-1569965]
WARN | SASL configuration failed: javax.security.auth.login.LoginException: No JAAS configuration section named 'Client' was found in specified JAAS configuration file: '/opt/activemq/conf.tmp/login.config'. Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server allows it.
INFO | Opening socket connection to server ip-172-20-48-226.ec2.internal/172.20.48.226:2181
WARN | unprocessed event state: AuthFailed
WARN | Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)[:1.8.0_91] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)[:1.8.0_91] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) [zookeeper-3.4.6.jar:3.4.6-1569965] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)[zookeeper-3.4.6.jar:3.4.6-1569965]
WARN | SASL configuration failed: javax.security.auth.login.LoginException: No JAAS configuration section named 'Client' was found in specified JAAS configuration file: '/opt/activemq/conf.tmp/login.config'. Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server allows it.
INFO | Opening socket connection to server ip-172-20-44-97.ec2.internal/172.20.44.97:2181
WARN | unprocessed event state: AuthFailed
WARN | Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)[:1.8.0_91] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)[:1.8.0_91] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)[zookeeper-3.4.6.jar:3.4.6-1569965] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)[zookeeper-3.4.6.jar:3.4.6-1569965]
Please help to get out from this problem.
If anyone having idea of deploying Zookeeper with ActiveMQ cluster in Kubernetes please share your ideas. since we are trying to deploy it in Kubernetes.

Open MQ broker won't start

We're running Glassfish 4.1.1 (Payara) with mq 5.1.1. It's a HA setup with load balancer and cluster.
Glassfish is running ok. Problem is that MQ won't start.
I think that a remote MQ is starting. I can do imqcmd list bkr -b and I get successful results.
However when I do imqcmd list bkr (or imqcmd list jmx, without -b hostname) I get:
Host Primary Port
-------------------------
localhost 7676
WARNING: [C4003]: Error occurred on connection creation [localhost:7676]. - cause: java.net.SocketException: Connection reset
Error while connecting to the broker on host 'localhost' and port '7676'.
I'd like to get rid of the error, and see my network ip instead of localhost.
Also GF server.log gives this:
[2017-04-12T11:54:46.516-0400] [Payara 4.1] [SEVERE] [rardeployment.start_failed] [javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors] [tid: _ThreadID=42 _ThreadName=admin-listener(2)] [timeMillis: 1492012486516] [levelValue: 1000] [[
RAR6035 : Resource adapter start failed.
javax.resource.spi.ResourceAdapterInternalException: java.security.PrivilegedActionException: javax.resource.spi.ResourceAdapterInternalException: MQJMSRA_RA4001: start:Aborting:Exception starting EMBEDDED broker=Broker failed to start
at com.sun.enterprise.connectors.jms.system.ActiveJmsResourceAdapter.startResourceAdapter(ActiveJmsResourceAdapter.java:557)
at com.sun.enterprise.connectors.ActiveOutboundResourceAdapter.init(ActiveOutboundResourceAdapter.java:130)
...
Caused by: java.lang.RuntimeException: Broker failed to start
at com.sun.messaging.jmq.jmsclient.runtime.impl.BrokerInstanceImpl.start(BrokerInstanceImpl.java:205)
at com.sun.messaging.jms.blc.EmbeddedBrokerRunner.start(EmbeddedBrokerRunner.java:331)
at com.sun.messaging.jms.blc.LifecycleManagedBroker.start(LifecycleManagedBroker.java:457)
... 92 more
Caused by: java.io.IOException: [B3297]: Unable to make directory <mydirectory>/imq/instances/imqbroker/etc
at com.sun.messaging.jmq.jmsserver.Broker.initializePasswdFile(Broker.java:376)
I'm wondering where the directory that it is unable to make is configured.
I've been debugging this for days. I need to know where to configure the ip for the embedded broker. I also need to know where to set up the jmxrmi url.
any help would be appreciated. Thanks!
I found the solution to this problem. We had a broken symlink to the openmq application directory, within the Glassfish application directory. On domain startup, Glassfish could not find mq and therefore could not start the embedded broker. Once we fixed the symlink, the embedded broker started up on glassfish domain startup (asadmin start-domain).
I knew the embedded broker was not starting because the "imq" folder was not being created in <domaindir>/
Check for those broken symlinks!!

activemq failover using multiple instances in master slave mode on same linux machine

I have setup ActiveMQ mulitple instances to achieve failover in master slave mode in windows.
While setting up the same i just created 3 instances under bin folder without changing any port and started all 3 instances one by one. First instance became master and remaining were in slave mode until I stopped master instance.
Now I am trying to achieve the same in Linux environment. First instance starts successfully but when I start second instance in a different window it throws below error:
ERROR | Failed to start Apache ActiveMQ ([instance2, ID:132vm6-57227-1478597606120-0:1], java.io.IOException: Transport Connector could not be registered in JMX: java.io.IOException: Failed to bind to server socket: tcp://0.0.0.0:61616?maximumConnections=1000&wireFormat.maxFrameSize=104857600 due to: java.net.BindException: Address already in use)
INFO | Apache ActiveMQ 5.14.0 (instance2, ID:132vm6-57227-1478597606120-0:1) is shutting down
INFO | Connector openwire stopped
INFO | Connector amqp stopped
INFO | Connector stomp stopped
INFO | Connector mqtt stopped
INFO | Connector ws stopped
INFO | PListStore:[/opt/apache-activemq-5.14.0/bin/instance2/data/instance2/tmp_storage] stopped
INFO | Stopping async queue tasks
INFO | Stopping async topic tasks
INFO | Stopped KahaDB
INFO | Apache ActiveMQ 5.14.0 (instance2, ID:132vm6-57227-1478597606120-0:1) uptime 0.585 seconds
INFO | Apache ActiveMQ 5.14.0 (instance2, ID:132vm6-57227-1478597606120-0:1) is shutdown
INFO | Closing org.apache.activemq.xbean.XBeanBrokerFactory$1#4233871a: startup date [Tue Nov 08 15:03:24 IST 2016]; root of context hierarchy
WARN | Exception thrown from LifecycleProcessor on context close
java.lang.IllegalStateException: LifecycleProcessor not initialized - call 'refresh' before invoking lifecycle methods via the context: org.apache.activemq.xbean.XBeanBrokerFactory$1#4233871a: startup date [Tue Nov 08 15:03:24 IST 2016]; root of context hierarchy
at org.springframework.context.support.AbstractApplicationContext.getLifecycleProcessor(AbstractApplicationContext.java:357)[spring-context-4.1.9.RELEASE.jar:4.1.9.RELEASE]
at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:884)[spring-context-4.1.9.RELEASE.jar:4.1.9.RELEASE]
at org.springframework.context.support.AbstractApplicationContext.close(AbstractApplicationContext.java:843)[spring-context-4.1.9.RELEASE.jar:4.1.9.RELEASE]
at org.apache.activemq.hooks.SpringContextHook.run(SpringContextHook.java:30)[activemq-spring-5.14.0.jar:5.14.0]
at org.apache.activemq.broker.BrokerService.stop(BrokerService.java:875)[activemq-broker-5.14.0.jar:5.14.0]
at org.apache.activemq.xbean.XBeanBrokerService.stop(XBeanBrokerService.java:122)[activemq-spring-5.14.0.jar:5.14.0]
at org.apache.activemq.broker.BrokerService.start(BrokerService.java:629)[activemq-broker-5.14.0.jar:5.14.0]
at org.apache.activemq.xbean.XBeanBrokerService.afterPropertiesSet(XBeanBrokerService.java:73)[activemq-spring-5.14.0.jar:5.14.0]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)[:1.7.0_65]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)[:1.7.0_65]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)[:1.7.0_65]
at java.lang.reflect.Method.invoke(Method.java:606)[:1.7.0_65]
I am using ActiveMQ 5.14 version.
If anybody has encountered a similar issue, kindly provide your inputs.
To get multiple instances of ActiveMQ running on the same machine, you need to change the ports that they try to open. There are (at least) 3 ports that need to be changed:
The transportConnector ports that accept messaging traffic. These are defined in theactivemq.xml file. Typically you only need the openwire one - this is 61616 by default; I usually change this in the other ActiveMQ instances to 61626, 61636 etc. You can usually comment out the others if you don't intend to use them.
The Jetty HTTP port. This is defined in the jetty.xml file. The default is 8161, set the next ones to 8162, 8163 etc.
The JMX port. This one's a bit tricky, as you need to stick a piece of config into the activemq.xml to explicitly define it as follows:
<managementContext>
<managementContext createConnector="true" connectorPort="1099"/>
</managementContext>
You can then change this to 1199, 1299 on the other instances. Hope this helps.