Apache Ignite Off Heap memory filling up

Apache Ignite Off Heap memory filling up - ignite

We have the following ignite cluster setup configuration:
Apache Ignite version : 2.7.5
Ignite persistence is enabled (true)
2 node cluster in partitioned mode
RAM - 210 GB per node 
JVM xms and xmx 20G
Off Heap Memory Max: 120GB
Number of records - 160 million 
I can see the following node metrics:
[03:13:31,126][INFO][db-checkpoint-thread-#146%GridA%][GridCacheDatabaseSharedManager] Checkpoint finished [cpId=df22db5b-6ffa-4f5d-b6da-d0e36c0492af, pages=1512, markPos=FileWALPointer [idx=6659, fileOff=249851578, len=49197], walSegmentsCleared=0, walSegmentsCovered=[], markDuration=26ms, pagesWrite=13ms, fsync=312ms, total=351ms]
[03:14:05,346][INFO][grid-timeout-worker-#67%GridA%][IgniteKernal%GridA]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
^-- Node [id=25a3a57c, name=GridA, uptime=16 days, 22:40:01.512]
^-- H/N/C [hosts=10, nodes=10, CPUs=172]
^-- CPU [cur=1.17%, avg=4.94%, GC=0%]
^-- PageMemory [pages=30333907]
^-- Heap [used=3889MB, free=81.01%, comm=20480MB]
^-- Off-heap [used=119880MB, free=2.68%, comm=123179MB]
^-- sysMemPlc region [used=0MB, free=99.99%, comm=99MB]
^-- metastoreMemPlc region [used=0MB, free=99.82%, comm=99MB]
^-- Default_Region region [used=119880MB, free=2.44%, comm=122880MB]
^-- TxLog region [used=0MB, free=100%, comm=99MB]
^-- Ignite persistence [used=253233MB]
^-- sysMemPlc region [used=0MB]
^-- metastoreMemPlc region [used=unknown]
^-- Default_Region region [used=253233MB]
^-- TxLog region [used=0MB]
^-- Outbound messages queue [size=0]
^-- Public thread pool [active=0, idle=0, qSize=0]
^-- System thread pool [active=0, idle=6, qSize=0]
Does the ignite node require restart or should page replacement trigger and free up some offheap space?
Edit-2: as you can see that off heap memory free space is ~ 2.5 % and still page replacement(PR) hasn't been triggered. Could not find anything on the topic as to when PR will be triggered. Will it be triggered at free space = 0% ? Is there a possibility that my ignite node would shutdown if free space reaches 0%? Any implications on query performance when page replacement triggers eventually?

In case of enabled persistence after a data region is filled up Page replacement is triggered.

Related

RabbitMQ Free disk space is insufficient

I try to add a message to a queue with Java client but rabbitmq keeps me blocked.
Official documentation says at https://www.rabbitmq.com/disk-alarms.html :
When free disk space drops below a configured limit (50 MB by default), an alarm will be triggered and all producers will be blocked.
My disc space looks like this
So, I set the disk space in the config file:
disk_free_limit.absolute = 1000MB
but it does not increment it. Disk space still looks like above.
Also log file says this:
2022-01-17 16:17:34.538000+03:00 [info] <0.399.0> Enabling free disk space monitoring
2022-01-17 16:17:34.538000+03:00 [info] <0.399.0> Disk free limit set to 1000MB
2022-01-17 16:17:34.844000+03:00 [info] <0.399.0> Free disk space is insufficient. Free bytes: 40. Limit: 1000000000
2022-01-17 16:17:34.844000+03:00 [info] <0.223.0> Running boot step code_server_cache defined by app rabbit
2022-01-17 16:17:34.844000+03:00 [warning] <0.395.0> disk resource limit alarm set on node 'rabbit#BLG2A-V1-BB0268'.
2022-01-17 16:17:34.844000+03:00 [warning] <0.395.0>
2022-01-17 16:17:34.844000+03:00 [warning] <0.395.0> **********************************************************
2022-01-17 16:17:34.844000+03:00 [warning] <0.395.0> *** Publishers will be blocked until this alarm clears ***
2022-01-17 16:17:34.844000+03:00 [warning] <0.395.0> **********************************************************
2022-01-17 16:17:34.844000+03:00 [warning] <0.395.0>
How can I increment disk space?
My setup:
OS: Windows 10
RabbitMQ: 3.9.12
Erlang/OTP: 24.2

The alarm is telling you that your server only has 50MB of space left on the disk which RabbitMQ is trying to write to.
The disk_free_limit setting doesn't control how much disk is allocated, it controls how much disk is expected - if you set it to 1000MB, the alarm will be triggered as soon as there is only 1000MB left, rather than waiting until there is only 50MB left.
Making more disk space available is the same as it would be for any other program:
Delete other things that are using up your disk space - e.g. make sure log files are compressed and deleted after a certain amount of time
Configure RabbitMQ to use a different disk or partition, if you already have one that's bigger
Install a larger disk if it's a physical host, or allocate a larger disk image if it's a VM

This issue will be fixed in 3.9.13
https://github.com/rabbitmq/rabbitmq-server/pull/3970
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.

I changed RabbitMQ version to 3.9.11. Now, it is fixed.
I guess, version 3.9.12 was the problem.

Apache Ignite Topology Snapshot refreshed .Net

I start the Apache Ingnite node, a server node, and another client node.
My scenario is: Close the client node, and how to update the service node Topology Snapshot at the same time.
Now, the Topology Snapshot is refreshed only when the NodeFailed event is received by the server after 20 seconds.
What method or configuration on the server side can receive the NodeFailed event immediately or refresh the Topology Snapshot?
This is server log:
[09:08:50,522][WARNING][disco-event-worker-#45%ignite-instance-f69c161b-9f38-4576-b52b-ef3077ba3156%][GridDiscoveryManager] Node FAILED: TcpDiscoveryNode [id=5f346db2-50fd-4d83-b518-a09690569274, consistentId=5f346db2-50fd-4d83-b518-a09690569274, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.40.1, 192.168.50.135, 192.168.65.1], sockAddrs=HashSet [DESKTOP-1BLUS7R/192.168.40.1:0, /[0:0:0:0:0:0:0:1]:0, /127.0.0.1:0, /192.168.65.1:0, /192.168.50.135:0], discPort=0, order=3, intOrder=3, lastExchangeTime=1602810475243, loc=false, ver=2.8.1#20200521-sha1:86422096, isClient=true]
[09:08:50,525][INFO][disco-event-worker-#45%ignite-instance-f69c161b-9f38-4576-b52b-ef3077ba3156%][GridDiscoveryManager] Topology snapshot [ver=5, locNode=f6d3f760, servers=1, clients=0, state=ACTIVE, CPUs=6, offheap=1.5GB, heap=2.0GB]
[09:08:50,525][INFO][disco-event-worker-#45%ignite-instance-f69c161b-9f38-4576-b52b-ef3077ba3156%][GridDiscoveryManager] ^-- Baseline [id=0, size=1, online=1, offline=0]
[

Can reduce the service node attribute ClientFailureDetectionTimeout, increase the server check the frequency of client nodes.The default is 30 seconds.
//
// 摘要:
//Gets or sets the failure detection timeout used by Apache.Ignite.Core.Discovery.Tcp.TcpDiscoverySpi
//and Apache.Ignite.Core.Communication.Tcp.TcpCommunicationSpi for client nodes.
[DefaultValue(typeof(TimeSpan), "00:00:30")]
public TimeSpan ClientFailureDetectionTimeout { get; set; }

Ignite tomee-webprofile-7.1.0-> Start error reporting at Tomee

[1 image description here][1][2 image description here][2]
3 image description hereFailed to unmarshal discovery data for component: 1
class org.apache.ignite.IgniteCheckedException: Failed to deserialize object with given class loader: TomEEWebappClassLoader
context: chinawork
delegate: false
[16:27:05] ver. 2.7.0#20181201-sha1:256ae401
[16:27:05] 2018 Copyright(C) Apache Software Foundation
[16:27:05]
[16:27:05] Ignite documentation: http://ignite.apache.org
[16:27:05]
[16:27:05] Quiet mode.
[16:27:05] ^-- Logging by 'JavaLogger [quiet=true, config=null]'
[16:27:05] ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or "-v" to ignite.{sh|bat}
[16:27:05]
[16:27:05] OS: Windows 10 10.0 amd64
[16:27:05] VM information: Java(TM) SE Runtime Environment 1.8.0_152-b16 Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.152-b16
[16:27:05] Please set system property '-Djava.net.preferIPv4Stack=true' to avoid possible problems in mixed environments.
[16:27:05] Initial heap size is 126MB (should be no less than 512MB, use -Xms512m -Xmx512m).
[16:27:05] Configured plugins:
[16:27:05] ^-- None
[16:27:05]
[16:27:05] Configured failure handler: [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]]]
[16:27:06] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
[16:27:06] Security status [authentication=off, tls/ssl=off]
[16:27:07] REST protocols do not start on client node. To start the protocols on client node set '-DIGNITE_REST_START_ON_CLIENT=true' system property.
十二月 11, 2018 4:27:13 下午 org.apache.ignite.logger.java.JavaLogger error
严重: Failed to unmarshal discovery data for component: 1
class org.apache.ignite.IgniteCheckedException: Failed to deserialize object with given class loader: TomEEWebappClassLoader
context: cnf-soa
delegate: false
----------> Parent Classloader:
java.net.URLClassLoader#f6f4d33
at org.apache.ignite.marshaller.jdk.JdkMarshaller.unmarshal0(JdkMarshaller.java:147)
at org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:94)
at org.apache.ignite.marshaller.jdk.JdkMarshaller.unmarshal0(JdkMarshaller.java:161)
at org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:82)
at org.apache.ignite.spi.discovery.tcp.internal.DiscoveryDataPacket.unmarshalData(DiscoveryDataPacket.java:280)
at org.apache.ignite.spi.discovery.tcp.internal.DiscoveryDataPacket.unmarshalGridData(DiscoveryDataPacket.java:123)
at org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.onExchange(TcpDiscoverySpi.java:2006)
at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.processNodeAddFinishedMessage(ClientImpl.java:2181)
at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.processDiscoveryMessage(ClientImpl.java:2060)
at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1905)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at org.apache.ignite.spi.discovery.tcp.ClientImpl$1.body(ClientImpl.java:304)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
Caused by: java.io.InvalidClassException: javax.cache.configuration.MutableConfiguration; local class incompatible: stream classdesc serialVersionUID = 201306200821, local class serialVersionUID = 201405
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:687)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1880)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1746)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1880)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1746)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2037)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1568)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2282)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2206)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2064)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1568)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:428)
at java.util.HashMap.readObject(HashMap.java:1409)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1158)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2173)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2064)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1568)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2282)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2206)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2064)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1568)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:428)
at org.apache.ignite.marshaller.jdk.JdkMarshaller.unmarshal0(JdkMarshaller.java:139)
... 12 more
[16:27:14] Performance suggestions for grid 'igniteCosco' (fix if possible)
[16:27:14] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
[16:27:14] ^-- Enable G1 Garbage Collector (add '-XX:+UseG1GC' to JVM options)
[16:27:14] ^-- Specify JVM heap max size (add '-Xmx<size>[g|G|m|M|k|K]' to JVM options)
[16:27:14] ^-- Set max direct memory size if getting 'OOME: Direct buffer memory' (add '-XX:MaxDirectMemorySize=<size>[g|G|m|M|k|K]' to JVM options)
[16:27:14] ^-- Disable processing of calls to System.gc() (add '-XX:+DisableExplicitGC' to JVM options)
[16:27:14] Refer to this page for more performance suggestions: https://apacheignite.readme.io/docs/jvm-and-system-tuning
[16:27:14]
[16:27:14] To start Console Management & Monitoring run ignitevisorcmd.{sh|bat}
[16:27:14]
[16:27:14] Ignite node started OK (id=9d93bb08, instance name=igniteCosco)
[16:27:14] Topology snapshot [ver=2, locNode=9d93bb08, servers=1, clients=1, state=ACTIVE, CPUs=8, offheap=3.1GB, heap=7.1GB]
十二月 11, 2018 4:27:15 下午 org.apache.ignite.logger.java.JavaLogger error
严重: Failed to send message: TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage [sndNodeId=null, id=1ac606c9761-9d93bb08-2ba3-4234-807b-941605b3597b, verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=true]]
java.net.SocketException: Socket is closed
at java.net.Socket.getSendBufferSize(Socket.java:1215)
at org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.socketStream(TcpDiscoverySpi.java:1480)
at org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.writeToSocket(TcpDiscoverySpi.java:1606)
at org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1362)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
十二月 11, 2018 4:27:25 下午 org.apache.ignite.logger.java.JavaLogger error
严重: Failed to reconnect to cluster (consider increasing 'networkTimeout' configuration property) [networkTimeout=5000]
2018-12-11 16:27:25.768 [localhost-startStop-1] ERROR cjf.web.CommonServlet - 加载初始化资源文件[/cjf/config/cjfinit.properties]失败.
javax.cache.CacheException: class org.apache.ignite.IgniteClientDisconnectedException: Failed to execute dynamic cache change request, client node disconnected.
at org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1337)
at org.apache.ignite.internal.IgniteKernal.getOrCreateCache(IgniteKernal.java:3310)
at cjf.init.InitIgniteCache.intercept(InitIgniteCache.java:148)
at cjf.common.responsibility.DefaultActionInvocation.invoke(DefaultActionInvocation.java:26)
at cjf.init.CjfClusterInterceptor.intercept(CjfClusterInterceptor.java:37)
at cjf.common.responsibility.DefaultActionInvocation.invoke(DefaultActionInvocation.java:26)
at cjf.init.CjfMailInterceptor.intercept(CjfMailInterceptor.java:34)
at cjf.common.responsibility.DefaultActionInvocation.invoke(DefaultActionInvocation.java:26)
at cjf.init.InitSsoInterceptor.intercept(InitSsoInterceptor.java:52)
at cjf.common.responsibility.DefaultActionInvocation.invoke(DefaultActionInvocation.java:26)
at cjf.init.InitServletInterceptor.intercept(InitServletInterceptor.java:33)
at cjf.common.responsibility.DefaultActionInvocation.invoke(DefaultActionInvocation.java:26)
at cjf.init.InitCjfInterceptor.intercept(InitCjfInterceptor.java:50)
at cjf.common.responsibility.DefaultActionInvocation.invoke(DefaultActionInvocation.java:26)
at cjf.init.SysCacheInterceptor.intercept(SysCacheInterceptor.java:129)
at cjf.common.responsibility.DefaultActionInvocation.invoke(DefaultActionInvocation.java:26)
at cjf.web.CommonServlet.initCaches(CommonServlet.java:111)
at cjf.web.CommonServlet.init(CommonServlet.java:58)
at javax.servlet.GenericServlet.init(GenericServlet.java:158)
at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1144)
at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1091)
at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:983)
at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4978)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5290)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:754)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:730)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:734)
at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1140)
at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:1875)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.ignite.IgniteClientDisconnectedException: Failed to execute dynamic cache change request, client node disconnected.
at org.apache.ignite.internal.util.IgniteUtils$15.apply(IgniteUtils.java:948)
at org.apache.ignite.internal.util.IgniteUtils$15.apply(IgniteUtils.java:944)
... 35 common frames omitted
Caused by: org.apache.ignite.internal.IgniteClientDisconnectedCheckedException: Failed to execute dynamic cache change request, client node disconnected.
at org.apache.ignite.internal.processors.cache.GridCacheProcessor.onDisconnected(GridCacheProcessor.java:1173)
at org.apache.ignite.internal.IgniteKernal.onDisconnected(IgniteKernal.java:3949)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:821)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:604)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2667)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2705)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
... 1 common frames omitted
1 image description here 2 image description here
3 image description here
IgniteConfiguration:
<bean class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="clientMode" value="true"/>
<property name="igniteInstanceName" value="igniteTest"/>
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
<property name="addresses">
<list>
<value>127.0.0.1:47500..47510</value>
</list>
</property>
</bean>
</property>
</bean>
</property>
</bean>

TomEE has lib/javaee-api-7.0-1.jar library that contains javax-cache version 1.1 while Ignite depends on javax-cache 1.0.
You need to eliminate this dependency issue.It makes sense to exclude java-cache by setting openejb.classloader.forced-skip=javax.cache in system.properties.

Looks like you have put some type in Discovery data which is not present on other nodes.
I can see that you have "local class incompatible". Is it possible that you jave javax-cache 1.0 on one node but javax-cache 1.1 on another? It could cause the problem that you are observing.

apache ignite node not able to join in the cluster

I'm using apacheignite:2.5.0 docker image deployed in 2 different
ec2-instances and using static IP finder config below is the config file, one of the node is unable to join in the cluster. I have attached logs also please find
below its accepting connection and disconnecting , i ran docker container with --net=host so conatainer attach all ports to host machine and all ports are opened in security group
#
**>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:util="http://www.springframework.org/schema/util"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/util
http://www.springframework.org/schema/util/spring-util.xsd">
<bean abstract="false" id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
<property name="addresses">
<list>
<value>34.241.10.9:47500</value>
</list>
</property>
</bean>
</property>
</bean>
</property>
</bean>
</beans>**
[12:59:25,309][INFO][disco-event-worker-#37][GridDiscoveryManager] Added new node to topology: TcpDiscoveryNode [id=07b55edb-cdb7-45eb-bfd6-36fe9c5f5f15, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.17.0.1, 172.18.0.1, 172.31.29.3], sockAddrs=[/172.31.29.3:47500, /172.17.0.1:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /172.18.0.1:47500], discPort=47500, order=312, intOrder=157, lastExchangeTime=1529067545288, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false]
[12:59:25,309][INFO][disco-event-worker-#37][GridDiscoveryManager] Topology snapshot [ver=312, servers=2, clients=0, CPUs=6, offheap=3.8GB, heap=2.0GB]
[12:59:25,309][INFO][disco-event-worker-#37][GridDiscoveryManager] Data Regions Configured:
[12:59:25,309][INFO][disco-event-worker-#37][GridDiscoveryManager] ^-- default [initSize=256.0 MiB, maxSize=710.0 MiB, persistenceEnabled=false]
[12:59:25,309][INFO][exchange-worker-#38][time] Started exchange init [topVer=AffinityTopologyVersion [topVer=312, minorTopVer=0], crd=true, evt=NODE_JOINED, evtNode=07b55edb-cdb7-45eb-bfd6-36fe9c5f5f15, customEvt=null, allowMerge=true]
[12:59:25,309][WARNING][disco-event-worker-#37][GridDiscoveryManager] Node FAILED: TcpDiscoveryNode [id=07b55edb-cdb7-45eb-bfd6-36fe9c5f5f15, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.17.0.1, 172.18.0.1, 172.31.29.3], sockAddrs=[/172.31.29.3:47500, /172.17.0.1:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /172.18.0.1:47500], discPort=47500, order=312, intOrder=157, lastExchangeTime=1529067545288, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false]
[12:59:25,310][INFO][exchange-worker-#38][GridDhtPartitionsExchangeFuture] Finished waiting for partition release future [topVer=AffinityTopologyVersion [topVer=312, minorTopVer=0], waitTime=0ms, futInfo=NA]
[12:59:25,310][INFO][exchange-worker-#38][time] Finished exchange init [topVer=AffinityTopologyVersion [topVer=312, minorTopVer=0], crd=true]
[12:59:25,310][INFO][disco-event-worker-#37][GridDiscoveryManager] Topology snapshot [ver=313, servers=1, clients=0, CPUs=2, offheap=0.69GB, heap=1.0GB]
[12:59:25,310][INFO][disco-event-worker-#37][GridDiscoveryManager] Data Regions Configured:
[12:59:25,310][INFO][disco-event-worker-#37][GridDiscoveryManager] ^-- default [initSize=256.0 MiB, maxSize=710.0 MiB, persistenceEnabled=false]
[12:59:25,310][INFO][disco-event-worker-#37][GridDhtPartitionsExchangeFuture] Coordinator received all messages, try merge [ver=AffinityTopologyVersion [topVer=312, minorTopVer=0]]
[12:59:25,311][INFO][disco-event-worker-#37][GridCachePartitionExchangeManager] Merge exchange future [curFut=AffinityTopologyVersion [topVer=312, minorTopVer=0], mergedFut=AffinityTopologyVersion [topVer=313, minorTopVer=0], evt=NODE_FAILED, evtNode=07b55edb-cdb7-45eb-bfd6-36fe9c5f5f15, evtNodeClient=false]
[12:59:25,311][INFO][disco-event-worker-#37][GridDhtPartitionsExchangeFuture] finishExchangeOnCoordinator [topVer=AffinityTopologyVersion [topVer=312, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=313, minorTopVer=0]]
[12:59:25,311][INFO][disco-event-worker-#37][GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=312, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=313, minorTopVer=0], err=null]
[12:59:25,312][INFO][exchange-worker-#38][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=313, minorTopVer=0], evt=NODE_JOINED, node=07b55edb-cdb7-45eb-bfd6-36fe9c5f5f15]
[12:59:25,315][INFO][grid-timeout-worker-#23][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
^-- Node [id=225f750c, uptime=01:42:00.504]
^-- H/N/C [hosts=1, nodes=1, CPUs=2]
^-- CPU [cur=0.17%, avg=0.4%, GC=0%]
^-- PageMemory [pages=200]
^-- Heap [used=73MB, free=92.47%, comm=981MB]
^-- Non heap [used=53MB, free=96.47%, comm=55MB]
^-- Outbound messages queue [size=0]
^-- Public thread pool [active=0, idle=6, qSize=0]
^-- System thread pool [active=0, idle=8, qSize=0]
[12:59:25,320][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery accepted incoming connection [rmtAddr=/34.241.7.9, rmtPort=53627]
[12:59:25,320][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery spawning a new thread for connection [rmtAddr=/34.241.7.9, rmtPort=53627]
[12:59:25,320][INFO][tcp-disco-sock-reader-#628][TcpDiscoverySpi] Started serving remote node connection [rmtAddr=/34.241.7.9:53627, rmtPort=53627]
[12:59:25,325][INFO][tcp-disco-sock-reader-#628][TcpDiscoverySpi] Finished serving remote node connection [rmtAddr=/34.241.7.9:53627, rmtPort=53627
[12:59:30,332][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery accepted incoming connection [rmtAddr=/34.241.7.9, rmtPort=50418]
[12:59:30,332][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery spawning a new thread for connection [rmtAddr=/34.241.7.9, rmtPort=50418]
[12:59:30,332][INFO][tcp-disco-sock-reader-#629][TcpDiscoverySpi] Started serving remote node connection [rmtAddr=/34.241.7.9:50418, rmtPort=50418]
[12:59:30,334][INFO][tcp-disco-sock-reader-#629][TcpDiscoverySpi] Finished
2nd ignite node logs
[12:13:12,850][INFO][main][TcpCommunicationSpi] Successfully bound communication NIO server to TCP port [port=47100, locHost=0.0.0.0/0.0.0.0, selectorsCnt=4, selectorSpins=0, pairedConn=false]
[12:13:12,869][WARNING][main][TcpCommunicationSpi] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
[12:13:12,888][WARNING][main][NoopCheckpointSpi] Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation)
[12:13:12,918][WARNING][main][GridCollisionManager] Collision resolution is disabled (all jobs will be activated upon arrival).
[12:13:12,919][INFO][main][IgniteKernal] Security status [authentication=off, tls/ssl=off]
[12:13:13,275][INFO][main][ClientListenerProcessor] Client connector processor has started on TCP port 10800
[12:13:13,328][INFO][main][GridTcpRestProtocol] Command protocol successfully started [name=TCP binary, host=0.0.0.0/0.0.0.0, port=11211]
[12:13:13,369][INFO][main][IgniteKernal] Non-loopback local IPs: 172.17.0.1, 172.18.0.1, 172.31.29.3, fe80:0:0:0:10f0:92ff:fea1:d09f%vethee2519f, fe80:0:0:0:42:19ff:fe73:ee80%docker_gwbridge, fe80:0:0:0:42:e6ff:fe14:144a%docker0, fe80:0:0:0:4b3:6ff:fe01:7ee0%eth0, fe80:0:0:0:64f4:8bff:fe83:7e97%vethdae9948, fe80:0:0:0:9474:a1ff:fe6b:3368%vethcb2500f
[12:13:13,370][INFO][main][IgniteKernal] Enabled local MACs: 02421973EE80, 0242E614144A, 06B306017EE0, 12F092A1D09F, 66F48B837E97, 9674A16B3368
[12:13:13,429][INFO][main][TcpDiscoverySpi] Successfully bound to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0, locNodeId=07b55edb-cdb7-45eb-bfd6-36fe9c5f5f15]
[12:13:18,555][WARNING][main][TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]
[12:18:20,925][WARNING][main][TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]
[12:23:22,710][WARNING][main][TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]
[12:28:23,988][WARNING][main][TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]
[12:33:25,004][WARNING][main][TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]
[12:38:25,815][WARNING][main][TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]
[12:43:26,831][WARNING][main][TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]
[12:48:27,916][WARNING][main][TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]

If you are using same config file for starting 2 nodes, then try to use localPortRange in DiscoverySpi.

RabbitMQ consumes memory and shuts

I just installed OpenStack Juno using devstack, and observed that RabbitMQ (package rabbitmq-server-3.1.5-10 installed by yum) is not stable, i.e. it quickly eats up the memory and shuts down; there is 2G of RAM. Below is the messages from logs and 'systemctl status' before the daemon died:
=INFO REPORT==== 18-Dec-2014::01:25:40 ===
vm_memory_high_watermark clear. Memory used:835116352 allowed:835212083
=WARNING REPORT==== 18-Dec-2014::01:25:40 ===
memory resource limit alarm cleared on node rabbit#node
=INFO REPORT==== 18-Dec-2014::01:25:40 ===
accepting AMQP connection <0.27011.5> (10.0.0.11:55198 -> 10.0.0.11:5672)
=INFO REPORT==== 18-Dec-2014::01:25:41 ===
vm_memory_high_watermark set. Memory used:850213192 allowed:835212083
=WARNING REPORT==== 18-Dec-2014::01:25:41 ===
memory resource limit alarm set on node rabbit#node.
**********************************************************
*** Publishers will be blocked until this alarm clears ***
**********************************************************
rabbitmqctl[770]: ===========
rabbitmqctl[770]: nodes in question: [rabbit#node]
rabbitmqctl[770]: hosts, their running nodes and ports:
rabbitmqctl[770]: - node: [{rabbitmqctl770,40089}]
rabbitmqctl[770]: current node details:
rabbitmqctl[770]: - node name: rabbitmqctl770#node
rabbitmqctl[770]: - home dir: /var/lib/rabbitmq
rabbitmqctl[770]: - cookie hash: FftrRFUESg4RKWsyb1cPqw==
systemd[1]: rabbitmq-server.service: control process exited, code=exited status=2
systemd[1]: Unit rabbitmq-server.service entered failed state.
I know about set_vm_memory_high_watermark, but it doesn't solve the issue. I want to ensure that the daemon doesn't shut down abruptly. I wonder if someone saw this before and could advise?
Thanks.
UPDATE
Upgraded to version 3.4.2 taken directly from www.rabbitmq.com/download.html The new version doesn't consume RAM that fast and tends to work longer then previous version, but eventually still eats out all the memory and shuts.

I think the number of connections in the servers are increasing and they are being held like that without closing that's why it is consuming more memory. When the usage of RAM increases beyond the watermark rabbitmq server won't accept any network request. Either you have to close the connections which all are opened or you have to increase the RAM of the system. But increasing the RAM will only reduce the problem for some time but you'll face the problem again it is better to close the connections.

try to use CloudAMQP instead of installing locally. This will be fixed then. firstly create a rabbitMQ account here. https://customer.cloudamqp.com/signup.
then create your queue there and connect with your application.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Apache Ignite Off Heap memory filling up - ignite

In case of enabled persistence after a data region is filled up Page replacement is triggered.

Related

RabbitMQ Free disk space is insufficient

Apache Ignite Topology Snapshot refreshed .Net

Ignite tomee-webprofile-7.1.0-> Start error reporting at Tomee

apache ignite node not able to join in the cluster

RabbitMQ consumes memory and shuts

Categories

Resources