coordinator status is unavailable in Infinispan cluster - infinispan

We are using Inifinispan cluster with jgroups-tcp configuration in one of our setups. Sometimes we are observing that the coordinator status in jconsole is showing as unavailable, but this is not affecting any of the functionality. However, I'm just curious as in why the status is showing as unavailable. Any pointers would be appreciated.

This is probably because the type of coordinator cannot be displayed by jconsole. E.g. if the type was org.jgroups.Address, then this class would have to be added to the classpath used by jconsole.
Actually, I'm not sure about my last statement, but googling for jconsole and custom types might be helpful.
This is not JGroups-specific.

Related

Apache Ignite node startup error - Joining node doesn't have stored group keys

I made some changes in my dev Ignite cluster to enable persistence. Now when I start my cluster (2 node, version 2.9.0), first one starts just fine but second one doesn't seem to be. As the first one shows in log the below error
[14:18:59] Joining node doesn't have stored group keys [node=4f20534b-1e44-46af-b81a-34d35807abd8]
I saw a similar question, whose answer mentions of TDE or transparent data encryption. But I have not enabled encryption to data anywhere in my config.
What could be the problem? Please help.
It's a known issue that Ignite checks a list of known encryption keys for cache groups even in case TDE is turned off.
Actually, it doesn't affect the node start, just an info message is printed.
You can find the discussion regarding that here.

Akka.net / Cluster - How to "Heal" the topology when the leader dies?

I set up a basic test topology with Petabridge Lighthouse and two simple test actors that communicate with each other. This works well so far, but there is one problem: Lighthouse (or the underlying Akka.Cluster) makes one of my actors the leader, and when not shutting the node down gracefully (e.g. when something crashes badly or I simply hit "Stop" in VS) the Lighthouse is not usable any more. Tons of exceptions scroll by and it must be restarted.
Is it possible to configure Akka.Cluster .net in a way that the rest of the topology elects a new leader and carries on?
There are 2 things to point here. One is that if you have a serious risk of your lighthouse node going down, you probably should have more that one -
akka.cluster.seed-nodes setting can take multiple addresses, the only requirement here is that all nodes, including lighthouses, must have them specified in the same order. This way if one lighthouse is going down, another one still can take its role.
Other thing is that when a node becomes unreachable (either because the process crashed on network connection is unavailable), by default akka.net cluster won't down that node. You need to tell it, how it should behave, when such thing happens:
At any point you can configure your own IDowningProvider interface, that will be triggered after certain period of node inactivity will be reached. Then you can manually decide what to do. To use it add fully qualified type name to followin setting: akka.cluster.downing-provider = "MyNamespace.MyDowningProvider, MyAssembly". Example downing provider implementation can be seen here.
You can specify akka.cluster.auto-down-unreachable-after = 10s (or other time value) to specify some timeout given for an unreachable node to join - if it won't join before the timeout triggers, it will be kicked out from the cluster. Only risk here is when cluster split brain happens: under certain situations a network failure between machines can split your cluster in two, if that happens with auto-down set up, two halves of the cluster may consider each other dead. In this case you could end up having two separate clusters instead of one.
Starting from the next release (Akka.Cluster 1.3.3) a new Split Brain Resolver feature will be available. It will allow you to configure more advanced strategies on how to behave in case of network partitions and machine crashes.

Apache Ignite node segmented

From time to time I keep getting node segmented. It happens in a cluster with ~40 nodes - it happens only on one node at a time. A few times it happened while there was some heavy GC work going on. On the other hand I have seen that similar heavy GC work going on and no node got segmented. I have tuned failure detection timeout to be bigger than max GC I was experiencing but that didn't help - failureDetection timeout is almost 2x bigger than max GC. How can I figure out if this is really GC or network issues?
I doubt it is related to networking as other nodes would fail as wel. When the process gets restarted it works fine so I would rule out network issues.
Where can I look at the code which produces EVT_NODE_SEGMENTED?
I debugged the IgniteConfiguration object and saw that segResolvers are null/empty so I have no clue where the event it published.
S3 based discovery is used, not sure it matters here(ignite 1.9).
I wonder under which conditions such event is produced? Unable to connect to majority of other nodes or all of them?
This event occurs when node disconnects and can't connect back, see ClientImpl.java and ServerImpl.java
look at logs at segmented node, it must be something like Node is out of topology (probably, due to short-time network problems message, so you can figure out exact problem.

Monitoring glassfish session failover?

On a two instance single node test cluster I wanted to get a list of which sessions
are active on which instance, and then stop/kill an instance and get some information
about the failover process - I want to see it happening.
I've read that it's considered a reasonable strategy to have multiple instances on a
single node for "don't put all your eggs in one basket" reasons, so if an instance
went bad I can see a need to figure out the session to instance mapping.
I've read all the docs I can think of reading but have not seen anything that does
what I want. I am at a disadvantage because since running the create-cluster commmand
from asadmin the admin console simply won't load (it tries to but after 10 mins it's
still not loaded the login page).
Any suggestions? is JMS something to look at here? I'm running g/f 3.1.2.
Thanks.

Weblogic + dameon thread

Hoping someone may be able to help me with a solution as to starting a background thread to monitor my database connection. Our application is deployed in weblogic 92 and I wondered if there was a way to start a thread running when the application is running ? thanks
I'm attempting to monitor my database to ensure I can switch databases should my connection fail. For this reason, I'm looking for an easy solution to run a background task.
Even though in many applications servers you can, you're not supposed to create your own threads in a Java EE server, see Why is spawning threads in Java EE container discouraged? for some background and workarounds.
Depending on what condition you want to check for and what action you want to take you can use the WebLogic Diagnostic Framework. You could have it send a JMS message when it detects a certain condition and then you can do whatever you want with an MDB.
Update your question with the condition & action you want to take and I can provide more details.
Generally speaking, starting your own threads isn't advisable.
UPDATE:
By your description I'm guessing you don't use JNDI or WebLogic datasources. It would be better if you used the datasources in WebLogic for connection pooling. WebLogic can detect that a connection in the pool is bad and recreate it before giving it to your application.
If you are referring to different databases then WebLogic has a multi-datasource option which has failover capability. What you should do is configure two datasources - one primary and one secondary - and then create a multi-datasource to wrap them. You then should use the JNDI of the multi-datasource in your application. Obviously if you do this you need to make sure the data is consistent between the two DB instances.
This does not make your application WebLogic-specific since it would just be a change to a JNDI name. WebLogic takes care of the rest.