Apache Ignite : Transaction support and cache definition - ignite

We are experimenting with Apache Ignite to use it as a Read and Write through caching layer for Distributed applications. The need is to weave a cache layer for the aggregates we depend on. Indiviual constituent entities that these aggregates comprise of, are managed entities maintained by EntityManager.
Two Questions:
Does Apache Ignite participate in Container Managed Transaction out of box ?
In order to understand solution to Q1 , I did a small experiment described below. Any insights on what induces below behaviour ?
Aggregates : Strategy and Strategy Parameter - one to many mapping.
Individual Entities : Strategy and StrategyParam (both managed by JPA/Hibernate).
CacheStore definition based on entitymanager : eg write method:
#Override
public void write(Cache.Entry<? extends Long, ? extends StrategyAggregate> entry) throws CacheWriterException {
em.merge(entry.getValue().getStrategy());
entry.getValue().getStrategyParamList().forEach(strategyParam -> em.merge(strategyParam));
}
Now when we init first node with above cache definition, I see the transaction nature working alright i.e. post method completion I see both cache and database updated i.e. I can read the changes from cache.
But as soon as second node joins the cluster, the same api throws up error
"no entitymanager available ..." followed by stacktrace having transaction has been rolled back. Though Read from both cache and direct read from entity manager works fine.
Stacktrace
Caused by: javax.cache.integration.CacheWriterException: javax.persistence.TransactionRequiredException: No EntityManager with actual transaction available for current thread - cannot reliably process 'merge' call
... 79 common frames omitted
Caused by: javax.persistence.TransactionRequiredException: No EntityManager with actual transaction available for current thread - cannot reliably process 'merge' call
at org.springframework.orm.jpa.SharedEntityManagerCreator$SharedEntityManagerInvocationHandler.invoke(SharedEntityManagerCreator.java:285) ~[spring-orm-4.3.25.RELEASE.jar:4.3.25.RELEASE]
at com.sun.proxy.$Proxy102.merge(Unknown Source) ~[na:na]
at StrategyAggregateCacheStore.write(StrategyAggregateCacheStore.java:47) ~[classes/:na]
at org.apache.ignite.internal.processors.cache.store.GridCacheStoreManagerAdapter.put(GridCacheStoreManagerAdapter.java:585) ~[ignite-core-2.11.0.jar:2.11.0]
... 78 common frames omitted

Related

How can I run various tests for Quarkus Kafka Stream with Testcontainer?

Following the steps described here https://quarkus.io/guides/kafka#testing-using-a-kafka-broker it's possible to define quarkus tests using a "real" Kafka broker.
#QuarkusTest instantiate all the resources needed, including KafkaStrems and during the individual tests (#Test) we can limit ourselves to produce records for input topics and consume results from output topics.
The current stream Topology include steps of groupBy, aggregation, join, ...
During test, the problem is that, after first one, all other tests have "dirty aggregates". A kafkaStreams.cleanUp() might solve the problem but produce an error:
Caused by: java.lang.IllegalStateException: Cannot clean up while running.
at org.apache.kafka.streams.KafkaStreams.cleanUp(KafkaStreams.java:1486)
at eu.reply.lea.visibility.unieuro.stream.TopologyProducerIT.setup(TopologyProducerIT.java:70)
at eu.reply.lea.visibility.unieuro.stream.TopologyProducerIT_Bean.create(Unknown Source)
at eu.reply.lea.visibility.unieuro.stream.TopologyProducerIT_Bean.get(Unknown Source)
at eu.reply.lea.visibility.unieuro.stream.TopologyProducerIT_Bean.get(Unknown Source)
at io.quarkus.arc.impl.InstanceImpl.getBeanInstance(InstanceImpl.java:225)
at io.quarkus.arc.impl.InstanceImpl.getInternal(InstanceImpl.java:211)
at io.quarkus.arc.impl.InstanceImpl.get(InstanceImpl.java:97)
... 73 more
The question is: what is the correct approch to run KafkaStream testing in quarkus (the "traditional" approach of: perform a test, perform rollback and continue with the next ones seems not applicable).
Also the following approach fails:
// test 1
kafkaStreams.close();
kafkaStreams.cleanUp();
kafkaStreams.start();
// test 2
kafkaStreams.close();
kafkaStreams.cleanUp();
kafkaStreams.start();
// ...

Infinispan clustered lock performance does not improve with more nodes?

I have a piece of code that is essentially executing the following with Infinispan in embedded mode, using version 13.0.0 of the -core and -clustered-lock modules:
#Inject
lateinit var lockManager: ClusteredLockManager
private fun getLock(lockName: String): ClusteredLock {
lockManager.defineLock(lockName)
return lockManager.get(lockName)
}
fun createSession(sessionId: String) {
tryLockCounter.increment()
logger.debugf("Trying to start session %s. trying to acquire lock", sessionId)
Future.fromCompletionStage(getLock(sessionId).lock()).map {
acquiredLockCounter.increment()
logger.debugf("Starting session %s. Got lock", sessionId)
}.onFailure {
logger.errorf(it, "Failed to start session %s", sessionId)
}
}
I take this piece of code and deploy it to kubernetes. I then run it in six pods distributed over six nodes in the same region. The code exposes createSession with random Guids through an API. This API is called and creates sessions in chunks of 500, using a k8s service in front of the pods which means the load gets balanced over the pods. I notice that the execution time to acquire a lock grows linearly with the amount of sessions. In the beginning it's around 10ms, when there's about 20_000 sessions it takes about 100ms and the trend continues in a stable fashion.
I then take the same code and run it, but this time with twelve pods on twelve nodes. To my surprise I see that the performance characteristics are almost identical to when I had six pods. I've been digging in to the code but still haven't figured out why this is, I'm wondering if there's a good reason why infinispan here doesn't seem to perform better with more nodes?
For completeness the configuration of the locks are as follows:
val global = GlobalConfigurationBuilder.defaultClusteredBuilder()
global.addModule(ClusteredLockManagerConfigurationBuilder::class.java)
.reliability(Reliability.AVAILABLE)
.numOwner(1)
and looking at the code the clustered locks is using DIST_SYNC which should spread out the load of the cache onto the different nodes.
UPDATE:
The two counters in the code above are simply micrometer counters. It is through them and prometheus that I can see how the lock creation starts to slow down.
It's correctly observed that there's one lock created per session id, this is per design what we'd like. Our use case is that we want to ensure that a session is running in at least one place. Without going to deep into detail this can be achieved by ensuring that we at least have two pods that are trying to acquire the same lock. The Infinispan library is great in that it tells us directly when the lock holder dies without any additional extra chattiness between pods, which means that we have a "cheap" way of ensuring that execution of the session continues when one pod is removed.
After digging deeper into the code I found the following in CacheNotifierImpl in the core library:
private CompletionStage<Void> doNotifyModified(K key, V value, Metadata metadata, V previousValue,
Metadata previousMetadata, boolean pre, InvocationContext ctx, FlagAffectedCommand command) {
if (clusteringDependentLogic.running().commitType(command, ctx, extractSegment(command, key), false).isLocal()
&& (command == null || !command.hasAnyFlag(FlagBitSets.PUT_FOR_STATE_TRANSFER))) {
EventImpl<K, V> e = EventImpl.createEvent(cache.wired(), CACHE_ENTRY_MODIFIED);
boolean isLocalNodePrimaryOwner = isLocalNodePrimaryOwner(key);
Object batchIdentifier = ctx.isInTxScope() ? null : Thread.currentThread();
try {
AggregateCompletionStage<Void> aggregateCompletionStage = null;
for (CacheEntryListenerInvocation<K, V> listener : cacheEntryModifiedListeners) {
// Need a wrapper per invocation since converter could modify the entry in it
configureEvent(listener, e, key, value, metadata, pre, ctx, command, previousValue, previousMetadata);
aggregateCompletionStage = composeStageIfNeeded(aggregateCompletionStage,
listener.invoke(new EventWrapper<>(key, e), isLocalNodePrimaryOwner));
}
The lock library uses a clustered Listener on the entry modified event, and this one uses a filter to only notify when the key for the lock is modified. It seems to me the core library still has to check this condition on every registered listener, which of course becomes a very big list as the number of sessions grow. I suspect this to be the reason and if it is it would be really really awesome if the core library supported a kind of key filter so that it could use a hashmap for these listeners instead of going through a whole list with all listeners.
I believe you are creating a clustered lock per session id. Is this what you need ? what is the acquiredLockCounter? We are about to deprecate the "lock" method in favour of "tryLock" with timeout since the lock method will block forever if the clustered lock is never acquired. Do you ever unlock the clustered lock in another piece of code? If you shared a complete reproducer of the code will be very helpful for us. Thanks!

Apache Geode debug Unknown pdx type=2140705

If I start a GFSH client and connect to Geode. There is a lot of data in myRegion and to check through it then I run:
query --query="select * from /myRegion"
I am getting the response:
Result : false
startCount : 0
endCount : 20
Message : Unknown pdx type=2140705
How does one troubleshoot / debug this problem?
UPDATE: The error in the Geode server log is:
[info 2018/07/04 10:53:07.275 BST IsGeode <Function Execution Processor1> tid=0x48] Exception occurred:
java.lang.IllegalStateException: Unknown pdx type=1318971
at org.apache.geode.internal.InternalDataSerializer.readPdxSerializable(InternalDataSerializer.java:3042)
at org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2859)
at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2961)
at org.apache.geode.internal.util.BlobHelper.deserializeBlob(BlobHelper.java:90)
at org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:1911)
at org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:1904)
at org.apache.geode.internal.cache.PreferBytesCachedDeserializable.getDeserializedValue(PreferBytesCachedDeserializable.java:73)
at org.apache.geode.internal.cache.LocalRegion.getDeserialized(LocalRegion.java:1269)
at org.apache.geode.internal.cache.LocalRegion$NonTXEntry.getValue(LocalRegion.java:8771)
at org.apache.geode.internal.cache.EntriesSet$EntriesIterator.moveNext(EntriesSet.java:179)
at org.apache.geode.internal.cache.EntriesSet$EntriesIterator.next(EntriesSet.java:134)
at org.apache.geode.cache.query.internal.CompiledSelect.doNestedIterations(CompiledSelect.java:837)
at org.apache.geode.cache.query.internal.CompiledSelect.doIterationEvaluate(CompiledSelect.java:699)
at org.apache.geode.cache.query.internal.CompiledSelect.evaluate(CompiledSelect.java:423)
at org.apache.geode.cache.query.internal.CompiledSelect.evaluate(CompiledSelect.java:53)
at org.apache.geode.cache.query.internal.DefaultQuery.executeUsingContext(DefaultQuery.java:558)
at org.apache.geode.cache.query.internal.DefaultQuery.execute(DefaultQuery.java:385)
at org.apache.geode.cache.query.internal.DefaultQuery.execute(DefaultQuery.java:319)
at org.apache.geode.management.internal.cli.functions.DataCommandFunction.select(DataCommandFunction.java:247)
at org.apache.geode.management.internal.cli.functions.DataCommandFunction.select(DataCommandFunction.java:202)
at org.apache.geode.management.internal.cli.functions.DataCommandFunction.execute(DataCommandFunction.java:147)
at org.apache.geode.internal.cache.MemberFunctionStreamingMessage.process(MemberFunctionStreamingMessage.java:185)
at org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:374)
at org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:440)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.geode.distributed.internal.DistributionManager.runUntilShutdown(DistributionManager.java:662)
at org.apache.geode.distributed.internal.DistributionManager$9$1.run(DistributionManager.java:1108)
at java.lang.Thread.run(Thread.java:748)
You can tell the immediate cause from the stack trace.
A PDX serialized stream contains a type id which is a reference into a repository of type metadata maintained by a GemFire cluster. In this case, the serialized data of the object contained a typeId that is not in the cluster's metadata repository.
So the question becomes, "what serialized that object and why did it use an invalid type id ?"
The only way I've seen this happen before is when a cluster is fully restarted and the pdx metadata goes away, either because it was not persistent or because it was deleted (by clearing out the locator working directory for example).
GemFire clients cache the mapping between a type and it's type ID. This allows them to quickly serialize objects without continually looking up the type id from the server. Client connections can persist across cluster restarts. When a client reconnects it does not flush the cached information and continues to write objects using its cached type ID.
So the combination of a pdx-metadata losing cluster restart and a client that is not restarted (e.g. an app. server) is the only way I have seen this happen before. Does this match your scenario ?
If so, one of the best ways to avoid this is to persist your pdx metadata and never delete it.

Corda notary ClassNotFoundException : Malformed transaction, OUTPUTS_GROUP at index 0 cannot be deserialised

When running an InitiatingFlow/InitiatedBy between two nodes, my notary node threw an error: java.lang.Exception: Malformed transaction, OUTPUTS_GROUP at index 0 cannot be deserialised
And a bit further down the trace: Caused by: java.lang.ClassNotFoundException: xxx.xxx.xxx.shared.states.OrderItemState
Including the 'shared' CordApp where this state was defined in my notary fixes the issue, but I don't understand why this is necessary?
I was able to send other states back and forth between the nodes just fine without including that CordApp
Only difference is that the OrderItemState is a LinearState where the other ones were FungibleAsset, am I to look for an answer there?
I assume you're using a validating notary. A validating notary is one that checks that the transaction is valid, as well as checking that it does not contain a double-spend attempt. This has a cost in terms of privacy. See https://docs.corda.net/key-concepts-notaries.html#validation.
If you look at the code that sends the transaction to the notary in NotaryFlow.Client, you can see that a validating notary is sent the entire transaction, and therefore needs the CorDapp defining the involved states in its cordapps folder:
if (serviceHub.networkMapCache.isValidatingNotary(notaryParty)) {
subFlow(SendTransactionWithRetry(session, stx))
session.receive<List<TransactionSignature>>()
}

Using JNDI Connection pool as Datanucleus PersistenceManagerFactory

I am developing a web application using DataNucleus as DAO layer (mainly due to historical reasons). It runs inside Payara server (a Glassfish 4 fork)
It works fine, but now I'd like to use a JNDI db connection pool to obtain the PersistenceManagerFactory for DataNucleus.
From the documentation, it seems that the following code would suffice:
pmf = JDOHelper.getPersistenceManagerFactory( "jdbc/HxWmDb", context );
but this way I obtain an error starting the application (DbSession is the class which implements the DAO layer, and the error line is exactly the one above):
Caused by: java.lang.ClassCastException
at com.sun.corba.ee.impl.javax.rmi.PortableRemoteObject.narrow(PortableRemoteObject.java:262)
at javax.rmi.PortableRemoteObject.narrow(PortableRemoteObject.java:150)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:1791)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:1755)
at ejb.DbSession.<init>(DbSession.java:119)
...
Caused by: java.lang.ClassCastException: com.sun.gjc.spi.jdbc40.DataSource40 cannot be cast to org.omg.CORBA.Object
at com.sun.corba.ee.impl.javax.rmi.PortableRemoteObject.narrow(PortableRemoteObject.java:245)
Any suggestion?
Little update, as requested by DN1:
As a first approach, I tried exactly what is described in the link:
Properties properties = new Properties();
properties.setProperty("datanucleus.ConnectionFactoryName","jdbc/HxWmDb");
PersistenceManagerFactory pmf = JDOHelper.getPersistenceManagerFactory(properties);
And the error is, as already said, that a URI is anyway required:
Caused by: org.datanucleus.exceptions.NucleusException: You haven't specified persistence property 'datanucleus.ConnectionURL'