Apache Ignite - Partition Map exchange causes deadlock when used with write through enabled cache - ignite

We have Ignite running in server mode in our JVM. Ignite is going into deadlock in following scenario. I have added the thread stack at the end of this question
a.Create a cache with write through enabled
b.In CacheWriter.write() implementation
1.Wait for a second to for step c to be invoked
2.Try to read from another cache
c. While step b is executing Trigger a thread which will create a new
cache.
d.On executing above scenario, Ignite is going into deadlock as
1.Readlock has been acquired by cache.put() operation
2.When cache creation is triggered in separate thread, Partition Map Exchange is also started
3.PME tries to acquire all 16 locks , but wait as one Read lock is already acquire
4.While reading from cache, cache.get() can not complete as it waits for current Partition Map Exchange to complete
We have face this issue in production and above scenario is just its reproducer. Write Through implementation is just trying to read from cache and cache creation is happening in totally different thread
Why Ignite is blocking all cache.get() operation for PME when it does not even have all required locks? Shouldn’t the call be blocked only after PME operation has all the locks?
why PME stops everything? If I create cache A then only related operation for cache A or its cache group should be stopped
Also is there any solution to solve this deadlock?
Thread executing cache.put() and write through
"main" #1 prio=5 os_prio=0 tid=0x0000000003505000 nid=0x43f4 waiting on condition [0x000000000334b000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
at org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:4870)
at org.apache.ignite.internal.processors.cache.GridCacheAdapter.repairableGet(GridCacheAdapter.java:4830)
at org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:1463)
at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.get(IgniteCacheProxyImpl.java:1128)
at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.get(GatewayProtectedCacheProxy.java:688)
at ReadWriteThroughInterceptor.write(ReadWriteThroughInterceptor.java:70)
at org.apache.ignite.internal.processors.cache.GridCacheLoaderWriterStore.write(GridCacheLoaderWriterStore.java:121)
at org.apache.ignite.internal.processors.cache.store.GridCacheStoreManagerAdapter.put(GridCacheStoreManagerAdapter.java:585)
at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.update(GridCacheMapEntry.java:6468)
at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:6239)
at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:5923)
at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.invokeClosure(BPlusTree.java:4041)
at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.access$5700(BPlusTree.java:3935)
at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:2039)
at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1923)
at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1734)
at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1717)
at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:441)
at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:2327)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2553)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update(GridDhtAtomicCache.java:2016)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1833)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1692)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:300)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:481)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:441)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:249)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1147)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.put0(GridDhtAtomicCache.java:615)
at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2571)
at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2550)
at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1337)
at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:868)
at com.eqtechnologic.eqube.cache.tests.readerwriter.WriteReadThroughTest.writeToCache(WriteReadThroughTest.java:54)
at com.eqtechnologic.eqube.cache.tests.readerwriter.WriteReadThroughTest.lambda$runTest$0(WriteReadThroughTest.java:26)
at com.eqtechnologic.eqube.cache.tests.readerwriter.WriteReadThroughTest$$Lambda$1095/2028767654.execute(Unknown Source)
at org.junit.jupiter.api.AssertDoesNotThrow.assertDoesNotThrow(AssertDoesNotThrow.java:50)
at org.junit.jupiter.api.AssertDoesNotThrow.assertDoesNotThrow(AssertDoesNotThrow.java:37)
at org.junit.jupiter.api.Assertions.assertDoesNotThrow(Assertions.java:3060)
at WriteReadThroughTest.runTest(WriteReadThroughTest.java:24)
PME thread waiting for locks
"exchange-worker-#39" #56 prio=5 os_prio=0 tid=0x0000000022b91800 nid=0x450 waiting on condition [0x000000002866e000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076e73b428> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireInterruptibly(AbstractQueuedSynchronizer.java:897)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1222)
at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lockInterruptibly(ReentrantReadWriteLock.java:998)
at org.apache.ignite.internal.util.StripedCompositeReadWriteLock$WriteLock.lock0(StripedCompositeReadWriteLock.java:192)
at org.apache.ignite.internal.util.StripedCompositeReadWriteLock$WriteLock.lockInterruptibly(StripedCompositeReadWriteLock.java:172)
at org.apache.ignite.internal.util.IgniteUtils.writeLock(IgniteUtils.java:10487)
at org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.updateTopologyVersion(GridDhtPartitionTopologyImpl.java:272)
at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.updateTopologies(GridDhtPartitionsExchangeFuture.java:1269)
at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:1028)
at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3370)
at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3197)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
at java.lang.Thread.run(Thread.java:748)

Technically, you have answered your question on your own, that is great work, to be honest.
You are not supposed to have blocking methods in your write-through cache store implementation that might get in conflict with PME or cause pool starvation.
You have to remember that PME is a show-stopper mechanism: the entire user load is stopped. In short, that is required to ensure ACID guarantees. The lock indeed is divided into multiple parts to speed up the processing, i.e. allowing up to 16 threads to perform cache operations concurrently. But a PME does need exclusive control over the cluster, thus it acquires a write lock over all the threads.
Shouldn’t the call be blocked only after PME operation has all the
locks?
Yes, that's indeed how it's supposed to work. But in your case, PME tries to get the write lock, whereas the read lock is there, therefore it's waiting for its completion, and all further read locks are being queued after the write lock.
Also is there any solution to solve this deadlock?
move cache-related logic out of the CacheStore. Ideally, do not start caches dynamically, since that triggers PME. Have them created in advance if possible
check if other mechanisms like continuous-queries or entry processo would work.
But still, it all depends on your use case.

I don't think creating a cache inside the cache store will work. From the documentation for CacheWriter:
A CacheWriter is used for write-through to an external resource.
(Emphasis mine.)
Without knowing your use case, it's difficult to suggest an alternative approach, but creating your caches in advance or using a continuous query as a trigger works in similar situations.

Related

How to re-queue celery task until after lock is released

I have a number of celery tasks that currently use a non-blocking redis lock to ensure atomic edits of a cluster of db tables for a particular user. The timing of the tasks is unpredictable and the frequency is high but sporadic. For example some are triggered by web hooks, other by user interaction.
Right now I am using autoretry_for and retry_backoff to respond to failures to acquire the non-blocking lock. This is sub-optimal since it leaves lots of idle time once the lock has been released. I can't use a blocking lock because I need to let other tasks that don't require the lock to still run.
What I need is a way to re-run any tasks that failed to acquire the non-blocking lock as soon as possible after the lock is released. It doesn't matter if some handful of non-locking tasks are run in the meantime, I just need the tasks that failed to acquire to run reasonably soon but certainly without idle time.
How can this be done? It's like I want to add the tasks to some kind of global celery chain, but post-hoc, and then fire any waiting chains every time the non-blocking lock is released. Or something like that. Which user each tasks are crunching is calculated from the task's arguments, it is not passed in through the arguments (so key-based queues won't work).

Weblogic Stuck thread impacts other runnable threads in it

I am using Weblogic 10.3.6 with 8 managed servers configured with session timeout as 600 seconds. I have an issue with my application that when a session gets timed out in 600 seconds(I am receiving as STUCK alerts which is also configured) I am facing slowness in my application. My question is,
Will all threads be impacted because of one STUCK thread(STUCK thread
was due to DB transaction timeout)
I assume it will not be, but wanted to confirm.
Depends on your application. In general no, but if for example the stuck thread is holding a lock on an object (database, file, etc.) called by other requests, these may be affected too. Also, depending on what the stuck thread is doing, it may use excessive resources (cpu, memory, disk, etc.). I suggest to investigate why the thread is taking so long and if it's possible to

Can a process terminate after I/O without returning to the CPU?

I have a question about the following diagram from Operating Systems Concepts: http://unboltingbinary.in/wp-content/uploads/2015/04/image028.jpg
This diagram seems to imply that after every I/O operation, the process is placed back on the ready queue before being sent to the CPU again. However, is it possible for a process to terminate after I/O but before being sent to the ready queue?
Suppose we have a program that computes a number and then writes it to storage. In this case, does the process really need to return to the CPU after the I/O operation? It seems to me that the process should be allowed to terminate right after I/O. That way, there would be no need for a context switch.
Once one process has successfully executed a termination request on another, the threads of the terminated process should never run again, no matter what state they were in - blocked on I/O, blocked on inter-thread comms, running on a core, sleeping, whatever - they all must be stopped immediately if running and all be put in a state where they will never run again.
Anything else would be a security issue - terminated threads should not be given execution at all, (else it may not be possible to terminate the process).
Process termination requires the cpu. Changes to kernel mode structures on process exit, returning memory resources, etc. all require the cpu.
A process simply just does not evaporate. The term you want here is process rundown - I think.

In Hazelcast, is it possible to use clustered locks that do _not_ care about the local thread that performs the lock/unlock operations?

Hazelcast locks (such as http://www.hazelcast.com/docs/1.9.4/manual/multi_html/ch02s07.html) as I understand it behave the same way as the Java concurrency primitives but across the cluster. The makes it possible to use to synchronize between thread in the local process as well as over the cluster.
However, is there any way I can opt out of this behaviour? In my current project, I need a way of coordinating unique ownership of a resource across the cluster but want to aquire and release this ownership from multiple points in my application - can I do this in some way that does not involve dedicating a thread to managing this lock in my process?
The Semaphore is your friend because it doesn't have a concept of ownership. It uses permits that can be acquired; thread x can acquire permit 1, but thread y can release permit 1. If you initialize the semaphore with a single permit, you will get your mutual exclusion.
ISemaphore s = hazelcastInstance.getSemaphore("foo");
s.init(1);
s.acquire();
And in another thread you can release this permit by:
ISemaphore s = hazelcastInstance.getSemaphore("foo");
s.release();

Apache Tomcat Threads in WAITING state while thread pool increases

I am trying to analyse thread dumps I have taken from my tomcat server. One of the thread dumps was taken after a couple of minutes of uptime and shows a thread pool of about 70, with several in WAITING state. I left a script hitting the server overnight and when I took another thread dump in the morning. When comparing the two dumps I can see that the threadpool has increased to from 70 threads to 90 threads. I can also see that the same threads are in a WAITING state between one dump and the other, while 20 new threads are added. Would this suggest that there is some bug in my application or is this standard behavior? I am wondering why the threads that are in waiting are not being re-used and instead new threads being created. I am assuming that the threads have not been re-used at all from one dump to another because in the dump file it reports them as "waiting on " where the number in <> is the same from one dump to another, is this assumption correct?
For example, from my initial thread dump I see this:
"http-8000-40" - Thread t#74
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Native Method)
- waiting on <4fd24389> (a org.apache.tomcat.util.net.JIoEndpoint$Worker)
at java.lang.Object.wait(Object.java:485)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.await(JIoEndpoint.java:458)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:484)
at java.lang.Thread.run(Thread.java:662)
Locked ownable synchronizers:
- None
and then I can see the same thread in the dump of the following morning in the same state and waiting on the same object: (I am assuming this from the numbers in "<>")
"http-8000-40" - Thread t#74
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Native Method)
- waiting on <4fd24389> (a org.apache.tomcat.util.net.JIoEndpoint$Worker)
at java.lang.Object.wait(Object.java:485)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.await(JIoEndpoint.java:458)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:484)
at java.lang.Thread.run(Thread.java:662)
Locked ownable synchronizers:
- None
Tomcat needs to spend some time managing threads and other resources even after your webapp's code completes processing a request. In order to keep up with the load, Tomcat will allocate new threads if enough aren't available.
If you have 70 total threads and 70 simultaneous requests, all should be well. If one request (of 70) completes (that is, the client has received all the data) and another is made before Tomcat is fully-done with the request-processor thread, another thread will be allocated to handle the new request resulting on a thread pool of size=71.
This can happen many times because it's not deterministic due to context switches, GC pauses, etc. that can interfere with exact timing of everything happening on the server.