Infinispan-8 blocking state during iteration through cache - infinispan

I am using infinispan 8.2.11. During iteration through cache with use of cache.entrySet().iterator() the thread gets stuck and not move. Here is the thread dump I collected :
"EJB default - 32" #586 prio=5 os_prio=0 tid=0x000055ce2f619000 nid=0x2853 runnable [0x00007f8780c7a000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006efb93ba8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163)
at org.infinispan.stream.impl.DistributedCacheStream$IteratorSupplier.get(DistributedCacheStream.java:754)
at org.infinispan.util.CloseableSuppliedIterator.getNext(CloseableSuppliedIterator.java:26)
at org.infinispan.util.CloseableSuppliedIterator.hasNext(CloseableSuppliedIterator.java:32)
at org.infinispan.stream.impl.RemovableIterator.getNextFromIterator(RemovableIterator.java:34)
at org.infinispan.stream.impl.RemovableIterator.hasNext(RemovableIterator.java:43)
at org.infinispan.commons.util.Closeables$IteratorAsCloseableIterator.hasNext(Closeables.java:93)
at org.infinispan.stream.impl.RemovableIterator.getNextFromIterator(RemovableIterator.java:34)
at org.infinispan.stream.impl.RemovableIterator.hasNext(RemovableIterator.java:43)
at org.infinispan.commons.util.IteratorMapper.hasNext(IteratorMapper.java:26)
I found the article in the Jboss Community Archive which describes similar issue : https://developer.jboss.org/thread/271158. There was a fix delivered into infinispan 9 which I belive resolves this problem: ISPN-9080
Is it possible to backport this fix into infinispan-8? Unfortunately, I can't uplift the version of infinispan in my project.

Unfortunately, we do not maintain such older versions. The suggested approach is to update to a more recent version. If that is not possible you could try patching the older version as you have the changes available https://github.com/infinispan/infinispan/pull/5924/files
Also to note this does not fix the actual issue. This just fixes a symptom of the actual issue. The actual issue is that the newest topology was not installed for some reason, but the original poster was not able to provide sufficient information to ascertain why.

In any case I would recommend to think about update the Infinispan version.
There are many fixes you might miss, as version 11 is the current stable one, and run into already fixed stuff.
But the problem in your case is that something happened in your cluster and cluster topology updates are missed.
If you have a stable cluster this problem will not happen.
So if you are able to find the cause for the 'instable' cluster, could be intentional stopping/starting nodes, and this is aceptable you can prevent from it.

Related

Q) crash in openJ9 libcryp due to Memeory Violation in CRYPTO_memcmp

The Context of my question
My server software is based on
openjdk version "1.8.0_242"
Nearly ever 2-3 week my server process crashes.
In the Java Dmp file is see that when doing
HttpsURLConnection conn = (HttpsURLConnection)myurl.openConnection();
there is a memory violation in the libcrypt-1_1 DLL in
4XENATIVESTACK CRYPTO_memcmp+0xe8ef8 (0x00007FFA122A5C18 [libcrypto-1_1-x64+0x185c18])
So for me it looks like that CRYPTO_memcmp
forces
1XHEXCPCODE Windows_ExceptionCode: C0000005
My Question
Did anybody observe a similar crash with openJ9 or have an idea about the root cause?
Many thanks in advance
Reinhold
I just think about my implementation which is a server and of cource I am using multitheading.
So I need to double check if libcrypto is Multi-Thread-safe.
If not then this could explain the crash.
But the real question is: Is the method myurl.openConnection()
realy thread-safe.
As far as I know it should be thread safe.
I will update with my finding as soon I found something
I think you've encountered OpenJ9 issue 8373 which is fixed in the 0.23 release of jdk8u275-b01.

Can a change to package-lock.json ever affect the deployment?

I'm reading the NPM docs about package-lock.json and my interpretation is that a committed change to it can never cause issues in the deployed version.
During the roll-out we run npm install which creates (or overwrites) the lock file anyway. In my mind, the lock file is more of a receipt of the state of the concurrent world while installing, rather than a pointer on how the installation should be performed.
However, I haven't been successful convincing my team that it is so. They feel uneasy relying on the statement above (not contradicting it nor arguing against it, just not entirely convinced to the degree that they would bet a testicle on it).
Is it at all possible that package-lock.json might affect the actual installation?
Since I'm new with the company, my track record of 10+ years has limited impact. And I'm myself humbly considering that even though the lock file never caused me any issues before, my experience might be irrelevant if the local environment is configured in a way I'm not familiar with yet. So I'm too cautious to bet my reputation as we're about to make a very important release.
In my mind, the lock file is more of a receipt of the state of the concurrent world while installing, rather than a pointer on how the installation should be performed.
Maybe I am interpreting your statement wrong but package-lock is a pointer for future installations in a way. See the general documentaion on lock files (different link than the one you shared), following statement from the above doc might be helpful:
This file describes an exact, and more importantly reproducible node_modules tree. Once it’s present, any future installation will base its work off this file, instead of recalculating dependency versions off package.json.`
A read on following discussion on this topic might be helpful to you too. Thanks!

What happens if an MPI process crashes?

I am evaluating different multiprocessing libraries for a fault tolerant application. I basically need any process to be allowed to crash without stopping the whole application.
I can do it using the fork() system call. The limit here is that the process can be created on the same machine, only.
Can I do the same with MPI? If a process created with MPI crashes, can the parent process keep running and eventually create a new process?
Is there any alternative (possibly multiplatform and open source) library to get the same result?
As reported here, MPI 4.0 will have support for fault tolerance.
If you want collectives, you're going to have to wait for MPI-3.something (as High Performance Mark and Hristo Illev suggest)
If you can live with point-to-point, and you are a patient person willing to raise a bunch of bug reports against your MPI implementation, you can try the following:
disable the default MPI error handler
carefully check every single return code from your MPI programs
keep track in your application which ranks are up and which are down. Oh, and when they go down they can never get back. but you're unable to use collectives anyway (see my opening statement), so that's not a huge deal, right?
Here's an old paper (back when Bill still worked at Argonne. I think it's from 2003):
http://www.mcs.anl.gov/~lusk/papers/fault-tolerance.pdf . It lays out the kinds of fault tolerant things one can do in MPI. Perhaps such a "constrained MPI" might still work for your needs.
If you're willing to go for something research quality, there's two implementations of a potential fault tolerance chapter for a future version of MPI (MPI-4?). The proposal is called User Level Failure Mitigation. There's an experimental version in MPICH 3.2a2 and a branch of Open MPI that also provides the interfaces. Both are far from production quality, but you're welcome to try them out. Just know that since this isn't in the MPI Standard, the function prefixes are not MPI_*. For MPICH, they're MPIX_*, for the Open MPI branch, they're OMPI_* (though I believe they'll be changing theirs to be MPIX_* soon as well.
As Rob Latham mentioned, there will be lots of work you'll need to do within your app to handle failures, though you don't necessarily have to check all of your return codes. You can/should use MPI error handlers as a callback function to simplify things. There's information/examples in the spec available along with the Open MPI branch.

Intellij IDEA 12 deadlock and lost changes

As you are working on the IDE, suddenly it has no reaction, the whole IDE become inactive and can not operate on it, but with high CPU usage, If you kill the process from Windows Task Manager, after it launched, all modified lost since your last edit. This problem occurs every now and then.
My environments:
Windows 7, Intel i7, 16GB RAM, IDEA 12.1.6 with auto save enabled.
Did anyone come across this problem before, it's to bad as my changes lost and i have to rewrite it after restarted.
You'll want to upgrade to v12.1.6 as 12.1.5 had a major bug in it that was fixed in 12.1.6. The bug prevented compiling of code in some circumstances. 12.1.6 was released only a few days after 12.1.5. That may not be the cause of your issue, but is still good advice.
Other than that, the 12.1.x line has been very stable. I think your issue is an isolated case as I have not seen any mention of it in the IntelliJ IDEA forums or here. Often times, such deadlocks are caused by third party plug-ins. Take a look in the logs (Help > Show Log) to see if it has any information that explains the hang. Also, if IDEA becomes non responsive, it automatically logs thread dumps in the log directory. Those may have some information.
If you experience the issue again, you may want to disable any third-party plug-ins to see if that resolves it. If it happens frequently, you can take a CPU Snapshot as described in this document and submit it to the JetBrains.
Lastly, I recommend you tweak the following setting: File > Settings > [IDE Settings] > General > Save files automatically if application is idle for x sec." Set it to 15 or 30 seconds. (You don't want to go too low). This will help reduce any loss of work in the event of a hang (which after 10 years of daily IDEA use I can attest to as being very rare.)

How to make sure Solr/Lucene won't die with java.lang.OutOfMemoryError?

I'm really puzzled why it keeps dying with java.lang.OutOfMemoryError during indexing even though it has a few GBs of memory.
Is there a fundamental reason why it needs manual tweaking of config files / jvm parameters instead of it just figuring out how much memory is available and limiting itself to that? No other programs except Solr ever have this kind of problem.
Yes, I can keep tweaking JVM heap size every time such crashes happen, but this is all so backwards.
Here's stack trace of the latest such crash in case it is relevant:
SEVERE: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:3209)
at java.lang.String.<init>(String.java:216)
at org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java:122)
at org.apache.lucene.index.SegmentTermEnum.term(SegmentTermEnum.java:169)
at org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:701)
at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:208)
at org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:676)
at org.apache.lucene.search.FieldComparator$StringOrdValComparator.setNextReader(FieldComparator.java:667)
at org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:94)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:245)
at org.apache.lucene.search.Searcher.search(Searcher.java:171)
at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884)
at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182)
at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:619)
Looking at the stack trace, it looks like you are performing a search, and sorting by a field. If you need to sort by a field, internally Lucene needs to load up all the values of all the terms in the field into memory. If the field contains a lot of data, then it is very possible that you may run out of memory.
I'm not certain there is a steadfast way to ensure you won't run into OutOfMemoryExceptions with Lucene. The problem you are facing is problem related to the use of FieldCache. From the Lucene API "Maintains caches of term values.". If your terms exceed the amount of memory allocated to the JVM you'll get the exception.
The documents are being sorted "at org.apache.lucene.search.FieldComparator$StringOrdValComparator.setNextReader(FieldComparator.java:667)", which will take up as much memory as is needed to store the terms being sorted for the index.
You'll need to review projected size of the fields that are sortable and adjust the JVM settings accordingly.
a wild guess, the documents you are indexing are very large
Lucene by default only indexes the first 10,000 terms of a document to avoid OutOfMemory errors, you can overcome this limit see setMaxFieldLength
Also, you could call optimize() and close as soon as you are done with processing with Indexwriter()
a definite way is to profile and find the bottleneck =]
You are using the post.jar to index data? This jar has a bug in solr1.2/1.3 I think (but I don't know the details). Our company has fixed this internally and it should be also fixed in the latest trunk solr1.4/1.5.
I was using this Java:
$ java -version
java version "1.6.0"
OpenJDK Runtime Environment (build 1.6.0-b09)
OpenJDK 64-Bit Server VM (build 1.6.0-b09, mixed mode)
Which was running out of heap space, but then I upgraded to this Java:
$ java -version
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
And now it works fine, on a huge dataset, with lots of term facets.
For me it worked after restarting the Tomcat server.
navigate to C:\Bitnami\solr-4.7.2-0\apache-solr\scripts
open up serviceinstall.bat (with notepad++ or another program)
Either add or update the following properties:- ++JvmOptions=-Xms1024M ++JvmOptions=-Xmx1024M
from the command prompt in that window, run serviceinstall.bat REMOVE
then run serviceinstall.bat INSTALL
Hope that helpw!
An old question but since I stumbled upon it:
The String Field Cache is lot more compact from Lucene 4.0. So lot can fit in.
Field Cache is an in-memory structure. So can't prevent OOME.
For fields which need sorting or faceting - one should try DocValues to overcome this problem. DocValues do work with numeric and non-analyzed string values. And I presume many use cases of sorting/faceting will have one of these value types.