I am using the below solution for removing stopwords while applying stanford NLP.
https://github.com/jconwell/coreNlp
This project has dependency on old version of Lucene ( 3.6.2 )
I need to migrate this code to lucene 5.5.2 in order to utilise latestfeatures of lucene.
While I try to fix the below file ,
https://github.com/jconwell/coreNlp/blob/master/src/main/java/intoxicant/analytics/coreNlp/StopwordAnnotator.java
I observed that the below classes are no longer available in lucene 5.5.2
import org.apache.lucene.analysis.CharArraySet;
import org.apache.lucene.analysis.StopAnalyzer;
I could not find information on the alternate classes for these from Lucene documentation.
In case if anybody is aware on the right classes to be used from the latest lucene release , kindly revert back.
Below are the classes to be used from lucene 5.5.2
import org.apache.lucene.analysis.util.CharArraySet;
import org.apache.lucene.analysis.core.StopAnalyzer;
Related
I am not able to find below class in hibernate-search 6
import org.apache.lucene.search.TermRangeQuery;
is there any similar class available?
This is an Apache Lucene class, not a Hibernate Search class. And this class still exists in the version of Lucene (8) used by Hibernate Search 6, so I don't understand what your problem is exactly.
In any case... you should probably use the Hibernate Search DSL, and the range predicate.
Currently i work with Text Search using Jena and Lucene. I have a problem with Apache Lucene, especially in org.apache.jena.query.text. I wrote the import libraries like this:
import org.apache.jena.query.text.EntityDefinition;
import org.apache.jena.query.text.TextDatasetFactory;
import org.apache.jena.query.text.TextIndexConfig;
Those three libraries says that cannot be resolved. I am using Lucene 8.4.0.
What should i do? I think it is because of the version of Lucene but i'm not sure.
we are working on a teamwork to create a Persian search engine.
I am doing the "indexing" part.
I worked with Solr and indexed some English documents to see if it works.
It worked! so it's the time for Persian indexer. I optimized a code for PersianAnalyzer a little bit (extending the stop words set for instance) and it can index the documents. Now I want to import the external Persian indexed document to the core to see the indexing process and search a query on it. how can I do it and import these indexed documents to the core?
I am kind of in hurry, so I will appreciate any help.
thanks,
Mahshid
You have several options:
the quickest option in order to get content from a file would be to use Solr DataImportHandler;
another option would be to write a custom crawler/indexer but that would require time;
if you need a web-crawler instead then you can use Apache Nutch.
anybody knows why there is no QueryParser, nor IndexWriter.MaxFieldLength(25000) and some more in Lucene 4.0 Snapshot?
I'm having hard time to port the code to this newer version, though I'm following the code as given here: http://search-lucene.com/jd/lucene/overview-summary.html
How do I find the missing packages, and how do I get them? As the snapshop jar doesn't contain all the features..
thanks
Lucene has been re-architectured, and some classes which used to be in the core module are now in submodules. You will now find the QueryParser stuff in the queryparser submodule. Similarly, lots of useful analyzers, tokenizers and tokenfilters have been moved to the analysis submodule.
Regarding IndexWriter, the maximum field length option has been deprecated, it is now recommended to wrap an analyzer with LimitTokenCountAnalyzer (in the analysis submodule) instead.
I am upgrading lucene 2.4.1 to 3.0.2 in my java web project
in lucene API's i found that Field.Store.COMPRESS is not present in 3.0.2 so
what i can use in place of Field.Store.COMPRESS?
some time field data is so large that i have to compress that.
Lucene made the decision to not compress fields, as it was really slow, and not Lucene's forte. The Javadocs say:
Please use
CompressionTools instead. For string
fields that were previously indexed
and stored using compression, the new
way to achieve this is: First add the
field indexed-only (no store) and
additionally using the same field name
as a binary, stored field with
CompressionTools.compressString(java.lang.String).