Lucene Difference between OpenMode.CREATE_OR_APPEND and deleteDocuments - lucene

I am pretty new to LUCENE search engine, want to know the functionality of OpenMode.CREATE_OR_APPEND, deleteDocuments? Also, indexSearcher.search method can accept either Term or Query as a parameter, to fetch documents. Can you help me out in which scenario I need to use term and query?

The OpenMode does not affect the behavior of deleteDocuments. It only affects what happens when you open the Indexwriter:
CREATE - Creates a new index. If one already exists, it will be overwritten.
CREATE_OR_APPEND - Uses an existing index, or creates it if none currently exists.
APPEND - Uses an existing index. If none currently exists, throws an IOException.
I'm not aware of any IndexSearcher.search method that takes a Term as an argument. If you can link to what you are referring to, that might be helpful.
However, if you want to search for a term, you can just use TermQuery.

Related

How to tell what index is trying to be used for ContentSearchManager.GetIndex(SitecoreIndexableItem)

So, ContentSearchManager.GetIndex(SitecoreIndexableItem) is returning null. Im pretty sure we might be missing an index. When using the sitecore master database, everything works fine, but in web is null.
I guess the question is, is there a way to know which index is GetIndex trying to recover that is returning null.
If you haven't overriden the default Sitecore logic for getting the index, Sitecore checks all the indexes which are registered in the configuration and for each of them, it checks if the SitecoreIndexableItem passed to the
ContentSearchManager.GetIndex(SitecoreIndexableItem)
is not excluded from that index.
Then is simply returns the first matching index.
So the answer to your question is - Sitecore check all indexes if they are a match for your item.
You may want to look through your logs for an error like this:
"There is no appropriate index for {indexable.AbsolutePath} - {indexable.Id}. You have to add an index crawler that will cover this item"
This may help you find which item is not indexed at all.

Solr 5.3 implementation processes docs but doesn't return results

I have recently set up a local instance of Solr 5.3 in an effort to get it going for my company. As an initial test case I've set up a Data Import Handler (DIH) that returns PDFs stored within a file directory. When I execute the full import in the admin tool, the DIH processes all the files within the directory, and I'm able to run a general query (*:*) which returns all indexed fields for every record in the index.
When I switch to a specific query using a word definitely contained within the files, however, Solr returns no results. What connection am I not making here?
I can provide excerpts from the schema, solrconfig, and custom data config if needed, but I don't want to oversaturate this post.
The answer I came up with involved a simple newbie mistake combined with something I wasn't anticipating.
1) First, I didn't have my field set to indexed="true". I set that. Yeesh, it stinks being new to this!
2) I needed to make a change to solrconfig.xml for the core in question. Thanks to this article, I was able to determine that I needed to add a default field in the /select requestHandler. Uncommenting the relevant line in solrconfig and changing the field name did the trick-- I no longer need to supply the name in df to return results.
My carryover question for anyone coming across this question in the future is whether this latter point is the proper way to go about using default fields. I see in schema.xml that is deprecated (or heading that direction) in 5.3.0. So is it alright to define df in solrconfig instead?

Lucene not giving results when specifying field

I have a database which I have indexed in Lucene (using Pylucene) by section (specified by markup in the document) using lucene's fields. This index seems to work fine. I can search it using the default field which is simply the entire document and get reasonable results.
The problem is, when I search it using a specific section (not the default), I expect to get a certain number of results back (as specified by IndexSearcher.search(query, results)), but instead it might simply return nothing. So my question is: how can I get it to return a ranked list with the number of results I specify?
The only place I specify the field is in the QueryParser, by calling:
QueryParser(Version.LUCENE_CURRENT, field, StandardAnalyzer)
I would verify the index using Luke (which is something I do often when modifying my index strategy).

Why are my Lucene Document results empty?

I'm running a simple test--trying to index something and then search for it. I index a simple document, but then when a search for a string in it, I get back what looks to be an empty document (it has no fields). Lucene seems to be doing something, because if I search for a word that's not in the document, it returns 0 results.
Any reason why Lucene would reliably return a document when it finds one that matches the given query, and yet that document has nothing in it?
More details:
I'm actually running Lucandra (Lucene + Cassandra). That certainly may be a relevant detail, but not sure.
The fields are set to Field.Store/YES and Field.Index/ANALYZED
Interestingly, I'm able to get this to work just fine on my local machine, but when we put it on our main server (which is a multi-node cassandra setup), I get the behavior described above. So this seems like probably the relevant detail, but unfortunately, I see no error message to clue me in to what specifically is causing it.
Unsure if this will work with Lucandra, but you have tried opening the index using Luke? Viewing the index contents with Luke might help
It's hard to tell what the problem is since you only provide a very abstract description. However, it sounds a bit like you are not storing the field value in the index. There are different modes for indexing a field. One option determines whether the original value is stored in the index to retrieve it later:
http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/document/Field.Store.html
See also the description of the enclosing class Field
Read: http://anismiles.wordpress.com/2010/05/27/lucandra-an-inside-story/

How do I use native Lucene Query Syntax?

I read that Lucene has an internal query language where one specifies : and you make combinations of these using boolean operators.
I read all about it on their website and it works just fine in LUKE, I can do things like
field1:value1 AND field2:value2
and it will return seemingly correct results.
My problem is how do I pass this who Lucene query into the API? I've seen QueryParser, but I have to specifiy a field. Does this mean I still have to manually parse my input string, fields, values, parenthesis, etc or is there a way to feed the whole thing in and let lucene do it's thing?
I'm using Lucene.NET but since it's a method by method port of the orignal java, any advice is appreciated.
Are you asking whether you need to force your user to enter the field? If so, the query parser has a default field. Here's a little more info. As long as you have a default field that will do the job, they don't need to specify fields.
If you're asking how to get a Query object from the String, you need the parse method. It understands about fields, and the default field, etc. mentioned earlier. You just need to make sure that the query parser and the index builder are both using the same analysis.