I'm trying to search for a particular set of keywords keyword1,keyword2 or keyword3 in a particular field. I'm doing it by using the query,
http://localhost:8983/solr/gettingstarted_shard2_replica2/browse?q=keyword1
keyword 2 keyword 3&qf=field1
However, when I run this it finds keyword2 in another field field2 and returns that row as well! As far as I understand, the qf:field1 parameter limits the search for all the keywords in just field1 right?
Where am I incorrect? Is it because of the schema that I have defined?
My schema config is:
<field name="field1" type="text_general" indexed="true"/>
<field name="field2" type="strings" indexed="false"/>
Disclaimer: I'm the author of Solr Query Debugger Google Chrome plugin.
I suggest to use this debugger in order to see what's is execute and explain why your query have such strange behaviour.
Just execute the solr query in your browser and then start the Solr Query Debugger plugin.
In the plugin page you'll see Debug and Echo tabs where explain what's executed by Solr. In the Explain tab you'll see the score explanations structured as a tree.
Are you using standard (default) Query parser or an eDisMax one? If standard (most likely), then you need to use df parameter.
qf parameter is used with eDisMax, but then you need to also have defType=edismax
Enabling debug flag will show you against which fields the search is actually issued.
Related
In my solr, i get this result after running analysis for Indexing. I have a number of documents containing the word Machine Learning but seems like something broke and indexing chain didn't complete. Can i find a work-around for this?
Field type is for the value being searched is: <field name="Skills" type="text_general" indexed="true" stored="true"/>
EDIT 1:
Analysis with Query:
I'm guessing that the "SF" is a Stemming filter - the filter will remove common endings to allow 'machine' to match 'machines', storing 'machin' as the common term in the index. As long as stemming is performed both when indexing and when querying, you should get the result you're looking for.
The EdgeNGramFilter stores a token for each extra letter in the token, so you get a token (that will match a query token) for each additional letter (where your filter seems to be configured for 3 as the minimum ngram size).
If you're not performing stemming when searching as well, the query machine will not find any terms matching, since the token after indexing has been stored as machin.
Use both the "query" and "index" section on the analysis page to see how each part is parsed and processed, and see why they don't end up with the same terms on both sides (the end tokens on both sides are compared, and if they're the same, there's a match - this is shown with a slightly darked background in the interface IIRC).
I am not sure what's your first image stands for, but your two image shows different token filter order.
As a side note of the Stem filter, The kstem token filter is a high performance filter for english. All terms must already be lowercased (use lowercase filter) for this filter to work correctly.
Your first image shows you have LCF (LowercaseFilter) as the first token filter. But your second image shows you have stem filter run first, then do the LCF (LowercaseFilter), it is not going to work
We have a scenario where we are trying to perform accurate name matching of Items using SOLR.
Query Parameter: Apple
SOLR Indexed Word: Apple-D
In our business case, "Apple" and "Apple-D" are totally different items and therefore SOLR shouldn't return the match.
Is there an option to achieve the same?
You need to change the fieldType used for the field. Use the String fieldType for the your field.
This String fieldType will make sure that the words will be stored as it is by solr.
It won't apply any analysis on the word. Or it won't create any tokes of it.
With the String type applied to it . The Apple and Apple-D are stored/indexed different token. As there won't be any tokenizing on the same. This will help you to achieve the exact match.
Once you change the fieldType. Re-index the same.
You can use the solr analysis tool to check how it is indexing and querying .
Note : Make sure whenever you ask question on it, Share your schema.xml
I am using Solr 4.1.0 and I'm facing a strange issue. If I give a value to search for a field, even be it exact or involving a wildcard, it gives me 0 search results. On the other hand if I just give the field name and a * in place of value, I get all the results.
Also, if I search in the text field, i.e where I have copied values of all my fields, it gives me correct output. text is by default, my catch-all for all fields. feature is a field which has value Butter.
So now, what is happening here is that if I try to find in the actual field with the exact value or even with starting alphabet and a *, it doesn't give me a value while if I search in the text field, which is a catch-all field, I'm able to retrieve the value. Although if I try to find in the feature field using *, it gives me complete result list correctly.
You can view the logs for text:Butter here, logs for feature:Butter here, logs for feature:B* here and logs for feature:* here
I'm facing this issue with this particular field only. Any pointers to what could be the reason behind this strange problem?
If you search without the field name, Solr is going to search in the default search field.
So make sure you are marking the fields you want to search on as default.
If you are using dismax query handler, you can add them to the qf parameter.
Also, for Wildcard Queries check [Analyzers][1]
On wildcard and fuzzy searches, no text analysis is performed on the search word.
As no analysis is done at query time for wilcard searches and hence the lower casing, stemming would not be applied during query time but just the index time.
I have an objective-c implementation of XMPP where I am trying to search for users. I use a predictable JID naming system where users JIDs are formed from the syntax 'fbFACEBOOK_ID'.
I tried initially to directly query to look for matching JIDs but found that XMPP doesnt seem to support that, so instead, I had users set their JID in their email field.
The following XML IQ works correctly when there is only query entered, but fails to get any results when there is more than one query. Is this not the correct syntax for searching for more than one term at once?
<iq type="set" from="hag66#shakespeare.lit/pda" to="search.shakespeare.lit" id="search2" xml:lang="en">
<query xmlns="jabber:iq:search"><email>*fb000000001*</email></query>
<query xmlns="jabber:iq:search"><email>*fb000000002*</email></query>
<query xmlns="jabber:iq:search"><email>*fb000000003*</email></query>
<query xmlns="jabber:iq:search"><email>*fb000000004*</email></query>
<query xmlns="jabber:iq:search"><email>*fb000000005*</email></query>
</iq>
See also: XMPP Query Group Chat (MUC) directory using search term
EDIT: I have tried using one query and multiple email elements instead with no luck
EDIT2: So, it doesn't seem like this is possible?
<iq> elements MUST have one and only one child element, so that won't work. XEP 55: Jabber Search doesn't define any way to search for multiple terms specifically, so it seems that you are out of luck.
Instead of writing in different queries, try it in a single query tag. It worked for me.
<query xmlns="jabber:iq:search">
<email>abc#gmail.com</email>
<email>bbc#gmail.com</email>
</query>
If I have a multiValued field type of text, and I put values [cat,dog,green,blue] in it. Is there a way to tell when I execute a query against that field for dog, that it was in the 1st element position for that multiValued field?
Assumption: client does not have any pre-knowledge of what the field type of the field being queried is. (i.e. Solr must provide the answer and the client can't post process the return doc to figure it out because it would not know how SOLR matched the query to the result).
Disclosure: I posted to solr-user list and am getting no traction so I post here now.
Currently, there's no out-of-the-box functionality provided in Solr which tells you the position of a value in a multiValue field.
Hopefully I understand your question correctly.
If you want to get field index or value there is an ugly workaround:
You could add the index directly in the value e.g. store "1; car", "2; test" and so on. Then use highlighting. When reading the returned fields simply skip the text before the semicolon.
But if you want to query only one type:
You can avoid the multivalue approach and simply store it as item_i and query via item_1. To query against all items regardless the type you need to use the copyField directive in the schema.xml
The Lucene API allows for this, but I'm not sure if Solr does out of the box. In Lucene you can use the IndexReader.termPositions(Term term) method.