The field value is 1 or true in solr search results - lucene

I have one field that is indexed as string in Solr's schema.xml, which is from a boolean(tinyint) column in mysql database.
In query, I search against this field using 1. But without any change, this query cannot return correct results as it did. After I used true instead of 1, it worked again. Now it goes wrong again but with true, no problem with 1.
What's the exact problem here? Do I need change the field type in schema.yml to integer?
Thank you in advance.

Since it's a string field, we can't possibly know how you indexed it. It could be "true" / "false" or "1" / "0" or "on" / "off", etc. Or even a mix of these, maybe you have some documents with "true" and some with "1".
If it's semantically a boolean field I recommend using the boolean fieldType, e.g.:
<field name="inStock" type="boolean" indexed="true" stored="true" />
for this to work you need the boolean fieldType declared (it comes declared in the default schema):
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true"/>
Remember to rebuild the index after this change.

Related

Solr 7.1: Querying Double field for any value not possible with * anymore

I recently upgraded from Solr 6.6 to 7.1 and cannot query Double fields for any value anymore using
q: test_d:*
(zero results although the field is set). However,
q: test_d:[* TO *]
works. This seems to affect all numeric field types (tested for Integers, Floats and Doubles). For String, Text, Boolean fields the single asterisk works just fine like before.
Is it possbile to reconfigure Solr to have the old behavior or do I have to rewrite all queries and introduce a switch for numeric field types? Until now, no field value type differentiation was needed (which is good!).
Minimal Working Example
Use the example-DIH-solr core supplied with the Solr distributable, push the document
{"id":"foo","test_b":true,"test_i":42,"test_f":42.0,"test_d":42.0}
and use
q: test_b:*
q: test_d:*
q: test_i:*
q: test_f:*
Only the query for the Boolean field will yield a result.
Double field definition changed. To restore the old behaviour you can use / change this:
<dynamicField name="*_d" type="double" indexed="true" stored="true"/>
and add back the double field type definition to the schema:
<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/>
This worked in the past but most likely per accident - see https://issues.apache.org/jira/browse/SOLR-11746 for a bug report / solr issue to track this.

Apache Solr undefined field score field in function query

I am using solr 4.10. I have to change relevance of documents based on a field boost and document score. For that, I have come to know that I should use function query. Following is the syntax of boost field in schema
<field name="boost" type="float" stored="true" indexed="false" default="1.0"/>
My first question is that can function queries be used on stored fields only?
When I try using above schema, like following query
http://localhost:8983/solr/select?q=bank&df=keywords&fl=id&sort=pow(score,%20boost)%20asc
There was some error saying like
sort param could not be parsed as a query, and is not a field that exists in the index:
then I changed the schema like
<field name="boost" type="float" stored="true" indexed="true" default="1.0"/>
Then above problem was gone but a new error appeared for query
http://localhost:8983/solr/select?q=bank&df=keywords&fl=id,pow(score,%20boost)
Following error appeared
<lst name="error">
<str name="msg">undefined field: "score"</str>
<int name="code">400</int>
</lst>
Where I am wrong?
Am I correct to change attributes of boost field?
I would recommend to use a boost function and sort just by score (default = no order param needed).
bf=linear(boost,100,0)
You may use other functions. That depends on your usecase.
Just check out the solr docs for function queries.

Search Predicate Builder

I am using Lucene search with Sitecore 7.2 and using predicate builder to search for data. I have included a computed field in the index which is a string. When I search on that field using .Contains(mystring), it fails when there is 'and' present in mystring. If there is no 'and' in the mystring it works.
Can you please suggest me anything?
Lucene by default, when the field and query is processed, will strip out what are called "stop words" such as and and the etc.
If you dont want this behaviour you can add an entry into the fieldMap section of your configuration to tell Sitecore how to process the field ...
<fieldNames hint="raw:AddFieldByFieldName">
<field fieldName="YOURFIELDNAME" storageType="YES" indexType="UN_TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
<analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
</field>
...
</fieldNames>
.. this example tells Sitecore, for that field, to not tokenize and also to put everything into lowercase. You can change to different analyzers to get the results you want.
You can try setting the indexType to TOKENIZED but still using the LowerCaseKeywordAnalyzer as another combination. UN_TOKENIZED will mean that your string will be processed as a single token which may not be what you want.
I have solved it, taking a hint from #Stephen Pope 's reply. In order to make your computed field untokenized you have to add it to both raw:AddFieldByFieldName and AddComputedIndexField.
See link below
http://www.sitecore.net/Community/Technical-Blogs/Martina-Welander-Sitecore-Blog/Posts/2013/09/Sitecore-7-Search-Tips-Computed-Fields.aspx

Solr no search results on new field

I added a multivalue field to schema.xml as follows:
<field name="fieldsharedsite" type="string" indexed="true" stored="false" multiValued="true" />
<field name="fieldsharedchannelnew" type="string" indexed="true" stored="false" multiValued="true" />
When I search for a document contents, I get the following result:
<fieldsharedsite><item key="0">33</item></fieldsharedsite>
<fieldsharedchannelnew><item key="0">52</item></fieldsharedchannelnew>
so I am sure fieldsharedchannelnew is in the results
When I do the following search:
q=fieldsharedsite:33
I do get the document
but when I do
q=fieldsharedchannelnew:52
I don't get any results.
fieldsharedsite has been here for a while and I'm trying to add fieldsharedchannelnew.
I did reindex all the content but did not help the search.
If I look at the schema browser, I have for fieldsharedsite:
Field Type: string
Properties: Indexed, Multivalued, Omit Norms, Sort Missing Last
Schema: Indexed, Multivalued, Omit Norms, Sort Missing Last
Index: (unstored field)
Index Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
Query Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
Docs: 902
and for fieldsharedchannnelnew I have:
Field Type: string
Properties: Indexed, Multivalued, Omit Norms, Sort Missing Last
Index Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
Query Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
What step did I miss in adding the fieldsharedchannelnew index? Why its not returning any results when I search for it?
The schema browser result for the field fieldsharedchannnelnew does not indicate it is populated in the documents.
the Docs information is missing, as for fieldsharedsite which shows it exists in 902 docs.
Field Type: string
Properties: Indexed, Multivalued, Omit Norms, Sort Missing Last
Index Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
Query Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
When I search for a document contents, I get the following result:
<fieldsharedsite><item key="0">33</item></fieldsharedsite>
<fieldsharedchannelnew><item key="0">52</item></fieldsharedchannelnew>
As the fields are not stored, the fields would not be returned with the results.
Is this the data you are feeding Solr ? how do they appear in results ?
Are you using value as is or want to use copyfield ?
You may mark the fields as stored and reindex the contents and check if they are returned with the results and the schema browser shows the Docs information.
If so, you should be able to search it as well.

What is the use of "multiValued" field type in Solr?

I'm new to Apache Solr. Even after reading the documentation part, I'm finding it difficult to clearly understand the functionality and use of the multiValued field type property.
What internally Solr does/treats/handles a field that is marked as multiValued?
What is the difference in indexing in Solr between a field that is multiValued and those that are not?
Can somebody explain with some good example?
Doc says:
multiValued=true|false
True if this
field may contain multiple values per
document, i.e. if it can appear
multiple times in a document
A multivalued field is useful when there are more than one value present for the field. An easy example would be tags, there can be multiple tags that need to be indexed. so if we have tags field as multivalued then solr response will return a list instead of a string value. One point to note is that you need to submit multiple lines for each value of the tags like:
<field name="tags">tag1</tags>
<field name="tags">tag2</tags>
...
<field name="tags">tagn</tags>
Once you have all the values index you can search or filter results by any value, e,g. you can find all documents with tag1 using query like
q=tags:tag1
or use the tags to filter out results like
q=query&fq=tags:tag1
multiValued defined in the schema whether the field is allowed to have more than one value.
For instance:
if I have a fieldType called ID which is multiValued=false indexing a document such as this:
doc {
id : [ 1, 2]
...
}
would cause an exception to be thrown in the indexing thread and the document will not be indexed (schema validation will fail).
On the other hand if I do have multiple values for a field I would want to set multiValued=true in order to guarantee that indexing is done correctly, for example:
doc {
id : 1
keywords: [ hello, world ]
...
}
In this case you would define "keywords" as a multiValued field.
I use multiple value fields only with copyfields, so think this way, say all fields will be single valued unless it's a copyfield, for example I have following fields:
<field name="id" type="string" indexed="true" stored="true"/>
<field name="name" type="string" indexed="true" stored="true"/>
<field name="subject" type="string" indexed="true" stored="true"/>
<field name="location" type="string" indexed="true" stored="true"/>
I want to query one field only and possibly to search all 4 fields above, then we need to use copyfield. first to create a new field call 'all', then copy everything into 'all'
<field name="all" type="text" indexed="true" stored="true" multiValued="true"/>
<copyField source="*" dest="all"/>
Now field 'all' need to be multi-valued.