Solr no search results on new field - lucene

I added a multivalue field to schema.xml as follows:
<field name="fieldsharedsite" type="string" indexed="true" stored="false" multiValued="true" />
<field name="fieldsharedchannelnew" type="string" indexed="true" stored="false" multiValued="true" />
When I search for a document contents, I get the following result:
<fieldsharedsite><item key="0">33</item></fieldsharedsite>
<fieldsharedchannelnew><item key="0">52</item></fieldsharedchannelnew>
so I am sure fieldsharedchannelnew is in the results
When I do the following search:
q=fieldsharedsite:33
I do get the document
but when I do
q=fieldsharedchannelnew:52
I don't get any results.
fieldsharedsite has been here for a while and I'm trying to add fieldsharedchannelnew.
I did reindex all the content but did not help the search.
If I look at the schema browser, I have for fieldsharedsite:
Field Type: string
Properties: Indexed, Multivalued, Omit Norms, Sort Missing Last
Schema: Indexed, Multivalued, Omit Norms, Sort Missing Last
Index: (unstored field)
Index Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
Query Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
Docs: 902
and for fieldsharedchannnelnew I have:
Field Type: string
Properties: Indexed, Multivalued, Omit Norms, Sort Missing Last
Index Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
Query Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
What step did I miss in adding the fieldsharedchannelnew index? Why its not returning any results when I search for it?

The schema browser result for the field fieldsharedchannnelnew does not indicate it is populated in the documents.
the Docs information is missing, as for fieldsharedsite which shows it exists in 902 docs.
Field Type: string
Properties: Indexed, Multivalued, Omit Norms, Sort Missing Last
Index Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
Query Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
When I search for a document contents, I get the following result:
<fieldsharedsite><item key="0">33</item></fieldsharedsite>
<fieldsharedchannelnew><item key="0">52</item></fieldsharedchannelnew>
As the fields are not stored, the fields would not be returned with the results.
Is this the data you are feeding Solr ? how do they appear in results ?
Are you using value as is or want to use copyfield ?
You may mark the fields as stored and reindex the contents and check if they are returned with the results and the schema browser shows the Docs information.
If so, you should be able to search it as well.

Related

How do I get empty fields in SOLR indexed for a schemaless collection?

How do I get empty fields in SOLR indexed? I am using solr 7.2.0
I am using schemaless SOLR to try to index everything as string, but for files with empty fields, those fields do not get indexed. Is there a way to get them to show up?
col1,col2,col3
a,,1
d,e,
g,h,3
for example column 1 shows up as
{
"col1":"a",
"col3":"1",
}
I'm trying to also get col2 to show up.
in my solrconfig.xml i have this
<dynamicField name="*" type="text_general" indexed="true" stored="true" required="true" default="" />
and I have any traces of the remove-blank processor removed from my config. I've reloaded and deleted/recreated by collection multiple times. Is there a solution for this?
The CSV import module has its own option to keep empty fields - f.<field name>.keepEmpty=true.
If you don't give that option, the CSV handler will never give the empty field value to the next step in your indexing process.
Giving f.col2.keepEmpty=True as an URL argument should at least give you a better starting point.
maybe preprocess your csv file like this:
s/,,/, ,/g
That is, add an space between both commas (you will have to specially deal with the last value differntly though, there is a regex for that).
And then try again. Right now solr is reading the value as non existant, making it a space has more chances to make it through, and would not change search results (if you don't have some crazy analysis chains)

Search Predicate Builder

I am using Lucene search with Sitecore 7.2 and using predicate builder to search for data. I have included a computed field in the index which is a string. When I search on that field using .Contains(mystring), it fails when there is 'and' present in mystring. If there is no 'and' in the mystring it works.
Can you please suggest me anything?
Lucene by default, when the field and query is processed, will strip out what are called "stop words" such as and and the etc.
If you dont want this behaviour you can add an entry into the fieldMap section of your configuration to tell Sitecore how to process the field ...
<fieldNames hint="raw:AddFieldByFieldName">
<field fieldName="YOURFIELDNAME" storageType="YES" indexType="UN_TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
<analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
</field>
...
</fieldNames>
.. this example tells Sitecore, for that field, to not tokenize and also to put everything into lowercase. You can change to different analyzers to get the results you want.
You can try setting the indexType to TOKENIZED but still using the LowerCaseKeywordAnalyzer as another combination. UN_TOKENIZED will mean that your string will be processed as a single token which may not be what you want.
I have solved it, taking a hint from #Stephen Pope 's reply. In order to make your computed field untokenized you have to add it to both raw:AddFieldByFieldName and AddComputedIndexField.
See link below
http://www.sitecore.net/Community/Technical-Blogs/Martina-Welander-Sitecore-Blog/Posts/2013/09/Sitecore-7-Search-Tips-Computed-Fields.aspx

What is the use of "multiValued" field type in Solr?

I'm new to Apache Solr. Even after reading the documentation part, I'm finding it difficult to clearly understand the functionality and use of the multiValued field type property.
What internally Solr does/treats/handles a field that is marked as multiValued?
What is the difference in indexing in Solr between a field that is multiValued and those that are not?
Can somebody explain with some good example?
Doc says:
multiValued=true|false
True if this
field may contain multiple values per
document, i.e. if it can appear
multiple times in a document
A multivalued field is useful when there are more than one value present for the field. An easy example would be tags, there can be multiple tags that need to be indexed. so if we have tags field as multivalued then solr response will return a list instead of a string value. One point to note is that you need to submit multiple lines for each value of the tags like:
<field name="tags">tag1</tags>
<field name="tags">tag2</tags>
...
<field name="tags">tagn</tags>
Once you have all the values index you can search or filter results by any value, e,g. you can find all documents with tag1 using query like
q=tags:tag1
or use the tags to filter out results like
q=query&fq=tags:tag1
multiValued defined in the schema whether the field is allowed to have more than one value.
For instance:
if I have a fieldType called ID which is multiValued=false indexing a document such as this:
doc {
id : [ 1, 2]
...
}
would cause an exception to be thrown in the indexing thread and the document will not be indexed (schema validation will fail).
On the other hand if I do have multiple values for a field I would want to set multiValued=true in order to guarantee that indexing is done correctly, for example:
doc {
id : 1
keywords: [ hello, world ]
...
}
In this case you would define "keywords" as a multiValued field.
I use multiple value fields only with copyfields, so think this way, say all fields will be single valued unless it's a copyfield, for example I have following fields:
<field name="id" type="string" indexed="true" stored="true"/>
<field name="name" type="string" indexed="true" stored="true"/>
<field name="subject" type="string" indexed="true" stored="true"/>
<field name="location" type="string" indexed="true" stored="true"/>
I want to query one field only and possibly to search all 4 fields above, then we need to use copyfield. first to create a new field call 'all', then copy everything into 'all'
<field name="all" type="text" indexed="true" stored="true" multiValued="true"/>
<copyField source="*" dest="all"/>
Now field 'all' need to be multi-valued.

The field value is 1 or true in solr search results

I have one field that is indexed as string in Solr's schema.xml, which is from a boolean(tinyint) column in mysql database.
In query, I search against this field using 1. But without any change, this query cannot return correct results as it did. After I used true instead of 1, it worked again. Now it goes wrong again but with true, no problem with 1.
What's the exact problem here? Do I need change the field type in schema.yml to integer?
Thank you in advance.
Since it's a string field, we can't possibly know how you indexed it. It could be "true" / "false" or "1" / "0" or "on" / "off", etc. Or even a mix of these, maybe you have some documents with "true" and some with "1".
If it's semantically a boolean field I recommend using the boolean fieldType, e.g.:
<field name="inStock" type="boolean" indexed="true" stored="true" />
for this to work you need the boolean fieldType declared (it comes declared in the default schema):
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true"/>
Remember to rebuild the index after this change.

Location aware search

I am trying location aware search with spatial example found in
http://www.ibm.com/developerworks/java/library/j-spatial/#indexing.approaches.
The schema.xml has a geohash field, but this field is not present in any of the .osm files (present in data folder) used to index. I am not able to understand how the value is assigned to it, so that when I give this query
http://localhost:8983/solr/select/?q=_val_:"recip (ghhsin(geohash(44.79, -93), geohash, 3963.205), 1, 1, 0)"^100
result set has geohash value retrieved. How is it happening? Please help me.
The Solr wiki has a pretty good page on how Spatial search can be done with solr 1.5+.
To summarize, your schema defines 'geohash' typed fields:
<fieldtype name="geohash" class="solr.GeoHashField"/>
<field name="destination" type="geohash" indexed="true" stored="true"/>
Data feeders pass in geohashed coordinates:
<field name="destination">cbj1pb56p4b</field> <!-- 45.17614 -93.87341 -->
You probably should go back to using simple latitude and longitude coordinates to start off with. There are better docs for it.