how to index and search for custom fields using Lucene or hibernate search? - lucene

how to index and search for custom fields using Lucene or hibernate search. i cannot find a way to index the custom field. they are dynamic.
'custom fields' in here means they can be editabled by user,those fields are not hard code.
Any help will be thankful!

Query of Custom Fields
Just use the projection API:
FullTextQuery hibernateQuery = fullTextSession
.createFullTextQuery(luceneQuery)
.setProjection("myField1", "myField2");
List results = hibernateQuery.list();
Using projections you get to read any field as long as it's STORED.
If it matches some property name of your indexed entities it will be materialized after being converted to the appropriate type (if you have a TwoWayFieldBridge); if not you will get the String value.
If for some reason you need to bypass this conversion or just want to have fun decoding the raw Lucene Document you can open an IndexReaderdirectly.
Indexing Custom Fields
When defining a FieldBridge you get to add as many fields as you like to the indexed Document, and you can name each of them as you like.
The method parameter name is a hint - useful for example to scope the field name - but you can ignore it.
An example FieldBridge implementation writing multiple fields is the DateSplitBridge in the documentation.

Related

Sitecore: Full text search using lucene

I'm using sitecore 8 and I'm looking for a way to run a full text search for all my sitecore content. I have a solution in place, but I feel there's got to be a better way to do this.
My approach:
i have a computed field that merges all text fields into a single computed field. Before I execute a search I tokenize my search text and build a ORed predicate to match on the field.
I do not like this approach because it gets really complicated if I need to boost items that match the title vs the body i.e. i loose the field level boosting.
FYI: my code is very similar to this so post.
Thanks
Sitecore already maintains a full text field, _content, that contains all the text fields. You can run your search against that. You can even create computed fields that add to _content (such as the datasource content example here).
So assuming you are building a LINQ query for your full text search, and have already filtered on templates, latest version, location, etc., adding your search terms to the query would look something like this:
var terms = SearchTerm.Split();
var currentExpression = PredicateBuilder.True<SiteSearchResultItem>();
foreach (var term in terms)
{
//Content is mapped to _content
currentExpression = PredicateBuilder.And(currentExpression, x => x.Content.Contains(term));
}
query = query.Where(currentExpression);
Typically you would want to AND search terms rather than ORing them.
You are right that field level boosting is lost in this. In the end, Lucene is not a great solution for creating a quality full-text site search. If this is an important requirement, you may want to look at Coveo or even something like a Google Site Search.

SOLR indexed item has extra word which is not available in query parameter - how to identify those cases?

We have a scenario where we are trying to perform accurate name matching of Items using SOLR.
Query Parameter: Apple
SOLR Indexed Word: Apple-D
In our business case, "Apple" and "Apple-D" are totally different items and therefore SOLR shouldn't return the match.
Is there an option to achieve the same?
You need to change the fieldType used for the field. Use the String fieldType for the your field.
This String fieldType will make sure that the words will be stored as it is by solr.
It won't apply any analysis on the word. Or it won't create any tokes of it.
With the String type applied to it . The Apple and Apple-D are stored/indexed different token. As there won't be any tokenizing on the same. This will help you to achieve the exact match.
Once you change the fieldType. Re-index the same.
You can use the solr analysis tool to check how it is indexing and querying .
Note : Make sure whenever you ask question on it, Share your schema.xml

How to prevent a field from not analyzing in lucene

I want some fields like urls, to be indexed and stored but not to be analyzed. Field class had a constructor to do the same.
Field(String name, String value, Field.Store store, Field.Index index)
But this constructor has been deprecated since lucene 4 and it is suggested to use StringField or TextField objects. But they don't have any constructors to specify which field to be indexed. So can it be done?
The correct way to index and store an un-analyzed field, as a single token, is to use StringField. It is designed to handle atomic strings, like id numbers, urls, etc. You can specify whether it is stored similarity to in Lucene 3.X
Such as:
new StringField("myUrl, "http://stackoverflow.com/questions/19042587/how-to-prevent-a-field-from-not-analyzing-in-lucene", Field.Store.YES)
Hello you are totally right with what you are saying. With the new fields provided by Lucene you cannot achieve what you want.
You can either continue using the Field as you described or implement your own field by implementing the interface IndexableField. there you can decide yourself what behaviors you want your Field to have.

hibernate search multiple fields based on language

I'm interested in changing db full text search to lucene. I'm using hibernate so I guess it would be smart to use hibernate search. I have a problem though.
Our record has a list of informations and titles from different languages and I need to be able to search based on a single language and over all languages.
I could probably do it in plain lucene but I don't know how well it would work with current transactions. So using hibernate search and hibernate to deal with the index would be much better.
Is it possible to create such fields in the index to search the way I described?
class Record{
List<Info> infos;
}
class Info{
String title;
String infoText;
String langCode;
}
Can I do it like this. Create getters in Record like this:
public String getEnghlishTitle(){...}
public String getFullInfos(){...}
And then put index annotations on these getters and then have necessary fields in index?
I would write a custom FieldBridge for the infos property. Then you have full control which fields you add to the index, eg you could could use text. as field names. This should allow to dynamically decide which language to search for. Remember you have to think about the analyzers too. A custom per field analyzer would work.

In Lucene, using a Standard Analyzer, I want to make fields with spaces searchable

In Lucene, using a Standard Analyzer, I want to make fields with space searchable.
I set Field.Index.NOT_ANALYZED and Field.Store.YES using the StandardAnalyzer
When I look at my index in LUKE, the fields are as I expected, a field and a value such as:
location -> 'New York'.
Here I found that I can use the KeywordAnalyzer to find this value using the query:
location:"New York".
But I want to add another term to the query. Let's say a have a body field which contains the normalized and analyzed terms created by the StandardAnalyzer. Using the KeywordAnalyzer for this field I get different results than when I use the StandardAnalyzer.
How do I combine two Analyzers in one QueryParser, where one Analyzer works for some fields and another one for another fields. I though of creating my own Analyzer which could behave differently depending on the field, but I have no clue how to do it.
PerFieldAnalyzerWrapper lets you apply different analyzers for different fields.