Lucene: Can I run a query against few specific docs of the collection only? - lucene

Can I run a query against few specific docs of the collection only ?
Can I filter the built collection according to documents fields content ?
For example I would like to query over documents having field2 = "abc".
thanks

Sure -- use a Filter. See http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/search/QueryWrapperFilter.html
The code will look something like:
QueryParser qp = ...
Filter filter = new QueryWrapperFilter(qp.parse("field2:abc"));
// pass filter to searcher.search()

Related

How to get many terms matched using Hibernate Search query DSL?

When I search for "cars blue" I get every result that matches "cars" or "blue", but I need to match them both. I've read about setting some defaultOperator to AND but I can't find where to do that,
Also I can't use PhraseQuery because the order of the terms in the search query is irrelevant,
This is my code so far, thanks!
// create the query using Hibernate Search query DSL
QueryBuilder queryBuilder = fullTextEntityManager.getSearchFactory()
.buildQueryBuilder().forEntity(Articulo.class).get();
// a very basic query by keywords
BooleanJunction<BooleanJunction> bool = queryBuilder.bool();
bool.must(queryBuilder.keyword()
.onFields("description")
.matching(text)
.createQuery()
);
Query query = bool.createQuery();
FullTextQuery jpaQuery =
fullTextEntityManager.createFullTextQuery(query, Articulo.class);
return jpaQuery.getResultList();
Note: I'm using Hibernate Search 5.6.4
I think you're looking for the Simple query string feature.
See http://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#_simple_query_string_queries for more details about it.
You have an example with .withAndAsDefaultOperator():
Query luceneQuery = mythQB
.simpleQueryString()
.onField("history")
.withAndAsDefaultOperator()
.matching("storm tree")
.createQuery();
This blog post explaining the rationale of this feature might be helpful too: http://in.relation.to/2017/04/27/simple-query-string-what-about-it/ .

Using Wildcard Sql for searching a word in a TextField

To make it clearer I have this fields
Columntobesearch
aword1 bword1
aword2 bword2
aword3 bword4
Now what I want to do is search using the sql wild card so what I did is like this
%searchbox%
I placed to wildcards on both ends of my search but what it searches is just the first word on the field
when I search 'aword' all of the fields is showing but when I search 'bword' nothing is showing, Please help.
Here is my Full Code
$Input=Input::all();
$makethis=Input::flash();
$soptions=Input::get('soptions');
$searchbox=Input::get('searchbox');
$items = Gamefarm::where('roost_hen', '=',Input::get('sex'))
->where($soptions, 'LIKE','%' . $searchbox . '%')
->paginate(12);
If you use mysql you can try this:
<?php
$q = Input::get('searchbox');
$results = DB::table('table')
->whereRaw("MATCH(columntobesearch) AGAINST(? IN BOOLEAN MODE)",
array($q)
)->get();
Ofcourse you need to prepare your table for full text search in your migration file with
DB::statement('ALTER TABLE table ADD FULLTEXT search(columntobesearch)');
Any way, this is not the more scalable nor efficient way to do FTS.
For a scalable and reliable full text search I strongly recommend you see elasticsearch and implement any Laravel package to this task

Lucene query with filter "without property"

I need to write lucene query/filter to get objects without specific property.
I tried with ... ISNULL:"cm:param_name" but id didn't work.
Edit: I have added new property in aspect but objects that haven't been updated yet don't have it amongst their listed properties (checked with node browser).
With a query like "cm:*", you should only receive documents that have the field "cm" plus content. Note that you have to allow leading wildcard queries by the query parser with setAllowLeadingWildcard(true).
Also check out this post, which deals with a reversed version of your problem:
Find all Lucene documents having a certain field
Can you please be more clear as to what "without property" means ? Do you mean that you do not want to specify the field like so "field:value" and instead set the filter to "value" ?
EDIT
Are you generating these field names dynamically or is this the only field name that can have it's value missing ? If there is only one field that may or may not appear in your document then you could just populate it with a default value when it's missing and then search for that . Otherwise, you could try a negated rangequery like so : NOT foo:[* TO *] . This should match all documents without a value in the foo field. For performance purposes , in the second case the field should be indexed as a string field (not analyzed).
I managed to get this done with .. AND NOT (#namespace\:property:"")
In Java and Lucene 3.6.2 the "FieldValueFilter" with activated negation can be used: (which was not the question)
import org.apache.lucene.search.FieldValueFilter;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.MatchAllDocsQuery;
import org.apache.lucene.search.TopDocs;
final IndexSearcher indexSearcher = getIndexSearcher() <- whereever that comes from
final TopDocs topdocs = indexSearcher.search(new MatchAllDocsQuery(), new FieldValueFilter("cm", true), Integer.MAX_VALUE);
You can use ISUNSET and/or ISNULL for this scenario.
ISUNSET:"cm:title"
ISNULL:"cm:title"

Find typo with Lucene

I would like to use Lucene to index/search text. The text can contain mistyped words, names, etc. What is the most simple way of getting Lucene to find a document containing
"this is Licene"
when user searches for
"Lucene"?
This is only for a demo app, so we need the most simple solution.
Lucene's fuzzy queries and based on Levenshtein edit distance.
Use a fuzzy query in the QueryParser, with syntax like:
Lucene~0.5
Or create a FuzzyQuery, passing in the maximum number of edits, something like:
Query query = new FuzzyQuery(new Term("field", "lucene"), 1);
Note: FuzzyQuery, in Lucene 4.x, does not support greater edit distances than 2.
Another option you could try is using the Lucene SpellChecker:
http://lucene.apache.org/core/6_4_0/suggest/org/apache/lucene/search/spell/SpellChecker.html
It is a out of box, and very easy to use:
SpellChecker spellchecker = new SpellChecker(spellIndexDirectory);
// To index a field of a user index:
spellchecker.indexDictionary(new LuceneDictionary(my_lucene_reader, a_field));
// To index a file containing words:
spellchecker.indexDictionary(new PlainTextDictionary(new File("myfile.txt")));
String[] suggestions = spellchecker.suggestSimilar("misspelt", 5);
By default, it is using the LevensteinDistance, but you could provide your own customized Edit Distance.

How can I search Dynamics CRM using OrganizationService for a specific record?

If I know the accountId of a record I can do something like this:
Dim cols As New ColumnSet(New String() {"name",
"address1_postalcode",
"lastusedincampaign"})
Dim retrievedAccount As Account = _orgService.Retrieve("account", _accountId, cols).ToEntity(Of Account)()
But what if I don't know the accountId and instead want to search for a record based on some other factor? Say, returning all records with "John" as the first name?
You have to use the RetrieveMultiple method with a QueryExpression.
See this link for some examples
You could also use Linq to CRM, or Fetch XML.
You can use QueryExpression as explained here:
http://msdn.microsoft.com/en-us/library/gg328300.aspx.
You can write your own ConditionExpression or FetchXml for the QueryExpression.
For more complicated query I like to use FetchXml. You can do Advanced Find then download the generated FetchXml, or use any number of online tools like Fetch Xml Builder to generate it first.
Hope that helps.