Lucene query in Liferay - lucene

I am looking for how to create a lucene query for the condition – className AND (“Apple-Orange” OR “Apple Banana” OR “Apple Shake”)
I tried
BooleanQuery specialityQuery = BooleanQueryFactoryUtil.create(searchContext);
specialityQuery.setQueryConfig(searchContext.getQueryConfig());
specialityQuery.add(contextQuery, BooleanClauseOccur.MUST);
BooleanQuery idFilter = BooleanQueryFactoryUtil.create(searchContext);
for (String speciality : specialities) {
TermQuery termQuery = TermQueryFactoryUtil.create(searchContext, "fruit", speciality);
idFilter.add(termQuery, BooleanClauseOccur.SHOULD);
LOGGER.info(" Term Query " + idFilter);
}
specialityQuery.add(idFilter, BooleanClauseOccur.MUST);
But this doesn’t work. Any other ideas?
P.S - Question posted on Liferay as well - https://www.liferay.com/community/forums/-/message_boards/message/48383912
Tina

Related

Hibernate Search DSL and Lucene query on Multiple Fields

I'm not really sure how involved this might be, but could someone help me with below problem.
I'm trying to implement search functionality in my project based on employee firt and last name. I have used Spring Data REST and Hibernate Search for this purpose.
#Transactional
public search(String searchText) {
FullTextEntityManager fullTextEntityManager = org.hibernate.search.jpa.Search
.getFullTextEntityManager(entityManager);
QueryBuilder qb = fullTextEntityManager.getSearchFactory().buildQueryBuilder().forEntity(Employee.class).get();
org.apache.lucene.search.Query luceneQuery = qb.keyword().wildcard()
.onFields("firstName", "middleName", "lastName").matching(searchText + "*").createQuery();
javax.persistence.Query jpaQuery = fullTextEntityManager.createFullTextQuery(luceneQuery, Employee.class);
List result = jpaQuery.getResultList();
List<EmployeeSearchDTO> listOfDTO = new ArrayList<>();
EmployeeSearchDTO employeeDTO;
Iterator<Employee> itr = result.iterator();
while (itr.hasNext()) {
Employee employee = itr.next();
employeeDTO = new EmployeeSearchDTO(employee);
listOfDTO.add(employeeDTO);
}
}
When I search "john doe" i expect the results should match the below two
FirstName : John LastName : Doe
FirstName : johnathan LastName : Doe
But that is not the case and I'm able to search only based on FirstName["john"] or LastName["doe"] but not with both.
How do I solve this, any pointers would be greatly appreciated. Thanksin advance.
You really want to create two queries, one against the first name and one against the last name and then combine them via the SHOULD operator. Something like
Query combinedQuery = querybuilder
.bool()
.should( firstNameQuery )
.should( lastNameQuery )
.createQuery();
This means you are looking for results where either of the queries match.

SQL Compare Characters in two strings count total identical

So the over all on this is I have two different systems and in both systems I have customers, unfortunately both systems allow you to type in the business name freehand so you end up with the example below.
Column A has a value of "St John Baptist Church"
Column B has a value of "John Baptist St Church"
What I need to come up with is a query that can compare the two columns to find the most closely matched values.
From there I plan to write a web app where I can have someone go through and validate all of the entries. I would enter in some example of what I have done, but unfortunately I honestly dont even know if what I am asking for is even possible. I would think it is though in this day and age I am sure I am not the first one to try to attempt this.
You could try and create a script something like this php script to help you:
$words = array();
$duplicates = array();
function _compare($value, $key, $array) {
global $duplicates;
$diff = array_diff($array, $value);
if (!empty($diff)) {
$duplicates[$key] = array_keys($diff);
}
return $diff;
}
$mysqli = new mysqli('localhost', 'username', 'password', 'database');
$query = "SELECT id, business_name FROM table";
if ($result = $mysqli->query($query)) {
while ($row = $result->fetch_object()) {
$pattern = '#[^\w\s]+#i';
$row->business_name = preg_replace($pattern, '', $row->business_name);
$_words = explode(' ', $row->business_name);
$diff = array_walk($words, '_compare', $_words);
$words[$row->id][] = $_words;
$result->close();
}
}
$mysqli->close();
This is not tested but you need something like this, because I don't think this is possible with SQL alone.
---------- EDIT ----------
Or you could do a research on what the guys in the comment recommend Levenshtein distance in T-SQL
Hope it helps, good luck!

Lucene.net - query returning unwanted documents

All of my Lucene.net (2.9.2) documents have two fields:
categoryid
bodytext
bodytext is the default field, and is where all of the document's text is stored (using Field.Store.NO , Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS ).
categoryid is just a numeric field stored as text: Field.Store.YES, Field.Index.NOT_ANALYZED
When this query is executed, it only returns documents with that category ID: categoryid:1
However when I perform this query: categoryid:1 foo bar it returns documents from other categories other than 1.
Why is this? And how can I force it to respect the original categoryid:N query term?
Do you want to require all words entered to be present in your matched documents?
var analyzer = new StandardAnalyzer(Version.LUCENE_30);
var queryParser = new QueryParser(Version.LUCENE_30, "bodytext", analyzer);
// This ensures that all terms are required.
queryParser.DefaultOperator = QueryParser.Operator.AND;
var query = queryParser.Parse("categoryid:1 foo bar");
// query = "+categoryid:1 +bodytext:foo +bodytext:bar"

Raven query returns 0 results for collection contains

I have a basic schema
Post {
Labels: [
{ Text: "Mine" }
{ Text: "Incomplete" }
]
}
And I am querying raven, to ask for all posts with BOTH "Mine" and "Incomplete" labels.
queryable.Where(candidate => candidate.Labels.Any(label => label.Text == "Mine"))
.Where(candidate => candidate.Labels.Any(label => label.Text == "Incomplete"));
This results in a raven query (from Raven server console)
Query: (Labels,Text:Incomplete) AND (Labels,Text:Mine)
Time: 3 ms
Index: Temp/XWrlnFBeq8ENRd2SCCVqUQ==
Results: 0 returned out of 0 total.
Why is this? If I query for JUST containing "Incomplete", I get 1 result.
If I query for JUST containing "Mine", I get the same result - so WHY where I query for them both, I get 0 results?
EDIT:
Ok - so I got a little further. The 'automatically generated index' looks like this
from doc in docs.FeedAnnouncements
from docLabelsItem in ((IEnumerable<dynamic>)doc.Labels).DefaultIfEmpty()
select new { CreationDate = doc.CreationDate, Labels_Text = docLabelsItem.Text }
So, I THINK the query was basically testing the SAME label for 2 different values. Bad.
I changed it to this:
from doc in docs.FeedAnnouncements
from docLabelsItem1 in ((IEnumerable<dynamic>)doc.Labels).DefaultIfEmpty()
from docLabelsItem2 in ((IEnumerable<dynamic>)doc.Labels).DefaultIfEmpty()
select new { CreationDate = doc.CreationDate, Labels1_Text = docLabelsItem1.Text, Labels2_Text = docLabelsItem2.Text }
Now my query (in Raven Studio) Labels1_Text:Mine AND Labels2_Text:Incomplete WORKS!
But, how do I address these phantom fields (Labels1_Text and Labels2_Text) when querying from Linq?
Adam,
You got the reason right. The default index would generate 2 index entries, and your query is executing on a single index entry.
What you want is to either use intersection, or create your own index like this:
from doc in docs.FeedAnnouncements
select new { Labels_Text = doc.Labels.Select(x=>x.Text)}
And that would give you all the label's text in a single index entry, which you can execute a query on.

City name query

Am a newbie to Lucene an working on a city search API using Lucene.
If user types in san francisco as search input, then it should give cities with exact match only and not San Jose /San Diego,etc.
How should i index city names in Lucene?and which Lucene analyzer and query class do i need to use?
Index your content with StandardAnalyzer. And then use PhraseQuery to search. For this, simply use the query string as "san francisco" with double quotes.
<?php
if(isset($_POST['submit']) && $_POST['submit']=='submit' ){
$city = $_POST['city'];
$query = "SELECT * FROM libreary WHERE city LIKE'".$city."'";
$row = mysqli_query($con,$query);
$result = mysqli_num_rows($row);
if($result>0){
while($row1 = mysqli_fetch_array($row)){
print_r($row1);
}
}
$respon['Response'] = $response;
print_r(json_encode($respon));
}
?>