Lucene -Lexical error while parsing Proximity query - lucene

I write a code for dynamic search on a database while using lucene.net.
I started creating queries and find the position of the results, It worked great!!
but when I used Proximity Searches, I get an error:
Lexical error at line 1, column 72. Encountered: after : "\" "
my Searching function:
private static List<String> GeneralSearch(string txt, Table type)
{
txt= "10~" + txt;
string newQuery = "";
foreach (var field in fields[type])
{
newQuery += field + ": " + txt + " OR ";
}
newQuery = newQuery.Substring(0, newQuery.Length - 4)+" ";
parser.MultiTermRewriteMethod =
MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE;
BooleanQuery bq = new BooleanQuery();
Query query = parser.Parse(newQuery);
bq.Add(query, Occur.MUST);
bq.Add(new TermQuery(new Term("tbl", type.ToString())), Occur.MUST);
TopDocs hits = searcher.Search(bq, reader.MaxDoc);........
The "txt" variable contained a query like that:
txt= "I like to read"
The function create a new query for searching on all the field of specific table
title: 10~"I like to read" OR content: 10~"I like to read"
I think my problem is maybe that the language alignment was right to left.
If you have an idea, it will help me !!

I can't speak to the specific error, however your query is malformed in two ways
The slop (proximity) operator must trail a query not lead the query
Literal phrase queries must be enclosed with double quotes
It's wise to log the result of a query parse with Query.ToString(). Assuming StandardAnalyzer, your query is parsing to something like this:
(text:10~0.5 text:i text:like text:read) +tbl:somevalue
What you think is your slop is parsed as a term query with the default slop value of 0.5
text:10~0.5
and what you thought was a phrase query is in reality parsing to multiple term queries because your phrase is not double quoted:
text:i text:like text:read
You want your raw query to look something like this:
text: "I like to read"~10
Here's a nice guide regarding Lucene query syntax. Good luck!

Related

Cannot run simply query, getting "missing right parenthesis" error

What is wrong with this query?
select author_num from (henry_author where (contains(author_first,'Albert') > 0))
Keeps giving me an error that is is missing a right parenthesis?
SELECT author_num FROM henry_author WHERE author_first LIKE '%Albert%';
or, probably better to account for data inconsistencies:
SELECT author_num FROM henry_author WHERE UPPER(author_first) LIKE '%ALBERT%';
The % is a wildcard matching zero or more characters. So %ALBERT% means anything can be before or after 'ALBERT', which is effectively what your contains() function is doing.
UPPER is just a function which converts the string into upper case characters, which makes it easier to deal with potential data inconsistencies, ie. someone typed in 'albert' instead of 'Albert', etc.
Since you're using JDBC, you might want to structure your query to use PreparedStatement which will allow you to parameterize your query like so:
final String sqlSelectAuthorNum = "SELECT author_num FROM henry_author WHERE UPPER(author_first) LIKE ?";
final PreparedStatement psSelectAuthorNum = conn.prepareStatement(sqlSelectAuthorNum);
// now execute your query someplace in your code.
psSelectAuthorNum.setString(1, "%" + authorName + "%");
final ResultSet rsAuthorNum = psSelectAuthorNum.executeQuery();
if (rsAuthorNum.isBeforeFirst()) {
while (rsAuthorNum.next()) {
int authorNumber = rsAuthorNum.getInt(1);
// etc...
}
}

Lucene Search 2 fields

I tried to search the best matching product (bounty paper towel) from a certain retailer, my query is the following, but the query returns 0 hit.
BooleanQuery.Builder combine = new BooleanQuery.Builder();
Query q1 = new QueryParser("product", new StandardAnalyzer()).parse(QueryParser.escape("product:" + "bounty paper towel"));
combine.add(q1, BooleanClause.Occur.SHOULD); // find best name match
Query q2 = new QueryParser("retailer", new StandardAnalyzer()).parse(QueryParser.escape("retailer:" + "Target"));
combine.add(q2, BooleanClause.Occur.MUST); // Must from this retailer
searcher.search(combine.build(), hitsPerPage).scoreDocs;
Is there anything wrong with the way I build the query?
You are escaping things you don't want to escape. You pass the string "product:bounty paper towel" to the escape method, which will escape the colon, which you don't want to escape. In effect, that query, after escaping and analysis, will look like this:
product:product\:bounty product:paper product:towels
You should escape the search terms, not the entire query. Something like:
parser.parse("product:" + QueryParse.escape("bounty paper towels"));
Also, it looks like you are looking for a phrase query there, in which case, it should be surrounded by quotes:
parser.parse("product:\"" + QueryParse.escape("bounty paper towels") + "\"");
The way your building your boolean query looks fine. You could leverage the query parser syntax to accomplish the same thing, if you prefer, like this:
parser.parse(
"product:\"" + QueryParse.escape("bounty paper towels") + "\""
+ "+retailer:" + QueryParse.escape("Target")
);
But again, there is nothing wrong with BooleanQuery.Builder instead.
Used Lucene too many years ago, but let me try...
Rewrite you parse part as follow:
...
Query q1 = new QueryParser("product", new StandardAnalyzer())
.parse("bounty paper towel");
...
Query q2 = new QueryParser("retailer", new StandardAnalyzer())
.parse("Target"));
...
So your query should contain only target information, but not a column name - since it is already referenced before.

how to get search hits when at least one character present in field value using lucene search

How do I get search hits when at least one character searched is present in a field's value, using lucene search?
I got search hits only when I search with a complete word.
Example:
Hello world
In above example, if I enter "Hello", then I will get a hit, but not if I enter "Hel"
Here is my code to get hits:
QueryParser parser = null;
Query query = null;
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT, new HashSet());
BooleanQuery.setMaxClauseCount(32767);
parser = new QueryParser("fieldname", analyzer);
parser.setAllowLeadingWildcard(true);
query = parser.parse("searchString");
TopDocs topResultDocs = searcher.search(query, null, 20);
Always append * to the query to get all suffix matches: Hel* will match Hello.

how to search word from String field in Lucene Index

How to search word from Lucene index String field ?
i have lucene index with field TITLE ,containts Document titles
eg:TV not working,Mobile not working
i want to search particular word from title .
code below gives me result from Full content,if i change FULL_CONTENET to TITLE then i dont get any results.
Query qry = null;
qry = new QueryParser(FULL_CONTENT, new SimpleAnalyzer()).parse("not");
Searcher searcher = null;
searcher = new IndexSearcher(indexDirectory);
Hits hits = null;
hits = searcher.search(qry);
System.out.println(hits.length());
As "NOT" is a Lucene query syntax operator, that may be your problem.
The problem is StringAnalyzer applies a Lower Case filter. Your query will be in lower case:
e.g. title:mobile.
StringField doesn't apply any analysis so your text will be indexed as is. If you change StringField to TextField it will be analyzed by the StringAnalyzer and get converted to lower case in the index.
If you replace StringAnalyzer with WhitespaceAnalyzer there is no Lower Case filter and it will work again (because your query doesn't get converted to lower case).

searching lucene index on multiple fields

I have an index with 2 content fields (analyzed, indexed & stored):
for example: name , hobbies. (The hobbies field can be added multiple times with different values).
I have another field that is only indexed (un_analyzed & not stored) used for filtering:
for example: country_code
Now, I want to build a query that will retrieve documents that match (as best as possible) to some "search" input field but only such documents where country_code has some exact value.
What would be the most suitable combination query syntax / query parser to use to build such a query.
You can use the following query:
country_code:india +(name:search_value OR hobbies:search_value)
Why don't you start with QueryParser, it might work for your use case and it requires the least amount of effort.
It's not clear from your question, but let's assume you have a single input field ('search') and a combobox for the country code. You would then read those values and create a query:
// you don't have to use two parsers, you can do this using one.
QueryParser nameParser = new QueryParser(Version.LUCENE_CURRENT, "name", your_analyzer);
QueryParser hobbiesParser = new QueryParser(Version.LUCENE_CURRENT, "hobbies", your_analyzer);
BooleanQuery q = new BooleanQuery();
q.add(nameParser.parser(query), BooleanClause.Occur.SHOULD);
q.add(hobbiesParser.parser(query), BooleanClause.Occur.SHOULD);BooleanClause.Occur.SHOULD);
/* Filtering by country code can be done using a BooleanQuery
* or a filter, the difference will be how Lucene scores matches.
* For example, using a filter:
*/
Filter countryCodeFilter = new QueryWrapperFilter(new TermQuery(new Term("country_code", )));
//and finally searching:
TopDocs topDocs = searcher.search(q, countryCodeFilter, 10);