Why queryString would not give any results even added keyword analyzed fields into query - lucene

i have a doubt that how query is search into documents.
When i searched with exact query "what is 1234?" against the keyword analyzed field,i could not get any results.
but if i searched "what" against snowball/standard analyzed field then i got some results and i also tried another way to escape space into the query like "what\ is\ 1234?", it also gave some results.
By default what analyzer the query_string will use, whether it will convert user query using any analyzer or it will use what users gave?
please find my gist here: https://gist.github.com/kirubar/6369034

The reason the query string "what is 1234?" fails to find results isn't the Analyzer, it's the QueryParser.
query_string uses Lucene query syntax. The query parser will interpret that query as three separate queries. That is to say
"query" : "what is 1234?"
Is the equvalent of:
"query" : "what OR is OR 1234?"
If you want to perform a phrase query, it will need to be enclosed on quotes, something like (I beleive you will also need to set the analyzer to a KeywordAnalyzer, so the phrase won't be tokenized, once again preventing matching):
"analyzer" : "keyword",
"query" : "\"what is 1234?\""
Or, better yet, don't even use a query_string query. Instead, use a term query, particularly when querying on a keyword field, like:
"term" : { "message_keyword" : "what is 1234?" }

Related

In a Rails WHERE LIKE query, what does the percent sign mean?

In a simple search like this:
find.where('name LIKE ?', "%#{search}%")
I understand that #{search} is just string interpolation. What do the % symbols do?
The percent sign % is a wildcard in SQL that matches zero or more characters. Thus, if search is "hello", it would match strings in the database such as "hello", "hello world", "well hello world", etc.
Note that this is a part of SQL and is not specific to Rails/ActiveRecord. The queries it can be used with, and the precise behavior of LIKE, differ based on SQL dialect (MySQL, PostgreSQL, etc.).
search = 'something'
find.where('name LIKE ?', "%#{search}%")
In your DB it will be interpreted as
SELECT <fields> FROM finds WHERE name LIKE '%something%';
The percent sign in a like query is a wildcard. So, your query is saying "anything, followed by whatever is in the search variable, followed by anything".
Note that this use of the percent sign is part of the SQL standard and not specific to Rails or ActiveRecord. Also be aware that this kind of search does note scale well -- your SQL db will be forced to scan through every row in the table trying to find matches rather than being able to rely on an index.

How to create a lucene query

I am writing some code the need to include all the words in the search string, eg "Apple is red". I am currently using MultiFieldQueryParser, but the search query will be (title:"apple ? red" body:"apple ? red"). I want the query looks exactly what the string should be. "apple ? red" should become "apple is red". How to I do that?
Your query looks correct. The question mark in the output indicates a position increment, it doesn't indicate an actual term in the query.
The word "is" is removed from the query and the index by StandardAnalyzer, since it is a stop word in the default stop word set. StopFilter removes those terms, but increments the position to indicate where the term was removed to enable closer matching with phrase queries.
Unless you see an issue with the results of the query, there appears to be nothing wrong with it.

Operator LIKE in LUCENE

I'm working in Lucene 4.6 and i'm trying to look for records that contains "keyword1" in "field1" and "keyword2" in "field2"
I wrote following query:
Query q = MultifieldQueryParser.parse(
Version.Lucene_46,
new String[] {keyword1, keyword2},
new String[]{"field1","field2"},
new StandardAnalyzer()
);
That gives me some results but I want to have something like %keyword1% , %keyword2% in SQL.
Thanks for your answers. In case I have a field with the value "Lucene Game Lucene" and I'm looking for that document using the keyword "Game" I can't get that result using keyword neither keyword Who have any idea about this?
You can use WildcardQuery. Supported wildcards are *, which matches any character sequence (including the empty one), and ?, which matches any single character. \ is the escape character.
You can also use the wildcard as prefix, for example *nix, but that can very slow on large indexes, because Lucene needs to scan the entire list of Terms.
[edit]
If you need a prefix wildcard in the queryparser, make sure to call setAllowLeadingWildcard(true)
on the QueryParser As can be seen here
WildcardQuery in Lucene provides the possibility to search for keyword%. For the other way arount there is some work to be done during indexing. You need to index the terms in reversed form (in an other field) and perform the query drowyek%.

Is it possible to write a lucene search query with multiple AND / OR terms on Lucene Fields?

According to this page:
http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html
you can do searches like
title:"The Right Way" AND text:go
But when i do something like:
title:"The Right Way" OR title:"Home" I get no results even though I know there are pages with the title "Home".
How do I build a Lucene Query to do multiple ORs/ANDs for the same field ?
When debugging queries, I always use Luke. Luke lets you see exactly how Lucene interprets your query (as the ANDs and ORs are turned into SHOULDs and MUSTs).
If you print out Query queryParser.parse("title:\"The Right Way\" OR title:\"Home\"") by StandardAnalyzer, the result is Query is [title:"the right way" title:home]

Setting wildcard queries as default for QueryParser

When my users enter a term like "word" I would like it be treated as a wildcard query "word*" so all terms beginning "word" are found. Is there a way to tell the QueryParser to automatically create wildcard queries or do I have to parse the query myself? This shouldn't be a problem for simple queries but it may become tricky for more complex queries.
Unless I am missing something - a wildcard query for every query is usually inadvisable - it is very expensive and could cause a lot of problems. If you are trying find results including variants of a stem (e.g. win -> winner, winning, etc.) You should consider a n-gram approach.