Hibernate Search with Lucene Query for first and last name together

Hibernate Search with Lucene Query for first and last name together - lucene

I have attempted this with phrase, wildcard and keyword queries but nothing really works perfectly.
...
#Field(name = "firstLastName", index = org.hibernate.search.annotations.Index.YES, analyze = Analyze.NO, store = Store.NO)
public String getFirstLastName() {
return this.firstLastName;
}
...
Now I want to query this field and return the correct results if a user types John Smith, Smith John or Smith Jo* or John Smi*....
junction = junction.should(qb.keyword().wildcard().onField("firstLastName")
.matching("John Smith*").createQuery());
If I search for just Smith or John given a keyword query, I get a hit. I am not analyzing the field as I didn't think I needed to but I tried it both ways with no success...

Several issues here:
You need to use an analyzer, be it only to split the strings on whitespaces. Define an analyzer and assign it to your field.
You can't use wildcard queries if you want the strings to be analyzed: wildcard queries are not analyzed. You should use an EdgeNGramFilter instead.
This answer to a very similar question will probably help: Hibernate Search: How to use wildcards correctly?

Related

How can I search lucene for "John J" and get people like "John Jameson" not just people with John?

For reasons out of my control, I must do this with a global search. I've taken converting a search term "John J" into (John AND J), which works for anyone who's last name doesn't start with the same letter as their first.
How can I make the search for "John J" become "find all people who have John and then another, different J in the field"?
Thanks for your time.

You may want to try out Wildcard Query. For example:
Term term = new Term("secondName", "J*");
Query query = new WildcardQuery(term);
I am assuming you have a different fields for first and second name. You can create a boolean query with a combination of queries for first and second names.
Documentation for WildcardQuery: http://lucene.apache.org/core/6_2_0/core/org/apache/lucene/search/WildcardQuery.html
I hope this helps.

Since you mentioned it is a type ahead input; prefixQuery might help -
new PrefixQuery(new Term("lastName","J"));
This will return all documents with lastName starting with "J".
To get results where firstName starts with "John" and lastName starts with "J", you can have -
BooleanQuery.Builder booleanQueryBuilder;
booleanQueryBuilder.add(new PrefixQuery(new Term("firstName","John")));
booleanQueryBuilder.add(new PrefixQuery(new Term("lastName","J")));`

lucene wildcard query with space

I have Lucene index which has city names.
Consider I want to search for 'New Delhi'. I have string 'New Del' which I want to pass to Lucene searcher and I am expecting output as 'New Delhi'.
If I generate query like Name:New Del* It will give me all cities with 'New and Del'in it.
Is there any way by which I can create Lucene query wildcard query with spaces in it?
I referred and tried few solutions given # http://www.gossamer-threads.com/lists/lucene/java-user/5487

It sounds like you have indexed your city names with analysis. That will tend to make this more difficult. With analysis, "new" and "delhi" are separate terms, and must be treated as such. Searching over multiple terms with wildcards like this tends to be a bit more difficult.
The easiest solution would be to index your city names without tokenization (lowercasing might not be a bad idea though). Then you would be able to search with the query parser simply by escaping the space:
QueryParser parser = new QueryParser("defaultField", analyzer);
Query query = parser.parse("cityname:new\\ del*");
Or you could use a simple WildcardQuery:
Query query = new WildcardQuery(new Term("cityname", "new del*"));
With the field analyzed by standard analyzer:
You will need to rely on SpanQueries, something like this:
SpanQuery queryPart1 = new SpanTermQuery(new Term("cityname", "new"));
SpanQuery queryPart2 = new SpanMultiTermQueryWrapper(new WildcardQuery(new Term("cityname", "del*")));
Query query = new SpanNearQuery(new SpanQuery[] {query1, query2}, 0, true);
Or, you can use the surround query parser (which provides query syntax intended to provide more robust support of span queries), using a query like W(new, del*):
org.apache.lucene.queryparser.surround.parser.QueryParser surroundparser = new org.apache.lucene.queryparser.surround.parser.QueryParser();
SrndQuery srndquery = surroundparser.parse("W(new, del*)");
query = srndquery.makeLuceneQueryField("cityname", new BasicQueryFactory());

As I learnt from the thread mentioned by you (http://www.gossamer-threads.com/lists/lucene/java-user/5487), you can either do an exact match with space or treat either parts w/ wild card.
So something like this should work - [New* Del*]

SQL wildcards via Ruby

I am trying to use a wildcard or regular expression to give some leeway with user input in retrieving information from a database in a simple library catalog program, written in Ruby.
The code in question (which currently works if there is an exact match):
puts "Enter the title of the book"
title = gets.chomp
book = $db.execute("SELECT * FROM books WHERE title LIKE ?", title).first
puts %Q{Title:#{book['title']}
Author:#{book['auth_first']} #{book['auth_last']}
Country:#{book['country']}}
I am using SQLite 3. In the SQLite terminal I can enter:
SELECT * FROM books WHERE title LIKE 'Moby%'
or
SELECT * FROM books WHERE title LIKE "Moby%"
and get (assuming there's a proper entry):
Title: Moby-Dick
Author: Herman Melville
Country: USA
I can't figure out any corresponding way of doing this in my Ruby program.
Is it not possible to use the SQL % wildcard character in this context? If so, do I need to use a Ruby regular expression here? What is a good way of handling this?
(Even putting the ? in single quotes ('?') will cause it to no longer work in the program.)
Any help is greatly appreciated.
(Note: I am essentially just trying to modify the sample code from chapter 9 of Beginning Ruby (Peter Cooper).)

The pattern you give to SQL's LIKE is just a string with optional pattern characters. That means that you can build the pattern in Ruby:
$db.execute("SELECT * FROM books WHERE title LIKE ?", "%#{title}%")
or do the string work in SQL:
$db.execute("SELECT * FROM books WHERE title LIKE '%' || ? || '%'", title)
Note that the case sensitivity of LIKE is database dependent but SQLite's is case insensitive so you don't have to worry about that until you try to switch database. Different databases have different ways of dealing with this, some have a case insensitive LIKE, some have a separate ILIKE case insensitive version of LIKE, and some make you normalize the case yourself.

SQL query to bring all results regardless of punctuation with JSF

So I have a database with articles in them and the user should be able to search for a keyword they input and the search should find any articles with that word in it.
So for example if someone were to search for the word Alzheimer's I would want it to return articles with the word spell in any way regardless of the apostrophe so;
Alzheimer's
Alzheimers
results should all be returned. At the minute it is search for the exact way the word is spell and wont bring results back if it has punctuation.
So what I have at the minute for the query is:
private static final String QUERY_FIND_BY_SEARCH_TEXT = "SELECT o FROM EmailArticle o where UPPER(o.headline) LIKE :headline OR UPPER(o.implication) LIKE :implication OR UPPER(o.summary) LIKE :summary";
And the user's input is called 'searchText' which comes from the input box.
public static List<EmailArticle> findAllEmailArticlesByHeadlineOrSummaryOrImplication(String searchText) {
Query query = entityManager().createQuery(QUERY_FIND_BY_SEARCH_TEXT, EmailArticle.class);
String searchTextUpperCase = "%" + searchText.toUpperCase() + "%";
query.setParameter("headline", searchTextUpperCase);
query.setParameter("implication", searchTextUpperCase);
query.setParameter("summary", searchTextUpperCase);
List<EmailArticle> emailArticles = query.getResultList();
return emailArticles;
}
So I would like to bring back all results for alzheimer's regardless of weather their is an apostrophe or not. I think I have given enough information but if you need more just say. Not really sure where to go with it or how to do it, is it possible to just replace/remove all punctuation or just apostrophes from a user search?

In my point of view, you should change your query,
you should add alter your table and add a FULLTEXT index to your columns (headline, implication, summary).
You should also use MATCH-AGAINST rather than using LIKE query and most important, read about SOUNDEX() syntax, very beautiful syntax.
All I can give you is a native query example:
SELECT o.* FROM email_article o WHERE MATCH(o.headline, o.implication, o.summary) AGAINST('your-text') OR SOUNDEX(o.headline) LIKE SOUNDEX('your-text') OR SOUNDEX(o.implication) LIKE SOUNDEX('your-text') OR SOUNDEX(o.summary) LIKE SOUNDEX('your-text') ;
Though it won't give you results like Google search but it works to some extent. Let me know what you think.

Replace space with dash before save using Rails 3

I am trying to save a name to the database and a single word (firstname) works fine but when the user enter both firstname and lastname I want Rails to save it to the database as firstname-lastname instead of firstname lastname (space between).
I know I perhaps should use a before create filter but I am not sure how this need to look like. I want the validation to work to, i.e. no two people should be able to use the same name.
I am using Rails 3.

You can use ActiveSupport's inflector method parameterize on the string.
name = 'john smith' # => john smith
name.parameterize # => john-smith
Further, parameterize takes an option to use for the word-break, so you can replace the dash with an underscore like this:
name.parameterize("_") # => john_smith
An advantage of using parameterize is that it normalizes the characters to the latin, so...
name = "jöhanne såltveç"
name.parameterize # => johanne-saltvec
EDIT: As of Rails 5.0.0.1 the separator needs to be passed as an option. Therefore: name.parameterize(separator: '_')

Why don't you just have first_name and last_name columns in the db, and create your own validation rule to make sure the combination is unique (http://guides.rubyonrails.org/active_record_validations_callbacks.html#creating-custom-validation-methods). You should also create a unique index over those two columns in your db.

Another option would be to us regexp and replace all existing spaces with. You'd put something along the lines of:
self.firstname.gsub(/\s+/, '-')
in your model.
Note: I'm not sure if ruby accepts \s as "any whitespace character" And I think the * should make sure that if someone enters a name with two neighbour spaces(or more) it won't convert each space into a separate dash, but only into one.

Other answer is correct,
But, if you want to preserve case while parameterizing
name = "Donald Duck"
name.parameterize(preserve_case: true) # => Donald-Duck

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Hibernate Search with Lucene Query for first and last name together - lucene

Related

How can I search lucene for "John J" and get people like "John Jameson" not just people with John?

lucene wildcard query with space

SQL wildcards via Ruby

SQL query to bring all results regardless of punctuation with JSF

Replace space with dash before save using Rails 3

Categories

Resources