Performance of an SQL query for finding users in database - sql

I am developing an application and I am using Spring Data JPA for database manipulation.
I am implementing a feature for finding friends with a searchbar. The way it is supposed to work is that when you enter some characters, a search is performed in the database to find and display 10 users that match the characters (the first name starts with it, the last name starts with it and it also should work if someone enters the full name).
I have written the code and it works the following way:
#Query(nativeQuery = true,
value="SELECT * FROM users u WHERE (" +
"UPPER(first_name) like CONCAT(UPPER(:query),'%') OR " +
"UPPER(last_name) like CONCAT(UPPER(:query),'%') OR " +
"UPPER(CONCAT(first_name,' ', last_name)) like CONCAT(UPPER(:query),'%')" +
") LIMIT 10")
List<User> findByQueryLikeName(#Param("query") String query);
However, when I look at this query it really seems like it might not be the best performance-wise. I do not have a lot of knowledge about sql performance, but I think that because it combines 3 where statements with an OR operator, and also uses some functions like UPPER and CONCAT a lot. I am using PostgreSQL.
Can you please assess if this kind of query is going to perform well on a big number of records? Can you try and explain why / why not? Do you have any tips on how to improve it?
Thanks a lot!

I would rewrite the query like this:
SELECT * FROM users u
WHERE concat(' ', u.first_name,' ', u.last_name) ILIKE ' ' || :query || '%';
You'd have to create a trigram index to support this:
CREATE EXTENSION pg_trgm;
CREATE INDEX ON users USING gin
(concat(' ', u.first_name,' ', u.last_name) gin_trgm_ops);

As pointed out in comments it's probably so slow because you are using functions in the WHERE clause:
https://www.mssqltips.com/sqlservertip/1236/avoid-sql-server-functions-in-the-where-clause-for-performance/
Alternatively you could add to your entity another field fullName and update it in setFirstName() and setLastName() like this:
void setFirstName(String first){
this.fullName = first.toUpperCase() + " " + this.lastName.toUpperCase(); //TODO handle nulls
}
Then you could query by that fullName, which would be already uppercased, then uppercase you search string in java and have a simple like query:
public List<User> findByFullnameContaining(String searchText, Pageable pageable);

Related

Lucene Search 2 fields

I tried to search the best matching product (bounty paper towel) from a certain retailer, my query is the following, but the query returns 0 hit.
BooleanQuery.Builder combine = new BooleanQuery.Builder();
Query q1 = new QueryParser("product", new StandardAnalyzer()).parse(QueryParser.escape("product:" + "bounty paper towel"));
combine.add(q1, BooleanClause.Occur.SHOULD); // find best name match
Query q2 = new QueryParser("retailer", new StandardAnalyzer()).parse(QueryParser.escape("retailer:" + "Target"));
combine.add(q2, BooleanClause.Occur.MUST); // Must from this retailer
searcher.search(combine.build(), hitsPerPage).scoreDocs;
Is there anything wrong with the way I build the query?
You are escaping things you don't want to escape. You pass the string "product:bounty paper towel" to the escape method, which will escape the colon, which you don't want to escape. In effect, that query, after escaping and analysis, will look like this:
product:product\:bounty product:paper product:towels
You should escape the search terms, not the entire query. Something like:
parser.parse("product:" + QueryParse.escape("bounty paper towels"));
Also, it looks like you are looking for a phrase query there, in which case, it should be surrounded by quotes:
parser.parse("product:\"" + QueryParse.escape("bounty paper towels") + "\"");
The way your building your boolean query looks fine. You could leverage the query parser syntax to accomplish the same thing, if you prefer, like this:
parser.parse(
"product:\"" + QueryParse.escape("bounty paper towels") + "\""
+ "+retailer:" + QueryParse.escape("Target")
);
But again, there is nothing wrong with BooleanQuery.Builder instead.
Used Lucene too many years ago, but let me try...
Rewrite you parse part as follow:
...
Query q1 = new QueryParser("product", new StandardAnalyzer())
.parse("bounty paper towel");
...
Query q2 = new QueryParser("retailer", new StandardAnalyzer())
.parse("Target"));
...
So your query should contain only target information, but not a column name - since it is already referenced before.

Rails Active Record complex query with UPPER and OR in sql clause

I want to query my psql database from my rails app using the example query string:
select * from venues where upper(regexp_replace(postcode, ' ', '')) = '#{postcode}' or name = '#{name}'
There are 2 aspects to this query:
the first is to compare against a manipulated value in the database (upper and regexp_replace) which I can do within an active record where method
the second is to provide the or condition which appears to require the use of ARel
I would appreciate some help in joining these together.
See squeel, it can handle OR queries and functions in a pretty human friendly way: https://github.com/activerecord-hackery/squeel & http://railscasts.com/episodes/354-squeel
for example:
[6] pry(main)> Page.where{(upper(regexp_replace(postcode, ' ', '')) == 'foo') | (name == 'bar')}.to_sql
=> "SELECT \"pages\".* FROM \"pages\" WHERE upper(regexp_replace(\"pages\".\"postcode\", ' ', '')) = 'foo'"
alternative is to code the query directly:
scope :funny_postcode_raw, lambda{ |postcode, name| where("upper(regexp_replace(postcode, ' ', '')) = ? or name = ?", postcode, name) }
I don't suggest going the Arel path, not worth it 99.9% of the time
NOTE: the OR operator for squeel is |

Best way to combine first_name and last_name columns in Ruby on Rails 3?

I have a method in my User model:
def self.search(search)
where('last_name LIKE ?', "%#{search}%")
end
However, it would be nice for my users to be able to search for both first_name and last_name within the same query.
I was thinking to create a virtual attribute like this:
def full_name
[first_name, last_name].join(' ')
end
But is this efficient on a database level. Or is there a faster way to retrieve search results?
Thanks for any help.
Virtual attribute from your example is just class method and cannot be used by find-like ActiveRecord methods to query database.
Easiest way to retrive search result is modifying Search method:
def self.search(search)
q = "%#{query}%"
where("first_name + ' ' + last_name LIKE ? OR last_name + ' ' + first_name LIKE ?", [q, q])
end
where varchar concatenation syntax is compatible with your database of choice (MS SQL in my example).
The search functionality, in your example, is still going to run at the SQL level.
So, to follow your example, your search code might be:
def self.search_full_name(query)
q = "%#{query}%"
where('last_name LIKE ? OR first_name LIKE ?', [q, q])
end
NOTE -- these sorts of LIKE queries, because they have a wildcard at the prefix, will be slow on large sets of data, even if they are indexed.
One way this can be implemented is by tokenizing (splitting) the search query and creating one where condition per each token:
def self.search(query)
conds = []
params = {}
query.split.each_with_index do |token, index|
conds.push "first_name LIKE :t#{index} OR last_name LIKE :t#{index}"
params[:"t#{index}"] = "%#{token}%"
end
where(conds.join(" OR "), params)
end
Also make sure you prevent SQL injection attacks.
However, it's better to use full-text searching tools, such as ElasticSearch and its Ruby gem named Tire to handle searches.
EDIT: Fixed the code.
A scope can be made to handle complex modes, here's an example from one project I'm working on:
scope :search_by_name, lambda { |q|
if q
case q
when /^(.+),\s*(.*?)$/
where(["(last_name LIKE ? or maiden_name LIKE ?) AND (first_name LIKE ? OR common_name LIKE ? OR middle_name LIKE ?)",
"%#{$1}%","%#{$1}%","%#{$2}%","%#{$2}%","%#{$2}%"
])
when /^(.+)\s+(.*?)$/
where(["(last_name LIKE ? or maiden_name LIKE ?) AND (first_name LIKE ? OR common_name LIKE ? OR middle_name LIKE ?)",
"%#{$2}%","%#{$2}%","%#{$1}%","%#{$1}%","%#{$1}%"
])
else
where(["(last_name LIKE ? or maiden_name LIKE ? OR first_name LIKE ? OR common_name LIKE ? OR middle_name LIKE ?)",
"%#{q}%","%#{q}%","%#{q}%","%#{q}%","%#{q}%"
])
end
else
{}
end
}
As you can see, I do a regex match to detect different patterns an build different searches depending on what is provided. As an added bonus, if nothing is provided, it returns an empty hash which effectively is where(true) and returns all results.
As mentioned elsewhere, the db cannot index the columns when a wildcard is used on both sides like %foo%, so this could potentially get slow on very large datasets.

Basic - SQL Query to LINQ Query

I have been trying out some LINQ query can someone please show how to convert the following SQL query to LINQ:
SELECT *, firstname+' '+lastname AS FullName FROM Client WHERE age > 25;
Don't worry about the where part (put it in for completeness) more wandering how to achieve that first part.
Now I have come across something like this:
from c in dc.Clients select new {FullName = c.firstname + " "+c.lastname}
But i don't know how to get it to select everything else without specifying it ie:
{firstname = c.firstname, id = c.id ..... etc}
But I was hoping for another way of achieving that.
So I'm just wandering if someone could show me the right or another way of accomplishing this :)
Thanks All :)
You have to select the actual item then refer to its properties. There's no way to expand the individual columns into the anonymous type.
var query = from c in dc.Clients
where c.Age > 25
select new
{
Client = c,
FullName = c.firstname + " " + c.lastname
};
foreach (var item in query)
{
// item.Client.Id
// item.FullName
// item.Client.FirstName
}
Selecting the actual item gives you access to the same properties you were using to construct the anonymous type. It's not a complete waste though if the query had more going on, such as a join with another table and including fields from that result in the anonymous type, along with the entire Client object.
You can can't autogenerate every column with Linq2Sql or EF (you can however find a way to mimic this behavior with micro-orms like Dapper and massive).
More conveniently, you can just select a new anonymous type with 3 fields, firstname, lastname and a client like:
from c in dc.Clients
select new
{
FullName = c.firstname + " "+c.lastname,
Client = c
}
I would however recommend to select just those properties that you really need. This forces you to think about how to compose your query and what the query is intended to do (and hence, select). Alternatively, you can just select the client, and use some extension methods to select full names. like:
public static string GetFullName(this Client client){ return client.firstname + " " + client.lastname; }

NHibernate Criteria - How to filter on combination of properties

I needed to filter a list of results using the combination of two properties. A plain SQL statement would look like this:
SELECT TOP 10 *
FROM Person
WHERE FirstName + ' ' + LastName LIKE '%' + #Term + '%'
The ICriteria in NHibernate that I ended up using was:
ICriteria criteria = Session.CreateCriteria(typeof(Person));
criteria.Add(Expression.Sql(
"FirstName + ' ' + LastName LIKE ?",
"%" + term + "%",
NHibernateUtil.String));
criteria.SetMaxResults(10);
It works perfectly, but I'm not sure if it is the ideal solution since I'm still learning about NHibernate's Criteria API. What are the recommended alternatives?
Is there something besides Expression.Sql that would perform the same operation? I tried Expression.Like but couldn't figure out how to combine the first and last names.
Should I map a FullName property to the formula "FirstName + ' ' + LastName" in the mapping class?
Should I create a read only FullName property on the domain object then map it to a column?
You can do one of the following:
If you always work with the full name, it's probably best to have a single property
Create a query-only property for that purpose (see http://ayende.com/Blog/archive/2009/06/10/nhibernate-ndash-query-only-properties.aspx)
Do the query in HQL, which is better suited for free-form queries (it will probably be almost the same as your SQL)
Use a proper entity-based Criteria:
Session.CreateCriteria<Person>()
.Add(Restrictions.Like(
Projections.SqlFunction("concat",
NHibernateUtil.String,
Projections.Property("FirstName"),
Projections.Constant(" "),
Projections.Property("LastName")),
term,
MatchMode.Anywhere))
On the pure technical side i don't have an answer, but consider this:
since you are only have a single input field for the user to enter the term, you don't know if he is going to enter 'foo bar' or 'bar foo'... so i would recommend this:
ICriteria criteria = Session.CreateCriteria(typeof(Person));
criteria.Add(Expression.Like("FirstName",term, MatchMode.Anywhere) || Expression.Like("LastName",term, MatchMode.Anywhere));
criteria.SetMaxResults(10);