Sort by preference nhibernate or sql server - sql

I have a list of users in a table and when performing a search on table, I want the usernames that begin with search key to appear on top of the list, followed by users who have the search key in their username.
For example, consider the list of usernames:
rene
irene
adler
leroy
Argog
harry
evan
I am providing "suggestions" as the user types in a search box when they are trying to search for other users. If the users types va into the search box, more often than not they will be looking for the user vain, but because I'm sorting the users by username, ascending order, evan is always on top. What I'd want is to order like so:
searching for va
vain
evan
searching for re
rene
Irene
searching for ar
Argog
harry
Of course, if they enter one more character it will be narrowed down further.
Thinking of it, this is what I want to do - put the username that starts with search key on top (if multiple usernames start with the search key, sort them alphabetically). Then, if the required number of usernames isn't complete, append the other usernames that contain search key in them, alphabetically.
I'm paginating the results in sql itself - and I'm using nhibernate queryover to perform the task. By if the required number of usernames isn't complete, I mean if the page size is 10 and I have only 7 usernames, append other usernames that contain searchkey in them.
I can't currently see a way to do all this in one query.. do I have to split the query into two parts and contact the db twice to get this done? or is there a way I can sort with position of the string?
Any hints about how to efficiently do this would be very helpful. - I can't even think of the query that would do this in plain sql..
thanks.
Solution
The accepted answer pushed me in the right direction and this is what finally worked for me:
.OrderBy(Projections.Conditional(
Restrictions.Where(r => r.Username.IsLike(searchKey + "%")),
Projections.Constant(0),
Projections.Constant(1))).Asc();

In plain SQL you could craft an ORDER BY clause such as:
ORDER BY CASE WHEN field LIKE 'VA%' THEN 0
WHEN field LIKE '%VA%' THEN 1
ELSE 2
END
Of course you can use variables/field names instead.
Not sure as to the rest of your question.

QueryOver based on Goat_CO's idea:
session.QueryOver<YourClass>()
.OrderBy(
Projections.Conditional(
Restrictions.Like(Projections.Property<YourClass>(x => x.Pro),
searchString,
MatchMode.Anywhere),
Projections.Constant(0),
Projections.Constant(1)))
.Asc;

Related

SQL Server Efficient Search for LIKE '%str%'

In Sql Server, I have a table containing 46 million rows.
In "Title" column of table, I want make search. The word may be at any index of field value.
For example:
Value in table: BROTHERS COMPANY
Search string: ROTHER
I want this search to match the given record. This is exactly what LIKE '%ROTHER%' do. However, LIKE '%%' usage should not be used on large tables because of performance issues. How can I achieve it?
Though I don't know your requirements, your best approach may be to challenge them. Middle-of-the-string searches are usually not very practical. If you can get your users to perform prefix searches (broth%) then you can easily use Full Text's wildcard search (CONTAINS(*, '"broth*"')). Full Text can also handle suffix searches (%rothers) with a little extra work.
But when it comes to middle-of-the-string searches with SQL Server, you're stuck using LIKE. However you may be able to improve performance of LIKE by using a binary collation as explained in this article. (I hate to post a link without including its content but it is way too long of an article to post here and I don't understand the approach enough to sum it up.)
If that doesn't help and if middle-of-the-string searches are that important of a requirement then you should consider using a different search solution like Lucene.
Add Full-Text index if you want.
You can search the table using CONTAINS:
SELECT *
FROM YourTable
WHERE CONTAINS(TableColumnName, 'SearchItem')

SQL exact match within a pattern?

I am using qodbc (a quickbooks database connector) It uses an ODBC-like sql language.
I would like to find all the records where a field matches a pattern but I have a slight delema.
The information in my field looks like this:
321-......02/25/10
321-1.....02/26/10
321-2.....03/25/10
321-3.....03/26/10
322-......04/25/10
322-1.....04/26/10
322-2.....05/25/10
322-3.....05/26/10
I would like my query to return only the rows where the pattern matches the first number. So if the user searches for '321' it will only show records that look like 321 but not those that have 321-1 or 321-3. Similarly if the user searched for 321-1 you would not see 321. (that's the easy part)
Right now I have
LIKE '321%'
This finds all of them regardless of if they are followed by dots or not. Is there a way I can limit the query to only specifics despite that field having more information that it should.
(P.S. I did not set up this system, it makes me wince to see two data points in one field
I'm sorry if my title isn't right, suggest a new title if you can. )
LIKE '321%' AND NOT LIKE '321-%'

Solr: How can I get all documents ordered by score with a list of keywords?

I have a Solr 3.1 database containing Emails with two fields:
datetime
text
For the query I have two parameters:
date of today
keyword array("important thing", "important too", "not so important, but more than average")
Is it possible to create a query to
get ALL documents of this day AND
sort them by relevancy by ordering them so that the email with contains most of my keywords(important things) scores best?
The part with the date is not very complicated:
fq=datetime[YY-MM-DDT00:00:00.000Z TO YY-MM-DDT23:59:59.999Z]
I know that you can boost the keywords this way:
q=text:"first keyword"^5 OR text:"second one"^2 OR text:"minus scoring"^0.5 OR text:"*"
But how do I only use the keywords to sort this list and get ALL entries instead of doing a realy query and get only a few entries back?
Thanks for help!
You need to specify your terms in the main query and then change your date query to be a filter query on these results by adding the following.
fq=datetime[YY-MM-DDT00:00:00.000Z TO YY-MM-DDT23:59:59.999Z]
So you should have something like this:
q=<terms go here>&fq=datetime[YY-MM-DDT00:00:00.000Z TO YY-MM-DDT23:59:59.999Z]
Edit: A little more about filter queries (as suggested by rfreak).
From Solr Wiki - FilterQuery Guidance - "Now, what is a filter query? It is simply a part of a query that is factored out for special treatment. This is achieved in Solr by specifying it using the fq (filter query) parameter instead of the q (main query) parameter. The same result could be achieved leaving that query part in the main query. The difference will be in query efficiency. That's because the result of a filter query is cached and then used to filter a primary query result using set intersection."
These should be sorted by relevancy score already, that is just the default behavior of Solr. You can see the score by adding that field.
fl=*,score
If you use the Full Interface for Make A Query on the Admin Interface on your Solr installation at http://<yourserver:port#>/<instancename>/admin/form.jspyou will see where you can specify the filter query, fields, and other options. You can check out the Solr Wiki for more details on the options and how they are used.
I hope that this helps you.
You could do a first query for:
fq=datetime[YY-MM-DDT00:00:00.000Z TO YY-MM-DDT23:59:59.999Z]
which gives all documents that match the range. Then, use CachingWrapperFilter for the second query to find documents in the DocSet from first query which have at least one keyword. They will be relevance ranked per tf-idf. You may want to use ConstantScoringQuery for the first to get the list of matching docids in the fastest possible way.
Sorting by relevance is default behavior on solr/lucene.
If your results are unsatisfied, try to put the keywords in quotes
//Edit: Folowing the answer from Paige Cook, use somethink like that
q="important thing"&fq=datetime[YY-MM-DDT00:00:00.000Z TO YY-MM-DDT23:59:59.999Z]
//2. nd update. By thinking about this answer: quotes are not an good idea, because in this case you will only receive "important thing" mails, but no "important too"
The Point is: what keywords you are using. Because: searching for -- important thing -- results in the highest scores for "important thing" mails. But lucene does not know, how to score "important too" or "not so important, but more than average" in relation to your keywords.
An other idea would be searching only for "important". But the field-values "importand thing" and "importand too" gives nearly the same score values,because 50% of the searched keywords (in this key: "imported") are part of the field-value.
So probably you have to change your keywords. It could work after changeing "importend to" into "also an important mail", to get the beast ratio of search-word "important" and field-value in order to score the shortest Mail-discripton to the highest value.

Sort with one option forced to top of list

I have a PHP application that displays a list of options to a user. The list is generated from a simple query against SQL 2000. What I would like to do is have a specific option at the top of the list, and then have the remaining options sorted alphabetically.
For example, here's the options if sorted alphabetically:
Calgary
Edmonton
Halifax
Montreal
Toronto
What I would like the list to be is more like this:
**Montreal**
Calgary
Edmonton
Halifax
Toronto
Is there a way that I can do this using a single query? Or am I stuck running the query twice and appending the results?
SELECT name
FROM locations
ORDER BY
CASE
WHEN name = 'Montreal'
THEN 0
ELSE 1
END, name
SELECT name FROM options ORDER BY name = "Montreal", name;
Note: This works with MySQL, not SQL 2000 like the OP requested.
create table Places (
add Name varchar(30),
add Priority bit
)
select Name
from Places
order by Priority desc,
Name
I had a similar problem on a website I built full of case reports. I wanted the case reports where the victim name is known to sort to the top, because they are more compelling. Conversely I wanted all the John Doe cases to be at the bottom. Since this also involved people's names, I had the firstname/lastname sorting problem as well. I didn't want to split it into two name fields because some cases aren't people at all.
My solution:
I have a "Name" field which is what is displayed. I also have a "NameSorted" field that is used in all queries but is never displayed. My input UI takes care of converting "LAST, FIRST" entered into the sorting field into the display version automatically.
Finally, to "rig" the sorting I simply put appropriate characters at the beginning of the sort field. Since I want stuff to come out at the end, I put "zzz" at the beginning. To sort at the top you could put "!" at the beginning. Again your editing UI can take care of this for you.
Yes, I admit its a bit cheezy, but it works. One advantage for me is I have to do more complex queries with joins in different places to generate pages versus RSS etc, and I don't have to keep remembering a complex expression to get the sorting right, its always just sort by the "NameSorted" field.
Click my profile to see the resulting website.
I ended up with this
SELECT name
FROM locations
LEFT JOIN (VALUES ('Toronto', 1), ('Montreal', 2)) city (name, rank)
ON locations.name = city.name
ORDER BY city.rank, locations.name;
Which may be overkill for this example but can be extended for more complex needs.

Need Pattern for dynamic search of multiple sql tables

I'm looking for a pattern for performing a dynamic search on multiple tables.
I have no control over the legacy (and poorly designed) database table structure.
Consider a scenario similar to a resume search where a user may want to perform a search against any of the data in the resume and get back a list of resumes that match their search criteria. Any field can be searched at anytime and in combination with one or more other fields.
The actual sql query gets created dynamically depending on which fields are searched. Most solutions I've found involve complicated if blocks, but I can't help but think there must be a more elegant solution since this must be a solved problem by now.
Yeah, so I've started down the path of dynamically building the sql in code. Seems godawful. If I really try to support the requested ability to query any combination of any field in any table this is going to be one MASSIVE set of if statements. shiver
I believe I read that COALESCE only works if your data does not contain NULLs. Is that correct? If so, no go, since I have NULL values all over the place.
As far as I understand (and I'm also someone who has written against a horrible legacy database), there is no such thing as dynamic WHERE clauses. It has NOT been solved.
Personally, I prefer to generate my dynamic searches in code. Makes testing convenient. Note, when you create your sql queries in code, don't concatenate in user input. Use your #variables!
The only alternative is to use the COALESCE operator. Let's say you have the following table:
Users
-----------
Name nvarchar(20)
Nickname nvarchar(10)
and you want to search optionally for name or nickname. The following query will do this:
SELECT Name, Nickname
FROM Users
WHERE
Name = COALESCE(#name, Name) AND
Nickname = COALESCE(#nick, Nickname)
If you don't want to search for something, just pass in a null. For example, passing in "brian" for #name and null for #nick results in the following query being evaluated:
SELECT Name, Nickname
FROM Users
WHERE
Name = 'brian' AND
Nickname = Nickname
The coalesce operator turns the null into an identity evaluation, which is always true and doesn't affect the where clause.
Search and normalization can be at odds with each other. So probably first thing would be to get some kind of "view" that shows all the fields that can be searched as a single row with a single key getting you the resume. then you can throw something like Lucene in front of that to give you a full text index of those rows, the way that works is, you ask it for "x" in this view and it returns to you the key. Its a great solution and come recommended by joel himself on the podcast within the first 2 months IIRC.
What you need is something like SphinxSearch (for MySQL) or Apache Lucene.
As you said in your example lets imagine a Resume that will composed of several fields:
List item
Name,
Adreess,
Education (this could be a table on its own) or
Work experience (this could grow to its own table where each row represents a previous job)
So searching for a word in all those fields with WHERE rapidly becomes a very long query with several JOINS.
Instead you could change your framework of reference and think of the Whole resume as what it is a Single Document and you just want to search said document.
This is where tools like Sphinx Search do. They create a FULL TEXT index of your 'document' and then you can query sphinx and it will give you back where in the Database that record was found.
Really good search results.
Don't worry about this tools not being part of your RDBMS it will save you a lot of headaches to use the appropriate model "Documents" vs the incorrect one "TABLES" for this application.