Efficiently Handling Multiple Optional Constraints in Where Clause - sql-server-2005

This is somewhat of a sequel to Slow Exists Check. Alex's suggestion works and successfully avoids code repetition, but I still end up with a second issue. Consider the example below (From AlexKuznetsov). In it, I have two branches to handle 1 contraint. If I had 2 optional constraints, I would end up with 4 branches. Basically, the number of branches increases exponentially with the number of constraints.
On the other hand, if I use a Multi-Statement Table-valued function or otherwise use temporary tables, the SQL query optimizer is not able to assist me, so things become slow. I am somewhat distrustful of dynamic SQL (and I've heard it is slow, too).
Can anyone offer suggestions on how to add more constraints without adding lots of if statements?
Note: I have previously tried just chaining x is null or inpo = #inpo together, but this is very slow. Keep in mind that while the inpo = #inpo test can be handled via some sort of indexing black magic, the nullity test ends up being evaluated for every row in the table.
IF #inpo IS NULL BEGIN
SELECT a,b,c
FROM dbo.ReuseMyQuery(#i1)
ORDER BY c;
END ELSE BEGIN
SELECT a,b,c
FROM dbo.ReuseMyQuery(#i1)
WHERE inpo = #inpo
ORDER BY c;
END
Variation Two: 2 constraints:
IF #inpo IS NULL BEGIN
IF #inpo2 IS NULL BEGIN
SELECT a,b,c
FROM dbo.ReuseMyQuery(#i1)
ORDER BY c;
END ELSE BEGIN
SELECT a,b,c
FROM dbo.ReuseMyQuery(#i1)
WHERE inpo2 = #inpo2
ORDER BY c;
END
END ELSE BEGIN
IF #inpo2 IS NULL BEGIN
SELECT a,b,c
FROM dbo.ReuseMyQuery(#i1)
WHERE inpo = #inpo
ORDER BY c;
END ELSE BEGIN
SELECT a,b,c
FROM dbo.ReuseMyQuery(#i1)
WHERE inpo = #inpo AND
inpo2 = #inpo2
ORDER BY c;
END
END

this is the best reference: http://www.sommarskog.se/dyn-search-2005.html

In such cases I use sp_executesql as described in Erland's article: Using sp_executesql
Whenever dynamic SQL is used, missing permissions may be a problem, so I have a real network account for unit testing, I add that account to the actual role, and I impersonate with that real account whenever I test dynamic SQL, as described here: Database Unit Testing: Impersonation

Here's a rough example. Modify the LIKE statements in the WHERE clause depending if you want "starts with" or "contains" or an exact match in your query.
CREATE PROCEDURE dbo.test
#name AS VARCHAR(50) = NULL,
#address1 AS VARCHAR(50) = NULL,
#address2 AS VARCHAR(50) = NULL,
#city AS VARCHAR(50) = NULL,
#state AS VARCHAR(50) = NULL,
#zip_code AS VARCHAR(50) = NULL
AS
BEGIN
SELECT [name],
address1,
address2,
city,
state,
zip_code
FROM my_table
WHERE ([name] LIKE #name + '%' OR #name IS NULL)
AND (address1 LIKE #address1 + '%' OR #address1 IS NULL)
AND (address2 LIKE #address2 + '%' OR #address2 IS NULL)
AND (city LIKE #city + '%' OR #city IS NULL)
AND (state LIKE #state + '%' OR #state IS NULL)
AND (zip_code LIKE #zip_code + '%' OR #zip_code IS NULL)
ORDER BY [name]
END
GO

Select blah from foo
Where (#inpo1 is null or #inpo1 = inpo1)
and (#inpo2 is null or #inpo2 = inpo2)
Apparently this is too slow. Interesting.
Have you considered code generation? Lengthy queries with lots of duplication is only an issue if it has to be maintained directly.

I realise your question may be purely academic, but if you have real world use cases have you considered only providing optimised queries for the most common scenarios?

Related

Multi Columns filtration in stored procedure in SQL Server

I am having more controls (assume 10 controls with textbox, dropdown, radio buttons) in my Windows forms application for filtering data which all are not a mandatory, hence user may filter data with 1 control or more.
Now I have to create a stored procedure for filtering the data based on their inputs.
Ex: if user enters some text in 1 textbox control, and left remaining 9 controls with empty data, I have to filter data based on only that textbox which user entered.
If user enters some text in 1 textbox control and 1 dropdown, and left remaining 8 controls with empty data, I have to filter data based on only that textbox and dropdown which user entered.
What am I supposed to do?
In source code:
If user entered/selected text on any control, I am passing values as parameters else i am passing as "null" to remaining all other parameters .
In stored procedure:
I gave all 10 controls parameters to get value from Source Code,based on parameters I am filtering data.
if #Param1=null && #Param2=null && #Param3='SomeText'
begin
sELECT * FROM tABLE1 wHERE TableCOLUMN3=#Param3
END
if #Param1=null && #Param2='SomeText' && #Param3='SomeText'
begin
sELECT * FROM tABLE1 wHERE TableCOLUMN2=#Param2 AND TableCOLUMN3=#Param3
END
Note: I need to filter data with each table column to each parameter , Simply assume #Param1--TableCOLUMN1, #param2--TableCOLUMN2, filter varies depend on parameters text.
If I do like this my stored procedure will be more enormous and very big because I have 10 parameters to check (just for your reference I gave 3 parameters above sample).
What I want is :
Since I gave 10 parameters, based on parameters which having values (some text other than NULL) only I have to filter data by using where condition.
Is there any other way to do this, or does anyone have any other ways for this to do?
As long as you make your params default to null and either don't pass in a value for the params you dont need or pass in dbnull value then you can filter like this
CREATE PROC dbo.SAMPLE
(
#Param1 VARCHAR(255) = NULL,
#Param2 VARCHAR(255) = NULL,
#Param3 VARCHAR(255) = NULL,
#Param4 VARCHAR(255) = NULL,
#Param5 VARCHAR(255) = NULL,
#Param6 VARCHAR(255) = NULL
)
AS
BEGIN
SELECT *
FROM Table1
WHERE (#Param1 IS NULL OR TableCOLUMN1 = #Param1)
AND (#Param2 IS NULL OR TableCOLUMN2 = #Param2)
AND (#Param3 IS NULL OR TableCOLUMN3 = #Param3)
AND (#Param4 IS NULL OR TableCOLUMN4 = #Param4)
AND (#Param5 IS NULL OR TableCOLUMN5 = #Param5)
OPTION (RECOMPILE) -- as JamesZ suggested to prevent caching
END
EXEC dbo.SAMPLE #Param2 = 'SomeText' -- only filter where TableCOLUMN2 = #Param2
I would suggest something like that:
SELECT *
FROM TABLE1
WHERE TableCOLUMN1=ISNULL(#Param1,TableCOLUMN1)
AND TableCOLUMN2=ISNULL(#Param2,TableCOLUMN2)
AND TableCOLUMN3=ISNULL(#Param3,TableCOLUMN3)
AND TableCOLUMN4=ISNULL(#Param4,TableCOLUMN4)
... and so on...
This will filter column1 on a value if you specify param1 otherwise it will use the columnvalue itself which will always be true.
But this will only work if your #Param values were NULL in each case if you won't use them.
If the table is big / you need to use indexes for fetching the rows, the problem with this kind of logic is, that indexes can't really be used. There's basically two ways how you can do that:
Add option (recompile) to the end of the select statement by #Ionic or #user1221684. This will cause the statement to be recompiled every time it is executed, which might be a lot of CPU overhead if it's called often.
Create dynamic SQL and call it using sp_executesql
Example:
set #sql = 'SELECT * FROM TABLE1 WHERE '
if (#Param1 is not NULL) set #sql = #sql + 'TableCOLUMN1=#Param1 AND '
if (#Param2 is not NULL) set #sql = #sql + 'TableCOLUMN2=#Param2 AND '
if (#Param3 is not NULL) set #sql = #sql + 'TableCOLUMN3=#Param3 AND '
-- Note: You're not concatenating the value of the parameter, just it's name
set #sql = #sql + ' 1=1' -- This handles the last 'and'
EXEC sp_executesql #sql,
N'#Param1 varchar(10), #Param2 varchar(10), #Param3 varchar(10)',
#Param1, #Param2, #Param3
As an extra option, you could do some kind of mix between your original idea and totally dynamic one, so that it would have at least the most common search criteria handled so that in can be fetched efficiently.
Normally every parameter will have a default value, for example int will have the default value as zero. So using this you can have the condition. See the exam sql below.
create procedure [dbo].[sp_report_test](
#pParam1 int,
#pParam2 int ,
#pParam3 int,
#pParam4 varchar(50)
)
AS
SELECT
*
FROM [vw_report]
where
(#pParam1 <= 0 or Column1 = #pParam1) and
(#pParam2 <= 0 or Column2 = #pParam2) and
(#pParam3 <= 0 or Column3 = #pParam3) and
(#pParam4 is null or len(#pParam4) <= 0 or Column4 = #pParam4);
GO

CASE WHEN in WHERE with LIKE condition instead of 1

I have a query with a bunch of OR's inside an AND in the where clause and I'm trying to replace them with CASE WHEN to see if it improves the performance.
The select query inside the stored procedure is something like:
DECLARE #word = '%word%' --These are inputs
DECLARE #type = 'type'
SELECT * FROM table1
WHERE SomeCondition1
AND ( (#type = 'CS' AND col1 like #word)
OR
(#type = 'ED' AND col2 like #word)
....
)
I'm trying to write this query as:
SELECT * FROM table1
WHERE SomeCondition1
AND ( 1= CASE WHEN #type = 'CS'
THEN col1 like #word
WHEN #type = 'ED'
THEN col2 like #word
END )
But SQL 2012 gives the error 'Incorrect Syntax Near Like' for THEN col1 like #word. If I replace THEN col1 like #word with 1 then no complaints but LIKE should return a 0 or 1 anyway.
I tried SELECT (col1 like #word), extra (), etc with no success.
Is there a way to include LIKE in CASE WHEN in WHERE or should I just not bother if using CASE WHEN instead of the original IF's won't make any performance difference?
UPDATE:
This actually didn't make any difference performance wise.
There are is a lot of info online about these 'optional' type stored procedures and how to avoid parameter sniffing performance issues.
This syntax should get you closer though:
AND CASE
WHEN #type = 'CS' THEN col1
WHEN #type = 'ED' THEN col2
END LIKE #word
Just make sure the col1 and col2 datatypes are similar (don't mix INT and VARCHAR)
You should compare query plans between the two syntaxes to ascertain whether it even makes a difference. Your performance issue might be due more to parameter sniffing.
You can also try nested case statements. e.g. based on your latest post, something like:
1 = CASE WHEN #type = 'CandidateStatus'
THEN (CASE WHEN co.[Description] LIKE #text THEN 1 END)
...
END
Here's how I got it to work, now just need to test if it makes any difference to performance. #Nick.McDermaid 's parameter sniffing is worth looking at.
1 = CASE WHEN #type = 'CandidateStatus'
THEN (SELECT 1 WHERE co.[Description] LIKE #text)

sql full text search with nullable parameter

We would like to work with full-text search but keep the incoming parameters in our procedure null-able. In this example #Street can have a value like Heuvel or can be NULL.
Working this out in a query is something that doesn't work.
heuvel* gives all the rows that are expected.
* gives zero rows while other threads on forums say that it will give all rows
street = isnull(#street, street) gives all rows
Knowing all the above I would think this query would work. It does but it's VERY slow.
When I execute the or clause separately then the query is fast.
Declare #Street nvarchar(50)= '"heuvel*"'
Declare #InnerStreet nvarchar(50) = '"'+ isnull(#Street, '') +'*"';
SELECT *
FROM Address
WHERE street = isnull(#street, street) or CONTAINS(street, #InnerStreet)
Now the question is, how to work with full-text search and null-able parameters?
Try this. #street is either NULL or has a value.
declare #streetWildcard nvarchar(50) = '"' + #street + '*"'
SELECT *
FROM Address
WHERE (#street is NULL) or CONTAINS(street, #streetWildcard)
If the performance is still bad then the execution plan may not be short circuiting the WHERE condition. In that case, try this instead.
if (#street is NULL) begin
SELECT *
FROM Address
end
else begin
declare #streetWildcard nvarchar(50) = '"' + #street + '*"'
SELECT *
FROM Address
WHERE CONTAINS(street, #streetWildcard)
end

Sql function to make noun plural

Is there any function in SQL Server which change the noun from singular to plural form?
SQL itself doesn't have anything like this - but you could try to use the .NET PluralizationService introduced in .NET 4 - the same functionality that the Entity Framework uses to pluralize/singularize table names to object names.
You would have to write a SQL-CLR assembly to tap into the pluralization services, but it definitely seems like a doable thing!
That function does not exist in SQL Server.
The function does not exist in SQL Server, as #aF mentioned. The one place I know if it existing is in Entity Framework 4+. The pluralization object can actually be instantiated and used.
SQL has the capacity to run CLR code--through SQL CLR. The main problem is that SQL CLR is limited to the .NET Framework 3.5. So you would either need to write some .net 4 code that operates on your tables. Alternately, you could use a product like Reflector and reverse engineer a 3.5 compatible version and run it inside SQL Server.
CREATE FUNCTION dbo.Pluralize
(
#noun nvarchar(50)
)
RETURNS nvarchar(50)
AS
BEGIN
DECLARE #QueryString nvarchar(4000)
SET #QueryString = N'FORMSOF(INFLECTIONAL,"' + #noun + N'")'
RETURN
(SELECT TOP 1 display_term
FROM sys.dm_fts_parser(#QueryString,1033,0,0))
END
GO
SELECT noun,
dbo.Pluralize(noun)
FROM (VALUES('cat'),
('mouse'),
('goose'),
('person'),
('man'),
('datum')) nouns(noun)
Returns
noun
------ ------------------------------
cat cats
mouse mice
goose geese
person persons
man men
datum data
Unfortunately it just relies on observation that the TOP 1 expansion term for the noun is the plural form. I doubt that this is documented anywhere.
DECLARE #PluralVersion nvarchar(128) = ''
DECLARE #TableName nvarchar(128) = 'MyTableName'
DECLARE #NounVersions TABLE(Term nvarchar(128) NOT NULL)
DECLARE #QueryString nvarchar(4000) SET #QueryString = N'FORMSOF(INFLECTIONAL,"' + #TableName + N'")'
INSERT INTO #NounVersions
SELECT TOP 10 display_term FROM sys.dm_fts_parser(#QueryString,1033,0,0)
SELECT TOP 1 #PluralVersion = Term FROM #NounVersions WHERE Term Not Like '%''%' AND RIGHT(Term,1) = 's'
SET #PluralVersion = UPPER(LEFT(#PluralVersion,1))+LOWER(SUBSTRING(#PluralVersion,2,LEN(#PluralVersion)))
SELECT #PluralVersion
I know this is an old thread, but thought I would post anyways. Based on what Martin started, I expanded it and so far it is doing an OK job. I would not say it is perfect. This is used for pluralizing table names so it is very controlled
Do you want to use this for display purposes?
Something like "Your search returned 1 result" / "Your search returned 4 results" ?
If yes, I wouldn't do it like that.
Finding or writing a function that does this correctly for all special cases (let alone in several languages) is nearly impossible, and storing each needed text once in singular and once in plural form isn't much better.
At work, I'm dealing with multiple languages and lots of dynamically generated sentences like this a lot, and I found that completely avoiding the distinction of singular/plural forms at all is the easiest solution to manage:
"Number of results for this search: 1"
Nope, but it would be pretty easy to make a table for this if you have a limited set of words to check.
Example:
CREATE TABLE dbo.Plurals
(
id int IDENTITY,
singular varchar(100),
plural varchar(100)
)
INSERT INTO dbo.Plurals
VALUES
('cat', 'cats'),
('goose', 'geese'),
('man', 'men'),
('question', 'questions')
Alternatively, you could make the table just be exceptions, i.e. words that can't be pluralized with the simple addition of an s - then you could do an EXISTS check on that table, if it's not there then add an s and if it is then lookup the plural.
I wrote this. Probably not perfect but it works for me. And it returns with same capitalization as it was called.
create function dbo.udfPlural(#Int int
, #Item varchar(47))
returns varchar(50)
as
begin
-- VARS...
declare #Ninth char(1)
declare #Tenth char(1)
declare #Reply varchar(50)
-- ONLY FOR PLURALS...
if (#Int <> 1)
begin
-- LAST AND PENULTIMATE LETTERS...
set #Ninth = ''
set #Tenth = ''
if (len(#Item) >= 2) set #Ninth = substring(#Item, len(#Item) - 1, 1)
if (len(#Item) >= 1) set #Tenth = substring(#Item, len(#Item), 1)
-- APPLY S/IES...
if (#Tenth = 's')
set #Reply = #Item + 'es'
else if (#Tenth = 'y')
if (charindex(#Ninth, 'AEIOU') > 0)
set #Reply = #Item + 's'
else
set #Reply = substring(#Item, 1, len(#Item) - 1) + 'ies'
else
set #Reply = #Item + 's'
end
-- ASSEMBLE...
set #Reply = ltrim(str(#Int)) + ' ' + #Reply
-- DONE...
return #Reply
end

SQL Precedence Matching

I'm trying to do precedence matching on a table within a stored procedure. The requirements are a bit tricky to explain, but hopefully this will make sense. Let's say we have a table called books, with id, author, title, date, and pages fields.
We also have a stored procedure that will match a query with ONE row in the table.
Here is the proc's signature:
create procedure match
#pAuthor varchar(100)
,#pTitle varchar(100)
,#pDate varchar(100)
,#pPages varchar(100)
as
...
The precedence rules are as follows:
First, try and match on all 4 parameters. If we find a match return.
Next try to match using any 3 parameters. The 1st parameter has the highest precedence here and the 4th the lowest. If we find any matches return the match.
Next we check if any two parameters match and finally if any one matches (still following the parameter order's precedence rules).
I have implemented this case-by-case. Eg:
select #lvId = id
from books
where
author = #pAuthor
,title = #pTitle
,date = #pDate
,pages = #pPages
if ##rowCount = 1 begin
select #lvId
return
end
select #lvId = id
from books
where
author = #pAuthor
,title = #pTitle
,date = #pDate
if ##rowCount = 1 begin
select #lvId
return
end
....
However, for each new column in the table, the number of individual checks grows by an order of 2. I would really like to generalize this to X number of columns; however, I'm having trouble coming up with a scheme.
Thanks for the read, and I can provide any additional information needed.
Added:
Dave and Others, I tried implementing your code and it is choking on the first Order by Clause, where we add all the counts. Its giving me an invalid column name error. When I comment out the total count, and order by just the individual aliases, the proc compiles fine.
Anyone have any ideas?
This is in Microsoft Sql Server 2005
I believe that the answers your working on are the simplest by far. But I also believe that in SQL server, they will always be full table scans. (IN Oracle you could use Bitmap indexes if the table didn't undergo a lot of simultaneous DML)
A more complex solution but a much more performant one would be to build your own index. Not a SQL Server index, but your own.
Create a table (Hash-index) with 3 columns (lookup-hash, rank, Rowid)
Say you have 3 columns to search on. A, B, C
For every row added to Books you'll insert 7 rows into hash_index either via a trigger or CRUD proc.
First you'll
insert into hash_index
SELECT HASH(A & B & C), 7 , ROWID
FROM Books
Where & is the concatenation operator and HASH is a function
then you'll insert hashes for A & B, A & C and B & C.
You now have some flexibility you can give them all the same rank or if A & B are a superior match to B & C you can give them a higher rank.
And then insert Hashes for A by itself and B and C with the same choice of rank... all the same number or all different... you can even say that a match on A is higher choice than a match on B & C. This solution give you a lot of flexibility.
Of course, this will add a lot of INSERT overhead, but if DML on Books is low or performance is not relevant you're fine.
Now when you go to search you'll create a function that returns a table of HASHes for your #A, #B and #C. you'll have a small table of 7 values that you'll join to the lookup-hash in the hash-index table. This will give you every possible match and possibly some false matches (that's just the nature of hashes). You'll take that result, order desc on the rank column. Then take the first rowid back to the book table and make sure that all of the values of #A #B #C are actually in that row. On the off chance it's not and you've be handed a false positive you'll need to check the next rowid.
Each of these operation in this "roll your own" are all very fast.
Hashing your 3 values into a small 7 row table variable = very fast.
joining them on an index in your Hash_index table = very fast index lookups
Loop over result set will result in 1 or maybe 2 or 3 table access by rowid = very fast
Of course, all of these together could be slower than an FTS... But an FTS will continue to get slower and slower. There will be a size which the FTS is slower than this. You'll have to play with it.
I don't have time to write out the query, but I think this idea would work.
For your predicate, use "author = #pAuthor OR title = #ptitle ...", so you get all candidate rows.
Use CASE expressions or whatever you like to create virtual columns in the result set, like:
SELECT CASE WHEN author = #pAuthor THEN 1 ELSE 0 END author_match,
...
Then add this order by and get the first row returned:
ORDER BY (author_match+title_match+date_match+page_match) DESC,
author_match DESC,
title_match DESC,
date_match DESC
page_match DESC
You still need to extend it for each new column, but only a little bit.
You don't explain what should happen if more than one result matches any given set of parameters that is reached, so you will need to change this to account for those business rules. Right now I've set it to return books that match on later parameters ahead of those that don't. For example, a match on author, title, and pages would come before one that just matches on author and title.
Your RDBMS may have a different way of handling "TOP", so you may need to adjust for that as well.
SELECT TOP 1
author,
title,
date,
pages
FROM
Books
WHERE
author = #author OR
title = #title OR
date = #date OR
pages = #pages OR
ORDER BY
CASE WHEN author = #author THEN 1 ELSE 0 END +
CASE WHEN title = #title THEN 1 ELSE 0 END +
CASE WHEN date = #date THEN 1 ELSE 0 END +
CASE WHEN pages = #pages THEN 1 ELSE 0 END DESC,
CASE WHEN author = #author THEN 8 ELSE 0 END +
CASE WHEN title = #title THEN 4 ELSE 0 END +
CASE WHEN date = #date THEN 2 ELSE 0 END +
CASE WHEN pages = #pages THEN 1 ELSE 0 END DESC
select id,
CASE WHEN #pPages = pages
THEN 1 ELSE 0
END
+ Case WHEN #pAuthor=author
THEN 1 ELSE 0
END AS
/* + Do this for each attribute. If each of your
attributes are just as important as the other
for example matching author is jsut as a good as matching title then
leave the values alone, if different matches are more
important then change the values */ as MatchRank
from books
where author = #pAuthor OR
title = #pTitle OR
date = #pDate
ORDER BY MatchRank DESC
Edited
When I run this query (modified only to fit one of my own tables) it works fine in SQL2005.
I'd recommend a where clause but you will want to play around with this to see performance impacts. You will need to use an OR clause otherwise you will loose potential matches
In regards to the Order By clause failing to compile:
As recursive said(in a comment), alias' may not be within expressions which are used in Order By clauses. to get around this I used a subquery which returned the rows, then ordered by in the outer query. In this way I am able to use the alias' in the order by clause. A little slower but a lot cleaner.
Okay, let me restate my understanding of your question: You want a stored procedure that can take a variable number of parameters and pass back the top row that matches the parameters in the weighted order of preference passed on SQL Server 2005.
Ideally, it will use WHERE clauses to prevent full tables scans plus take advantage of indices and will "short circuit" the search - you don't want to search all possible combinations if one can be found early. Perhaps we can also allow other comparators than = such as >= for dates, LIKE for strings, etc.
One possible way is to pass the parameters as XML like in this article and use .Net stored procedures but let's keep it plain vanilla T-SQL for now.
This looks to me like a binary search on the parameters: Search all parameters, then drop the last one, then drop the second last one but include the last one, etc.
Let's pass the parameters as a delimited string since stored procedures don't allow for arrays to be passed as parameters. This will allow us to get a variable number of parameters in to our stored procedure without requiring a stored procedure for each variation of parameters.
In order to allow any sort of comparison, we'll pass the entire WHERE clause list, like so: title like '%something%'
Passing multiple parameters means delimiting them in a string. We'll use the tilde ~ character to delimit the parameters, like this: author = 'Chris Latta'~title like '%something%'~pages >= 100
Then it is simply a matter of doing a binary weighted search for the first row that meets our ordered list of parameters (hopefully the stored procedure with comments is self-explanatory but if not, let me know). Note that you are always guaranteed a result (assuming your table has at least one row) as the last search is parameterless.
Here is the stored procedure code:
CREATE PROCEDURE FirstMatch
#SearchParams VARCHAR(2000)
AS
BEGIN
DECLARE #SQLstmt NVARCHAR(2000)
DECLARE #WhereClause NVARCHAR(2000)
DECLARE #OrderByClause NVARCHAR(500)
DECLARE #NumParams INT
DECLARE #Pos INT
DECLARE #BinarySearch INT
DECLARE #Rows INT
-- Create a temporary table to store our parameters
CREATE TABLE #params
(
BitMask int, -- Uniquely identifying bit mask
FieldName VARCHAR(100), -- The field name for use in the ORDER BY clause
WhereClause VARCHAR(100) -- The bit to use in the WHERE clause
)
-- Temporary table identical to our result set (the books table) so intermediate results arent output
CREATE TABLE #junk
(
id INT,
author VARCHAR(50),
title VARCHAR(50),
printed DATETIME,
pages INT
)
-- Ill use tilde ~ as the delimiter that separates parameters
SET #SearchParams = LTRIM(RTRIM(#SearchParams))+ '~'
SET #Pos = CHARINDEX('~', #SearchParams, 1)
SET #NumParams = 0
-- Populate the #params table with the delimited parameters passed
IF REPLACE(#SearchParams, '~', '') <> ''
BEGIN
WHILE #Pos > 0
BEGIN
SET #NumParams = #NumParams + 1
SET #WhereClause = LTRIM(RTRIM(LEFT(#SearchParams, #Pos - 1)))
IF #WhereClause <> ''
BEGIN
-- This assumes your field names dont have spaces and that you leave a space between the field name and the comparator
INSERT INTO #params (BitMask, FieldName, WhereClause) VALUES (POWER(2, #NumParams - 1), LTRIM(RTRIM(LEFT(#WhereClause, CHARINDEX(' ', #WhereClause, 1) - 1))), #WhereClause)
END
SET #SearchParams = RIGHT(#SearchParams, LEN(#SearchParams) - #Pos)
SET #Pos = CHARINDEX('~', #SearchParams, 1)
END
END
-- Set the binary search to search from all parameters down to one in order of preference
SET #BinarySearch = POWER(2, #NumParams)
SET #Rows = 0
WHILE (#BinarySearch > 0) AND (#Rows = 0)
BEGIN
SET #BinarySearch = #BinarySearch - 1
SET #WhereClause = ' WHERE '
SET #OrderByClause = ' ORDER BY '
SELECT #OrderByClause = #OrderByClause + FieldName + ', ' FROM #params WHERE (#BinarySearch & BitMask) = BitMask ORDER BY BitMask
SET #OrderByClause = LEFT(#OrderByClause, LEN(#OrderByClause) - 1) -- Remove the trailing comma
SELECT #WhereClause = #WhereClause + WhereClause + ' AND ' FROM #params WHERE (#BinarySearch & BitMask) = BitMask ORDER BY BitMask
SET #WhereClause = LEFT(#WhereClause, LEN(#WhereClause) - 4) -- Remove the trailing AND
IF #BinarySearch = 0
BEGIN
-- If nothing found so far, return the top row in the order of the parameters fields
SET #WhereClause = ''
-- Use the full order sequence of fields to return the results
SET #OrderByClause = ' ORDER BY '
SELECT #OrderByClause = #OrderByClause + FieldName + ', ' FROM #params ORDER BY BitMask
SET #OrderByClause = LEFT(#OrderByClause, LEN(#OrderByClause) - 1) -- Remove the trailing comma
END
-- Find out if there are any results for this search
SET #SQLstmt = 'SELECT TOP 1 id, author, title, printed, pages INTO #junk FROM books' + #WhereClause + #OrderByClause
Exec (#SQLstmt)
SET #Rows = ##RowCount
END
-- Stop the result set being eaten by the junk table
SET #SQLstmt = REPLACE(#SQLstmt, 'INTO #junk ', '')
-- Uncomment the next line to see the SQL you are producing
--PRINT #SQLstmt
-- This gives the result set
Exec (#SQLstmt)
END
This stored procedure is called like so:
FirstMatch 'author = ''Chris Latta''~pages > 100~title like ''%something%'''
There you have it - a fully expandable, optimised search for the top result in weighted order of preference. This was an interesting problem and shows just what you can pull off with native T-SQL.
A couple of small issues with this:
it relies on the caller to know that they must leave a space after the field name for the parameter to work properly
you can't have field names with spaces in them - fixable with some effort
it assumes that the relevant sort order is always ascending
the next programmer that has to look at this procedure will think you're insane :)
Try this:
ALTER PROCEDURE match
#pAuthor varchar(100)
,#pTitle varchar(100)
,#pDate varchar(100)
,#pPages varchar(100)
-- exec match 'a title', 'b author', '1/1/2007', 15
AS
SELECT id,
CASE WHEN author = #pAuthor THEN 1 ELSE 0 END
+ CASE WHEN title = #pTitle THEN 1 ELSE 0 END
+ CASE WHEN bookdate = #pDate THEN 1 ELSE 0 END
+ CASE WHEN pages = #pPages THEN 1 ELSE 0 END AS matches,
CASE WHEN author = #pAuthor THEN 4 ELSE 0 END
+ CASE WHEN title = #pTitle THEN 3 ELSE 0 END
+ CASE WHEN bookdate = #pDate THEN 2 ELSE 0 END
+ CASE WHEN pages = #pPages THEN 1 ELSE 0 END AS score
FROM books
WHERE author = #pAuthor
OR title = #pTitle
OR bookdate = #PDate
OR pages = #pPages
ORDER BY matches DESC, score DESC
However, this of course causes a table scan. You can avoid that by making it a union of a CTE and 4 WHERE clauses, one for each property - there will be duplicates, but you can just take the TOP 1 anyway.
EDIT: Added the WHERE ... OR clause. I'd feel more comfortable if it were
SELECT ... FROM books WHERE author = #pAuthor
UNION
SELECT ... FROM books WHERE title = #pTitle
UNION
...