Multi Columns filtration in stored procedure in SQL Server - sql

I am having more controls (assume 10 controls with textbox, dropdown, radio buttons) in my Windows forms application for filtering data which all are not a mandatory, hence user may filter data with 1 control or more.
Now I have to create a stored procedure for filtering the data based on their inputs.
Ex: if user enters some text in 1 textbox control, and left remaining 9 controls with empty data, I have to filter data based on only that textbox which user entered.
If user enters some text in 1 textbox control and 1 dropdown, and left remaining 8 controls with empty data, I have to filter data based on only that textbox and dropdown which user entered.
What am I supposed to do?
In source code:
If user entered/selected text on any control, I am passing values as parameters else i am passing as "null" to remaining all other parameters .
In stored procedure:
I gave all 10 controls parameters to get value from Source Code,based on parameters I am filtering data.
if #Param1=null && #Param2=null && #Param3='SomeText'
begin
sELECT * FROM tABLE1 wHERE TableCOLUMN3=#Param3
END
if #Param1=null && #Param2='SomeText' && #Param3='SomeText'
begin
sELECT * FROM tABLE1 wHERE TableCOLUMN2=#Param2 AND TableCOLUMN3=#Param3
END
Note: I need to filter data with each table column to each parameter , Simply assume #Param1--TableCOLUMN1, #param2--TableCOLUMN2, filter varies depend on parameters text.
If I do like this my stored procedure will be more enormous and very big because I have 10 parameters to check (just for your reference I gave 3 parameters above sample).
What I want is :
Since I gave 10 parameters, based on parameters which having values (some text other than NULL) only I have to filter data by using where condition.
Is there any other way to do this, or does anyone have any other ways for this to do?

As long as you make your params default to null and either don't pass in a value for the params you dont need or pass in dbnull value then you can filter like this
CREATE PROC dbo.SAMPLE
(
#Param1 VARCHAR(255) = NULL,
#Param2 VARCHAR(255) = NULL,
#Param3 VARCHAR(255) = NULL,
#Param4 VARCHAR(255) = NULL,
#Param5 VARCHAR(255) = NULL,
#Param6 VARCHAR(255) = NULL
)
AS
BEGIN
SELECT *
FROM Table1
WHERE (#Param1 IS NULL OR TableCOLUMN1 = #Param1)
AND (#Param2 IS NULL OR TableCOLUMN2 = #Param2)
AND (#Param3 IS NULL OR TableCOLUMN3 = #Param3)
AND (#Param4 IS NULL OR TableCOLUMN4 = #Param4)
AND (#Param5 IS NULL OR TableCOLUMN5 = #Param5)
OPTION (RECOMPILE) -- as JamesZ suggested to prevent caching
END
EXEC dbo.SAMPLE #Param2 = 'SomeText' -- only filter where TableCOLUMN2 = #Param2

I would suggest something like that:
SELECT *
FROM TABLE1
WHERE TableCOLUMN1=ISNULL(#Param1,TableCOLUMN1)
AND TableCOLUMN2=ISNULL(#Param2,TableCOLUMN2)
AND TableCOLUMN3=ISNULL(#Param3,TableCOLUMN3)
AND TableCOLUMN4=ISNULL(#Param4,TableCOLUMN4)
... and so on...
This will filter column1 on a value if you specify param1 otherwise it will use the columnvalue itself which will always be true.
But this will only work if your #Param values were NULL in each case if you won't use them.

If the table is big / you need to use indexes for fetching the rows, the problem with this kind of logic is, that indexes can't really be used. There's basically two ways how you can do that:
Add option (recompile) to the end of the select statement by #Ionic or #user1221684. This will cause the statement to be recompiled every time it is executed, which might be a lot of CPU overhead if it's called often.
Create dynamic SQL and call it using sp_executesql
Example:
set #sql = 'SELECT * FROM TABLE1 WHERE '
if (#Param1 is not NULL) set #sql = #sql + 'TableCOLUMN1=#Param1 AND '
if (#Param2 is not NULL) set #sql = #sql + 'TableCOLUMN2=#Param2 AND '
if (#Param3 is not NULL) set #sql = #sql + 'TableCOLUMN3=#Param3 AND '
-- Note: You're not concatenating the value of the parameter, just it's name
set #sql = #sql + ' 1=1' -- This handles the last 'and'
EXEC sp_executesql #sql,
N'#Param1 varchar(10), #Param2 varchar(10), #Param3 varchar(10)',
#Param1, #Param2, #Param3
As an extra option, you could do some kind of mix between your original idea and totally dynamic one, so that it would have at least the most common search criteria handled so that in can be fetched efficiently.

Normally every parameter will have a default value, for example int will have the default value as zero. So using this you can have the condition. See the exam sql below.
create procedure [dbo].[sp_report_test](
#pParam1 int,
#pParam2 int ,
#pParam3 int,
#pParam4 varchar(50)
)
AS
SELECT
*
FROM [vw_report]
where
(#pParam1 <= 0 or Column1 = #pParam1) and
(#pParam2 <= 0 or Column2 = #pParam2) and
(#pParam3 <= 0 or Column3 = #pParam3) and
(#pParam4 is null or len(#pParam4) <= 0 or Column4 = #pParam4);
GO

Related

Multiple value parameter only showing first value

I have an SQL Server stored procedure that I'm trying to make an SSRS report from. The sp has a parameter in it. The parameter works in SSRS when calling a single value. However, when calling both values, only the first one is returned.
Here's something important- The 'Active' field is a bit data type, so this is where the problem most likely lies.
ALTER PROCEDURE [dbo].[USP_RptRC]
#Active VARCHAR(1)
AS
BEGIN
SELECT [Funding]
,[4thChar]
,REPLACE([Description], CHAR(13) + CHAR(10), '') [Description]
,[Comments]
,CAST(Active AS VARCHAR(1)) Active
,[IsDeleted]
,[LastModifiedBy]
,[LastModifiedDate]
FROM [RC]
WHERE Active IN (#Active)
END
First your parameter is only a length of 1. Second, since you are using 'Where In' you will need to do a little extra work. I had the same issue in the past and this article helped me get it working.
https://munishbansal.wordpress.com/2008/12/29/passing-multi-value-parameter-in-stored-procedure-ssrs-report/
Since #Active is a VARCHAR(1), which could be a CHAR or INT if you are always using 1 or 0, you can only pass in one of the values. If you want the ability to return where Active is 1 or 0, then do this:
#Active VARCHAR(1) = NULL
AS
BEGIN
SELECT [Funding]
,[4thChar]
,REPLACE([Description], CHAR(13) + CHAR(10), '') [Description]
,[Comments]
,CAST(Active AS VARCHAR(1)) Active
,[IsDeleted]
,[LastModifiedBy]
,[LastModifiedDate]
FROM [RC]
WHERE #Active IS NULL or Active = #Active
Then in your SSRS report, just set the default value of your parameter to NULL, and/or in the Parameter settings, allow NULL values along with your explicit choices of 1 and 0. When NULL is selected, both 1 and 0 will be returned. Of course you can label the parameter choice in SSRS to "All" or "Inactive & Active" and pass NULL as the actual value.
T-SQL wise - '0,1' is not two values, it's one and in the context you use it it's one string value. But since your parameter length is limited (by you) to 1, the engine trims everything after the first character to convert '0,1' into '0'. If you pass '1,whatever' you'll get '1' in you stored procedure. That's way "only the first one is returned".
But since the #Active is bit you should declare it as bit. Then change your logic to include appropriate subset of records when you receive NULL for #Active, like so:
ALTER PROCEDURE [dbo].[USP_RptRC]
#Active VARCHAR(1)
AS
BEGIN
SELECT [Funding]
,[4thChar]
,REPLACE([Description], CHAR(13) + CHAR(10), '') [Description]
,[Comments]
,CAST(Active AS VARCHAR(1)) Active
,[IsDeleted]
,[LastModifiedBy]
,[LastModifiedDate]
FROM [RC]
WHERE 1 = CASE WHEN #Active IS NULL THEN 1 ELSE
CASE WHEN Active = #Active THEN 1 ELSE -1
END

Building dynamic WHERE clause in stored procedure

I'm using SQL Server 2008 Express, and I have a stored procedure that do a SELECT from table, based on parameters. I have nvarchar parameters and int parameters.
Here is my problem, my where clause looks like this:
WHERE [companies_SimpleList].[Description] Like #What
AND companies_SimpleList.Keywords Like #Keywords
AND companies_SimpleList.FullAdress Like #Where
AND companies_SimpleList.ActivityId = #ActivityId
AND companies_SimpleList.DepartementId = #DepartementId
AND companies_SimpleList.CityId = #CityId
This parameters are the filter values set by the user of my ASP.NET MVC 3 application, and the int parameters may not be set, so their value will be 0. This is my problem, the stored procedure will search for items who have 0 as CityId for example, and for this, it return a wrong result. So it will be nice, to be able to have a dynamic where clause, based on if the value of int parameter is grater than 0, or not.
Thanks in advance
Try this instead:
WHERE 1 = 1
AND (#what IS NULL OR [companies_SimpleList].[Description] Like #What )
AND (#keywords IS NULL OR companies_SimpleList.Keywords Like #Keywords)
AND (#where IS NULL OR companies_SimpleList.FullAdress Like #Where)
...
If any of the parameters #what, #where is sent to the stored procedure with NULL value then the condition will be ignored. You can use 0 instead of null as a test value then it will be something like #what = 0 OR ...
try something like
AND companies_SimpleList.CityId = #CityId or #CityID = 0
Here is another easy-to-read solution for SQL Server >=2008
CREATE PROCEDURE FindEmployee
#Id INT = NULL,
#SecondName NVARCHAR(MAX) = NULL
AS
SELECT Employees.Id AS "Id",
Employees.FirstName AS "FirstName",
Employees.SecondName AS "SecondName"
FROM Employees
WHERE Employees.Id = COALESCE(#Id, Employees.Id)
AND Employees.SecondName LIKE COALESCE(#SecondName, Employees.SecondName) + '%'

SQL Data Filtering approach

I have a stored procedure that receives 3 parameters that are used to dynamically filter the result set
create proc MyProc
#Parameter1 int,
#Parameter2 int,
#Paremeter3 int
as
select * from My_table
where
1 = case when #Parameter1 = 0 then 1 when #Parameter1 = Column1 then 1 else 0 end
and
1 = case when #Parameter2 = 0 then 1 when #Parameter2 = Column2 then 1 else 0 end
and
1 = case when #Parameter3 = 0 then 1 when #Parameter3 = Column3 then 1 else 0 end
return
The values passed for each parameter can be 0 (for all items) or non-zero for items matching on specific column.
I may have upwards of 20 parameters (example shown only has 3). Is there a more elegant approach to allow this to scale when the database gets large?
I am using something similar to your idea:
select *
from TableA
where
(#Param1 is null or Column1 = #Param1)
AND (#Param2 is null or Column2 = #Param2)
AND (#Param3 is null or Column3 = #Param3)
It is generally the same, but I used NULLs as neutral value instead of 0. It is more flexible in a sense that it doesn't matter what is the data type of the #Param variables.
I use a slightly different method to some of the ones listed above and I've not noticed any performance hit. Instead of passing in 0 for no filter I would use null and I would force it to be the default value of the parameter.
As we give the parameters default values it makes them optional which lends itself to better readability when your calling the procedure.
create proc myProc
#Parameter1 int = null,
#Parameter2 int = null,
#Paremeter3 int = null
AS
select
*
from
TableA
where
column1 = coalesce(#Parameter1,column1)
and
column2 = coalesce(#Parameter2, column2)
and
column3 = coalesce(#Parameter3,column3)
That said I may well try out the dynamic sql method next time to see if I notice any performance difference
Unfortunately dynamic SQL is the best solution performance/stability wise. While all the other methods commonly used ( #param is not or Col = #param, COALESCE, CASE... ) work, they are unpredictable and unreliable, execution plan can (and will) vary for each execution and you may find yourself spending lots of hours trying to figure out, why your query performs really bad when it was working fine yesterday.

creating SQL command to return match or else everything else

i have three checkboxs in my application. If the user ticks a combination of the boxes i want to return matches for the boxes ticked and in the case where a box is not checked i just want to return everything . Can i do this with single SQL command?
I recommend doing the following in the WHERE clause;
...
AND (#OnlyNotApproved = 0 OR ApprovedDate IS NULL)
It is not one SQL command, but works very well for me. Basically the first part checks if the switch is set (checkbox selected). The second is the filter given the checkbox is selected. Here you can do whatever you would normally do.
You can build a SQL statement with a dynamic where clause:
string query = "SELECT * FROM TheTable WHERE 1=1 ";
if (checkBlackOnly.Checked)
query += "AND Color = 'Black' ";
if (checkWhiteOnly.Checked)
query += "AND Color = 'White' ";
Or you can create a stored procedure with variables to do this:
CREATE PROCEDURE dbo.GetList
#CheckBlackOnly bit
, #CheckWhiteOnly bit
AS
SELECT *
FROM TheTable
WHERE
(#CheckBlackOnly = 0 or (#CheckBlackOnly = 1 AND Color = 'Black'))
AND (#CheckWhiteOnly = 0 or (#CheckWhiteOnly = 1 AND Color = 'White'))
....
sure. example below assumes SQL Server but you get the gist.
You could do it pretty easily using some Dynamic SQL
Lets say you were passing your checkboxes to a sproc as bit values.
DECLARE bit #cb1
DECLARE bit #cb2
DECLARE bit #cb3
DECLARE nvarchar(max) #whereClause
IF(#cb1 = 1)
SET #whereClause = #whereClause + ' AND col1 = ' + #cb1
IF(#cb2 = 1)
SET #whereClause = #whereClause + ' AND col2 = ' + #cb2
IF(#cb3 = 1)
SET #whereClause = #whereClause + ' AND col3 = ' + #cb3
DECLARE nvarchar(max) #sql
SET #sql = 'SELECT * FROM Table WHERE 1 = 1' + #whereClause
exec (#sql)
Sure you can.
If you compose your SQL SELECT statement in the code, then you just have to generate:
in case nothing or all is selected (check it using your language), you just issue non-filter version:
SELECT ... FROM ...
in case some checkboxes are checked, you create add a WHERE clause to it:
SELECT ... FROM ... WHERE MyTypeID IN (3, 5, 7)
This is single SQL command, but it is different depending on the selection, of course.
Now, if you would like to use one stored procedure to do the job, then the implementation would depend on the database engine since what you need is to be able to pass multiple parameters. I would discourage using a procedure with just plain 3 parameters, because when you add another check-box, you will have to change the SQL procedure as well.
SELECT *
FROM table
WHERE value IN
(
SELECT option
FROM checked_options
UNION ALL
SELECT option
FROM all_options
WHERE NOT EXISTS (
SELECT 1
FROM checked_options
)
)
The inner subquery will return either the list of the checked options, or all possible options if the list is empty.
For MySQL, it will be better to use this:
SELECT *
FROM t_data
WHERE EXISTS (
SELECT 1
FROM t_checked
WHERE session = 2
)
AND opt IN
(
SELECT opt
FROM t_checked
WHERE session = 2
)
UNION ALL
SELECT *
FROM t_data
WHERE NOT EXISTS (
SELECT 1
FROM t_checked
WHERE session = 2
)
MySQL will notice IMPOSSIBLE WHERE on either of the SELECT's, and will execute only the appropriate one.
See this entry in my blog for performance detail:
Selecting options
If you pass a null into the appropriate values, then it will compare that specific column against itself. If you pass a value, it will compare the column against the value
CREATE PROCEDURE MyCommand
(
#Check1 BIT = NULL,
#Check2 BIT = NULL,
#Check3 BIT = NULL
)
AS
SELECT *
FROM Table
WHERE Column1 = ISNULL(#Check1, Column1)
AND Column2 = ISNULL(#Check2, Column2)
AND Column3 = ISNULL(#Check3, Column3)
The question did not specify a DB product or programming language. However it can be done with ANSI SQL in a cross-product manner.
Assuming a programming language that uses $var$ for variable insertion on strings.
On the server you get all selected values in a list, so if the first two boxes are selected you would have a GET/POST variable like
http://url?colors=black,white
so you build a query like this (pseudocode)
colors = POST['colors'];
colors_list = replace(colors, ',', "','"); // separate colors with single-quotes
sql = "WHERE ('$colors$' == '') OR (color IN ('$colors_list$'));";
and your DB will see:
WHERE ('black,white' == '') OR (color IN ('black','white')); -- some selections
WHERE ('' == '') OR (color IN ('')); -- nothing selected (matches all rows)
Which is a valid SQL query. The first condition matches any row when nothing is selected, otherwise the right side of the OR statement will match any row that is one of the colors. This query scales to an unlimited number of options without modification. The brackets around each clause are optional as well but I use them for clarity.
Naturally you will need to protect the string from SQL injection using parameters or escaping as you see fit. Otherwise a malicious value for colors will allow your DB to be attacked.

SQL Precedence Matching

I'm trying to do precedence matching on a table within a stored procedure. The requirements are a bit tricky to explain, but hopefully this will make sense. Let's say we have a table called books, with id, author, title, date, and pages fields.
We also have a stored procedure that will match a query with ONE row in the table.
Here is the proc's signature:
create procedure match
#pAuthor varchar(100)
,#pTitle varchar(100)
,#pDate varchar(100)
,#pPages varchar(100)
as
...
The precedence rules are as follows:
First, try and match on all 4 parameters. If we find a match return.
Next try to match using any 3 parameters. The 1st parameter has the highest precedence here and the 4th the lowest. If we find any matches return the match.
Next we check if any two parameters match and finally if any one matches (still following the parameter order's precedence rules).
I have implemented this case-by-case. Eg:
select #lvId = id
from books
where
author = #pAuthor
,title = #pTitle
,date = #pDate
,pages = #pPages
if ##rowCount = 1 begin
select #lvId
return
end
select #lvId = id
from books
where
author = #pAuthor
,title = #pTitle
,date = #pDate
if ##rowCount = 1 begin
select #lvId
return
end
....
However, for each new column in the table, the number of individual checks grows by an order of 2. I would really like to generalize this to X number of columns; however, I'm having trouble coming up with a scheme.
Thanks for the read, and I can provide any additional information needed.
Added:
Dave and Others, I tried implementing your code and it is choking on the first Order by Clause, where we add all the counts. Its giving me an invalid column name error. When I comment out the total count, and order by just the individual aliases, the proc compiles fine.
Anyone have any ideas?
This is in Microsoft Sql Server 2005
I believe that the answers your working on are the simplest by far. But I also believe that in SQL server, they will always be full table scans. (IN Oracle you could use Bitmap indexes if the table didn't undergo a lot of simultaneous DML)
A more complex solution but a much more performant one would be to build your own index. Not a SQL Server index, but your own.
Create a table (Hash-index) with 3 columns (lookup-hash, rank, Rowid)
Say you have 3 columns to search on. A, B, C
For every row added to Books you'll insert 7 rows into hash_index either via a trigger or CRUD proc.
First you'll
insert into hash_index
SELECT HASH(A & B & C), 7 , ROWID
FROM Books
Where & is the concatenation operator and HASH is a function
then you'll insert hashes for A & B, A & C and B & C.
You now have some flexibility you can give them all the same rank or if A & B are a superior match to B & C you can give them a higher rank.
And then insert Hashes for A by itself and B and C with the same choice of rank... all the same number or all different... you can even say that a match on A is higher choice than a match on B & C. This solution give you a lot of flexibility.
Of course, this will add a lot of INSERT overhead, but if DML on Books is low or performance is not relevant you're fine.
Now when you go to search you'll create a function that returns a table of HASHes for your #A, #B and #C. you'll have a small table of 7 values that you'll join to the lookup-hash in the hash-index table. This will give you every possible match and possibly some false matches (that's just the nature of hashes). You'll take that result, order desc on the rank column. Then take the first rowid back to the book table and make sure that all of the values of #A #B #C are actually in that row. On the off chance it's not and you've be handed a false positive you'll need to check the next rowid.
Each of these operation in this "roll your own" are all very fast.
Hashing your 3 values into a small 7 row table variable = very fast.
joining them on an index in your Hash_index table = very fast index lookups
Loop over result set will result in 1 or maybe 2 or 3 table access by rowid = very fast
Of course, all of these together could be slower than an FTS... But an FTS will continue to get slower and slower. There will be a size which the FTS is slower than this. You'll have to play with it.
I don't have time to write out the query, but I think this idea would work.
For your predicate, use "author = #pAuthor OR title = #ptitle ...", so you get all candidate rows.
Use CASE expressions or whatever you like to create virtual columns in the result set, like:
SELECT CASE WHEN author = #pAuthor THEN 1 ELSE 0 END author_match,
...
Then add this order by and get the first row returned:
ORDER BY (author_match+title_match+date_match+page_match) DESC,
author_match DESC,
title_match DESC,
date_match DESC
page_match DESC
You still need to extend it for each new column, but only a little bit.
You don't explain what should happen if more than one result matches any given set of parameters that is reached, so you will need to change this to account for those business rules. Right now I've set it to return books that match on later parameters ahead of those that don't. For example, a match on author, title, and pages would come before one that just matches on author and title.
Your RDBMS may have a different way of handling "TOP", so you may need to adjust for that as well.
SELECT TOP 1
author,
title,
date,
pages
FROM
Books
WHERE
author = #author OR
title = #title OR
date = #date OR
pages = #pages OR
ORDER BY
CASE WHEN author = #author THEN 1 ELSE 0 END +
CASE WHEN title = #title THEN 1 ELSE 0 END +
CASE WHEN date = #date THEN 1 ELSE 0 END +
CASE WHEN pages = #pages THEN 1 ELSE 0 END DESC,
CASE WHEN author = #author THEN 8 ELSE 0 END +
CASE WHEN title = #title THEN 4 ELSE 0 END +
CASE WHEN date = #date THEN 2 ELSE 0 END +
CASE WHEN pages = #pages THEN 1 ELSE 0 END DESC
select id,
CASE WHEN #pPages = pages
THEN 1 ELSE 0
END
+ Case WHEN #pAuthor=author
THEN 1 ELSE 0
END AS
/* + Do this for each attribute. If each of your
attributes are just as important as the other
for example matching author is jsut as a good as matching title then
leave the values alone, if different matches are more
important then change the values */ as MatchRank
from books
where author = #pAuthor OR
title = #pTitle OR
date = #pDate
ORDER BY MatchRank DESC
Edited
When I run this query (modified only to fit one of my own tables) it works fine in SQL2005.
I'd recommend a where clause but you will want to play around with this to see performance impacts. You will need to use an OR clause otherwise you will loose potential matches
In regards to the Order By clause failing to compile:
As recursive said(in a comment), alias' may not be within expressions which are used in Order By clauses. to get around this I used a subquery which returned the rows, then ordered by in the outer query. In this way I am able to use the alias' in the order by clause. A little slower but a lot cleaner.
Okay, let me restate my understanding of your question: You want a stored procedure that can take a variable number of parameters and pass back the top row that matches the parameters in the weighted order of preference passed on SQL Server 2005.
Ideally, it will use WHERE clauses to prevent full tables scans plus take advantage of indices and will "short circuit" the search - you don't want to search all possible combinations if one can be found early. Perhaps we can also allow other comparators than = such as >= for dates, LIKE for strings, etc.
One possible way is to pass the parameters as XML like in this article and use .Net stored procedures but let's keep it plain vanilla T-SQL for now.
This looks to me like a binary search on the parameters: Search all parameters, then drop the last one, then drop the second last one but include the last one, etc.
Let's pass the parameters as a delimited string since stored procedures don't allow for arrays to be passed as parameters. This will allow us to get a variable number of parameters in to our stored procedure without requiring a stored procedure for each variation of parameters.
In order to allow any sort of comparison, we'll pass the entire WHERE clause list, like so: title like '%something%'
Passing multiple parameters means delimiting them in a string. We'll use the tilde ~ character to delimit the parameters, like this: author = 'Chris Latta'~title like '%something%'~pages >= 100
Then it is simply a matter of doing a binary weighted search for the first row that meets our ordered list of parameters (hopefully the stored procedure with comments is self-explanatory but if not, let me know). Note that you are always guaranteed a result (assuming your table has at least one row) as the last search is parameterless.
Here is the stored procedure code:
CREATE PROCEDURE FirstMatch
#SearchParams VARCHAR(2000)
AS
BEGIN
DECLARE #SQLstmt NVARCHAR(2000)
DECLARE #WhereClause NVARCHAR(2000)
DECLARE #OrderByClause NVARCHAR(500)
DECLARE #NumParams INT
DECLARE #Pos INT
DECLARE #BinarySearch INT
DECLARE #Rows INT
-- Create a temporary table to store our parameters
CREATE TABLE #params
(
BitMask int, -- Uniquely identifying bit mask
FieldName VARCHAR(100), -- The field name for use in the ORDER BY clause
WhereClause VARCHAR(100) -- The bit to use in the WHERE clause
)
-- Temporary table identical to our result set (the books table) so intermediate results arent output
CREATE TABLE #junk
(
id INT,
author VARCHAR(50),
title VARCHAR(50),
printed DATETIME,
pages INT
)
-- Ill use tilde ~ as the delimiter that separates parameters
SET #SearchParams = LTRIM(RTRIM(#SearchParams))+ '~'
SET #Pos = CHARINDEX('~', #SearchParams, 1)
SET #NumParams = 0
-- Populate the #params table with the delimited parameters passed
IF REPLACE(#SearchParams, '~', '') <> ''
BEGIN
WHILE #Pos > 0
BEGIN
SET #NumParams = #NumParams + 1
SET #WhereClause = LTRIM(RTRIM(LEFT(#SearchParams, #Pos - 1)))
IF #WhereClause <> ''
BEGIN
-- This assumes your field names dont have spaces and that you leave a space between the field name and the comparator
INSERT INTO #params (BitMask, FieldName, WhereClause) VALUES (POWER(2, #NumParams - 1), LTRIM(RTRIM(LEFT(#WhereClause, CHARINDEX(' ', #WhereClause, 1) - 1))), #WhereClause)
END
SET #SearchParams = RIGHT(#SearchParams, LEN(#SearchParams) - #Pos)
SET #Pos = CHARINDEX('~', #SearchParams, 1)
END
END
-- Set the binary search to search from all parameters down to one in order of preference
SET #BinarySearch = POWER(2, #NumParams)
SET #Rows = 0
WHILE (#BinarySearch > 0) AND (#Rows = 0)
BEGIN
SET #BinarySearch = #BinarySearch - 1
SET #WhereClause = ' WHERE '
SET #OrderByClause = ' ORDER BY '
SELECT #OrderByClause = #OrderByClause + FieldName + ', ' FROM #params WHERE (#BinarySearch & BitMask) = BitMask ORDER BY BitMask
SET #OrderByClause = LEFT(#OrderByClause, LEN(#OrderByClause) - 1) -- Remove the trailing comma
SELECT #WhereClause = #WhereClause + WhereClause + ' AND ' FROM #params WHERE (#BinarySearch & BitMask) = BitMask ORDER BY BitMask
SET #WhereClause = LEFT(#WhereClause, LEN(#WhereClause) - 4) -- Remove the trailing AND
IF #BinarySearch = 0
BEGIN
-- If nothing found so far, return the top row in the order of the parameters fields
SET #WhereClause = ''
-- Use the full order sequence of fields to return the results
SET #OrderByClause = ' ORDER BY '
SELECT #OrderByClause = #OrderByClause + FieldName + ', ' FROM #params ORDER BY BitMask
SET #OrderByClause = LEFT(#OrderByClause, LEN(#OrderByClause) - 1) -- Remove the trailing comma
END
-- Find out if there are any results for this search
SET #SQLstmt = 'SELECT TOP 1 id, author, title, printed, pages INTO #junk FROM books' + #WhereClause + #OrderByClause
Exec (#SQLstmt)
SET #Rows = ##RowCount
END
-- Stop the result set being eaten by the junk table
SET #SQLstmt = REPLACE(#SQLstmt, 'INTO #junk ', '')
-- Uncomment the next line to see the SQL you are producing
--PRINT #SQLstmt
-- This gives the result set
Exec (#SQLstmt)
END
This stored procedure is called like so:
FirstMatch 'author = ''Chris Latta''~pages > 100~title like ''%something%'''
There you have it - a fully expandable, optimised search for the top result in weighted order of preference. This was an interesting problem and shows just what you can pull off with native T-SQL.
A couple of small issues with this:
it relies on the caller to know that they must leave a space after the field name for the parameter to work properly
you can't have field names with spaces in them - fixable with some effort
it assumes that the relevant sort order is always ascending
the next programmer that has to look at this procedure will think you're insane :)
Try this:
ALTER PROCEDURE match
#pAuthor varchar(100)
,#pTitle varchar(100)
,#pDate varchar(100)
,#pPages varchar(100)
-- exec match 'a title', 'b author', '1/1/2007', 15
AS
SELECT id,
CASE WHEN author = #pAuthor THEN 1 ELSE 0 END
+ CASE WHEN title = #pTitle THEN 1 ELSE 0 END
+ CASE WHEN bookdate = #pDate THEN 1 ELSE 0 END
+ CASE WHEN pages = #pPages THEN 1 ELSE 0 END AS matches,
CASE WHEN author = #pAuthor THEN 4 ELSE 0 END
+ CASE WHEN title = #pTitle THEN 3 ELSE 0 END
+ CASE WHEN bookdate = #pDate THEN 2 ELSE 0 END
+ CASE WHEN pages = #pPages THEN 1 ELSE 0 END AS score
FROM books
WHERE author = #pAuthor
OR title = #pTitle
OR bookdate = #PDate
OR pages = #pPages
ORDER BY matches DESC, score DESC
However, this of course causes a table scan. You can avoid that by making it a union of a CTE and 4 WHERE clauses, one for each property - there will be duplicates, but you can just take the TOP 1 anyway.
EDIT: Added the WHERE ... OR clause. I'd feel more comfortable if it were
SELECT ... FROM books WHERE author = #pAuthor
UNION
SELECT ... FROM books WHERE title = #pTitle
UNION
...