How can I search a SQL database with multiple "%" wildcards? - sql

I am trying to write a SQL query in SQL Server 2008 R2 that will allow a user to search a database table by a number of parameters. The way this should work is, my user enters his criteria and the query looks for all close matches, while ignoring those criteria for which the user did not enter a value.
I've written my query using LIKE and parameters, like so:
select item
from [item]
where a like #a and b like #b and c like #c ...
where 'a', 'b', and 'c' are table columns, and my # parameters all default to '%' wildcards. This goes on for about twenty different columns, and that leads to my problem: if this query is entered as is, no input, just wildcards, it returns no results. Yet this table contains over 30,000 rows, so an all-wildcard query should return the whole table. Obviously I'm going about this the wrong way, but I don't know how to correct it.
I can't use 'contains' or 'freetext', as those look for whole words, and I need to match user input no matter where it occurs in the actual column value. I've tried breaking my query up into individual steps using 'intersect', but that doesn't change anything. Does anyone know a better way to do this?

To allow for null inputs, this is a good pattern:
select * from my table where ColA LIKE isnull(#a, ColA) AND ColB like isnull(#b, ColB)
This avoids having to construct and execute a dynamic SQL statement (and creating possible SQL injection issues.)

my # parameters all default to '%' wildcards
Don't do this. Default them to null. The way to disregard empty parameters is with a short circuit:
(#a IS NULL OR #a LIKE a)
Depending on how you want to handle missing data in the column, you might want a third term, because null will not match LIKE statements:
(#a IS NULL OR a IS NULL OR #a LIKE a)
How can I search a SQL database with multiple "%" wildcards?
Slowly. SQL is suboptimal for doing text comparisons. The best approach is to perform this search somewhere else, or at least structure your data to facilitate these kinds of queries. If you know you'll be performing a lot of these queries, consider redesigning your schema in the shape of a suffix tree. At an absolute bare minimum, do something so that every LIKE match is suffix-only, meaning LIKE 'xxx%' and never LIKE '%xxx' or LIKE 'x%x'. The latter two preclude the use of indexes. And put an index on every column you need to search.

Thanks for the guidance, all. It turns out that the table I'm querying can easily contain null values in the columns I'm searching against, so I expanded my query to say "where (a like #a or a is null) and ... " and it works now.

Personally, I'd do this in the application layer (assuming you have one), and build the query around the parameters the user supplies, eliminating the ones they don't.
For example, the following bit of code builds the query in SQL, where only the parameters the user has supplied (not null) are included in the where clause.
NOTE: This is very crude, as it doesn't take into account AND's if the first parameter if null, and it doesn't remove the WHERE clause if no parameters are supplied. If you let me know what language your application layer is built in, I'll provide a better example. (This is purely pseudo-code!)
DECLARE #a VARCHAR(100) = '''%SomeValue%''', #b VARCHAR(100)= '''%AnotherValue%''', #c VARCHAR(100)
DECLARE #SQL VARCHAR(MAX) = 'SELECT * FROM MyTable WHERE'
IF #a IS NOT NULL
BEGIN
SET #SQL += ' ColA LIKE ' + #a
END
IF #b IS NOT NULL
BEGIN
SET #SQL += ' AND ColA LIKE ' + #b
END
IF #c IS NOT NULL
BEGIN
SET #SQL += ' AND ColC LIKE ' + #c
END
PRINT #SQL
--EXEC(#SQL)
Output:
SELECT * FROM MyTable WHERE ColA = '%SomeValue%' AND ColB = '%AnotherValue%'

Related

How do you pass values for a parameter by position when you need to check multiple values?

I created a stored procedure (spBalanceRange) with 2 optional parameters. They've been set to a default value and the sp works fine when I pass only 1 value per parameter by position. However, I have a situation where I'm trying to pass, by position, two strings immediately followed by a wildcard. I want the user to be able to search for Vendor names that start with either 'C%' or 'F%'. Here's the gist of the CREATE PROC statement:
CREATE PROC spBalanceRange
#VendorVar varchar(40) = '%',
#BalanceMin money = 1.0
...
Here's what I've tried so far, but doesn't work:
EXEC spBalanceRange '(C%|F%)', 200.00;
EXEC spBalanceRange 'C%|F%', 200.00;
Is there a way to check for 2 or more string values with a wildcard when passed by position? Thanks.
EDIT: According to your comments you are looking for the first letter of a vendor's name only.
In this special case I could suggest an easy, not well performing but really simple approach. CHARINDEX returns a number greater than zero, if a character appears within a string. So you just have to pass in all your lookup-first-characters as a simple "chain":
DECLARE #DummyVendors TABLE(VendorName VARCHAR(100));
INSERT INTO #DummyVendors VALUES
('Camel Industries')
,('Fritz and Fox')
,('some other');
DECLARE #ListOfFirstLetters VARCHAR(100)='CF';
SELECT VendorName
FROM #DummyVendors AS dv
WHERE CHARINDEX(LEFT(dv.VendorName,1),#ListOfFirstLetters)>0
This was the former answer
Checking against more than one value needs either a dedicated list of compares
WHERE val=#prm1 OR val=#prm2 OR ... (you know the count before)
...or you use the IN-clause
WHERE LEFT(VenoderName,1) IN ('C','F', ...)
...but you cannot pass the IN-list with a parameter like ... IN(#allValues)
You might think about a created TYPE to pass in all your values like a table and use an INNER JOIN as filter: https://stackoverflow.com/a/337864/5089204 (and a lot of other examples there...)
Or you might think of dynamic SQL: https://stackoverflow.com/a/5192765/5089204
And last but not least you might think of one of the many split string approaches. This is one of my own answers, section "dynamic IN-statement": https://stackoverflow.com/a/33658220/5089204
I'm answering my own question, and maybe other solutions exist but here is what had to happen with my stored procedure in order to pass variables by position:
CREATE PROC spBalanceRange
#VendorVar varchar(40) = '%',
#BalanceMin money = 1.0
AS
IF (#VendorVar = '%' AND #BalanceMin IS NULL OR #BalanceMin = '')
BEGIN
PRINT 'BalanceMin cannot be null.';
END
IF (#VendorVar = % AND #BalanceMin IS NOT NULL)
BEGIN
(sql statement using parameters)
END
EXEC spBalanceRange '[C,F]%', 200.00;
That's what I know.

Complex query filter using Like() in T-SQL

I'm writing a SQL script that we want our accounting team to be able to edit, without dealing with engineering.
The general idea is to have a .sql script, which defines some variables at the top of the query, and then has several complex queries below it that use those variables.
The problem we have is that we want the accounting team to be able to specify the filter to use. For example:
DECLARE #year INT
DECLARE #month INT
DECLARE #filter VARCHAR(30);
SET #year = 2010
SET #month = 7
SET #filter = '%test%'
Here the team can change the month and the year that the subsequent queries return. They can also define ONE filter element, in this example, excluding any records where the username has the string 'test' in it.
My question is whether or not there is a way to specify OR's to a LIKE(). Eg, ideally we'd have the #filter variable as something like '%test%, or %other%. Now I know that's not real syntax, but I'm wondering if there is syntax that lets me achieve that. I've scowered MSDN on the LIKE() syntax with no joy. Should I use some different query expression?
Probably the simplest thing to do would be to just have multiple parameters, though it's not pretty:
SET #filter_1 = '%test%'
SET #filter_2 = '%foo%'
SET #filter_3 = '%'
SET #filter_4 = '%'
SELECT *
FROM BAR
WHERE var LIKE #filter_1
OR var LIKE #filter_2
OR var LIKE #filter_3
OR var LIKE #filter_4
OR var LIKE #filter_5
By defaulting them to %, they will always match by default.
You could also use dynamic SQL and a local table variable. Basically, create a local table with one column, allow them to change the INSERT statements into that table, then define a loop that iterates over the contents of that table to dynamically generate the LIKE clauses. It would work, but it would be a bit more code. The example above is quick and dirty, but I'd guess it's probably sufficient for what you need to do.
I'd use a join with a LIKE predicate. You can execute the following code sample in a query window to see how this works:
DECLARE #tblFilter TABLE
(sFilter nvarchar(MAX) NOT NULL);
INSERT #tblFilter
SELECT * FROM (VALUES ('%one%'), ('%two%'), ('%three%')) v(s);
DECLARE #tblData TABLE
(iId int NOT NULL PRIMARY KEY IDENTITY,
sData nvarchar(MAX));
INSERT #tblData(sData)
SELECT * FROM (VALUES ('one'), ('two three'), ('four')) v(s);
SELECT DISTINCT iId
FROM #tblData d
JOIN #tblFilter f ON d.sData LIKE f.sFilter;
I assume that the different query strings are in the #tblFilter table, which could be a TVP, coming from XML values, from comma-separated values, from a temp table or whatever.

SQL query like filter

I need to execute a search query in SQL Server where I need to filter out data based upon an user input textfield.
The problem is, this query needs to be executed on several tables (so I only know the tablecolumns at runtime).
This is the query I have:
SELECT * FROM [BTcegeka.C2M].[dbo].[Lookup_Country] WHERE Name LIke '%test%'
Now the problem is I need to do the Like function on every column (I only know the columnname at runtime) in the table. I am calling this query from an ASP.NET website. The user selects a table from a dropdownlist and can then enter the search field.
This is what I really want to accomplish:
SELECT * FROM [BTcegeka.C2M].[dbo].[Lookup_Country] WHERE * LIke '%test%'
Obviously 'Where * Like' Fails. How can I accomplish this?
You can query all columns in a table like:
select name from sys.columns where object_id = object_id('YourTable')
Then you can construct a query that does a like for each column.
Another approach is to create a calculated column called SearchField that contains a concatenation of all strings you'd like to search for. Then you can search like:
create table #tmp (id int identity, col1 varchar(10), col2 varchar(10),
SearchField as col1 + '|' + col2 persisted)
insert #tmp (col1, col2) values
('alfa', 'beta'),
('gamma', 'DELTA'),
('GAMMA', 'delta')
select * from #tmp where SearchField like '%alfa%'
Try using your SQL query like this.
SELECT * FROM [BTcegeka.C2M].[dbo].[Lookup_Country]
WHERE
COL1 LIke '%test%'
OR COL2 LIke '%test%'
OR COL3 LIke '%test%'
You may use AND instead of OR if your requirement needs that.
If you know the column names at run time, then you should build you query in .NET before passing it to sql. You can build it with the correct column name. This way you can account also for the type of the column you search in.
Careful though this path you chose is prone to SQL injection so before sending a query to the SQL you should check it.
If you really need to do this you can search in sqlserver meta tables and find the description of selected user table. Make a good use of this data is easy and you can make any sql you want with this information, but performance may not the that good
you have to use dynamic sql for implementing this. Your column name needs to be passed as parameter to this stored procedure or if you dont want to create stored procedure just declare one paramter and assign the value selected from the drop down list to it and use that in the query.
create procedure sp_dynamicColumn
(
#columnName varchar(10)
)
as
begin
declare #DYNAMICSQL nvarchar(4000);
SET #DYNAMICSQL = 'Select * from [BTcegeka.C2M].[dbo].[Lookup_Country] where '+ #columnName + ' like ''%test%'''
EXECUTE SP_EXECUTESQL #DYNAMICSQL
end
go

Filtering With Multi-Select Boxes With SQL Server

I need to filter result sets from sql server based on selections from a multi-select list box. I've been through the idea of doing an instring to determine if the row value exists in the selected filter values, but that's prone to partial matches (e.g. Car matches Carpet).
I also went through splitting the string into a table and joining/matching based on that, but I have reservations about how that is going to perform.
Seeing as this is a seemingly common task, I'm looking to the Stack Overflow community for some feedback and maybe a couple suggestions on the most commonly utilized approach to solving this problem.
I solved this one by writing a table-valued function (we're using 2005) which takes a delimited string and returns a table. You can then join to that or use WHERE EXISTS or WHERE x IN. We haven't done full stress testing yet, but with limited use and reasonably small sets of items I think that performance should be ok.
Below is one of the functions as a starting point for you. I also have one written to specifically accept a delimited list of INTs for ID values in lookup tables, etc.
Another possibility is to use LIKE with the delimiters to make sure that partial matches are ignore, but you can't use indexes with that, so performance will be poor for any large table. For example:
SELECT
my_column
FROM
My_Table
WHERE
#my_string LIKE '%|' + my_column + '|%'
.
/*
Name: GetTableFromStringList
Description: Returns a table of values extracted from a delimited list
Parameters:
#StringList - A delimited list of strings
#Delimiter - The delimiter used in the delimited list
History:
Date Name Comments
---------- ------------- ----------------------------------------------------
2008-12-03 T. Hummel Initial Creation
*/
CREATE FUNCTION dbo.GetTableFromStringList
(
#StringList VARCHAR(1000),
#Delimiter CHAR(1) = ','
)
RETURNS #Results TABLE
(
String VARCHAR(1000) NOT NULL
)
AS
BEGIN
DECLARE
#string VARCHAR(1000),
#position SMALLINT
SET #StringList = LTRIM(RTRIM(#StringList)) + #Delimiter
SET #position = CHARINDEX(#Delimiter, #StringList)
WHILE (#position > 0)
BEGIN
SET #string = LTRIM(RTRIM(LEFT(#StringList, #position - 1)))
IF (#string <> '')
BEGIN
INSERT INTO #Results (String) VALUES (#string)
END
SET #StringList = RIGHT(#StringList, LEN(#StringList) - #position)
SET #position = CHARINDEX(#Delimiter, #StringList, 1)
END
RETURN
END
I've been through the idea of doing an
instring to determine if the row value
exists in the selected filter values,
but that's prone to partial matches
(e.g. Car matches Carpet)
It sounds to me like you aren't including a unique ID, or possibly the primary key as part of values in your list box. Ideally each option will have a unique identifier that matches a column in the table you are searching on. If your listbox was like below then you would be able to filter for specifically for cars because you would get the unique value 3.
<option value="3">Car</option>
<option value="4">Carpret</option>
Then you just build a where clause that will allow you to find the values you need.
Updated, to answer comment.
How would I do the related join
considering that the user can select
and arbitrary number of options from
the list box? SELECT * FROM tblTable
JOIN tblOptions ON tblTable.FK = ? The
problem here is that I need to join on
multiple values.
I answered a similar question here.
One method would be to build a temporary table and add each selected option as a row to the temporary table. Then you would simply do a join to your temporary table.
If you want to simply create your sql dynamically you can do something like this.
SELECT * FROM tblTable WHERE option IN (selected_option_1, selected_option_2, selected_option_n)
I've found that a CLR table-valued function which takes your delimited string and calls Split on the string (returning the array as the IEnumerable) is more performant than anything written in T-SQL (it starts to break down when you have around one million items in the delimited list, but that's much further out than the T-SQL solution).
And then, you could join on the table or check with EXISTS.

Dynamic SQL - Search Query - Variable Number of Keywords

We are trying to update our classic asp search engine to protect it from SQL injection. We have a VB 6 function which builds a query dynamically by concatenating a query together based on the various search parameters. We have converted this to a stored procedure using dynamic sql for all parameters except for the keywords.
The problem with keywords is that there are a variable number words supplied by the user and we want to search several columns for each keyword. Since we cannot create a separate parameter for each keyword, how can we build a safe query?
Example:
#CustomerId AS INT
#Keywords AS NVARCHAR(MAX)
#sql = 'SELECT event_name FROM calendar WHERE customer_id = #CustomerId '
--(loop through each keyword passed in and concatenate)
#sql = #sql + 'AND (event_name LIKE ''%' + #Keywords + '%'' OR event_details LIKE ''%' + #Keywords + '%'')'
EXEC sp_executesql #sql N'#CustomerId INT, #CustomerId = #CustomerId
What is the best way to handle this and maintaining protection from SQL injection?
You may not like to hear this, but it might be better for you to go back to dynamically constructing your SQL query in code before issuing against the database. If you use parameter placeholders in the SQL string you get the protection against SQL injection attacks.
Example:
string sql = "SELECT Name, Title FROM Staff WHERE UserName=#UserId";
using (SqlCommand cmd = new SqlCommand(sql))
{
cmd.Parameters.Add("#UserId", SqlType.VarChar).Value = "smithj";
You can build the SQL string depending on the set of columns you need to query and then add the parameter values once the string is complete. This is a bit of a pain to do, but I think it is much easier than having really complicated TSQL which unpicks lots of possible permutations of possible inputs.
You have 3 options here.
Use a function that converts lists tables and join into it. So you will have something like this.
SELECT *
FROM calendar c
JOIN dbo.fnListToTable(#Keywords) k
ON c.keyword = k.keyword
Have a fixed set of params, and only allow the maximum of N keywords to be searched on
CREATE PROC spTest
#Keyword1 varchar(100),
#Keyword2 varchar(100),
....
Write an escaping string function in TSQL and escape your keywords.
Unless you need it, you could simply strip out any character that's not in [a-zA-Z ] - most of those things won't be in searches and you should not be able to be injected that way, nor do you have to worry about keywords or anything like that. If you allow quotes, however, you will need to be more careful.
Similar to sambo99's #1, you can insert the keywords into a temporary table or table variable and join to it (even using wildcards) without danger of injection:
This isn't really dynamic:
SELECT DISTINCT event_name
FROM calendar
INNER JOIN #keywords
ON event_name LIKE '%' + #keywords.keyword + '%'
OR event_description LIKE '%' + #keywords.keyword + '%'
You can actually generate an SP with a large number of parameters instead of coding it by hand (set the defaults to '' or NULL depending on your preference in coding your searches). If you found you needed more parameters, it would be simple to increase the number of parameters it generated.
You can move the search to a full-text index outside the database like Lucene and then use the Lucene results to pull the matching database rows.
You can try this:
SELECT * FROM [tablename] WHERE LIKE % +keyword%