Filtering With Multi-Select Boxes With SQL Server - sql

I need to filter result sets from sql server based on selections from a multi-select list box. I've been through the idea of doing an instring to determine if the row value exists in the selected filter values, but that's prone to partial matches (e.g. Car matches Carpet).
I also went through splitting the string into a table and joining/matching based on that, but I have reservations about how that is going to perform.
Seeing as this is a seemingly common task, I'm looking to the Stack Overflow community for some feedback and maybe a couple suggestions on the most commonly utilized approach to solving this problem.

I solved this one by writing a table-valued function (we're using 2005) which takes a delimited string and returns a table. You can then join to that or use WHERE EXISTS or WHERE x IN. We haven't done full stress testing yet, but with limited use and reasonably small sets of items I think that performance should be ok.
Below is one of the functions as a starting point for you. I also have one written to specifically accept a delimited list of INTs for ID values in lookup tables, etc.
Another possibility is to use LIKE with the delimiters to make sure that partial matches are ignore, but you can't use indexes with that, so performance will be poor for any large table. For example:
SELECT
my_column
FROM
My_Table
WHERE
#my_string LIKE '%|' + my_column + '|%'
.
/*
Name: GetTableFromStringList
Description: Returns a table of values extracted from a delimited list
Parameters:
#StringList - A delimited list of strings
#Delimiter - The delimiter used in the delimited list
History:
Date Name Comments
---------- ------------- ----------------------------------------------------
2008-12-03 T. Hummel Initial Creation
*/
CREATE FUNCTION dbo.GetTableFromStringList
(
#StringList VARCHAR(1000),
#Delimiter CHAR(1) = ','
)
RETURNS #Results TABLE
(
String VARCHAR(1000) NOT NULL
)
AS
BEGIN
DECLARE
#string VARCHAR(1000),
#position SMALLINT
SET #StringList = LTRIM(RTRIM(#StringList)) + #Delimiter
SET #position = CHARINDEX(#Delimiter, #StringList)
WHILE (#position > 0)
BEGIN
SET #string = LTRIM(RTRIM(LEFT(#StringList, #position - 1)))
IF (#string <> '')
BEGIN
INSERT INTO #Results (String) VALUES (#string)
END
SET #StringList = RIGHT(#StringList, LEN(#StringList) - #position)
SET #position = CHARINDEX(#Delimiter, #StringList, 1)
END
RETURN
END

I've been through the idea of doing an
instring to determine if the row value
exists in the selected filter values,
but that's prone to partial matches
(e.g. Car matches Carpet)
It sounds to me like you aren't including a unique ID, or possibly the primary key as part of values in your list box. Ideally each option will have a unique identifier that matches a column in the table you are searching on. If your listbox was like below then you would be able to filter for specifically for cars because you would get the unique value 3.
<option value="3">Car</option>
<option value="4">Carpret</option>
Then you just build a where clause that will allow you to find the values you need.
Updated, to answer comment.
How would I do the related join
considering that the user can select
and arbitrary number of options from
the list box? SELECT * FROM tblTable
JOIN tblOptions ON tblTable.FK = ? The
problem here is that I need to join on
multiple values.
I answered a similar question here.
One method would be to build a temporary table and add each selected option as a row to the temporary table. Then you would simply do a join to your temporary table.
If you want to simply create your sql dynamically you can do something like this.
SELECT * FROM tblTable WHERE option IN (selected_option_1, selected_option_2, selected_option_n)

I've found that a CLR table-valued function which takes your delimited string and calls Split on the string (returning the array as the IEnumerable) is more performant than anything written in T-SQL (it starts to break down when you have around one million items in the delimited list, but that's much further out than the T-SQL solution).
And then, you could join on the table or check with EXISTS.

Related

Is it better to use Custom TABLE TYPE as parameter instead of SQL "IN" clause when passing a large comma separated value

I have a stored procedure it takes comma separated string as input. Which might be too large some times approximately more than 8 thousand characters or more. In that situation, query performance goes down sometimes. And I think there is a limitation for the character length inside the IN clause. For that, sometimes I get errors. Now, I need to know is it better to use a Custom TABLE TYPE as parameter and use Inner JOIN to find the result. If it is then why is it. Here are my 2 stored procedures (minimal code):
CREATE TYPE [dbo].[INTList] AS TABLE(
[ID] [int] NULL
)
Procedure 1
CREATE PROCEDURE [report].[GetSKU]
#list [INTList] READONLY,
AS
Select sk.SKUID,sk.Code SCode,sk.SName
FROM SKUs sk
INNER JOIN #list sst ON sst.ID=sk.SKUID
Procedure 2
CREATE PROCEDURE [report].[GetSKU]
#params varchar(max),
AS
Select sk.SKUID,sk.Code SCode,sk.SName
FROM SKUs sk
WHere CHARINDEX(','+cast( sk.SKUID as varchar(MAX))+',', #params) > 0
Now, which procedures is better to use.
Note: Original Stored Procedures does have few more Joins.
As this question did raise quite some discussion in comments but did not get any viable answer, I'd like to add the major points in order to help future research.
This question is about: How do I pass a (large) list of values into a query?
In most cases, people need this either in a WHERE SomeColumn IN(SomeValueList)-filter or to JOIN against this with something like FROM MyTable INNER JOIN SomeValueList ON....
Very important is the SQL-Server's version, as with v2016 we got two great tools: native STRING_SPLIT() (not position-safe!) and JSON support.
Furthermore, and rather obvious, we have to think about the scales and values.
Do we pass in a simple list of some IDs or a huge list with thousands of values?
Do we talk about simple integers or GUIDs?
And what's about text values, where we have to think about dangerous characters (like [ { " in JSON or < & in XML - there are many more...)?
What about CSV-lists, where the separating character might appear within the content (quoting / escaping)?
In some cases we might even want to pass several columns at once...
There are several options:
Table valued parameter (TVP, CREATE TYPE ...),
CSV together with string splitting functions (native since v2016, various home brewed, CLR...),
and text-based containers: XML or JSON (since v2016)
Table valued paramter (TVP - the best choice)
A table valued parameter (TVP) must be created in advance (this might be a draw back) but will behave as any other table once created. You can add indexes, you can use it in various use cases and you do not have to bother about anything under the hood.
Sometimes we cannot use this due to missing rights to use CREATE TYPE...
Character separated values (CSV)
With CSV we see three approaches
Dynamic Sql: Create a statement, where the CSV list is simply stuffed into the IN() and execute this dynamically. This can be a very efficient approach, but will be open to various obstacles (no ad-hoc-usage, injection threat, breaking on bad values...)
String splitting functions: There are tons of examples around... All of them have in common that the separated string will be returned as a list of items. Common issues here: performance, missing ordinal position, limits for the separator, handling of duplicate or empty values, handling of quoted or escaped values, handling of separators within the content. Aaron Bertrand did some great research about the various approaches of string splitting. Similar to TVPs one draw back might be, that this function must exist in the database in advance or that we need to be allowed to execute CREATE FUNCTION if not.
ad-hoc-splitters: Before v2016 the most used approach was XML based, since then we have moved to JSON based splitters. Both use some string methods to transform the CSV string to 1) separated elements (XML) or 2) into a JSON-array. The result is queried by 1) XQuery (.value() and .nodes()) or 2) JSON's OPENJSON() or JSON_VALUE().
Text based containers
We can pass the list as string, but within a defined format:
Using ["a","b","c"] instead of a,b,c allows for immediate usage of OPENJSON().
Using <x>a</x><x>b</x><x>c</x> instead allows for XML queries.
The biggest advantage here: Any programming language provides support for these formats.
Common obstacles like date and number formatting is solved implicitly. Passing JSON or XML is - in most cases - just some few lines of code.
Both approaches allow for type- and position-safe queries.
We can solve our needs without the need to rely on anything existing in advance.
For the very best performance you can use this function:
CREATE FUNCTION [dbo].StringSplit
(
#String VARCHAR(MAX), #Separator CHAR(1)
)
RETURNS #RESULT TABLE(Value VARCHAR(MAX))
AS
BEGIN
DECLARE #SeparatorPosition INT = CHARINDEX(#Separator, #String ),
#Value VARCHAR(MAX), #StartPosition INT = 1
IF #SeparatorPosition = 0
BEGIN
INSERT INTO #RESULT VALUES(#String)
RETURN
END
SET #String = #String + #Separator
WHILE #SeparatorPosition > 0
BEGIN
SET #Value = SUBSTRING(#String , #StartPosition, #SeparatorPosition- #StartPosition)
IF( #Value <> '' )
INSERT INTO #RESULT VALUES(#Value)
SET #StartPosition = #SeparatorPosition + 1
SET #SeparatorPosition = CHARINDEX(#Separator, #String , #StartPosition)
END
RETURN
END
This function return table - select * from StringSplit('12,13,14,15,16', ',') so you can join this function to your table or can use IN on the where clause.

FnSplit not working for SQL stored procedure to take multiple parameters

I have a stored procedure that currently takes in one value(ChainId) for a parameter. I am trying to allow the user to select multiple values of(ChainId). My where statement is below. Could someone help point me in a better direction than I am going now. Currently the query will run and return no data if I select multiple values for the parameter.
WHERE EndAuth is null AND CL.CHIND in(
SELECT [Value] FROM dbo.FnSplit(#ChainId, ','))
ORDER BY CL.CHIND
This is a popular function in SQL Server, so I'll assume you're working with that. Make sure your parameter is of type Varchar(MAX). #ChainId is passed as your string (ideally for SSRS) and ',' is passed as your delimiter. In SSRS, if you have a text box for your users to manually enter multiple values, they will enter something like 'value1, value2, value3'.
Test this out:
Declare #Yes_No Varchar(Max)
Set #Yes_No = 'y,n'
Select #yes_no
Select * from SplitString('y,n',',')
Select * from SplitString(#Yes_No,',')
Your results will be
y,n
----
y
n
----
y
n
Why I say to use Varchar(Max) and not int, or Varchar(10) for example, is because that would stop the function from reading all the values prematurely.
Try this:
Declare #Yes_No Varchar(1)
Set #Yes_No = 'y,n'
Select * from SplitString(#Yes_No,',')
The result will be:
y
The reason is because the function only accepts a value of 1 character in length, and splits that. As you can see, there isn't much to split.
This is just the way SSRS accepts parameters. FN_Split isn't necessarily a built-in function, but a widely popular one designed to allow you to pass multiple values to a string, with a pre-specified delimiter. So make sure you also go to your parameter in the report and specify that it will allow for multiple values. You will also want to supply a list of potential values for your users to select from. You'll either do this by manually populating a small list or providing another data source in the form of a stored procedure or table.

SQL How to Split One Column into Multiple Variable Columns

I am working on MSSQL, trying to split one string column into multiple columns. The string column has numbers separated by semicolons, like:
190230943204;190234443204;
However, some rows have more numbers than others, so in the database you can have
190230943204;190234443204;
121340944534;340212343204;134530943204
I've seen some solutions for splitting one column into a specific number of columns, but not variable columns. The columns that have less data (2 series of strings separated by commas instead of 3) will have nulls in the third place.
Ideas? Let me know if I must clarify anything.
Splitting this data into separate columns is a very good start (coma-separated values are an heresy). However, a "variable number of properties" should typically be modeled as a one-to-many relationship.
CREATE TABLE main_entity (
id INT PRIMARY KEY,
other_fields INT
);
CREATE TABLE entity_properties (
main_entity_id INT PRIMARY KEY,
property_value INT,
FOREIGN KEY (main_entity_id) REFERENCES main_entity(id)
);
entity_properties.main_entity_id is a foreign key to main_entity.id.
Congratulations, you are on the right path, this is called normalisation. You are about to reach the First Normal Form.
Beweare, however, these properties should have a sensibly similar nature (ie. all phone numbers, or addresses, etc.). Do not to fall into the dark side (a.k.a. the Entity-Attribute-Value anti-pattern), and be tempted to throw all properties into the same table. If you can identify several types of attributes, store each type in a separate table.
If these are all fixed length strings (as in the question), then you can do the work fairly simply (at least relative to other solutions):
select substring(col, 1+13*(n-1), 12) as val
from t join
(select 1 as n union all select union all select 3
) n
on len(t.col) <= 13*n.n
This is a useful hack if all the entries are the same size (not so easy if they are of different sizes). Do, however, think about the data structure because semi-colon (or comma) separated list is not a very good data structure.
IF I were you, I would create a simple function that is dividing values separated with ';' like this:
IF EXISTS (SELECT * FROM sysobjects WHERE id = object_id(N'fn_Split_List') AND xtype IN (N'FN', N'IF', N'TF'))
BEGIN
DROP FUNCTION [dbo].[fn_Split_List]
END
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[fn_Split_List](#List NVARCHAR(512))
RETURNS #ResultRowset TABLE ( [Value] NVARCHAR(128) PRIMARY KEY)
AS
BEGIN
DECLARE #XML xml = N'<r><![CDATA[' + REPLACE(#List, ';', ']]></r><r><![CDATA[') + ']]></r>'
INSERT INTO #ResultRowset ([Value])
SELECT DISTINCT RTRIM(LTRIM(Tbl.Col.value('.', 'NVARCHAR(128)')))
FROM #xml.nodes('//r') Tbl(Col)
RETURN
END
GO
Than simply called in this way:
SET NOCOUNT ON
GO
DECLARE #RawData TABLE( [Value] NVARCHAR(256))
INSERT INTO #RawData ([Value] )
VALUES ('1111111;22222222')
,('3333333;113113131')
,('776767676')
,('89332131;313131312;54545353')
SELECT SL.[Value]
FROM #RawData AS RD
CROSS APPLY [fn_Split_List] ([Value]) as SL
SET NOCOUNT OFF
GO
The result is as the follow:
Value
1111111
22222222
113113131
3333333
776767676
313131312
54545353
89332131
Anyway, the logic in the function is not complicated, so you can easily put it anywhere you need.
Note: There is not limitations of how many values you will have separated with ';', but there are length limitation in the function that you can set to NVARCHAR(MAX) if you need.
EDIT:
As I can see, there are some rows in your example that will caused the function to return empty strings. For example:
number;number;
will return:
number
number
'' (empty string)
To clear them, just add the following where clause to the statement above like this:
SELECT SL.[Value]
FROM #RawData AS RD
CROSS APPLY [fn_Split_List] ([Value]) as SL
WHERE LEN(SL.[Value]) > 0

SQL: SELECT number text base on a number

Background: I have an SQL database that contain a column (foo) of a text type and not integer. In the column I store integer in a text form.
Question: Is it possible to SELECT the row that contains (in foo column) number greater/lesser than n?
PS: I have a very good reason to store them as text form. Please refrain from commenting on that.
Update: (Forgot to mention) I am storing it in SQLite3.
SELECT foo
FROM Table
WHERE CAST(foo as int)>#n
select *
from tableName
where cast(textColumn as int) > 5
A simple CAST in the WHERE clause will work as long as you are sure that the data in the foo column is going to properly convert to an integer. If not, your SELECT statement will throw an error. I would suggest you add an extra step here and take out the non-numeric characters before casting the field to an int. Here is a link on how to do something similar:
http://blog.sqlauthority.com/2007/05/13/sql-server-udf-function-to-parse-alphanumeric-characters-from-string/
The only real modification you would need to do on this function would be to change the following lines:
PATINDEX('%[^0-9A-Za-z]%', #string)
to
PATINDEX('%[^0-9]%', #string)
The results from that UDF should then be castable to an int without it throwing an error. It will further slow down your query, but it will be safer. You could even put your CAST inside the UDF and make it one call. The final UDF would look like this:
CREATE FUNCTION dbo.UDF_ParseAlphaChars
(
#string VARCHAR(8000)
)
RETURNS int
AS
BEGIN
DECLARE #IncorrectCharLoc SMALLINT
SET #IncorrectCharLoc = PATINDEX('%[^0-9]%', #string)
WHILE #IncorrectCharLoc > 0
BEGIN
SET #string = STUFF(#string, #IncorrectCharLoc, 1, '')
SET #IncorrectCharLoc = PATINDEX('%[^0-9]%', #string)
END
SET #string = #string
RETURN CAST(#string as int)
END
GO
Your final SELECT statement would look something like this:
SELECT *
FROM Table
WHERE UDF_ParseAlphaChars(Foo) > 5
EDIT
Based upon the new information that the database is SQLite, the above probably won't work directly. I don't believe SQLite has native support for UDFs. You might be able to create a type of UDF using your programming language of choice (like this: http://www.christian-etter.de/?p=439)
The other option I see to safely get all of your data (an IsNumeric would exclude certain rows from your results, which might not be what you want) would probably be to create an extra column that has the int representation of the string. It is a little more dangerous in that you need to keep two fields in sync, but it will allow you to quickly sort and filter the table data.
SELECT *
FROM Table
WHERE CAST(foo as int) > 2000

how to write the store procedure for searching (CSV)?

how can i write the store procedure for searching particular string in a column of table, for given set of strings (CSV string).
like : select * from xxx where tags like ('oscar','rahman','slumdog')
how can i write the procedure for that combination of tags.
To create a comma seperated string...
You could then apply this list to Oded example to create the LIKE parts of the WHERE cluase on the fly.
DECLARE #pos int, #curruntLocation char(20), #input varchar(2048)
SELECT #pos=0
SELECT #input = 'oscar,rahman,slumdog'
SELECT #input = #input + ','
CREATE TABLE #tempTable (temp varchar(100) )
WHILE CHARINDEX(',',#input) > 0
BEGIN
SELECT #pos=CHARINDEX(',',#input)
SELECT #curruntLocation = RTRIM(LTRIM(SUBSTRING(#input,1,#pos-1)))
INSERT INTO #tempTable (temp) VALUES (#curruntLocation)
SELECT #input=SUBSTRING(#input,#pos+1,2048)
END
SELECT * FROM #tempTable
DR0P TABLE #tempTable
First off, the use of like for exact matches is sub-optimal. Might as well use =, and if doing so, you can use the IN syntax:
select * from xxx
where tags IN ('oscar', 'rahman', 'slumdog')
I am guessing you are not looking for an exact match, but for any record where the tags field contains all of the tags.
This would be something like this:
select * from xxx
where tags like '%oscar%'
and tags like '%rahman%'
and tags like '%slumdog%'
This would be not be very fast or performant though.
Think about moving this kind of logic into your application, where it is faster and easier to do.
Edit:
Following the comments - there are lots of examples on how to parse delimited strings out there. You can put these in a table and use dynamic sql to generate your query.
But, this will have bad performance and SQL Server will not be able to cache query plans for this kind of thing. As I said above - think about moving this kind of logic to application level.