Sane/fast method to pass variable parameter lists to SqlServer2008 stored procedure - sql

A fairly comprehensive query of the brain has turned up a thousand and one ways to pass variable length parameter lists that involve such methods as:
CLR based methods for parsing strings to lists of integers
Table valued functions that require the presence of a 'Numbers' table (wtf?)
Passing the data as XML
Our requirements are to pass two variable length lists of integers (~max 20 ints) to a stored procedure. All methods outlined above seem to smell funny.
Is this just the way it has to be done, or is there a better way?
Edit:
I've just found this, which may qualify this question as a dupe

Yes, I'd definitely look at Table Valued Parameters for this. As a side benefit, it may allow you to use a nice, clean set-based implementation for the innards of your procedure directly, without any data massaging required.
Here's another reference as well...

Here is a fairly fast method to split strings using only T-SQL and you input parameter is only a string. You need to have a table and a function (as described below) already set up to use this method.
create this table:
CREATE TABLE Numbers (Number int not null primary key identity(1,1))
DECLARE #n int
SET #n=1
SET IDENTITY_INSERT Numbers ON
WHILE #N<=8000
BEGIN
INSERT INTO Numbers (Number) values (#n)
SET #n=#n+1
END
SET IDENTITY_INSERT Numbers OFF
create this function to split the string array (I have other versions, where empty sections are eliminated and ones that do not return row numbers):
CREATE FUNCTION [dbo].[FN_ListAllToNumberTable]
(
#SplitOn char(1) --REQUIRED, the character to split the #List string on
,#List varchar(8000) --REQUIRED, the list to split apart
)
RETURNS
#ParsedList table
(
RowNumber int --REQUIRED, the list to split apart
,ListValue varchar(500) --OPTIONAL, the character to split the #List string on, defaults to a comma ","
)
AS
BEGIN
--this will return empty rows, and row numbers
INSERT INTO #ParsedList
(RowNumber,ListValue)
SELECT
ROW_NUMBER() OVER(ORDER BY number) AS RowNumber
,LTRIM(RTRIM(SUBSTRING(ListValue, number+1, CHARINDEX(#SplitOn, ListValue, number+1)-number - 1))) AS ListValue
FROM (
SELECT #SplitOn + #List + #SplitOn AS ListValue
) AS InnerQuery
INNER JOIN Numbers n ON n.Number < LEN(InnerQuery.ListValue)
WHERE SUBSTRING(ListValue, number, 1) = #SplitOn
RETURN
END
go
here is an example of how to split the parameter apart:
CREATE PROCEDURE TestPass
(
#ArrayOfInts varchar(255) --pipe "|" separated list of IDs
)
AS
SET NOCOUNT ON
DECLARE #TableIDs TABLE (RowNumber int, IDValue int null)
INSERT INTO #TableIDs (RowNumber, IDValue) SELECT RowNumber,CASE WHEN LEN(ListValue)<1 then NULL ELSE ListValue END FROM dbo.FN_ListAllToNumberTable('|',#ArrayOfInts)
SELECT * FROM #TableIDs
go
this is based on: http://www.sommarskog.se/arrays-in-sql.html

Related

SQL How to Split One Column into Multiple Variable Columns

I am working on MSSQL, trying to split one string column into multiple columns. The string column has numbers separated by semicolons, like:
190230943204;190234443204;
However, some rows have more numbers than others, so in the database you can have
190230943204;190234443204;
121340944534;340212343204;134530943204
I've seen some solutions for splitting one column into a specific number of columns, but not variable columns. The columns that have less data (2 series of strings separated by commas instead of 3) will have nulls in the third place.
Ideas? Let me know if I must clarify anything.
Splitting this data into separate columns is a very good start (coma-separated values are an heresy). However, a "variable number of properties" should typically be modeled as a one-to-many relationship.
CREATE TABLE main_entity (
id INT PRIMARY KEY,
other_fields INT
);
CREATE TABLE entity_properties (
main_entity_id INT PRIMARY KEY,
property_value INT,
FOREIGN KEY (main_entity_id) REFERENCES main_entity(id)
);
entity_properties.main_entity_id is a foreign key to main_entity.id.
Congratulations, you are on the right path, this is called normalisation. You are about to reach the First Normal Form.
Beweare, however, these properties should have a sensibly similar nature (ie. all phone numbers, or addresses, etc.). Do not to fall into the dark side (a.k.a. the Entity-Attribute-Value anti-pattern), and be tempted to throw all properties into the same table. If you can identify several types of attributes, store each type in a separate table.
If these are all fixed length strings (as in the question), then you can do the work fairly simply (at least relative to other solutions):
select substring(col, 1+13*(n-1), 12) as val
from t join
(select 1 as n union all select union all select 3
) n
on len(t.col) <= 13*n.n
This is a useful hack if all the entries are the same size (not so easy if they are of different sizes). Do, however, think about the data structure because semi-colon (or comma) separated list is not a very good data structure.
IF I were you, I would create a simple function that is dividing values separated with ';' like this:
IF EXISTS (SELECT * FROM sysobjects WHERE id = object_id(N'fn_Split_List') AND xtype IN (N'FN', N'IF', N'TF'))
BEGIN
DROP FUNCTION [dbo].[fn_Split_List]
END
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[fn_Split_List](#List NVARCHAR(512))
RETURNS #ResultRowset TABLE ( [Value] NVARCHAR(128) PRIMARY KEY)
AS
BEGIN
DECLARE #XML xml = N'<r><![CDATA[' + REPLACE(#List, ';', ']]></r><r><![CDATA[') + ']]></r>'
INSERT INTO #ResultRowset ([Value])
SELECT DISTINCT RTRIM(LTRIM(Tbl.Col.value('.', 'NVARCHAR(128)')))
FROM #xml.nodes('//r') Tbl(Col)
RETURN
END
GO
Than simply called in this way:
SET NOCOUNT ON
GO
DECLARE #RawData TABLE( [Value] NVARCHAR(256))
INSERT INTO #RawData ([Value] )
VALUES ('1111111;22222222')
,('3333333;113113131')
,('776767676')
,('89332131;313131312;54545353')
SELECT SL.[Value]
FROM #RawData AS RD
CROSS APPLY [fn_Split_List] ([Value]) as SL
SET NOCOUNT OFF
GO
The result is as the follow:
Value
1111111
22222222
113113131
3333333
776767676
313131312
54545353
89332131
Anyway, the logic in the function is not complicated, so you can easily put it anywhere you need.
Note: There is not limitations of how many values you will have separated with ';', but there are length limitation in the function that you can set to NVARCHAR(MAX) if you need.
EDIT:
As I can see, there are some rows in your example that will caused the function to return empty strings. For example:
number;number;
will return:
number
number
'' (empty string)
To clear them, just add the following where clause to the statement above like this:
SELECT SL.[Value]
FROM #RawData AS RD
CROSS APPLY [fn_Split_List] ([Value]) as SL
WHERE LEN(SL.[Value]) > 0

Sort nvarchar in SQL Server 2008

I have a table with this data in SQL Server :
Id
=====
1
12e
5
and I want to order this data like this:
id
====
1
5
12e
My id column is of type nvarchar(50) and I can't convert it to int.
Is this possible that I sort the data in this way?
As a general rule, if you ever find yourself manipulating parts of columns, you're almost certainly doing it wrong.
If your ID is made up of a numeric and alpha component and you need to fiddle with just the numeric bit, make it two columns and save yourself some angst. In that case, you have an integral id_numeric and a varchar id_alpha and your query is simply:
select char(id_numeric) | id_alpha as id
from mytable
order by id_numeric asc
Or, if you really must store that as a single column, create extra columns to hold the individual parts and use those for sorting and selection. But, in order to mitigate the problems in having duplicate data in a row, use triggers to ensure the data remains consistent:
select id
from mytable
order by id_numeric asc
You usually don't want to have to do this splitting on every select since that never scales well. By doing it as an update/insert trigger, you only do the splitting when needed (ie, when the data changes) and this cost is amortised across all the selects. That's a good idea because, in the vast majority of cases, databases are read far more often than they're written.
And it's perfectly normal practice to revert to lesser levels of normalisation for performance reasons, provided that you understand and mitigate the consequences.
I'd actually use something along the lines of this function, though be warned that it's not going to be super-speedy. I've modified that function to return only the numbers:
CREATE FUNCTION dbo.UDF_ParseNumericChars
(
#string VARCHAR(8000)
)
RETURNS VARCHAR(8000)
WITH SCHEMABINDING
AS
BEGIN
DECLARE #IncorrectCharLoc SMALLINT
SET #IncorrectCharLoc = PATINDEX('%[^0-9]%', #string)
WHILE #IncorrectCharLoc > 0
BEGIN
SET #string = STUFF(#string, #IncorrectCharLoc, 1, '')
SET #IncorrectCharLoc = PATINDEX('%[^0-9]%', #string)
END
SET #string = #string
RETURN #string
END
GO
Once you create that function, then you can do your sort like this:
SELECT YourMixedColumn
FROM YourTable
ORDER BY CONVERT(INT, dbo.UDF_ParseNumericChars(YourMixedColumn))
It can be sort with the Len function
create table #temp (id nvarchar(50) null)
select * from #temp order by LEN(id)

How hard would you try to make your SQL queries secure?

I am in a situation where I am given a comma-separated VarChar as input to a stored procedure. I want to do something like this:
SELECT * FROM tblMyTable
INNER JOIN /*Bunch of inner joins here*/
WHERE ItemID IN ($MyList);
However, you can't use a VarChar with the IN statement. There are two ways to get around this problem:
(The Wrong Way) Create the SQL query in a String, like so:
SET $SQL = '
SELECT * FROM tblMyTable
INNER JOIN /*Bunch of inner joins here*/
WHERE ItemID IN (' + $MyList + ');
EXEC($SQL);
(The Right Way) Create a temporary table that contains the values of $MyList, then join that table in the initial query.
My question is:
Option 2 has a relatively large performance hit with creating a temporary table, which is less than ideal.
While Option 1 is open to an SQL injection attack, since my SPROC is being called from an authenticated source, does it really matter? Only trusted sources will execute this SPROC, so if they choose to bugger up the database, that is their prerogative.
So, how far would you go to make your code secure?
What database are you using? in SQL Server you can create a split function that can split a long string and return a table sub-second. you use the table function call like a regular table in a query (no temp table necessary)
You need to create a split function, or if you have one just use it. This is how a split function can be used:
SELECT
*
FROM YourTable y
INNER JOIN dbo.yourSplitFunction(#Parameter) s ON y.ID=s.Value
I prefer the number table approach to split a string in TSQL but there are numerous ways to split strings in SQL Server, see the previous link, which explains the PROs and CONs of each.
For the Numbers Table method to work, you need to do this one time table setup, which will create a table Numbers that contains rows from 1 to 10,000:
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)
Once the Numbers table is set up, create this split function:
CREATE FUNCTION [dbo].[FN_ListToTable]
(
#SplitOn char(1) --REQUIRED, the character to split the #List string on
,#List varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN
(
----------------
--SINGLE QUERY-- --this will not return empty rows
----------------
SELECT
ListValue
FROM (SELECT
LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(#SplitOn, List2, number+1)-number - 1))) AS ListValue
FROM (
SELECT #SplitOn + #List + #SplitOn AS List2
) AS dt
INNER JOIN Numbers n ON n.Number < LEN(dt.List2)
WHERE SUBSTRING(List2, number, 1) = #SplitOn
) dt2
WHERE ListValue IS NOT NULL AND ListValue!=''
);
GO
You can now easily split a CSV string into a table and join on it:
select * from dbo.FN_ListToTable(',','1,2,3,,,4,5,6777,,,')
OUTPUT:
ListValue
-----------------------
1
2
3
4
5
6777
(6 row(s) affected)
Your can use the CSV string like this, not temp table necessary:
SELECT * FROM tblMyTable
INNER JOIN /*Bunch of inner joins here*/
WHERE ItemID IN (select ListValue from dbo.FN_ListToTable(',',$MyList));
I would personally prefer option 2 in that just because a source is authenticated, does not mean you should be letting your guard down. You would leave yourself open to potential rights escalations where an authenticated low lvl user, is able to still execute commands against the database you had not intended.
The phrase you use of 'trusted sources' - it might be better if you assume an X-Files aproach and to trust no-one.
If someone buggers up the database you might still be getting a call.
A good option that is similar to option two is to use a function to create a table in memory from the CSV list. It is reasonably fast and offers the protections of option two. Then that table can be joined to the Inner Join, e.g.
CREATE FUNCTION [dbo].[simple_strlist_to_tbl] (#list nvarchar(MAX))
RETURNS #tbl TABLE (str varchar(4000) NOT NULL) AS
BEGIN
DECLARE #pos int,
#nextpos int,
#valuelen int
SELECT #pos = 0, #nextpos = 1
WHILE #nextpos > 0
BEGIN
SELECT #nextpos = charindex(',', #list, #pos + 1)
SELECT #valuelen = CASE WHEN #nextpos > 0
THEN #nextpos
ELSE len(#list) + 1
END - #pos - 1
INSERT #tbl (str)
VALUES (substring(#list, #pos + 1, #valuelen))
SELECT #pos = #nextpos
END
RETURN
END
Then in the join:
tblMyTable INNER JOIN
simple_strlist_to_tbl(#MyList) list ON tblMyTable.itemId = list.str
Option 3 is to confirm each item in the list is in fact an integer before concatenating the string to your SQL statement.
Do this by parsing the input string (e.g., split into an array), loop through and convert each value to an int, and then recreate the list yourself before concatenating back to the SQL statement. This will give you reasonable assurance that SQL injection cannot occur.
It is safer to concatenate strings that have been created by your application, because you can do things like check for int, but it also means your code is written in a way that a subsequent developer may modify slightly, thus opening back up the risk of SQL injection, because they do not realize that is what your code is protecting against. Make sure you comment well what you are doing if you go this route.
A third option: pass the values to the stored procedure in an array. Then you can either assemble the comma-separated string in your code and use the dynamic SQL option, or (if your flavour of RDBMS permits it) use the array directly in the SELECT statement.
Why don't You write an CLR split function, that will do all the job nice and easy? You can write user Defined table functions which will return a table doing string splitting with .Net infructure. Hell in SQL 2008 you can even give them hints if they return the strings sorted in any way... like ascending or something which can help the optimizer?
Or maybe You can't do CLR integration then You have to stick to the tsql but I personally would go for the CLR soluton

SQL Server: any equivalent of strpos()?

I'm dealing with an annoying database where one field contains what really should be stored two separate fields. So the column is stored something like "The first string~#~The second string", where "~#~" is the delimiter. (Again, I didn't design this, I'm just trying to fix it.)
I want a query to move this into two columns, that would look something like this:
UPDATE UserAttributes
SET str1 = SUBSTRING(Data, 1, STRPOS(Data, '~#~')),
str2 = SUBSTRING(Data, STRPOS(Data, '~#~')+3, LEN(Data)-(STRPOS(Data, '~#~')+3))
But I can't find that any equivalent to strpos exists.
User charindex:
Select CHARINDEX ('S','MICROSOFT SQL SERVER 2000')
Result: 6
Link
The PatIndex function should give you the location of the pattern as a part of a string.
PATINDEX ( '%pattern%' , expression )
http://msdn.microsoft.com/en-us/library/ms188395.aspx
If you need your data in columns here is what I use:
create FUNCTION [dbo].[fncTableFromCommaString] (#strList varchar(8000))
RETURNS #retTable Table (intValue int) AS
BEGIN
DECLARE #intPos tinyint
WHILE CHARINDEX(',',#strList) > 0
BEGIN
SET #intPos=CHARINDEX(',',#strList)
INSERT INTO #retTable (intValue) values (CONVERT(int, LEFT(#strList,#intPos-1)))
SET #strList = RIGHT(#strList, LEN(#strList)-#intPos)
END
IF LEN(#strList)>0
INSERT INTO #retTable (intValue) values (CONVERT(int, #strList))
RETURN
END
Just replace ',' in the function with your delimiter (or maybe even parametrize it)

Filtering With Multi-Select Boxes With SQL Server

I need to filter result sets from sql server based on selections from a multi-select list box. I've been through the idea of doing an instring to determine if the row value exists in the selected filter values, but that's prone to partial matches (e.g. Car matches Carpet).
I also went through splitting the string into a table and joining/matching based on that, but I have reservations about how that is going to perform.
Seeing as this is a seemingly common task, I'm looking to the Stack Overflow community for some feedback and maybe a couple suggestions on the most commonly utilized approach to solving this problem.
I solved this one by writing a table-valued function (we're using 2005) which takes a delimited string and returns a table. You can then join to that or use WHERE EXISTS or WHERE x IN. We haven't done full stress testing yet, but with limited use and reasonably small sets of items I think that performance should be ok.
Below is one of the functions as a starting point for you. I also have one written to specifically accept a delimited list of INTs for ID values in lookup tables, etc.
Another possibility is to use LIKE with the delimiters to make sure that partial matches are ignore, but you can't use indexes with that, so performance will be poor for any large table. For example:
SELECT
my_column
FROM
My_Table
WHERE
#my_string LIKE '%|' + my_column + '|%'
.
/*
Name: GetTableFromStringList
Description: Returns a table of values extracted from a delimited list
Parameters:
#StringList - A delimited list of strings
#Delimiter - The delimiter used in the delimited list
History:
Date Name Comments
---------- ------------- ----------------------------------------------------
2008-12-03 T. Hummel Initial Creation
*/
CREATE FUNCTION dbo.GetTableFromStringList
(
#StringList VARCHAR(1000),
#Delimiter CHAR(1) = ','
)
RETURNS #Results TABLE
(
String VARCHAR(1000) NOT NULL
)
AS
BEGIN
DECLARE
#string VARCHAR(1000),
#position SMALLINT
SET #StringList = LTRIM(RTRIM(#StringList)) + #Delimiter
SET #position = CHARINDEX(#Delimiter, #StringList)
WHILE (#position > 0)
BEGIN
SET #string = LTRIM(RTRIM(LEFT(#StringList, #position - 1)))
IF (#string <> '')
BEGIN
INSERT INTO #Results (String) VALUES (#string)
END
SET #StringList = RIGHT(#StringList, LEN(#StringList) - #position)
SET #position = CHARINDEX(#Delimiter, #StringList, 1)
END
RETURN
END
I've been through the idea of doing an
instring to determine if the row value
exists in the selected filter values,
but that's prone to partial matches
(e.g. Car matches Carpet)
It sounds to me like you aren't including a unique ID, or possibly the primary key as part of values in your list box. Ideally each option will have a unique identifier that matches a column in the table you are searching on. If your listbox was like below then you would be able to filter for specifically for cars because you would get the unique value 3.
<option value="3">Car</option>
<option value="4">Carpret</option>
Then you just build a where clause that will allow you to find the values you need.
Updated, to answer comment.
How would I do the related join
considering that the user can select
and arbitrary number of options from
the list box? SELECT * FROM tblTable
JOIN tblOptions ON tblTable.FK = ? The
problem here is that I need to join on
multiple values.
I answered a similar question here.
One method would be to build a temporary table and add each selected option as a row to the temporary table. Then you would simply do a join to your temporary table.
If you want to simply create your sql dynamically you can do something like this.
SELECT * FROM tblTable WHERE option IN (selected_option_1, selected_option_2, selected_option_n)
I've found that a CLR table-valued function which takes your delimited string and calls Split on the string (returning the array as the IEnumerable) is more performant than anything written in T-SQL (it starts to break down when you have around one million items in the delimited list, but that's much further out than the T-SQL solution).
And then, you could join on the table or check with EXISTS.