split string in column - sql

I have data that has come over from a hierarchical database, and it often has columns that contain data that SHOULD be in another table, if the original database had been relational.
The column's data is formatted in pairs, with LABEL\VALUE with a space as the delimiter, like this:
LABEL1\VALUE LABEL2\VALUE LABEL3\VALUE
There is seldom more than one pair in a record, but there as many as three. There are 24 different possible Labels. There are other columns in this table, including the ID. I have been able to convert this column into a sparse array without using a cursor, with columns for ID, LABEL1, LABEL2, etc....
But this is not ideal for using in another query. My other option it to use a cursor, loop through the entire table once and write to a temp table, but I can't see to get it to work the way I want. I have been able to do it in just a few minutes in VB.NET, using a couple of nested loops, but can't manage to do it in T-SQL even using cursors. Problem is, that I would have to remember to run this program every time before I want to use the table it creates. Not ideal.
So, I read a row, split out the pairs from 'LABEL1\VALUE LABEL2\VALUE LABEL3\VALUE' into an array, then split them out again, then write the rows
ID, LABEL1, VALUE
ID, LABEL2, VALUE
ID, LABEL3, VALUE
etc...
I realize that 'splitting' the strings here is the hard part for SQL to do, but it just seems a lot more difficult that it needs to be. What am I missing?

Assuming that the data label contains no . characters, you can use a simple function for this:
CREATE FUNCTION [dbo].[SplitGriswold]
(
#List NVARCHAR(MAX),
#Delim1 NCHAR(1),
#Delim2 NCHAR(1)
)
RETURNS TABLE
AS
RETURN
(
SELECT
Val1 = PARSENAME(Value,2),
Val2 = PARSENAME(Value,1)
FROM
(
SELECT REPLACE(Value, #Delim2, '.') FROM
(
SELECT LTRIM(RTRIM(SUBSTRING(#List, [Number],
CHARINDEX(#Delim1, #List + #Delim1, [Number]) - [Number])))
FROM (SELECT Number = ROW_NUMBER() OVER (ORDER BY name)
FROM sys.all_objects) AS x
WHERE Number <= LEN(#List)
AND SUBSTRING(#Delim1 + #List, [Number], LEN(#Delim1)) = #Delim1
) AS y(Value)
) AS z(Value)
);
GO
Sample usage:
DECLARE #x TABLE(ID INT, string VARCHAR(255));
INSERT #x VALUES
(1, 'LABEL1\VALUE LABEL2\VALUE LABEL3\VALUE'),
(2, 'LABEL1\VALUE2 LABEL2\VALUE2');
SELECT x.ID, t.val1, t.val2
FROM #x AS x CROSS APPLY
dbo.SplitGriswold(REPLACE(x.string, ' ', N'ŏ'), N'ŏ', '\') AS t;
(I used a Unicode character unlikely to appear in data above, only because a space can be problematic for things like length checks. If this character is likely to appear, choose a different one.)
Results:
ID val1 val2
-- -------- --------
1 LABEL1 VALUE
1 LABEL2 VALUE
1 LABEL3 VALUE
2 LABEL1 VALUE2
2 LABEL2 VALUE2
If your data might have ., then you can just make the query a little more complex, without changing the function, by adding yet another character to the mix that is unlikely or impossible to be in the data:
DECLARE #x TABLE(ID INT, string VARCHAR(255));
INSERT #x VALUES
(1, 'LABEL1\VALUE.A LABEL2\VALUE.B LABEL3\VALUE.C'),
(2, 'LABEL1\VALUE2.A LABEL2.1\VALUE2.B');
SELECT x.ID, val1 = REPLACE(t.val1, N'ű', '.'), val2 = REPLACE(t.val2, N'ű', '.')
FROM #x AS x CROSS APPLY
dbo.SplitGriswold(REPLACE(REPLACE(x.string, ' ', 'ŏ'), '.', N'ű'), 'ŏ', '\') AS t;
Results:
ID val1 val2
-- -------- --------
1 LABEL1 VALUE.A
1 LABEL2 VALUE.B
1 LABEL3 VALUE.C
2 LABEL1 VALUE2.A
2 LABEL2.1 VALUE2.B

With only three values, you can manage to do this by brute force:
select (case when rest like '% %'
then left(rest, charindex(' ', rest) - 1)
else rest
end) as val2,
(case when rest like '% %'
then substring(col, charindex(' ', col) + 1, 1000)
end) as val3
from (select (case when col like '% %'
then left(col, charindex(' ', col) - 1)
else col
end) as val1,
(case when col like '% %'
then substring(col, charindex(' ', col) + 1, 1000)
end) as rest
from t
) t

Using the SQL split string function given at referenced SQL tutorial, you can split the label-value pairs as following
SELECT
id, max(label) as label, max(value) as value
FROM (
SELECT
s.id,
label = case when t.id = 1 then t.val else NULL end,
value = case when t.id = 2 then t.val else NULL end
FROM dbo.Split(N'LABEL1\VALUE1 LABEL2\VALUE2 LABEL3\VALUE3', ' ') s
CROSS APPLY dbo.Split(s.val, '\') t
) t
group by id
You can see that the split string function is called twice, first for splitting pairs from others. Then the second split function joined to previous one using CROSS APPLY splits labels from pairs

Related

SQL : extract next character from string where multiple separators exist

Azure MSSQL Database
I have a column that contains values stored per transaction. The string can contain up to 7 values, separated by a '-'.
I need to be able to extract the value that is stored after the 3rd '-'. The issue is that the length of this column (and the characters that come before the 3rd '-') can vary.
For example:
DIM VALUE
1. NHL--WA-S-MOSG-SER-
2. VDS----HAST-SER-
3. ---D---SER
Row 1 needs to return 'S'
Row 2 needs to return '-'
Row 3 needs to return 'D'
This is by no means an optimal solution, but it works in SQL Server. 😊
TempTable added for testing purposes. Maybe it gives you a hint as of where to start.
Edit: added reference for string_split function (works from SQL Server 2016 up).
CREATE TABLE #tempStrings (
VAL VARCHAR(30)
);
INSERT INTO #tempStrings VALUES ('NHL--WA-S-MOSG-SER-');
INSERT INTO #tempStrings VALUES ('VDS----HAST-SER-');
INSERT INTO #tempStrings VALUES ('---D---SER');
INSERT INTO #tempStrings VALUES ('A-V-D-C--SER');
SELECT
t.VAL,
CASE t.PART WHEN '' THEN '-' ELSE t.PART END AS PART
FROM
(SELECT
t.VAL,
ROW_NUMBER() OVER (PARTITION BY VAL ORDER BY (SELECT NULL)) AS IX,
value AS PART
FROM #tempStrings t
CROSS APPLY string_split(VAL, '-')) t
WHERE t.IX = 4; --DASH COUNT + 1
DROP TABLE #tempStrings;
Output is...
VAL PART
---D---SER D
A-V-D-C--SER C
NHL--WA-S-MOSG-SER- S
VDS----HAST-SER- -
If you always want the fourth element then using CHARINDEX is relatively straightforward:
DROP TABLE IF EXISTS #tmp;
CREATE TABLE #tmp (
rowId INT IDENTITY PRIMARY KEY,
xval VARCHAR(30) NOT NULL
);
INSERT INTO #tmp
VALUES
( 'NHL--WA-S-MOSG-SER-' ),
( 'VDS----HAST-SER-' ),
( '---D---SER' ),
( 'A-V-D-C--SER' );
;WITH cte AS
( -- Work out the position of the 3rd dash
SELECT
rowId,
xval,
CHARINDEX( '-', xval, CHARINDEX( '-', xval, CHARINDEX( '-', xval ) + 1 ) + 1 ) + 1 xstart
FROM #tmp t
), cte2 AS
( -- Work out the length for the substring function
SELECT rowId, xval, xstart, CHARINDEX( '-', xval, xstart) - (xstart) AS xlen
FROM cte
)
SELECT rowId, ISNULL( NULLIF( SUBSTRING( xval, xstart, xlen ), '' ), '-' ) xpart
FROM cte2
I also did a volume test at 1 million rows and this was by far the fastest method compared with STRING_SPLIT, OPENJSON, recursive CTE (the worst at high volume). As a downside this method is less extensible, say you want the second or fifth items for example.

How to extract the n-th value from the text field with delimitors

In SQL table I have a text column with value 'Yellow|Green|Blue' and another column with numeric value. This numeric value defines which part of the text column to be extracted. Values in the text column are separated with '|' separator.
For example:
If numeric value is 0, 1st part of the text field should be extracted: Yellow
If numeric value is 1, 2nd part of the text field should be extracted: Green
And so on.
Is there a way how to extract it dynamically ?
Meaning without using CASE statement like:
case when u.UD_2 =0 then 'Yellow' when u.UD_2=1 then 'Green' when u.UD_2=3 then 'Blue' end Kategorie
UPDATE: We are using SQL Server 2016
This should work for you, in the subquery extract each category to separate columns and after it, use a case statement to choose the needed category.
select case sep when 0 then x.[0] when 1 then x.[1] when 2 then x.[2] end as Kategorie
from (
select *
,LEFT(val, CHARINDEX('|', val) - 1) AS '0'
,LEFT(STUFF(SUBSTRING(val, CHARINDEX('|', val), LEN(val)), 1, 1, ''), CHARINDEX('|', STUFF(SUBSTRING(val, CHARINDEX('|', val), LEN(val)), 1, 1, '')) - 1) AS '1'
,SUBSTRING(SUBSTRING(val, CHARINDEX('|', val), LEN(val)), CHARINDEX('|', val) + 1, LEN(val)) AS '2'
from #test
)x
Sample data:
create table #test
(
val nvarchar(500),
sep int
)
insert into #test values
('Yellow|Green|Blue', 0),
('Yellow|Green|Blue', 1),
('Yellow|Green|Blue', 2)
Note: this only works if there are exact 3 values separated with |
UPDATE
And this is a dynamic way to achieve it, doesn't matter how many categories will be separated:
SELECT x.Kategorie
FROM (
SELECT DISTINCT node.s.value('.', 'NVARCHAR(500)') AS Kategorie
,ROW_NUMBER() OVER (PARTITION BY sep ORDER BY (SELECT NULL)) - 1 as rn
FROM (
SELECT sep
,CAST('<M>' + REPLACE(val, '|', '</M><M>') + '</M>' AS XML) AS Kategorie
FROM #test
) AS s
CROSS APPLY Kategorie.nodes('/M') AS node(s)
)x
JOIN #test AS t ON t.sep = x.rn
One possible approach is to split your text data into substrings and get each substring position.
Starting with SQL Server 2016 you may use STRING_SPLIT() to split a string, but in your case this is not an option, because this function returns a table with all substrings, but they are not ordered and the order of substrings is not guaranteed.
Again, if you use SQL Server 2016+, you may try to transform the text data into a valid JSON array using REPLACE() ('Yellow|Green|Blue' is transformed into '["Yellow","Green","Blue"]') and after that to use OPENJSON() with default schema to retrieve this JSON array as table, which has columns key, value and type (key column contains the index of the element in the specified array).
Input:
CREATE TABLE #Data (
TextValue nvarchar(max),
IndexValue int
)
INSERT INTO #Data
(TextValue, IndexValue)
VALUES
('Yellow|Green|Blue', 0),
('Yellow|Green|Blue', 1)
T-SQL:
SELECT d.TextValue, d.IndexValue, j.[value] AS [Value]
FROM #Data d
CROSS APPLY OPENJSON(CONCAT(N'["', REPLACE(d.TextValue, N'|', N'","'), N'"]')) j
WHERE d.IndexValue = j.[key]
Output:
---------------------------------------
TextValue IndexValue Value
---------------------------------------
Yellow|Green|Blue 0 Yellow
Yellow|Green|Blue 1 Green

SQL Server: Select rows with multiple occurrences of regex match in a column

I’m fairly used to using MySQL, but not particularly familiar with SQL Server. Tough luck, the database I’m dealing with here is on SQL Server 2014.
I have a table with a column whose values are all integers with leading, separating, and trailing semicolons, like these three fictitious rows:
;905;1493;384;13387;29;933;467;28732;
;905;138;3084;1387;290;9353;4767;2732;
;9085;14493;3864;130387;289;933;4767;28732;
What I am trying to do now is to select all rows where more than one number taken from a list of numbers appears in this column. So for example, given the three rows above, if I have the group 905,467,4767, the statement I’m trying to figure out how to construct should return the first two rows: the first row contains 905 and 467; the second row contains 905 and 4767. The third row contains only 4767, so that row should not be returned.
As far as I can tell, SQL Server does not actually support regex directly (and I don’t even know what managed code is), which doesn’t help. Even with regex, I wouldn’t know where to begin. Oracle seems to have a function that would be very useful, but that’s Oracle.
Most similar questions on here deal with finding multiple instances of the same character (usually singular) and solve the problem by replacing the string to match with nothing and counting the difference in length. I suppose that would technically work here, too, but given a ‘filter’ group of 15 numbers, the SELECT statement would become ridiculously long and convoluted and utterly unreadable. Additionally, I only want to match entire numbers (so if one of the numbers to match is 29, the value 29 would match in the first row, but the value 290 in the second row should not match), which means I’d have to include the semicolons in the REPLACE clause and then discount them when calculating the length. A complete mess.
What I would ideally like to do is something like this:
SELECT * FROM table WHERE REGEXP_COUNT(column, ';(905|467|4767);') > 1
– but that will obviously not work, for all kinds of reasons (the most obvious one being the nonexistence of REGEXP_COUNT outside Oracle).
Is there some sane, manageable way of doing this?
You can do
SELECT *
FROM Mess
CROSS APPLY (SELECT COUNT(*)
FROM (VALUES (905),
(467),
(4767)) V(Num)
WHERE Col LIKE CONCAT('%;', Num, ';%')) ca(count)
WHERE count > 1
SQL Fiddle
Or alternatively
WITH Nums
AS (SELECT Num
FROM (VALUES (905),
(467),
(4767)) V(Num))
SELECT Mess.*
FROM Mess
CROSS APPLY (VALUES(CAST(CONCAT('<x>', REPLACE(Col, ';', '</x><x>'), '</x>') AS XML))) x(x)
CROSS APPLY (SELECT COUNT(*)
FROM (SELECT n.value('.', 'int')
FROM x.x.nodes('/x') n(n)
WHERE n.value('.', 'varchar') <> ''
INTERSECT
SELECT Num
FROM Nums) T(count)
HAVING COUNT(*) > 1) ca2(count)
Could you put your arguments into a table (perhaps using a table-valued function accepting a string (of comma-separated integers) as a parameter) and use something like this?
DECLARE #T table (String varchar(255))
INSERT INTO #T
VALUES
(';905;1493;384;13387;29;933;467;28732;')
, (';905;138;3084;1387;290;9353;4767;2732;')
, (';9085;14493;3864;130387;289;933;4767;28732;')
DECLARE #Arguments table (Arg int)
INSERT INTO #Arguments
VALUES
(905)
, (467)
, (4767)
SELECT String
FROM
#T
CROSS JOIN #Arguments
GROUP BY String
HAVING SUM(CASE WHEN PATINDEX('%;' + CAST(Arg AS varchar) + ';%', String) > 0 THEN 1 ELSE 0 END) > 1
And example of using this with a function to generate the arguments:
CREATE FUNCTION GenerateArguments (#Integers varchar(255))
RETURNS #Arguments table (Arg int)
AS
BEGIN
WITH cte
AS
(
SELECT
PATINDEX('%,%', #Integers) p
, LEFT(#Integers, PATINDEX('%,%', #Integers) - 1) n
UNION ALL
SELECT
CASE WHEN PATINDEX('%,%', SUBSTRING(#Integers, p + 1, LEN(#Integers))) + p = p THEN 0 ELSE PATINDEX('%,%', SUBSTRING(#Integers, p + 1, LEN(#Integers))) + p END
, CASE WHEN PATINDEX('%,%', SUBSTRING(#Integers, p + 1, LEN(#Integers))) = 0 THEN RIGHT(#Integers, PATINDEX('%,%', REVERSE(#Integers)) - 1) ELSE LEFT(SUBSTRING(#Integers, p + 1, LEN(#Integers)), PATINDEX('%,%', SUBSTRING(#Integers, p + 1, LEN(#Integers))) - 1) END
FROM cte
WHERE p <> 0
)
INSERT INTO #Arguments (Arg)
SELECT n
FROM cte
RETURN
END
GO
DECLARE #T table (String varchar(255))
INSERT INTO #T
VALUES
(';905;1493;384;13387;29;933;467;28732;')
, (';905;138;3084;1387;290;9353;4767;2732;')
, (';9085;14493;3864;130387;289;933;4767;28732;')
;
SELECT String
FROM
#T
CROSS JOIN GenerateArguments('905,467,4767')
GROUP BY String
HAVING SUM(CASE WHEN PATINDEX('%;' + CAST(Arg AS varchar) + ';%', String) > 0 THEN 1 ELSE 0 END) > 1
You can achieve this using the like function for the regex and row_number to determine the number of matches.
Here we declare the column values for testing:
DECLARE #tbl TABLE (
string NVARCHAR(MAX)
)
INSERT #tbl VALUES
(';905;1493;384;13387;29;933;467;28732;'),
(';905;138;3084;1387;290;9353;4767;2732;'),
(';9085;14493;3864;130387;289;933;4767;28732;')
Then we pass your search parameters into a table variable to be joined on:
DECLARE #search_tbl TABLE (
search_value INT
)
INSERT #search_tbl VALUES
(905),
(467),
(4767)
Finally we join the table with the column to search for onto the search table. We apply the row_number function to determine the number of times it matches. We select from this subquery where the row_number = 2 meaning that it joined at least twice.
SELECT
string
FROM (
SELECT
tbl.string,
ROW_NUMBER() OVER (PARTITION BY tbl.string ORDER BY tbl.string) AS rn
FROM #tbl tbl
JOIN #search_tbl search_tbl ON
tbl.string LIKE '%;' + CAST(search_tbl.search_value AS NVARCHAR(MAX)) + ';%'
) tbl
WHERE rn = 2
You could build a where clause like this :
WHERE
case when column like '%;905;%' then 1 else 0 end +
case when column like '%;467;%' then 1 else 0 end +
case when column like '%;4767;%' then 1 else 0 end >= 2
The advantage is that you do not need a helper table. I don't know how you build the query, but the following also works, and is useful if the numbers are in a tsql variable.
case when column like ('%;' + #n + ';%') then 1 else 0 end

Split string and display below other column data using SQL Server [duplicate]

I have a table that looks like this:
ProductId, Color
"1", "red, blue, green"
"2", null
"3", "purple, green"
And I want to expand it to this:
ProductId, Color
1, red
1, blue
1, green
2, null
3, purple
3, green
Whats the easiest way to accomplish this? Is it possible without a loop in a proc?
Take a look at this function. I've done similar tricks to split and transpose data in Oracle. Loop over the data inserting the decoded values into a temp table. The convent thing is that MS will let you do this on the fly, while Oracle requires an explicit temp table.
MS SQL Split Function
Better Split Function
Edit by author:
This worked great. Final code looked like this (after creating the split function):
select pv.productid, colortable.items as color
from product p
cross apply split(p.color, ',') as colortable
based on your tables:
create table test_table
(
ProductId int
,Color varchar(100)
)
insert into test_table values (1, 'red, blue, green')
insert into test_table values (2, null)
insert into test_table values (3, 'purple, green')
create a new table like this:
CREATE TABLE Numbers
(
Number int not null primary key
)
that has rows containing values 1 to 8000 or so.
this will return what you want:
EDIT
here is a much better query, slightly modified from the great answer from #Christopher Klein:
I added the "LTRIM()" so the spaces in the color list, would be handled properly: "red, blue, green". His solution requires no spaces "red,blue,green". Also, I prefer to use my own Number table and not use master.dbo.spt_values, this allows the removal of one derived table too.
SELECT
ProductId, LEFT(PartialColor, CHARINDEX(',', PartialColor + ',')-1) as SplitColor
FROM (SELECT
t.ProductId, LTRIM(SUBSTRING(t.Color, n.Number, 200)) AS PartialColor
FROM test_table t
LEFT OUTER JOIN Numbers n ON n.Number<=LEN(t.Color) AND SUBSTRING(',' + t.Color, n.Number, 1) = ','
) t
EDIT END
SELECT
ProductId, Color --,number
FROM (SELECT
ProductId
,CASE
WHEN LEN(List2)>0 THEN LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(',', List2, number+1)-number - 1)))
ELSE NULL
END AS Color
,Number
FROM (
SELECT ProductId,',' + Color + ',' AS List2
FROM test_table
) AS dt
LEFT OUTER JOIN Numbers n ON (n.Number < LEN(dt.List2)) OR (n.Number=1 AND dt.List2 IS NULL)
WHERE SUBSTRING(List2, number, 1) = ',' OR List2 IS NULL
) dt2
ORDER BY ProductId, Number, Color
here is my result set:
ProductId Color
----------- --------------
1 red
1 blue
1 green
2 NULL
3 purple
3 green
(6 row(s) affected)
which is the same order you want...
You can try this out, doesnt require any additional functions:
declare #t table (col1 varchar(10), col2 varchar(200))
insert #t
select '1', 'red,blue,green'
union all select '2', NULL
union all select '3', 'green,purple'
select col1, left(d, charindex(',', d + ',')-1) as e from (
select *, substring(col2, number, 200) as d from #t col1 left join
(select distinct number from master.dbo.spt_values where number between 1 and 200) col2
on substring(',' + col2, number, 1) = ',') t
I arrived this question 10 years after the post.
SQL server 2016 added STRING_SPLIT function.
By using that, this can be written as below.
declare #product table
(
ProductId int,
Color varchar(max)
);
insert into #product values (1, 'red, blue, green');
insert into #product values (2, null);
insert into #product values (3, 'purple, green');
select
p.ProductId as ProductId,
ltrim(split_table.value) as Color
from #product p
outer apply string_split(p.Color, ',') as split_table;
Fix your database if at all possible. Comma delimited lists in database cells indicate a flawed schema 99% of the time or more.
I would create a CLR table-defined function for this:
http://msdn.microsoft.com/en-us/library/ms254508(VS.80).aspx
The reason for this is that CLR code is going to be much better at parsing apart the strings (computational work) and can pass that information back as a set, which is what SQL Server is really good at (set management).
The CLR function would return a series of records based on the parsed values (and the input id value).
You would then use a CROSS APPLY on each element in your table.
Just convert your columns into xml and query it. Here's an example.
select
a.value('.', 'varchar(42)') c
from (select cast('<r><a>' + replace(#CSV, ',', '</a><a>') + '</a></r>' as xml) x) t1
cross apply x.nodes('//r/a') t2(a)
Why not use dynamic SQL for this purpose, something like this(adapt to your needs):
DECLARE #dynSQL VARCHAR(max)
SET #dynSQL = 'insert into DestinationTable(field) values'
select #dynSQL = #dynSQL + '('+ REPLACE(Color,',',''',''') + '),' from Table
SET #dynSql = LEFT(#dynSql,LEN(#dynSql) -1) -- delete the last comma
exec #dynSql
One advantage is that you can use it on any SQL Server version

How do I expand comma separated values into separate rows using SQL Server 2005?

I have a table that looks like this:
ProductId, Color
"1", "red, blue, green"
"2", null
"3", "purple, green"
And I want to expand it to this:
ProductId, Color
1, red
1, blue
1, green
2, null
3, purple
3, green
Whats the easiest way to accomplish this? Is it possible without a loop in a proc?
Take a look at this function. I've done similar tricks to split and transpose data in Oracle. Loop over the data inserting the decoded values into a temp table. The convent thing is that MS will let you do this on the fly, while Oracle requires an explicit temp table.
MS SQL Split Function
Better Split Function
Edit by author:
This worked great. Final code looked like this (after creating the split function):
select pv.productid, colortable.items as color
from product p
cross apply split(p.color, ',') as colortable
based on your tables:
create table test_table
(
ProductId int
,Color varchar(100)
)
insert into test_table values (1, 'red, blue, green')
insert into test_table values (2, null)
insert into test_table values (3, 'purple, green')
create a new table like this:
CREATE TABLE Numbers
(
Number int not null primary key
)
that has rows containing values 1 to 8000 or so.
this will return what you want:
EDIT
here is a much better query, slightly modified from the great answer from #Christopher Klein:
I added the "LTRIM()" so the spaces in the color list, would be handled properly: "red, blue, green". His solution requires no spaces "red,blue,green". Also, I prefer to use my own Number table and not use master.dbo.spt_values, this allows the removal of one derived table too.
SELECT
ProductId, LEFT(PartialColor, CHARINDEX(',', PartialColor + ',')-1) as SplitColor
FROM (SELECT
t.ProductId, LTRIM(SUBSTRING(t.Color, n.Number, 200)) AS PartialColor
FROM test_table t
LEFT OUTER JOIN Numbers n ON n.Number<=LEN(t.Color) AND SUBSTRING(',' + t.Color, n.Number, 1) = ','
) t
EDIT END
SELECT
ProductId, Color --,number
FROM (SELECT
ProductId
,CASE
WHEN LEN(List2)>0 THEN LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(',', List2, number+1)-number - 1)))
ELSE NULL
END AS Color
,Number
FROM (
SELECT ProductId,',' + Color + ',' AS List2
FROM test_table
) AS dt
LEFT OUTER JOIN Numbers n ON (n.Number < LEN(dt.List2)) OR (n.Number=1 AND dt.List2 IS NULL)
WHERE SUBSTRING(List2, number, 1) = ',' OR List2 IS NULL
) dt2
ORDER BY ProductId, Number, Color
here is my result set:
ProductId Color
----------- --------------
1 red
1 blue
1 green
2 NULL
3 purple
3 green
(6 row(s) affected)
which is the same order you want...
You can try this out, doesnt require any additional functions:
declare #t table (col1 varchar(10), col2 varchar(200))
insert #t
select '1', 'red,blue,green'
union all select '2', NULL
union all select '3', 'green,purple'
select col1, left(d, charindex(',', d + ',')-1) as e from (
select *, substring(col2, number, 200) as d from #t col1 left join
(select distinct number from master.dbo.spt_values where number between 1 and 200) col2
on substring(',' + col2, number, 1) = ',') t
I arrived this question 10 years after the post.
SQL server 2016 added STRING_SPLIT function.
By using that, this can be written as below.
declare #product table
(
ProductId int,
Color varchar(max)
);
insert into #product values (1, 'red, blue, green');
insert into #product values (2, null);
insert into #product values (3, 'purple, green');
select
p.ProductId as ProductId,
ltrim(split_table.value) as Color
from #product p
outer apply string_split(p.Color, ',') as split_table;
Fix your database if at all possible. Comma delimited lists in database cells indicate a flawed schema 99% of the time or more.
I would create a CLR table-defined function for this:
http://msdn.microsoft.com/en-us/library/ms254508(VS.80).aspx
The reason for this is that CLR code is going to be much better at parsing apart the strings (computational work) and can pass that information back as a set, which is what SQL Server is really good at (set management).
The CLR function would return a series of records based on the parsed values (and the input id value).
You would then use a CROSS APPLY on each element in your table.
Just convert your columns into xml and query it. Here's an example.
select
a.value('.', 'varchar(42)') c
from (select cast('<r><a>' + replace(#CSV, ',', '</a><a>') + '</a></r>' as xml) x) t1
cross apply x.nodes('//r/a') t2(a)
Why not use dynamic SQL for this purpose, something like this(adapt to your needs):
DECLARE #dynSQL VARCHAR(max)
SET #dynSQL = 'insert into DestinationTable(field) values'
select #dynSQL = #dynSQL + '('+ REPLACE(Color,',',''',''') + '),' from Table
SET #dynSql = LEFT(#dynSql,LEN(#dynSql) -1) -- delete the last comma
exec #dynSql
One advantage is that you can use it on any SQL Server version