Joining tables containing comma delimited values - sql

I have three excel sheet I push them into tables in SQL server and I need to join these table. However, I believe - as I have tried already - normal join wouldn't work. I have programming background but not that much with SQL.
Table1
ID Data_column reference_number
1 some data 1528,ss-456
2 some data 9523
3 some data ss-952
4 some data null
Table2
ID Data_column
ss-456 some data
ss-952 some data
Table3
ID Data_column
1528 some data
9523 some data
In the case below How I will be able to join this raw on both table.
Table1
ID Data_column reference_number
1 some data 1528,ss-456

declare #t1 as table(
id int
,data_column varchar(20)
,reference_number varchar(20)
)
declare #t2 as table(
id varchar(20)
,data_column varchar(20)
)
declare #t3 as table(
id varchar(20)
,data_column varchar(20)
)
insert into #t1 values(1,'some data','1528,ss-456'),(2,'some data','9523'),(3,'some data','ss-952'),(4,'some data',null);
insert into #t2 values('ss-456','some data'),('ss-952','some data');
insert into #t3 values(1528,'some data'),(9523,'some data');
Quick solution
select * from #t1 t1
left outer join #t2 t2 on t1.reference_number like '%'+t2.id or t1.reference_number like t2.id+'%'
left outer join #t3 t3 on t1.reference_number like '%'+t3.id or t1.reference_number like t3.id+'%'
Result (left join):
id data_column reference_number id data_column id data_column
1 some data 1528,ss-456 ss-456 some data 1528 some data
2 some data 9523 NULL NULL 9523 some data
3 some data ss-952 ss-952 some data NULL NULL
4 some data NULL NULL NULL NULL NULL
You can change 'left outer join' to 'inner join' for exact match.

Clumsy design, clumsy solution:
SELECT *
FROM Table1
INNER JOIN Table2 ON ',' + Table1.reference_number + ',' LIKE '%,' + Table2.ID + ',%'
INNER JOIN Table3 ON ',' + Table1.reference_number + ',' LIKE '%,' + Table3.ID + ',%'
You must append leading and trailing commas to make sure that, for example, 1528,ss-456asdf does not match %ss-456%.

I see two problems here. First is the inconsistent type of ID in table 2 and 3 and aggregation of referenced keys in table 1. Here is an example how to solve both problems. To split REFERENCE_NUMBER column I used STRING_SPLIT function.
Update:
I added the solution which should work with SQL Server 2012.
I assumed that you wish to join data from table 1 with 2 or 3 depending in existence of this data. This is just my idea what you wanted to achive.
-- data preparing
declare #t1 as table(
id int
,data_column varchar(20)
,reference_number varchar(20)
)
declare #t2 as table(
id varchar(20)
,data_column varchar(20)
)
declare #t3 as table(
id int
,data_column varchar(20)
)
insert into #t1 values(1,'some data','1528,ss-456'),(2,'some data','9523'),(3,'some data','ss-952'),(4,'some data',null);
insert into #t2 values('ss-456','some data'),('ss-952','some data');
insert into #t3 values(1528,'some data'),(9523,'some data');
-- Solution example version >= 2016
with base as (
select t1.id,t1.data_column,f1.value from #t1 t1 outer apply string_split(t1.reference_number,',') f1)
select b.id,b.data_column,b.value,t2.data_column from base b join #t2 t2 on b.value = t2.id
union all
select b.id,b.data_column,b.value,t3.data_column from base b join #t3 t3 on try_cast(b.value as int ) = t3.id
union all
select b.id,b.data_column,b.value,null from base b where b.value is null;
-- Solution for SQL Version < 2016
with base as (
select t1.id,t1.data_column,f1.value from #t1 t1 outer apply(
SELECT Split.a.value('.', 'NVARCHAR(MAX)') value
FROM
(
SELECT CAST('<X>'+REPLACE(t1.reference_number, ',', '</X><X>')+'</X>' AS XML) AS String
) AS A
CROSS APPLY String.nodes('/X') AS Split(a)
) f1)
select b.id,b.data_column,b.value,t2.data_column from base b join #t2 t2 on b.value = t2.id
union all
select b.id,b.data_column,b.value,t3.data_column from base b join #t3 t3 on try_cast(b.value as int ) = t3.id
union all
select b.id,b.data_column,b.value,null from base b where b.value is null;

You will require a function to divide the comma separated sting into rows. If you don't have access to thr inbuilt string_split() function (as of mssql 2017 with compatibility of 130) there are several to choose from here
CREATE TABLE table1(
ID INTEGER NOT NULL PRIMARY KEY
,Data_column VARCHAR(10) NOT NULL
,reference_number VARCHAR(11)
);
INSERT INTO table1(ID,Data_column,reference_number) VALUES
(1,'t1somedata','1528,ss-456')
, (2,'t1somedata','9523')
, (3,'t1somedata','ss-952')
, (4,'t1somedata',NULL);
CREATE TABLE table2(
ID VARCHAR(6) NOT NULL PRIMARY KEY
,Data_column VARCHAR(10) NOT NULL
);
INSERT INTO table2(ID,Data_column) VALUES
('ss-456','t2somedata'),
('ss-952','t2somedata');
CREATE TABLE table3(
ID VARCHAR(6) NOT NULL PRIMARY KEY
,Data_column VARCHAR(10) NOT NULL
);
INSERT INTO table3(ID,Data_column) VALUES
('1528','t3somedata'),
('9523','t3somedata');
I have used this splistring function, but you can use almost any of the many freely available.
CREATE FUNCTION dbo.SplitStrings_Moden
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING AS
RETURN
WITH E1(N) AS ( SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
E2(N) AS (SELECT 1 FROM E1 a, E1 b),
E4(N) AS (SELECT 1 FROM E2 a, E2 b),
E42(N) AS (SELECT 1 FROM E4 a, E2 b),
cteTally(N) AS (SELECT 0 UNION ALL SELECT TOP (DATALENGTH(ISNULL(#List,1)))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E42),
cteStart(N1) AS (SELECT t.N+1 FROM cteTally t
WHERE (SUBSTRING(#List,t.N,1) = #Delimiter OR t.N = 0))
SELECT Item = SUBSTRING(#List, s.N1, ISNULL(NULLIF(CHARINDEX(#Delimiter,#List,s.N1),0)-s.N1,8000))
FROM cteStart s;
This is what the data looks like using the splitstring function:
select *
from table1
cross apply SplitStrings_Moden(reference_number,',')
ID | Data_column | reference_number | Item
-: | :---------- | :--------------- | :-----
1 | t1somedata | 1528,ss-456 | 1528
1 | t1somedata | 1528,ss-456 | ss-456
2 | t1somedata | 9523 | 9523
3 | t1somedata | ss-952 | ss-952
4 | t1somedata | null | null
and now joining to the other tables:
select
*
from (
select *
from table1
cross apply SplitStrings_Moden(reference_number,',')
) t1
left join table2 on t1.item = table2.id
left join table3 on t1.item = table3.id
where t1.item is not null
GO
ID | Data_column | reference_number | Item | ID | Data_column | ID | Data_column
-: | :---------- | :--------------- | :----- | :----- | :---------- | :--- | :----------
1 | t1somedata | 1528,ss-456 | 1528 | null | null | 1528 | t3somedata
1 | t1somedata | 1528,ss-456 | ss-456 | ss-456 | t2somedata | null | null
2 | t1somedata | 9523 | 9523 | null | null | 9523 | t3somedata
3 | t1somedata | ss-952 | ss-952 | ss-952 | t2somedata | null | null
db<>fiddle here

TRY THIS: If your reference_number is fixed and always stored IDs upto 2 only then you can go with the below approach
SELECT *
FROM(
SELECT ID,
data_column,
CASE WHEN PATINDEX ( '%,%', reference_number) > 0 THEN
SUBSTRING(reference_number, PATINDEX ( '%,%', reference_number)+1, LEN(reference_number))
ELSE reference_number END AS ref_col
FROM #table1
UNION
SELECT ID,
data_column,
CASE WHEN PATINDEX ( '%,%', reference_number) > 0 THEN
SUBSTRING(reference_number, 0, PATINDEX ( '%,%', reference_number))
END
FROM #table1) t1
LEFT JOIN #table2 t2 ON t2.id = t1.ref_col
LEFT JOIN #table3 t3 ON t3.id = t1.ref_col
WHERE t1.ref_col IS NOT NULL
OUTPUT:
ID data_column ref_col ID Data_column ID Data_column
1 some data 1528 NULL NULL 1528 some data
1 some data ss-456 ss-456 some data NULL NULL
2 some data 9523 NULL NULL 9523 some data
3 some data ss-952 ss-952 some data NULL NULL
4 some data null NULL NULL NULL NULL

You can Implement and get desired result using Substring and charIndex functions on the reference_number.
I upvoted an answer of 'is_oz' since i used his ready made schema to test and build a query for you.
below is the final query i build after several tries i made here:
select * from abc
left join abc2 on abc2.id = case when charindex(',',abc.reference_number) > 0
then substring(abc.reference_number
,charindex(',',abc.reference_number)+1
,len(abc.reference_number)-(charindex(',',abc.reference_number)-1)
)
else abc.reference_number
end
left join abc3 on abc3.id = case when charindex(',',abc.reference_number) > 0
then substring(abc.reference_number
,0
,(charindex(',',abc.reference_number))
)
else abc.reference_number
end
As per your requirement as much as i understand, it is returning all the matched rows from 2 other tables but still i hope this fulfills all the requirements you seek in your question. :)

Related

Concatenate from rows in SQL server

I want to concatenate from multiple rows
Table:
|id |Attribute |Value |
|--------|------------|---------|
|101 |Manager |Rudolf |
|101 |Account |456 |
|101 |Code |B |
|102 |Manager |Anna |
|102 |Cardno |123 |
|102 |Code |B |
|102 |Code |C |
The result I’m looking for is:
|id |Manager|Account|Cardno|Code |
|--------|-------|-------|------|----------|
|101 |Rudolf |456 | |B |
|102 |Anna | |123 |B,C |
I have the following code from a related question:
select
p.*,
a.value as Manager,
b.value as Account,
c.value as Cardno
from table1 p
left join table2 a on a.id = p.id and a.attribute = 'Manager'
left join table2 b on b.id = p.id and b.attribute = 'Account'
left join table2 c on c.id = p.id and b.attribute = 'Cardno'
However, it fails for the Code attribute with ID# 102, where both B and C values are present.
How can I update this to include both of those values in the same result?
If you are using SQL SERVER 2017 or above then string_agg() with PIVOT() will be easy to use but much faster in performance solution (Query#1).
If you are using older version of SQL Server then go for Query#2 with STUFF() and XML PATH FOR() for concatenating value along with PIVOT()
Schema:
create table table1 (id int, Attribute varchar(50) , Value varchar(50));
insert into table1 values(101 ,'Manager' ,'Rudolf');
insert into table1 values(101 ,'Account' ,'456');
insert into table1 values(101 ,'Code' ,'B');
insert into table1 values(102 ,'Manager' ,'Anna');
insert into table1 values(102 ,'Cardno' ,'123');
insert into table1 values(102 ,'Code' ,'B');
insert into table1 values(102 ,'Code' ,'C');
GO
Query#1 PIVOT() with STRING_AGG():
select *
from
(
select t1.id,t1.attribute,
string_agg(value,',') AS value
from table1 t1
group by t1.id,t1.attribute
) d
pivot
(
max(value)
for attribute in (manager,account,cardno,code)
) piv
Output:
id
manager
account
cardno
code
101
Rudolf
456
<emnull</em
B
102
Anna
<emnull</em
123
B,C
Query#2 PIVOT() WITH STUFF() AND XML PATH FOR():
select *
from
(
select distinct t1.id,t1.attribute,
STUFF(
(SELECT ', ' + convert(varchar(10), t2.value, 120)
FROM table1 t2
where t1.id = t2.id and t1.attribute=t2.attribute
FOR XML PATH (''))
, 1, 1, '') AS value
from table1 t1
) d
pivot
(
max(value)
for attribute in (manager,account,cardno,code)
) piv
Output:
id
manager
account
cardno
code
101
Rudolf
456
<emnull</em
B
102
Anna
<emnull</em
123
B, C
db<fiddle here
Another method via XML and XQuery.
It is for SQL Server 2008 onwards.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT, attribute VARCHAR(20), [Value] VARCHAR(30));
INSERT INTO #tbl (ID, attribute, Value) VALUES
(101,'Manager','Rudolf'),
(101,'Account','456'),
(101,'Code','B'),
(102,'Manager','Anna'),
(102,'Cardno','123'),
(102,'Code','B'),
(102,'Code','C');
-- DDL and sample data population, end
;WITH rs AS
(
SELECT ID, (
SELECT *
FROM #tbl AS c
WHERE c.id = p.id
FOR XML PATH('r'), TYPE, ROOT('root')
) AS xmldata
FROM #tbl AS p
GROUP BY id
)
SELECT ID
, COALESCE(xmldata.value('(/root/r[attribute="Manager"]/Value/text())[1]','VARCHAR(30)'),'') AS Manager
, COALESCE(xmldata.value('(/root/r[attribute="Account"]/Value/text())[1]','VARCHAR(30)'),'') AS Account
, COALESCE(xmldata.value('(/root/r[attribute="Cardno"]/Value/text())[1]','VARCHAR(30)'),'') AS Cardno
, COALESCE(REPLACE(xmldata.query('data(/root/r[attribute="Code"]/Value)').value('.', 'VARCHAR(MAX)'), SPACE(1), ','),'') AS Code
FROM rs
ORDER BY ID;
Output
+-----+---------+---------+--------+------+
| ID | Manager | Account | Cardno | Code |
+-----+---------+---------+--------+------+
| 101 | Rudolf | 456 | | B |
| 102 | Anna | | 123 | B,C |
+-----+---------+---------+--------+------+
UPD: "STRING_AGG only Server 2017+"
You can solve this task using CTE and STRING_AGG function, for example:
declare
#t table (id int, Attribute varchar (100), [Value] varchar (100) )
insert into #t
values
(101, 'Manager', 'Rudolf'),
(101, 'Account', '456'),
(101, 'Code', 'B'),
(102, 'Manager', 'Anna'),
(102, 'Cardno', '123'),
(102, 'Code', 'B'),
(102, 'Code', 'C')
;with cte as
(
select id, Attribute
,STRING_AGG([Value], ', ') WITHIN GROUP (ORDER BY ID ASC) AS [Value]
from #t
group by ID, Attribute
)
select
max(p.ID) ID
,a.Value Manager
,isnull(b.Value, '') Account
,isnull(c.Value, '') Cardno
,isnull(e.Value, '') Code
from cte p
left join cte a on a.id =p.ID and a.attribute = 'Manager'
left join cte b on b.id = p.id and b.attribute = 'Account'
left join cte c on c.id = p.id and c.attribute = 'Cardno'
left join cte e on e.id = p.id and e.attribute = 'Code'
group by p.ID, a.Value,b.Value,c.Value,e.Value

SQL server and STUFF with two tables

I'm facing a problem. I have two tables as below.
table 1
+----+------+
| ks | keys |
+----+------+
| 11 | 1122|
+----+------+
| 12 | 2211|
+----+------+
| 13 | 2233|
+----+------+
| 14 | 3322|
+----+------+
table 2
+----+--+-------+
| Id | ks|codes|
+----+-----------+
| 1 | 11 |aaaaa|
+----+-----------+
| 2 | 11 |bbbbb|
+----+-----------+
| 3 | 12 |aaaaa|
+----+-----------+
| 3 | 13 |ccccc|
+----+-----------+
| 4 | 12 |bbbbb|
+----+-----------+
I tried to implement a following query in order to get my required output but did not work:
SELECT ks,
STUFF (
(SELECT ', ' + t2.codes as [text()]
from table2 as t2 where t1.ks = t2.ks FOR XML PATH('')
),1,1,''
) as "codes"
from table1 t1
group by ks;
I get this table as result:
+----+------+
| ks | codes|
+----+------+
| 11 | aaaa |
+----+------+
| 11 | bbbb |
+----+------+
| 12 | cccc |
+----+------+
| 12 | dddd |
+----+------+
then this image below shows my required output:
required result
I did something wrong but I do not know what could be. Any chance someone help me? Thanks!
Try this. I think you posted the wrong output.
Create table #tbl (ks int , codes varchar(10))
Insert into #tbl values
(11 ,'aaaa'),
(12 ,'bbbb'),
(13 ,'cccc'),
(14 ,'dddd')
Create table #tbl2 (id int, ks int , codes varchar(10))
Insert into #tbl2 values
( 1 ,11 ,'aaaaa'),
( 2 ,11 ,'bbbbb'),
( 3 ,12 ,'aaaaa'),
( 3 ,13 ,'ccccc'),
( 4 ,12 ,'bbbbb')
with cte as
(Select t1.ks, t2.codes
from #tbl t1 join #tbl2 t2 on t1.ks = t2.ks)
Select ks, STUFF(
(SELECT ',' + codes FROM cte c1
where c1.ks = c2.ks FOR XML PATH ('')), 1, 1, ''
)
from cte c2
group by ks
Output:
ks
11 aaaaa,bbbbb
12 aaaaa,bbbbb
13 ccccc
I cannot say that I fully understand what is going on in your tables--especially given your output image appears to have no relation to your sample tables--but it looks like you want a comma-delimited list of sub-values from table2 that are associated with table1.
Here's a working example that I think addresses your need. You can use CROSS APPLY in these situations. Doing so allows you to return all values from table1 regardless of a matching record in table2.
DECLARE #table1 TABLE ( [ks] INT, [code] VARCHAR(10) );
DECLARE #table2 TABLE ( [id] INT, [ks] INT, [code] VARCHAR(10) );
-- populate table1 --
INSERT INTO #table1 (
[ks], [code]
)
VALUES
( 11, 'aaaa' )
, ( 12, 'bbbb' )
, ( 13, 'cccc' )
, ( 14, 'dddd' );
-- populate table two --
INSERT INTO #table2 (
[id], [ks], [code]
)
VALUES
( 1, 11, 'aaaaa' )
, ( 2, 11, 'bbbbb' )
, ( 3, 12, 'aaaaa' )
, ( 3, 13, 'ccccc' )
, ( 4, 12, 'bbbbb' );
SELECT
t1.ks, codes.codes
FROM #table1 t1
CROSS APPLY (
SELECT (
STUFF(
( SELECT ', ' + t2.code AS "text()" FROM #table2 t2 WHERE t2.ks = t1.ks FOR XML PATH ( '' ) )
, 1, 2, ''
)
) AS [codes]
) AS codes
ORDER BY
t1.ks;
Resulting Output:
ks codes
11 aaaaa, bbbbb
12 aaaaa, bbbbb
13 ccccc
14 NULL

Sql Query Join on Comma Separated Value

I have a table that has a composite key and a comma separated value. I need the single row split into one row for each comma separated element. I have seen similar questions and similar answers but have not been able to translate them into a solution for myself.
I'm running SQL Server 2008 R2.
| Key Part 1 | Key Part 2 | Key Part 3 | Values |
|------------------------------------------------------|
| A | A | A | PDE,PPP,POR |
| A | A | B | PDE,XYZ |
| A | B | A | PDE,RRR |
|------------------------------------------------------|
and I need this as output
| Key Part 1 | Key Part 2 | Key Part 3 | Values | Sequence |
|-------------------------------------------------------------------|
| A | A | A | PDE | 0 |
| A | A | A | PPP | 1 |
| A | A | A | POR | 2 |
| A | A | B | PDE | 0 |
| A | A | B | XYZ | 1 |
| A | B | A | PDE | 0 |
| A | B | A | RRR | 1 |
|-------------------------------------------------------------------|
Thanks
Geoff
Here is a simple inline approach if you don't have or want a Split/Parse UDF
Example
Select A.[Key Part 1]
,A.[Key Part 2]
,A.[Key Part 3]
,B.*
From YourTable A
Cross Apply (
Select [Values] = LTrim(RTrim(X2.i.value('(./text())[1]', 'varchar(max)')))
,[Sequence] = Row_Number() over (Order By (Select null))-1
From (Select x = Cast('<x>' + replace(A.[Values],',','</x><x>')+'</x>' as xml)) X1
Cross Apply x.nodes('x') X2(i)
) B
Returns
EDIT - If Open to a Table-Valued Function
The Query would Look Like This
Select A.[Key Part 1]
,A.[Key Part 2]
,A.[Key Part 3]
,[Values] = B.RetVal
,[Sequence] = B.RetSeq-1
From #YourTable A
Cross Apply [dbo].[udf-Str-Parse-8K](A.[Values],',') B
The UDF if Interested
CREATE FUNCTION [dbo].[udf-Str-Parse-8K] (#String varchar(max),#Delimiter varchar(25))
Returns Table
As
Return (
with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(N) As (Select Top (IsNull(DataLength(#String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 a,cte1 b,cte1 c,cte1 d) A ),
cte3(N) As (Select 1 Union All Select t.N+DataLength(#Delimiter) From cte2 t Where Substring(#String,t.N,DataLength(#Delimiter)) = #Delimiter),
cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(#Delimiter,#String,s.N),0)-S.N,8000) From cte3 S)
Select RetSeq = Row_Number() over (Order By A.N)
,RetVal = LTrim(RTrim(Substring(#String, A.N, A.L)))
From cte4 A
);
--Orginal Source http://www.sqlservercentral.com/articles/Tally+Table/72993/
--Select * from [dbo].[udf-Str-Parse-8K]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse-8K]('John||Cappelletti||was||here','||')
If all CSV values are exactly 3 characters (as you have in your test data) you can use a a tally table in an incredibly efficient manner by creating the exact number of rows needed up front (as opposed to creating a row for every character to find the delimiter character)... because you already know the delimiter location.
In this case, I'll use a tally function but you can use a fixed tally table as well.
Code for the tfn_Tally function...
SET QUOTED_IDENTIFIER ON
SET ANSI_NULLS ON
GO
CREATE FUNCTION dbo.tfn_Tally
/* ============================================================================
07/20/2017 JL, Created. Capable of creating a sequense of rows
ranging from -10,000,000,000,000,000 to 10,000,000,000,000,000
============================================================================ */
(
#NumOfRows BIGINT,
#StartWith BIGINT
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH
cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)), -- 10 rows
cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b), -- 100 rows
cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n2 b), -- 10,000 rows
cte_n4 (n) AS (SELECT 1 FROM cte_n3 a CROSS JOIN cte_n3 b), -- 100,000,000 rows
cte_Tally (n) AS (
SELECT TOP (#NumOfRows)
(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1) + #StartWith
FROM
cte_n4 a CROSS JOIN cte_n4 b -- 10,000,000,000,000,000 rows
)
SELECT
t.n
FROM
cte_Tally t;
GO
How to use it in the solution...
-- create some test data...
IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL
DROP TABLE #TestData;
CREATE TABLE #TestData (
KeyPart1 CHAR(1),
KeyPart2 CHAR(1),
KeyPart3 CHAR(1),
[Values] varchar(50)
);
INSERT #TestData (KeyPart1, KeyPart2, KeyPart3, [Values]) VALUES
('A', 'A', 'A', 'PDE,PPP,POR'),
('A', 'A', 'B', 'PDE,XYZ'),
('A', 'B', 'A', 'PDE,RRR,XXX,YYY,ZZZ,AAA,BBB,CCC');
--==========================================================
-- solution query...
SELECT
td.KeyPart1,
td.KeyPart2,
td.KeyPart3,
x.SplitValue,
[Sequence] = t.n
FROM
#TestData td
CROSS APPLY dbo.tfn_Tally(LEN(td.[Values]) - LEN(REPLACE(td.[Values], ',', '')) + 1, 0) t
CROSS APPLY ( VALUES (SUBSTRING(td.[Values], t.n * 4 + 1, 3)) ) x (SplitValue);
And the results...
KeyPart1 KeyPart2 KeyPart3 SplitValue Sequence
-------- -------- -------- ---------- --------------------
A A A PDE 0
A A A PPP 1
A A A POR 2
A A B PDE 0
A A B XYZ 1
A B A PDE 0
A B A RRR 1
A B A XXX 2
A B A YYY 3
A B A ZZZ 4
A B A AAA 5
A B A BBB 6
A B A CCC 7
If the assumption that all of the csv elements are the number of characters is incorrect, you'd be better off using a traditional tally based splitter. In which case my recommendation is DelimitedSplit8K written by Jeff Moden.
In that case, the solution query would look like this...
SELECT
td.KeyPart1,
td.KeyPart2,
td.KeyPart3,
SplitValue = dsk.Item,
[Sequence] = dsk.ItemNumber - 1
FROM
#TestData td
CROSS APPLY dbo.DelimitedSplit8K(td.[Values], ',') dsk;
Ann the result...
KeyPart1 KeyPart2 KeyPart3 SplitValue Sequence
-------- -------- -------- ---------- --------------------
A A A PDE 0
A A A PPP 1
A A A POR 2
A A B PDE 0
A A B XYZ 1
A B A PDE 0
A B A RRR 1
A B A XXX 2
A B A YYY 3
A B A ZZZ 4
A B A AAA 5
A B A BBB 6
A B A CCC 7
HTH, Jason
-- Create Table
Create table YourTable
(
p1 varchar(50),
p2 varchar(50),
p3 varchar(50),
pval varchar(50)
)
go
-- Insert Data
insert into YourTable values ('A','A','A','PDE,PPP,POR'),
('A','A','B','PDE,XYZ'),('A','B','A','PDE,RRR')
go
-- View Sample Data
SELECT p1, p2, p3 , pval FROM YourTable
go
-- Required Result
SELECT p1,p2,p3, LTRIM(RTRIM(Split.a.value('.', 'VARCHAR(100)'))) as Value1 , ROW_NUMBER() OVER(PARTITION BY id ORDER BY id ASC)-1 AS SequenceNo
FROM
(SELECT ROW_NUMBER() over (order by (SELECT NULL)) AS ID, p1,p2,p3, pval, CAST ('<M>' + REPLACE(pval, ',', '</M><M>') + '</M>' AS XML) AS Data from YourTable
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
go
-- Remove Temp created table
drop table YourTable
go

How to select all records from first table and only first matching record from second table?

From this two tables
FirstTable
Number|
1 |
2 |
3 |
4 |
SecondTable
Id | Number | Column2 | Column3
--------------------------------
1 | 1 | text1 | text11
2 | 2 | text2 | text12
3 | 3 | text3 | text13
4 | 3 | text4 | text14
5 | 2 | text5 | text15
How to select all records from first table and only first matching record from second table with nulls?
Result should be like this:
Result
Number | Column2 | Column3
--------------------------
1 | text1 | text11
2 | text2 | text12
3 | text3 | text13
4 | null | null
I tried:
SELECT FT.Number, ST.Column2, ST.Column3
FROM FirstTable FT LEFT JOIN
SecondTable ST ON FT.Number =
(
SELECT TOP 1 S2.Number FROM SecondTable S2 WHERE S2.Number = FT.Number
)
or
SELECT min(FT.Number), ST.Column2, ST.Column3
FROM FirstTable FT LEFT JOIN
SecondTable ST ON FT.Number = ST.Number
GROUP BY ST.Column2, ST.Column3
You can do this with Row_Number() in a sub-query like this:
SELECT T1.Number, T2.Column1, T2.Column3
FROM FirstTable T1
LEFT JOIN ( SELECT ID, NUMBER, Colunmn2, Column3,
ROW_NUMBER() OVER (PARTITION BY Number ORDER BY ID ASC) as NumOrder
FROM SecondTable
) T2 ON T1.Number = T2.Number AND T2.NumOrder = 1
If you run just the sub-query you will see how this works -- it "flags" the rows of interest by having a value of 1. Then a simple join works.
You can do this variation of your first attempt:
SELECT FT.Number, ST.Column2, ST.Column3
FROM FirstTable FT LEFT JOIN
SecondTable ST ON ST.Id =
(
SELECT TOP 1 S2.Id FROM SecondTable S2 WHERE S2.Number = FT.Number
ORDER BY S2.Id
)
EDIT:
I ran the following script as a test:
DECLARE #FirstTable TABLE (
[Number] int
);
DECLARE #SecondTable TABLE (
Id int IDENTITY(1,1)
, Number int
, Column2 varchar(31)
, Column3 varchar(31)
)
INSERT INTO #FirstTable (Number) VALUES (1), (2), (3), (4);
INSERT INTO #SecondTable (Number, Column2, Column3) VALUES
(1, 'text1', 'text11')
, (2, 'text2', 'text12')
, (3, 'text3', 'text13')
, (3, 'text4', 'text14')
, (2, 'text5', 'text15')
;
SELECT FT.Number, ST.Column2, ST.Column3
FROM #FirstTable FT LEFT JOIN
#SecondTable ST ON ST.Id =
(
SELECT TOP 1 S2.Id FROM #SecondTable S2 WHERE S2.Number = FT.Number
ORDER BY S2.Id
);
And I got the following results:
Number Column2 Column3
1 text1 text11
2 text2 text12
3 text3 text13
4 NULL NULL
Which is exactly your desired result. If you are getting "too many rows" you must have made a mistake in the implementation.

MS SQL Set Group ID Without Looping

I would like create a query in MS-SQL to make a column containing an incrementing group number.
This is how I want my data to return:
Column 1 | Column 2 | Column 3
------------------------------
I | 1 | 1
O | 2 | 2
O | 2 | 3
I | 3 | 4
O | 4 | 5
O | 4 | 6
O | 4 | 7
O | 4 | 8
I | 5 | 9
O | 6 | 10
Column 1 is the I and O meaning In and Out.
Column 2 is the row Group (this should increment when Column 1 changes).
Column 3 is the Row-number.
So how can I write my query so that Column 2 increments every time Column 1 changes?
Firstly, to perform this kind of operation you need some column that can identify the order of the rows. If you have a column that determines this order, an identity column for example, it can be used to do something like this:
Runnable sample:
CREATE TABLE #Groups
(
id INT IDENTITY(1, 1) , -- added identity to provide order
Column1 VARCHAR(1)
)
INSERT INTO #Groups
( Column1 )
VALUES ( 'I' ),
( 'O' ),
( 'O' ),
( 'I' ),
( 'O' ),
( 'O' ),
( 'O' ),
( 'O' ),
( 'I' ),
( 'O' );
;
WITH cte
AS ( SELECT id ,
Column1 ,
1 AS Column2
FROM #Groups
WHERE id = 1
UNION ALL
SELECT g.id ,
g.Column1 ,
CASE WHEN g.Column1 = cte.Column1 THEN cte.Column2
ELSE cte.Column2 + 1
END AS Column2
FROM #Groups g
INNER JOIN cte ON cte.id + 1 = g.id
)
SELECT *
FROM cte
OPTION (MAXRECURSION 0) -- required to allow for more than 100 recursions
DROP TABLE #Groups
This code effectively loops through the records, comparing each row to the next and incrementing the value of Column2 if the value in Column1 changes.
If you don't have an identity column, then you might consider adding one.
Credit #AeroX:
With 30K records, the last line: OPTION (MAXRECURSION 0) is required to override the default of 100 recursions when using a Common Table Expression (CTE). Setting it to 0, means that it isn't limited.
This will work if you have sqlserver 2012+
DECLARE #t table(col1 char(1), col3 int identity(1,1))
INSERT #t values
('I'), ('O'), ('O'), ('I'), ('O'), ('O'), ('O'), ('O'), ('I'), ('O')
;WITH CTE AS
(
SELECT
case when lag(col1) over (order by col3) = col1
then 0 else 1 end increase,
col1,
col3
FROM #t
)
SELECT
col1,
sum(increase) over (order by col3) col2,
col3
FROM CTE
Result:
col1 col2 col3
I 1 1
O 2 2
O 2 3
I 3 4
O 4 5
O 4 6
O 4 7
O 4 8
I 5 9
O 6 10