I have two stored procedure which currently selects from Users (U) where U has a booking which matches some criteria (IsPaid = 1 and MonthNo matches that passed to the stored procedure).
--Users (U)--
UserID Name
1 John
2 Bill
3 Tom
--Bookings (B)--
BookingID UserID MonthNo IsPaid
5 1 2 1
6 1 3 1
7 1 4 0
8 2 2 1
9 2 3 1
10 2 4 1
11 3 4 1
ALTER PROCEDURE FindUsers...
#MonthNo
AS
...
WHERE B.IsPaid = 1 AND B.MonthNo = #MonthNo
So currently, if #MonthId = 3, users #2 and #3 are returned
I now need to pass a list to the procedure to return users where a range of booking match. Ie:
ALTER PROCEDURE FindUsers...
#MonthNosCsv
AS
...
-- split month numbers somehow, then check whether all matching `Booking` rows match.
Pseudo code....
WHERE ALL(B.IsPaid = 1 AND B.MonthNo IN(CsvSplit(#MonthNoCsv)))
So if #MonthNoCsv is '2,3,4', only user 2 is returned, because they have paid bookings in months 2,3 and 4.
Is this possible in SQL or would it be best to do secondary processing in the consuming application?
Everyone should have a good splitter. There are many options available. I did, however, supply the one I used.
The following list two options, the first is an in-line version, and the second uses a parse function. Both would return the same results.
Option 1 - Without a Parse Function
Declare #MonthNoCsv varchar(50) = '2,3,4'
;with cte as (
Select RetSeq = Row_Number() over (Order By (Select null))
,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(#MonthNoCsv,',','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
)
Select U.*
from cte M
Join Bookings B on M.RetVal=B.MonthNo and B.IsPaid=1
Join Users U on U.UserID=B.UserID
Group By U.UserID,U.Name
Having Count(Distinct B.MonthNo)=(Select max(RetSeq) from cte)
Option 2 - With a Parse Function
Declare #MonthNoCsv varchar(50) = '2,3,4'
;with cte as (
Select * from [dbo].[udf-Str-Parse](#MonthNoCsv,',')
)
Select U.*
from cte M
Join Bookings B on M.RetVal=B.MonthNo and B.IsPaid=1
Join Users U on U.UserID=B.UserID
Group By U.UserID,U.Name
Having Count(Distinct B.MonthNo)=(Select max(RetSeq) from cte)
Both Return
UserID Name
2 Bill
The UDF if Interested
CREATE FUNCTION [dbo].[udf-Str-Parse] (#String varchar(max),#Delimiter varchar(10))
Returns Table
As
Return (
Select RetSeq = Row_Number() over (Order By (Select null))
,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(#String,#Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
);
--Thanks Shnugo for making this XML safe
--Select * from [dbo].[udf-Str-Parse]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ')
--Select * from [dbo].[udf-Str-Parse]('this,is,<test>,for,< & >',',')
Related
I have 2 strings
Declare #WhenDetails NVarchar(Max) ='07:00:0:0;1:00:1:0;6:00:1:0;10:00:1:0;'
Declare #Dosage NVarchar(Max) ='1.00;2.00;1.00;1.00'
I need to split these 2 string and insert into a table
Example at 07:00:0:0=>1.00 1:00:1:0=>2.00
Declare #TempDosageWhenDetails Table (RowID INT IDENTITY(1,1), PatientMedicationID INT, Dosage NVARCHAR(Max),WhenDetails NVARCHAR(Max))
insert #TempDosageWhenDetails(Dosage)
select x.items
from dbo.Split('07:00:0:0;1:00:1:0;6:00:1:0;10:00:1:0;', ';') x
I have taken a table and split and inserted my when details
How can fill the dosage column as shown in the example?
Note I might have n number of records to split I have given these just an example.
Perhaps with a little JSON (assuming 2016+)
Example
Declare #WhenDetails NVarchar(Max) ='07:00:0:0;1:00:1:0;6:00:1:0;10:00:1:0;'
Declare #Dosage NVarchar(Max) ='1.00;2.00;1.00;1.00'
Select RowID = A.[Key]+1
,PatientID = null
,Dosage = B.[Value]
,WhenDetails = A.[Value]
From (
Select *
From OpenJSON( '["'+replace(#WhenDetails,';','","')+'"]' )
) A
Join (
Select *
From OpenJSON( '["'+replace(#Dosage,';','","')+'"]' )
) B
on A.[Key]=B.[Key]
Returns
RowID PatientID Dosage WhenDetails
1 NULL 1.00 07:00:0:0
2 NULL 2.00 1:00:1:0
3 NULL 1.00 6:00:1:0
4 NULL 1.00 10:00:1:0
If it helps with the Visualization:
We convert the strings into a JSON array, then it is a small matter to join the results based on the KEY
If you were to
Select * From OpenJSON( '["'+replace(#WhenDetails,';','","')+'"]' )
The results would be
key value type
0 07:00:0:0 1
1 1:00:1:0 1
2 6:00:1:0 1
3 10:00:1:0 1
EDIT - XML Approach
Select RowID = A.RetSeq
,PatientID = null
,Dosage = B.RetVal
,WhenDetails = A.RetVal
From (
Select RetSeq = row_number() over (order by 1/0)
,RetVal = ltrim(rtrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(#WhenDetails,';','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
) A
Join (
Select RetSeq = row_number() over (order by 1/0)
,RetVal = ltrim(rtrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(#Dosage,';','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
) B
on A.RetSeq=B.RetSeq
I using SQL Server 2008 R2 / 2014. I wish to find a SQL query that can do the following:
Rules:
Each [Group] must have [Number] 1 to 6 to be complete group.
[Name] in each [Group] must be unique.
Each row only can use 1 time.
Table before sorting is...
Name Number Group
---- ------ -----
A 1
B 6
A 123
C 3
B 4
C 23
D 45
D 4
C 56
A 12
D 56
After sorting, result I want is below or similar....
Name Number Group
---- ------ -----
A 1 1
C 23 1
D 45 1
B 6 1
A 123 2
D 4 2
C 56 2
A 12 3
C 3 3
B 4 3
D 56 3
What I tried before is to find a subgroup that have [Number] consist of 1-6 with below concatenate method...
SELECT *
FROM [Table1] ST2
WHERE
SUBSTRING((SELECT ST1.[Number] AS [text()]
FROM [Table1] ST1
-- WHERE ST1.[Group] = ST2.[Group]
ORDER BY LEFT(ST1.[Number],1)
FOR XML PATH ('')), 1, 1000) = '123456'
Maybe you should check ROW_NUMBER function.
select Name
, Number
, ROW_NUMBER () OVER(PARTITION BY Name ORDER BY Number) as Group
from [Table1]
If you have more than 6 rows with same NAME value then it will return more groups. You can filter additional groups out since you are interested in only 6 groups with unique values of NAME column.
I'm not sure if this can be done more simply or not, but here's my go at it...
Advanced warning, this requires some means of splitting strings. Since you're not on 2016, I've included a function at the beginning of the script.
The bulk of the work is a recursive CTE that builds the Name and Number columns into comma delimited groups. We then reduce our working set to only the groups where the numbers would create 123456, split the groups and use ROW_NUMBER() OVER... to identify them, and then select based on the new data.
Demo: http://rextester.com/NEXG53500
CREATE FUNCTION [dbo].[SplitStrings]
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
FROM
(
SELECT x = CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
CREATE TABLE #temp
(
name VARCHAR(MAX),
number INT
)
INSERT INTO #temp
VALUES
('a',1),
('b',6),
('a',123),
('c',3),
('b',4),
('c',23),
('d',45),
('d',4),
('c',56),
('a',12),
('d',56);
/*** Recursively build groups based on information from #temp ***/
WITH groupFinder AS
(
SELECT CAST(name AS VARCHAR(MAX)) AS [groupNames], CAST(number AS VARCHAR(max)) AS [groupNumbers] FROM #temp
UNION ALL
SELECT
cast(CONCAT(t.[Name],',',g.[groupNames]) as VARCHAR(MAX)),
CAST(CONCAT(CAST(t.[Number] AS VARCHAR(max)),',',CAST(g.[groupNumbers] AS VARCHAR(max))) AS VARCHAR(max))
FROM #temp t
JOIN groupFinder g
ON
g.groupNames NOT LIKE '%' + t.name+'%'
AND g.[groupNumbers] NOT LIKE '%' + CAST(t.number/100 AS VARCHAR(10)) +'%'
AND g.[groupNumbers] NOT LIKE '%' + CAST(t.number/10 AS VARCHAR(10)) +'%'
AND g.[groupNumbers] NOT LIKE '%' + CAST(t.number%10 AS VARCHAR(10)) +'%'
)
/*** only get groups where the numbers form 123456 ***/
, groupPruner AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY [groupNames]) AS [rn] FROM groupFinder WHERE REPLACE([groupNumbers],',','') = '123456'
)
/*** split the name group and give it identifiers ***/
, nameIdentifier AS
(
SELECT g.*, c1.[item] AS [Name], ROW_NUMBER() OVER (PARTITION BY [rn] ORDER BY (SELECT NULL)) AS [rn1]
FROM groupPruner g
CROSS APPLY splitstrings(g.groupnames,',') c1
)
/*** split the number group and give it identifiers ***/
, numberIdentifier AS
(
SELECT g.*, c1.[item] AS [Number], ROW_NUMBER() OVER (PARTITION BY [rn], [rn1] ORDER BY (SELECT NULL)) AS [rn2]
FROM nameIdentifier g
CROSS APPLY splitstrings(g.groupNumbers,',') c1
)
SELECT [Name], [Number], [rn] AS [Group]
--,groupnames, groupNumbers /*uncomment this line to see the groups that were built*/
FROM numberIdentifier
WHERE rn1 = rn2
ORDER BY rn, rn1
DROP TABLE #temp
Is there anyway to do this in less time? I am taking the summary column from my case table and splitting the data word by word into my words table using the following loop:
Example case table
CaseID | CaseNumber | Summary
1 111111 This is a summary
2 111112 This is Summary 2
DECLARE
#n int = 1
;
WHILE #n <= 1000
BEGIN
INSERT INTO words (caseID, caseNumber, pn, word)
SELECT caseID, caseNumber, pn, word FROM dbo.Split6(' ', (select summary
from
cases where caseID = #n)) where caseID = #n group by caseID,caseNumber, pn,
word
option (maxrecursion 0)
SET #n = #n+1;
END
GO
It works, but it is slow. Took 3 hours to break down 1000 cases. I have 100,000 cases. Is there a way I can do this more efficiently? Here is the split function I'm using:
Split6 function:
CREATE FUNCTION [dbo].[Split6] (
#sep CHAR(1)
,#s nVARCHAR(4000)
)
RETURNS TABLE
AS
RETURN (
WITH Pieces(caseID,caseNumber, pn, start, stop) AS (
SELECT cs.caseID
,cs.caseNumber
,1
,1
,CHARINDEX(#sep, #s)
FROM cases cs
UNION ALL
SELECT caseID
,caseNumber
,pn + 1
,stop + 1
,CHARINDEX(#sep, #s, stop + 1)
FROM Pieces
WHERE stop > 0
)
SELECT caseID
,caseNumber
,pn
,SUBSTRING(#s, start, CASE
WHEN stop > 0
THEN stop - start
ELSE 512
END) AS word
FROM Pieces
) GO
You should avoid loops whenever possible.
The following uses a Parse/Split function in concert with a Cross Apply (use Outer Apply to show null values).
As far as performance goes... useing a test sample of 100,000 records with a average of 5 words each, the execution time is 2.2 seconds.
Example
Declare #YourTable Table ([CaseID] varchar(50),[CaseNumber] varchar(50),[Summary] varchar(50))
Insert Into #YourTable Values
(1,111111,'This is a summary')
,(2,111112,'This is Summary 2')
Select A.CaseID
,A.CaseNumber
,B.*
From #YourTable A
Cross Apply [dbo].[udf-Str-Parse](A.Summary,' ') B
Returns
CaseID CaseNumber RetSeq RetVal
1 111111 1 This
1 111111 2 is
1 111111 3 a
1 111111 4 summary
2 111112 1 This
2 111112 2 is
2 111112 3 Summary
2 111112 4 2
The UDF if Interested
CREATE FUNCTION [dbo].[udf-Str-Parse] (#String varchar(max),#Delimiter varchar(10))
Returns Table
As
Return (
Select RetSeq = Row_Number() over (Order By (Select null))
,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(#String,#Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
);
--Thanks Shnugo for making this XML safe
--Select * from [dbo].[udf-Str-Parse]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ')
--Select * from [dbo].[udf-Str-Parse]('this,is,<test>,for,< & >',',')
EDIT - Another Parse/Split Function
The following TVF is slightly faster then the XML version, but limited to 8K. For example, on 5,000 sample records, with an average of 36 "words", it was 20ms faster than the XML version.
CREATE FUNCTION [dbo].[udf-Str-Parse-8K] (#String varchar(max),#Delimiter varchar(25))
Returns Table
As
Return (
with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(N) As (Select Top (IsNull(DataLength(#String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 a,cte1 b,cte1 c,cte1 d) A ),
cte3(N) As (Select 1 Union All Select t.N+DataLength(#Delimiter) From cte2 t Where Substring(#String,t.N,DataLength(#Delimiter)) = #Delimiter),
cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(#Delimiter,#String,s.N),0)-S.N,8000) From cte3 S)
Select RetSeq = Row_Number() over (Order By A.N)
,RetVal = LTrim(RTrim(Substring(#String, A.N, A.L)))
From cte4 A
);
--Orginal Source http://www.sqlservercentral.com/articles/Tally+Table/72993/
--Select * from [dbo].[udf-Str-Parse-8K]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse-8K]('John||Cappelletti||was||here','||')
I've inherited a database of user profile information which has a column for personal interests. Multiple interests are separated by a pipe (|). In a SQL query, how can I split a field with this value: 2|27|33|14|15
To look like this:
2
27
33
14
15
The exact syntax depends on which dbms you are using. Assuming you are using MSSQL this is the general syntax
STRING_SPLIT ( string , separator )
For example
DECLARE #string_to_be_split NVARCHAR(400) = '2|27|33|14|15'
SELECT value
FROM STRING_SPLIT(#string_to_be_split, '|')
WHERE RTRIM(value) <> '';
Edit - Could have sworn that I saw SQL Server
If not 2016, just about any Split/Parse Function will do.
Option 1 - With UDF
Declare #YourTable table (ID int,Interests varchar(250))
Insert Into #YourTable values
(1,'2|27|33|14|15')
Select A.ID
,B.*
From #YourTable A
Cross Apply [dbo].[udf-Str-Parse](A.Interests,'|') B
Option 2 - Without a UDF
Select A.ID
,B.*
From #YourTable A
Cross Apply (
Select RetSeq = Row_Number() over (Order By (Select null))
,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(A.Interests,'|','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as X
Cross Apply x.nodes('x') AS B(i)
) B
Both Return
ID RetSeq RetVal
1 1 2
1 2 27
1 3 33
1 4 14
1 5 15
The UDF if Interested
CREATE FUNCTION [dbo].[udf-Str-Parse] (#String varchar(max),#Delimiter varchar(10))
Returns Table
As
Return (
Select RetSeq = Row_Number() over (Order By (Select null))
,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(#String,#Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as X
Cross Apply x.nodes('x') AS B(i)
);
--Thanks Shnugo for making this XML safe
--Select * from [dbo].[udf-Str-Parse]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ')
--Select * from [dbo].[udf-Str-Parse]('this,is,<test>,for,< & >',',')
Inserts properly, however sorts id while inserting
If I execute the stored procedure parameters username = dynamic and id = 19,1,10
then when i check the Favorites table i see:
INSERT INTO Favorites(username, id)
SELECT #username, i.item
FROM fnSplit(#id, ',') i
INNER JOIN dbo.Link f on f.id = i.item
WHERE id IS NOT NULL
More information about split function:
https://msdn.microsoft.com/en-us/library/mt684588.aspx
NOTE: I am using a different name for the function but it is the same thing
I believe your inner join is changing the order. Since you are only using it for filtering, you can change the inner join into a where exists. This should preserve the order:
INSERT INTO Favorites( username, id )
SELECT #username, i.item
FROM fnSplit(#id, ',') i
WHERE EXISTS
(
SELECT 1
FROM dbo.Link f
WHERE f.id = i.item AND f.id IS NOT NULL
)
Example
Declare #username varchar(50) = 'dynamic'
Declare #favorite varchar(50) = '19,1,10'
Insert Into Favorites (username,id)
Select #username,f.ID
From [dbo].[udf-Str-Parse](#favorite,',') i
Join dbo.Link f on f.id = i.RetSeq
Where f.ID is not null
Order By RetSeq -- << Notice we added an Order By
If it helps with the visualization:
Select * From [dbo].[udf-Str-Parse]('19,1,10',',')
Returns
RetSeq RetVal
1 19
2 1
3 10
The TVF which will supply a Sequence (RetSeq)
CREATE FUNCTION [dbo].[udf-Str-Parse] (#String varchar(max),#Delimiter varchar(10))
Returns Table
As
Return (
Select RetSeq = Row_Number() over (Order By (Select null))
,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(#String,#Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
);
--Thanks Shnugo for making this XML safe
--Select * from [dbo].[udf-Str-Parse]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ')
--Select * from [dbo].[udf-Str-Parse]('this,is,<test>,for,< & >',',')
--Performance On a 5,000 random sample -8K 77.8ms, -1M 79ms (+1.16), -- 91.66ms (+13.8)