SQL Server 2014 : Convert two comma separated string into two columns - sql

I have two comma-separated string which needs to be converted into a temptable with two columns synchronized based on the index.
If the input string as below
a = 'abc,def,ghi'
b = 'aaa,bbb,ccc'
then output should be
column1 | column2
------------------
abc | aaa
def | bbb
ghi | ccc
Let us say I have function fnConvertCommaSeparatedStringToColumn which takes in comma-separated string and delimiter as a parameter and returns a column with values. I use this on both strings and get two columns to verify if the count is the same on both sides. But it would be nice two have them in a single temp table. How can i do that?

Let us say I have function which ... returns a column with values.
At that point, the basic idea is to select the column and use the row_number() function with both of your strings. Then you can JOIN the two together using the row_number() result as the matching field for the join.

One method is a recursive CTE:
with cte as (
select convert(varchar(max), null) as a_part, convert(varchar(max), null) as b_part,
convert(varchar(max), 'abc,def,ghi') + ',' as a,
convert(varchar(max), 'aaa,bbb,ccc') + ',' as b,
0 as lev
union all
select convert(varchar(max), left(a, charindex(',', a) - 1)),
convert(varchar(max), left(b, charindex(',', b) - 1)),
stuff(a, 1, charindex(',', a), ''),
stuff(b, 1, charindex(',', b), ''),
lev + 1
from cte
where a <> '' and lev < 10
)
select a_part, b_part
from cte
where lev > 0;
Here is a db<>fiddle.

Here's something a bit sneaky you can try.
I don't have your bespoke function so have used the built-in string_split function (SQL2016+) - for quickly testing, but assuming the parameters are the same. Ideally, your bespoke function should return its own row number in which case you'd use that instead of a rownumber function.
declare #a varchar(20)='abc,def,ghi', #b varchar(20)='aaa,bbb,ccc';
with v as (
select a.value A,b.value B,
row_number() over(partition by a.value order by (select 1/0))Arn,
row_number() over(partition by b.value order by (select 1/0))Brn
from fnConvertCommaSeparatedStringToColumn (#a,',')a
cross apply fnConvertCommaSeparatedStringToColumn (#b,',')b
)
select A,B from v
where Arn=Brn

I would suggest getting a (set based) function that can split a string, based on a delimiter, that returns the ordinal position as well. For example DelimitedSplit8k_LEAD. Then you can trivially split the value, and JOIN on the ordinal position:
DECLARE #a varchar(100) = 'abc,def,ghi';
DECLARE #b varchar(100) = 'aaa,bbb,ccc';
SELECT A.Item AS A,
B.Item AS B
FROM dbo.delimitedsplit8k_lead(#a,',') A
FULL OUTER JOIN dbo.delimitedsplit8k_lead(#a,',') B ON A.ItemNumber = B.ItemNumber;
db<>fiddle
I use a FULL OUTER JOIN and then if either column has a NULL value you know that the 2 delimited lists don't have the same number of delimited values.

Related

How to combine return results of query in one row

I have a table that save personnel code.
When I select from this table I get 3 rows result such as:
2129,3394,3508,3534
2129,3508
4056
I want when create select result combine in one row such as:
2129,3394,3508,3534,2129,3508,4056
or distinct value such as:
2129,3394,3508,3534,4056
You should ideally avoid storing CSV data at all in your tables. That being said, for your first result set we can try using STRING_AGG:
SELECT STRING_AGG(col, ',') AS output
FROM yourTable;
Your second requirement is more tricky, and we can try going through a table to remove duplicates:
WITH cte AS (
SELECT DISTINCT VALUE AS col
FROM yourTable t
CROSS APPLY STRING_SPLIT(t.col, ',')
)
SELECT STRING_AGG(col, ',') WITHIN GROUP (ORDER BY CAST(col AS INT)) AS output
FROM cte;
Demo
I solved this by using STUFF and FOR XML PATH:
SELECT
STUFF((SELECT ',' + US.remain_uncompleted
FROM Table_request US
WHERE exclusive = 0 AND reqact = 1 AND reqend = 0
FOR XML PATH('')), 1, 1, '')
Thank you Tim

Alphanumeric sort on nvarchar(50) column

I am trying to write a query that will return data sorted by an alphanumeric column, Code.
Below is my query:
SELECT *
FROM <<TableName>>
CROSS APPLY (SELECT PATINDEX('[A-Z, a-z][0-9]%', [Code]),
CHARINDEX('', [Code]) ) ca(PatPos, SpacePos)
CROSS APPLY (SELECT CONVERT(INTEGER, CASE WHEN ca.PatPos = 1 THEN
SUBSTRING([Code], 2,ISNULL(NULLIF(ca.SpacePos,0)-2, 8000)) ELSE NULL END),
CASE WHEN ca.PatPos = 1 THEN LEFT([Code],
ISNULL(NULLIF(ca.SpacePos,0)-0,1)) ELSE [Code] END) ca2(OrderBy2, OrderBy1)
WHERE [TypeID] = '1'
OUTPUT:
FFS1
FFS2
...
FFS12
FFS1.1
FFS1.2
...
FFS1.1E
FFS1.1R
...
FFS12.1
FFS12.2
FFS.12.1E
FFS12.1R
FFS12.2E
FFS12.2R
DESIRED OUTPUT:
FFS1
FFS1.1
FFS1.1E
FFS1.1R
....
FFS12
FFS12.1
FFS12.1E
FFS12.1R
What am I missing or overlooking?
EDIT:
Let me try to detail the table contents a little better. There are records for FFS1 - FFS12. Those are broken into X subs, i.e., FFS1.1 - FFS1.X to FFS12.1 - FFS12.X. The E and the R was not a typo, each sub record has two codes associated with it: FFS1.1E & FFS1.1R.
Additionally I tried using ORDER BY but it sorted as
FFS1
...
FFS10
FFS2
This will work for any count of parts separated by dots. The sorting is alphanumerical for each part separately.
DECLARE #YourValues TABLE(ID INT IDENTITY, SomeVal VARCHAR(100));
INSERT INTO #YourValues VALUES
('FFS1')
,('FFS2')
,('FFS12')
,('FFS1.1')
,('FFS1.2')
,('FFS1.1E')
,('FFS1.1R')
,('FFS12.1')
,('FFS12.2')
,('FFS.12.1E')
,('FFS12.1R')
,('FFS12.2E')
,('FFS12.2R');
--The query
WITH Splittable AS
(
SELECT ID
,SomeVal
,CAST(N'<x>' + REPLACE(SomeVal,'.','</x><x>') + N'</x>' AS XML) AS Casted
FROM #YourValues
)
,Parted AS
(
SELECT Splittable.*
,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS PartNmbr
,A.part.value(N'text()[1]','nvarchar(max)') AS Part
FROM Splittable
CROSS APPLY Splittable.Casted.nodes(N'/x') AS A(part)
)
,AddSortCrit AS
(
SELECT ID
,SomeVal
,(SELECT LEFT(x.Part + REPLICATE(' ',10),10) AS [*]
FROM Parted AS x
WHERE x.ID=Parted.ID
ORDER BY PartNmbr
FOR XML PATH('')
) AS SortColumn
FROM Parted
GROUP BY ID,SomeVal
)
SELECT ID
,SomeVal
FROM AddSortCrit
ORDER BY SortColumn;
The result
ID SomeVal
10 FFS.12.1E
1 FFS1
4 FFS1.1
6 FFS1.1E
7 FFS1.1R
5 FFS1.2
3 FFS12
8 FFS12.1
11 FFS12.1R
9 FFS12.2
12 FFS12.2E
13 FFS12.2R
2 FFS2
Some explanation:
The first CTE will transform your codes to XML, which allows to address each part separately.
The second CTE returns each part toegther with a number.
The third CTE re-concatenates your code, but each part is padded to a length of 10 characters.
The final SELECT uses this new single-string-per-row in the ORDER BY.
Final hint:
This design is bad! You should not store these values in concatenated strings... Store them in separate columns and fiddle them together just for the output/presentation layer. Doing so avoids this rather ugly fiddle...

ORDER BY specific numerical value in string [SQL]

Have a column ID that I would like to ORDER in a specific format. Column has a varchar data type and always has an alphabetic value, typically P in front followed by three to four numeric values. Possibly even followed by an underscore or another alphabetic value. I have tried few options and none are returning what I desire.
SELECT [ID] FROM MYTABLE
ORDER BY
(1) LEN(ID), ID ASC
/ (2) LEFT(ID,2)
OPTIONS TRIED (3) SUBSTRING(ID,2,4) ASC
\ (4) ROW_NUMBER() OVER (ORDER BY SUBSTRING(ID,2,4))
(5) SUBSTRING(ID,PATINDEX('%[0-9]%',ID),LEN(ID))
(6) LEFT(ID, PATINDEX('%[0-9]%', ID)-1)
Option 1 seems to be closest to what I am looking for except when an _ or Alphabetic values follow the numeric value. See results from Option 1 below
P100
P208
P218
P301
P305
P306
P4200
P4510
P4511
P4512
P5011
P1400A
P4125H
P4202A
P4507L
P4706A
P1001_2
P2103_B
P4368_RL
Would like to see..
P100
P208
P218
P301
P305
P306
P1001_2
P1400A
P2103_B
P4125H
P4200
P4202A
P4368_RL
P4507L
P4510
P4511
P4512
P4706A
P5011
ORDER BY
CAST(SUBSTRING(id, 2, 4) AS INT),
SUBSTRING(id, 6, 3)
http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1e1f4fbf1/9464
And one that's still less complex than a getOnlyNumbers() UDF, but copes with varying length of numeric part.
CROSS APPLY
(
SELECT
tail_start = PATINDEX('%[0-9][^0-9]%', id + '_')
)
stats
CROSS APPLY
(
SELECT
numeric = CAST(SUBSTRING(id, 2, stats.tail_start-1) AS INT),
alpha = RIGHT(id, LEN(id) - stats.tail_start)
)
id_tuple
ORDER BY
id_tuple.numeric,
id_tuple.alpha
http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1e1f4fbf1/9499
Finally, one that can cope with there being no number at all (but still assumes the first character exists and should be ignored).
CROSS APPLY
(
SELECT
tail_start = NULLIF(PATINDEX('%[0-9][^0-9]%', id + '_'), 0)
)
stats
CROSS APPLY
(
SELECT
numeric = CAST(SUBSTRING(id, 2, stats.tail_start-1) AS INT),
alpha = RIGHT(id, LEN(id) - ISNULL(stats.tail_start, 1))
)
id_tuple
ORDER BY
id_tuple.numeric,
id_tuple.alpha
http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1e1f4fbf1/9507
This is a rather strange way to sort but now that I understand it I figured out a solution. I am using a table valued function here to strip out only the numbers from a string. Since the function returns all numeric characters I also need to check for the _ and only pass in the part of the string before that.
Here is the function.
create function GetOnlyNumbers
(
#SearchVal varchar(8000)
) returns table as return
with MyValues as
(
select substring(#SearchVal, N, 1) as number
, t.N
from cteTally t
where N <= len(#SearchVal)
and substring(#SearchVal, N, 1) like '[0-9]'
)
select distinct NumValue = STUFF((select number + ''
from MyValues mv2
order by mv2.N
for xml path('')), 1, 0, '')
from MyValues mv
This function is using a tally table. If you have one you can tweak that code slightly to fit. Here is my tally table. I keep it as a view.
create View [dbo].[cteTally] as
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
select N from cteTally
GO
Next of course we need to have some data to work. In this case I just created a table variable to represent your actual table.
declare #Something table
(
SomeVal varchar(10)
)
insert #Something values
('P100')
, ('P208')
, ('P218')
, ('P301')
, ('P305')
, ('P306')
, ('P4200')
, ('P4510')
, ('P4511')
, ('P4512')
, ('P5011')
, ('P1400A')
, ('P4125H')
, ('P4202A')
, ('P4507L')
, ('P4706A')
, ('P1001_2')
, ('P2103_B')
, ('P4368_RL')
With all the legwork and setup behind us we can get to the actual query needed to accomplish this.
select s.SomeVal
from #Something s
cross apply dbo.GetOnlyNumbers(case when charindex('_', s.SomeVal) = 0 then s.SomeVal else left(s.SomeVal, charindex('_', s.SomeVal) - 1) end) x
order by convert(int, x.NumValue)
This returns the rows in the order you listed them in your question.
You can break down ID in steps to extract the number. Then, order by the number and ID. I like to break down long string manipulation into steps using CROSS APPLY. You can do it inline (it'd be long) or bundle it into an inline TVF.
SELECT t.*
FROM MYTABLE t
CROSS APPLY (SELECT NoP = STUFF(ID, 1, 1, '')) nop
CROSS APPLY (SELECT FindNonNumeric = LEFT(NoP, ISNULL(NULLIF(PATINDEX('%[^0-9]%', NoP)-1, -1), LEN(NoP)))) fnn
CROSS APPLY (SELECT Number = CONVERT(INT, FindNonNumeric)) num
ORDER BY Number
, ID;
I think your best bet is to create a function that strips the numbers out of the string, like this one, and then sort by that. Even better, as #SeanLange suggested, would be to use that function to store the number value in a new column and sort by that.

T-SQL function to split string with two delimiters as column separators into table

I'm looking for a t-sql function to get a string like:
a:b,c:d,e:f
and convert it to a table like
ID Value
a b
c d
e f
Anything I found in Internet incorporated single column parsing (e.g. XMLSplit function variations) but none of them letting me describe my string with two delimiters, one for column separation & the other for row separation.
Can you please guiding me regarding the issue? I have a very limited t-sql knowledge and cannot fork those read-made functions to get two column solution?
You can find a split() function on the web. Then, you can do string logic:
select left(val, charindex(':', val)) as col1,
substring(val, charindex(':', val) + 1, len(val)) as col2
from dbo.split(#str, ';') s(val);
You can use a custom SQL Split function in order to separate data-value columns
Here is a sql split function that you can use on a development system
It returns an ID value that can be helpful to keep id and value together
You need to split twice, first using "," then a second split using ";" character
declare #str nvarchar(100) = 'a:b,c:d,e:f'
select
id = max(id),
value = max(value)
from (
select
rowid,
id = case when id = 1 then val else null end,
value = case when id = 2 then val else null end
from (
select
s.id rowid, t.id, t.val
from (
select * from dbo.Split(#str, ',')
) s
cross apply dbo.Split(s.val, ':') t
) k
) m group by rowid

SQL Customized search with special characters

I am creating a key-wording module where I want to search data using the comma separated words.And the search is categorized into comma , and minus -.
I know a relational database engine is designed from the principle that a cell holds a single value and obeying to this rule can help for performance.But in this case table is already running and have millions of data and can't change the table structure.
Take a look on the example what I exactly want to do is
I have a main table name tbl_main in SQL
AS_ID KWD
1 Man,Businessman,Business,Office,confidence,arms crossed
2 Man,Businessman,Business,Office,laptop,corridor,waiting
3 man,business,mobile phone,mobile,phone
4 Welcome,girl,Greeting,beautiful,bride,celebration,wedding,woman,happiness
5 beautiful,bride,wedding,woman,girl,happiness,mobile phone,talking
6 woman,girl,Digital Tablet,working,sitting,online
7 woman,girl,Digital Tablet,working,smiling,happiness,hand on chin
If search text is = Man,Businessman then result AS_ID is =1,2
If search text is = Man,-Businessman then result AS_ID is =3
If search text is = woman,girl,-Working then result AS_ID is =4,5
If search text is = woman,girl then result AS_ID is =4,5,6,7
What is the best why to do this, Help is much appreciated.Thanks in advance
I think you can easily solve this by creating a FULL TEXT INDEX on your KWD column. Then you can use the CONTAINS query to search for phrases. The FULL TEXT index takes care of the punctuation and ignores the commas automatically.
-- If search text is = Man,Businessman then the query will be
SELECT AS_ID FROM tbl_main
WHERE CONTAINS(KWD, '"Man" AND "Businessman"')
-- If search text is = Man,-Businessman then the query will be
SELECT AS_ID FROM tbl_main
WHERE CONTAINS(KWD, '"Man" AND NOT "Businessman"')
-- If search text is = woman,girl,-Working the query will be
SELECT AS_ID FROM tbl_main
WHERE CONTAINS(KWD, '"woman" AND "girl" AND NOT "working"')
To search the multiple words (like the mobile phone in your case) use the quoted phrases:
SELECT AS_ID FROM tbl_main
WHERE CONTAINS(KWD, '"woman" AND "mobile phone"')
As commented below the quoted phrases are important in all searches to avoid bad searches in the case of e.g. when a search term is "tablet working" and the KWD value is woman,girl,Digital Tablet,working,sitting,online
There is a special case for a single - search term. The NOT cannot be used as the first term in the CONTAINS. Therefore, the query like this should be used:
-- If search text is = -Working the query will be
SELECT AS_ID FROM tbl_main
WHERE NOT CONTAINS(KWD, '"working"')
Here is my attempt using Jeff Moden's DelimitedSplit8k to split the comma-separated values.
First, here is the splitter function (check the article for updates of the script):
CREATE FUNCTION [dbo].[DelimitedSplit8K](
#pString VARCHAR(8000), #pDelimiter CHAR(1)
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)
,E2(N) AS (SELECT 1 FROM E1 a, E1 b)
,E4(N) AS (SELECT 1 FROM E2 a, E2 b)
,cteTally(N) AS(
SELECT TOP (ISNULL(DATALENGTH(#pString), 0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
,cteStart(N1) AS(
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(#pString, t.N, 1) = #pDelimiter
),
cteLen(N1, L1) AS(
SELECT
s.N1,
ISNULL(NULLIF(CHARINDEX(#pDelimiter, #pString, s.N1),0) - s.N1, 8000)
FROM cteStart s
)
SELECT
ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(#pString, l.N1, l.L1)
FROM cteLen l
Here is the complete solution:
-- search parameter
DECLARE #search_text VARCHAR(8000) = 'woman,girl,-working'
-- split comma-separated search parameters
-- items starting in '-' will have a value of 1 for exclude
DECLARE #search_values TABLE(ItemNumber INT, Item VARCHAR(8000), Exclude BIT)
INSERT INTO #search_values
SELECT
ItemNumber,
CASE WHEN LTRIM(RTRIM(Item)) LIKE '-%' THEN LTRIM(RTRIM(STUFF(Item, 1, 1 ,''))) ELSE LTRIM(RTRIM(Item)) END,
CASE WHEN LTRIM(RTRIM(Item)) LIKE '-%' THEN 1 ELSE 0 END
FROM dbo.DelimitedSplit8K(#search_text, ',') s
;WITH CteSplitted AS( -- split each KWD to separate rows
SELECT *
FROM tbl_main t
CROSS APPLY(
SELECT
ItemNumber, Item = LTRIM(RTRIM(Item))
FROM dbo.DelimitedSplit8K(t.KWD, ',')
)x
)
SELECT
cs.AS_ID
FROM CteSplitted cs
INNER JOIN #search_values sv
ON sv.Item = cs.Item
GROUP BY cs.AS_ID
HAVING
-- all parameters should be included (Relational Division with no Remainder)
COUNT(DISTINCT cs.Item) = (SELECT COUNT(DISTINCT Item) FROM #search_values WHERE Exclude = 0)
-- no exclude parameters
AND SUM(CASE WHEN sv.Exclude = 1 THEN 1 ELSE 0 END) = 0
SQL Fiddle
This one uses a solution from the Relational Division with no Remainder problem discussed in this article by Dwain Camps.
From what you've described, you want the keywords that are included in the search text to be a match in the KWD column, and those that are prefixed with a - to be excluded.
Despite the data existing in this format, it still makes most sense to normalize the data, and then query based on the existence or non existence of the keywords.
To do this, in very rough terms:-
Create two additional tables - Keyword and tbl_Main_Keyword. Keyword contains a distinct list of each of the possible keywords and tbl_Main_Keyword contains a link between each record in tbl_Main to each Keyword record where there's a match. Ensure to create an index on the text field for the keyword (e.g. the Keyword.KeywordText column, or whatever you call it), as well as the KeywordID field in the tbl_Main_Keyword table. Create Foreign Keys between tables.
Write some DML (or use a separate program, such as a C# program) to iterate through each record, parsing the text, and inserting each distinct keyword encountered into the Keyword table. Create a relationship to the row for each keyword in the tbl_main record.
Now, for searching, parse out the search text into keywords, and compose a query against the tbl_Main_Keyword table containing both a WHERE KeywordID IN and WHERE KeywordID NOT IN clause, depending on whether there is a match.
Take note to consider whether the case of each keyword is important to your business case, and consider the collation (case sensitive or insensitive) accordingly.
I would prefer cha's solution, but here's another solution:
declare #QueryParts table (q varchar(1000))
insert into #QueryParts values
('woman'),
('girl'),
('-Working')
select AS_ID
from tbl_main
inner join #QueryParts on
(q not like '-%' and ',' + KWD + ',' like '%,' + q + ',%') or
(q like '-%' and ',' + KWD + ',' not like '%,' + substring(q, 2, 1000) + ',%')
group by AS_ID
having COUNT(*) = (select COUNT(*) from #QueryParts)
With such a design, you would have two tables. One that defines the IDs and a subtable that holds the set of keywords per search string.
Likewise, you would transform the search strings into two tables, one for strings that should match and one for negated strings. Assuming that you put this in a stored procedure, these tables would be table-value parameters.
Once you have this set up, the query is simple to write:
SELECT M.AS_ID
FROM tbl_main M
WHERE (SELECT COUNT(*)
FROM tbl_keywords K
WHERE K.AS_ID = M.AS_ID
AND K.KWD IN (SELECT word FROM #searchwords)) =
(SELECT COUNT(*) FROM #searchwords)
AND NOT EXISTS (SELECT *
FROM tbl_keywords K
WHERE K.AS_ID = M.AS_ID
AND K.KWD IN (SELECT word FROM #minuswords))