Sql For Xml Path get node count - sql

I'm trying to use the 'For Xml Path' T-SQL to generate a comma separated list of values from a column. This seems to work great, but the problem is I would like to get a count of the items in the comma separated list. Here is an example of the code I am using to generate the comma separated list:
Create Table #List ([col] varchar)
Insert Into #List Select '1';
Insert Into #List Select '2';
Insert Into #List Select '3'
Select ',' + [col] From #List For Xml Path('')
This gives the results 1,2,3 as expected, but there is no way to get the count that there are 3 items. Any attempt to add a count will just add it to the xml. I combined this code with a cte to get the count:
With CTE As (
Select
[col]
From
#List
)
Select
(Select ',' + [col] From #List For Xml Path('')) As [List],
Count(*) As [Count]
From
CTE
Is there an easier/cleaner way to get the count of nodes without using a CTE? It was pointed out that you can just duplicate the from clause inside the inner select and outside, but that requires keeping the from clauses in sync. I want to get both the list and count, but only have the from clause written once.

How about drawing data from the CTE instead of the temp table?
With CTE As (
Select
[col]
From
#List
-- Many joins
-- Complicated where clause
)
Select
(Select ',' + [col] From Cte For Xml Path('')) As [List],
Count(*) As [Count]
From
CTE
This will allow you to keep your joins and search predicates in one place.

You don't need the CTE you can use the subquery approach directly
SELECT
COUNT(*) AS [Count],
(SELECT ',' + [col] FROM #List FOR XML PATH('')) AS [List]
FROM #List

Related

SQL Server : Compare same string

I have a select query say
select details,* from employee
details column value can like 'very good,very good, bad'. It can have any number of comma separated values.
I want to compare text that falls between each commas and remove duplicates.
Result needs to be like 'very good,bad'
How can i implement it. Please help.
Thanks in advance.
I have create a scalar valued function fn_RemoveDuplicate which takes varchar as input and return a varchar (having no duplicates).
You can then use it as
select dbo.fn_RemoveDuplicate(details),* from employee
Create FUNCTION fn_RemoveDuplicate
(
#inputstring varchar(max)
)
RETURNS varchar(max)
AS
BEGIN
declare #test2 varchar(max)
declare #test1 xml =cast(#inputstring as xml)
SET #test2 ='<Details>'+ cast(('<detail><value1>'+replace(#inputstring,',' ,'</value1></detail><detail><value1>')+'</value1></detail>') as varchar(max))+'</Details>'
set #test1=cast(#test2 as xml)
DECLARE #Details varchar(max)
SET #Details = NULL
SELECT #Details = COALESCE(#Details + ',','') + [value1]
FROM (select distinct
t.x.value('value1[1]','Varchar(50)') as value1
from #test1.nodes('/Details/detail') t(x)) as p
return #Details
END
If you use SQL Server 2016 or later the following answer solve your problem:
select
e.*,
x.[expected_result]
from
employee e
cross apply
(select
stuff((
select
distinct
','+ltrim(rtrim(value))
from
string_split(e.details, ',')
for xml path(''))
,1 ,1 ,'') as [expected_result]) as x
I solve it by using string_split() and stuff() functions. The following link helps you to understand how they work:
STRING_SPLIT (Transact-SQL)
STUFF (Transact-SQL)
SQL Server CROSS APPLY and OUTER APPLY
Storing data with comma separated value is not a good practice. I also strongly suggest you to change the model if it is possible.
The idea for the solution is to use a table valued function (fn_SplitString), and combine the resultant table based on distinct values.
The following query should do what you want:
SELECT
[ID],[Details],
[cleansedDetails] = (SELECT
STUFF((
SELECT
DISTINCT ','+LTRIM(RTRIM(ISNULL(ncValue,cvalue)))
FROM
fn_SplitString([Details], ',')
FOR XML PATH(''))
,1 ,1 ,''))
FROM [dbo].[tb_Employee]
In this db<>fiddle, you could find the DDL & DML for my example data and the definition for the table valued function fn_SplitString. You could check how the code works in different scenarios.

Order Concatenated field

I have a field which is a concatenation of single letters. I am trying to order these strings within a view. These values can't be hard coded as there are too many. Is someone able to provide some guidance on the function to use to achieve the desired output below? I am using MSSQL.
Current output
CustID | Code
123 | BCA
Desired output
CustID | Code
123 | ABC
I have tried using a UDF
CREATE FUNCTION [dbo].[Alphaorder] (#str VARCHAR(50))
returns VARCHAR(50)
BEGIN
DECLARE #len INT,
#cnt INT =1,
#str1 VARCHAR(50)='',
#output VARCHAR(50)=''
SELECT #len = Len(#str)
WHILE #cnt <= #len
BEGIN
SELECT #str1 += Substring(#str, #cnt, 1) + ','
SET #cnt+=1
END
SELECT #str1 = LEFT(#str1, Len(#str1) - 1)
SELECT #output += Sp_data
FROM (SELECT Split.a.value('.', 'VARCHAR(100)') Sp_data
FROM (SELECT Cast ('<M>' + Replace(#str1, ',', '</M><M>') + '</M>' AS XML) AS Data) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)) A
ORDER BY Sp_data
RETURN #output
END
This works when calling one field
ie.
Select CustID, dbo.alphaorder(Code)
from dbo.source
where custid = 123
however when i try to apply this to top(10) i receive the error
"Invalid length parameter passed to the LEFT or SUBSTRING function."
Keeping in mind my source has ~4million records, is this still the best solution?
Unfortunately i am not able to normalize the data into a separate table with records for each Code.
This doesn't rely on a id column to join with itself, performance is almost as fast
as the answer by #Shnugo:
SELECT
CustID,
(
SELECT
chr
FROM
(SELECT TOP(LEN(Code))
SUBSTRING(Code,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)),1)
FROM sys.messages) A(Chr)
ORDER by chr
FOR XML PATH(''), type).value('.', 'varchar(max)'
) As CODE
FROM
source t
First of all: Avoid loops...
You can try this:
DECLARE #tbl TABLE(ID INT IDENTITY, YourString VARCHAR(100));
INSERT INTO #tbl VALUES ('ABC')
,('JSKEzXO')
,('QKEvYUJMKRC');
--the cte will create a list of all your strings separated in single characters.
--You can check the output with a simple SELECT * FROM SeparatedCharacters instead of the actual SELECT
WITH SeparatedCharacters AS
(
SELECT *
FROM #tbl
CROSS APPLY
(SELECT TOP(LEN(YourString)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values) A(Nmbr)
CROSS APPLY
(SELECT SUBSTRING(YourString,Nmbr,1))B(Chr)
)
SELECT ID,YourString
,(
SELECT Chr As [*]
FROM SeparatedCharacters sc1
WHERE sc1.ID=t.ID
ORDER BY sc1.Chr
FOR XML PATH(''),TYPE
).value('.','nvarchar(max)') AS Sorted
FROM #tbl t;
The result
ID YourString Sorted
1 ABC ABC
2 JSKEzXO EJKOSXz
3 QKEvYUJMKRC CEJKKMQRUvY
The idea in short
The trick is the first CROSS APPLY. This will create a tally on-the-fly. You will get a resultset with numbers from 1 to n where n is the length of the current string.
The second apply uses this number to get each character one-by-one using SUBSTRING().
The outer SELECT calls from the orginal table, which means one-row-per-ID and use a correalted sub-query to fetch all related characters. They will be sorted and re-concatenated using FOR XML. You might add DISTINCT in order to avoid repeating characters.
That's it :-)
Hint: SQL-Server 2017+
With version v2017 there's the new function STRING_AGG(). This would make the re-concatenation very easy:
WITH SeparatedCharacters AS
(
SELECT *
FROM #tbl
CROSS APPLY
(SELECT TOP(LEN(YourString)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values) A(Nmbr)
CROSS APPLY
(SELECT SUBSTRING(YourString,Nmbr,1))B(Chr)
)
SELECT ID,YourString
,STRING_AGG(sc.Chr,'') WITHIN GROUP(ORDER BY sc.Chr) AS Sorted
FROM SeparatedCharacters sc
GROUP BY ID,YourString;
Considering your table having good amount of rows (~4 Million), I would suggest you to create a persisted calculated field in the table, to store these values. As calculating these values at run time in a view, will lead to performance problems.
If you are not able to normalize, add this as a denormalized column to the existing table.
I think the error you are getting could be due to empty codes.
If LEN(#str) = 0
BEGIN
SET #output = ''
END
ELSE
BEGIN
... EXISTING CODE BLOCK ...
END
I can suggest to split string into its characters using referred SQL function.
Then you can concatenate string back, this time ordered alphabetically.
Are you using SQL Server 2017? Because with SQL Server 2017, you can use SQL String_Agg string aggregation function to concatenate characters splitted in an ordered way as follows
select
t.CustId, string_agg(strval, '') within GROUP (order by strval)
from CharacterTable t
cross apply dbo.SPLIT(t.code) s
where strval is not null
group by CustId
order by CustId
If you are not working on SQL2017, then you can follow below structure using SQL XML PATH for concatenation in SQL
select
CustId,
STUFF(
(
SELECT
'' + strval
from CharacterTable ct
cross apply dbo.SPLIT(t.code) s
where strval is not null
and t.CustId = ct.CustId
order by strval
FOR XML PATH('')
), 1, 0, ''
) As concatenated_string
from CharacterTable t
order by CustId

Remove duplicates when converting multiple rows into CSV using SQL

I am creating a view where I need to pull all the rows of one column and convert into CSV format.
SELECT
SUBSTRING((SELECT ',' + CAST(s.marketConfigId AS varchar(MAX))
FROM [Metric].[MetricGoalDefMarket] s
WHERE [metricGoalDefId]=21
ORDER BY s.marketConfigId
FOR XML PATH('')),2,200000) AS marketConfigID
Using above query can create a CSV format but it is also displaying duplicates. When you run the above query it is displaying the output as
**marketConfigID**
751,751,742,751,751,784,1850,737
How can I remove duplicates?
PS: As I am creating a view, I don't want to use functions as I see using dbo.DistinctList from here can remove duplicates
It seems distinct would work :
SELECT SUBSTRING((SELECT DISTINCT ',' + CAST(s.marketConfigId AS varchar(MAX))
FROM [Metric].[MetricGoalDefMarket] s
WHERE [metricGoalDefId]=21
ORDER BY s.marketConfigId
FOR XML PATH('')
), 2,200000) AS marketConfigID
You don't need to use substring() you can use stuff() instead :
SELECT STUFF ( (SELECT DISTINCT ',' + CAST(s.marketConfigId AS varchar(MAX))
FROM [Metric].[MetricGoalDefMarket] s
WHERE [metricGoalDefId]=21
FOR XML PATH('')
), 1, 1, ''
) AS marketConfigID
Have you tried DISTINCT?
Edit: Apologies think I misread your query. I think you want to fetch the marketConfigIds using a sub-select query (applying DISTINCT), rather than at the outer level like I did below
SELECT DISTINCT
SUBSTRING((SELECT ',' + CAST(s.marketConfigId AS varchar(MAX))
FROM [Metric].[MetricGoalDefMarket] s
WHERE [metricGoalDefId]=21
ORDER BY s.marketConfigId
FOR XML PATH('')),2,200000) AS marketConfigID

What is the meaning of SELECT ... FOR XML PATH(' '),1,1)?

I am learning sql in one of the question and here I saw usage of this,can some body make me understand what xml path('') mean in sql? and yes,i browsed through web pages I didn't understand it quite well!
I am not getting the Stuff behind,now what does this piece of code do ?(only select part)
declare #t table
(
Id int,
Name varchar(10)
)
insert into #t
select 1,'a' union all
select 1,'b' union all
select 2,'c' union all
select 2,'d'
select ID,
stuff(
(
select ','+ [Name] from #t where Id = t.Id for XML path('')
),1,1,'')
from (select distinct ID from #t )t
There's no real technique to learn here. It's just a cute trick to concatenate multiple rows of data into a single string. It's more a quirky use of a feature than an intended use of the XML formatting feature.
SELECT ',' + ColumnName ... FOR XML PATH('')
generates a set of comma separated values, based on combining multiple rows of data from the ColumnName column. It will produce a value like ,abc,def,ghi,jkl.
STUFF(...,1,1,'')
Is then used to remove the leading comma that the previous trick generated, see STUFF for details about its parameters.
(Strangely, a lot of people tend to refer to this method of generating a comma separated set of values as "the STUFF method" despite the STUFF only being responsible for a final bit of trimming)
SQL you were referencing is used for string concatenation in MSSQL.
It concatenates rows by prepending , using for xml path to result
,a,b,c,d. Then using stuff it replaces first , for , thus removing it.
The ('') in for xml path is used to remove wrapper node, that is being automatically created. Otherwise it would look like <row>,a,b,c,d</row>.
...
stuff(
(
select ',' + CAST(t2.Value as varchar(10)) from #t t2 where t1.id = t2.id
for xml path('')
)
,1,1,'') as Value
...
more on stuff
more on for xml path

Splitting a variable length column in SQL server safely

I have a column (varchar400) in the following form in an SQL table :
Info
UserID=1123456,ItemID=6685642
The column is created via our point of sale application, and so I cannot do the normal thing of simply splitting it into two columns as this would cause an obscene amount of work. My problem is that this column is used to store attributes of products in our database, and so while I am only concerned with UserID and ItemID, there may be superfluous information stored here, for example :
Info
IrrelevantID=666,UserID=123124,AnotherIrrelevantID=1232342,ItemID=1213124.
What I want to retrieve is simply two columns, with no error given if neither of these attributes exists in the Info column. :
UserID ItemID
123124 1213124
Would it be possible to do this effectively, with error checking, given that the length of the IDs are all variable, but all of the attributes are comma-separated and follow a uniform style (i.e "UserID=number").
Can anyone tell me the best way of dealing with my problem ?
Thanks a lot.
Try this
declare #infotable table (info varchar(4000))
insert into #infotable
select 'IrrelevantID=666,UserID=123124,AnotherIrrelevantID=1232342,ItemID=1213124.'
union all
select 'UserID=1123456,ItemID=6685642'
-- convert info column to xml type
; with cte as
(
select cast('<info ' + REPLACE(REPLACE(REPLACE(info,',', '" '),'=','="'),'.','') + '" />' as XML) info,
ROW_NUMBER() over (order by info) id
from #infotable
)
select userId, ItemId from
(
select T.N.value('local-name(.)', 'varchar(max)') as Name,
T.N.value('.', 'varchar(max)') as Value, id
from cte cross apply info.nodes('//#*') as T(N)
) v
pivot (max(value) for Name in ([UserID], [ItemId])) p
SQL DEMO
You can try this split function: http://www.sommarskog.se/arrays-in-sql-2005.html
Assuming ItemID=1213124. is terminated with a dot.
Declare #t Table (a varchar(400))
insert into #t values ('IrrelevantID=666,UserID=123124,AnotherIrrelevantID=1232342,ItemID=1213124.')
insert into #t values ('IrrelevantID=333,UserID=222222,AnotherIrrelevantID=0,ItemID=111.')
Select
STUFF(
Stuff(a,1,CHARINDEX(',UserID=',a) + Len(',UserID=')-1 ,'' )
,CharIndex
(',',
Stuff(a,1,CHARINDEX(',UserID=',a) + Len(',UserID=')-1 ,'' )
)
,400,'') as UserID
,
STUFF(
Stuff(a,1,CHARINDEX(',ItemID=',a) + Len(',ItemID=')-1 ,'' )
,CharIndex
('.',
Stuff(a,1,CHARINDEX(',ItemID=',a) + Len(',ItemID=')-1,'' )
)
,400,'') as ItemID
from #t