Convert multiples xml nodes data into varchar - sql

I want to convert an xml string like this :
'<orga_label>ORG1</orga_label><orga_label>ORG2</orga_label><orga_label>ORG3</orga_label>'
into a varchar like this :
'ORG1, ORG2, ORG3'
in t-sql in one query.
Is that possible?

You can keep is very simple and avoid XML methods here...
DECLARE #foo xml = '<orga_label>ORG1</orga_label><orga_label>ORG2</orga_label><orga_label>ORG3</orga_label>';
SELECT
REPLACE(
REPLACE(
REPLACE(
CONVERT(nvarchar(4000), #foo), '</orga_label><orga_label>', ', '
),
'<orga_label>', ''
),
'</orga_label>', ''
);
Edit: this has the advantage of not invoking the XML methods and processor.

It's better you use a xml parser in script language like ruby .
require 'rexml/document'
xml =REXML::Document.new(File.open"filename/filename.XML")
xml.each_element('//(your element)') do |sdobi|
puts sdobi.attributes["orga_label"]
end
If you really want to use sql, it's a little bit comeplex:
SELECT SUBSTRING( columnname, LOCATE( '<orga_label',columnname ) +12, LOCATE( '</', tablename) ) from tablename
the if the substring not right try to change the number

declare #xml xml = '<orga_label>ORG1</orga_label><orga_label>ORG2</orga_label><orga_label>ORG3</orga_label>';
select stuff((select
',' + s from (
select
a.b.value('(.)[1]', 'varchar(50)') s
from #xml.nodes('/orga_label') a(b)
) t
for xml path('')
),1,1,'');

Related

How to encode XML in T SQL without the additional XML overhead

I have a database which (For whatever reason) has a column containing pipe delimited data.
I want to parse this data quickly, so I've thought of converting this column (nvarchar) into an XML by replacing the pipes with XML attributes and putting it into an XML data typed column somewhere else.
It works, except in the case where that column had a character that required encoding, such a '<' character.
I found I could encode XML using FOR XML clause, however, that appears to inject some XML tags around the data.
For example: (this gives error on bad character)
SELECT CAST('<f>' + replace(value,'|','</f><f>') + '</f>' AS XML)
FROM TABLE
this gives xml encoded value, but wraps it in "< value> < /value>" tag
SELECT value
FROM table
FOR XML PATH('')
Any ideas on how I can get the XML encoded value without this extra tag added, so I can convert the pipe format to XML after it's done (preferably in one swoop)?
EDIT: since people are asking, this is what 5 potential rows of data might look like
foo
foo|bar
foo|bar|1
foo||
baz|
And the results would be
Col1, Col2, Col3
foo,null,null
foo,bar,null
foo,bar,1
foo,null,null
baz,null,null
I'm achieving this by using the resulting XML type in a sub query such as: (it can be up to 4 columns pr 3 pipes in any given row)
SELECT
*,
x.query('f[1]').value('.','nVarChar(2048)') Col1
,x.query('f[2]').value('.','nVarChar(2048)') Col2
,x.query('f[3]').value('.','nvarchar(2048)') Col3
,x.query('f[4]').value('.','nvarchar(2048)') Col4
FROM
(
SELECT *,
CAST('<f>' + REPLACE(Value,'|','</f><f>') + '</f>' AS XML) as x
FROM table
) y
#srutzky makes a great point. No, I do not need to do XML here at all. If I can find a fast & clean way to parse pipes in a set based operation, I'll do that. Will review the SQL# documentation...
SELECT CAST('<values><f>' +
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(value,'&','&')
,'"','"')
,'<','<')
,'>','>')
,'|','</f><f>') + '</f></values>' AS XML)
FROM TABLE;
You could try the following BUT you need to make sure the content is "xml safe", in other words the content does not contain values which xml will reject (look into xml element content parsing).
Try the following...it's test script to see if it does what you want..
UPDATE:
ok, it might help if I read the question all the way through...2 steps...split the pipes and then xml all the split items...try this:
Create the following function:
CREATE FUNCTION [dbo].[udf_SPLIT]
(
#s nvarchar(max),
#trimPieces bit,
#returnEmptyStrings bit,
#delimiter nvarchar(10)
)
RETURNS #t TABLE (val nvarchar(max))
AS
BEGIN
DECLARE #i int, #j int
SELECT #i = 0, #j = (LEN(#s) - LEN(REPLACE(#s,#delimiter,'')))
;WITH cte AS
(
SELECT i = #i + 1,
s = #s,
n = substring(#s, 0, charindex(#delimiter, #s)),
m = substring(#s, charindex(#delimiter, #s)+1, len(#s) - charindex(#delimiter, #s))
UNION ALL
SELECT i = cte.i + 1,
s = cte.m,
n = substring(cte.m, 0, charindex(#delimiter, cte.m)),
m = substring(cte.m, charindex(#delimiter, cte.m) + 1, len(cte.m)-charindex(#delimiter, cte.m))
FROM cte
WHERE i <= #j
)
INSERT INTO #t (val)
SELECT [pieces]
FROM (
SELECT CASE
WHEN #trimPieces = 1 THEN LTRIM(RTRIM(CASE WHEN i <= #j THEN n ELSE m END))
ELSE CASE WHEN i <= #j THEN n ELSE m END
END AS [pieces]
FROM cte
) t
WHERE (#returnEmptyStrings = 0 AND LEN(pieces) > 0)
OR (#returnEmptyStrings = 1)
OPTION (maxrecursion 0)
RETURN
END
next try the following to test...
DECLARE #str nvarchar(500) = 'test|<html>this</html>|boogie woogie| SDGDSFG| game<br /> on |working| this|'
SELECT REPLACE(
REPLACE(
REPLACE(
REPLACE([val],'&','&')
,'"','"')
,'<','<')
,'>','>')
AS [f]
FROM [dbo].[udf_SPLIT](#str,1,0,'|')
FOR XML PATH('')
If not totally correct, hopefully will put you on right path...
HTH
Dave
Your idea was absolutely OK: By making an XML out of your string the XML engine will convert all special characters properly. After your splitting the XML should be correct.
If your string is stored in a column you can avoid the automatically given name by either doing kind of computation (something like '' + YourColumn) or you give the column an alias AS [*]:
Try it like this:
DECLARE #str VARCHAR(100)='300|2€&ÄÖÜ|This is text -> should be text|2015-12-31';
SELECT #str FOR XML PATH('');
/*
300|2€&ÄÖÜ|This is text -> should be text|2015-12-31
*/
DECLARE #Xml XML=(SELECT CAST('<x>' + REPLACE((SELECT #str FOR XML PATH('')),'|','</x><x>')+'</x>' AS XML));
SELECT #Xml.value('/x[1]','int') AS IntTypeSave
,#Xml.value('/x[3]','varchar(max)') AS VarcharTypeSave
,#Xml.value('/x[4]','datetime') AS DateTypeSave;
/*
300 This is text -> should be text 2015-12-31 00:00:00.000
*/
SELECT X.value('.','varchar(max)') AS EachX
FROM #Xml.nodes('/x') AS Each(X);
/*
300
2€&ÄÖÜ
This is text -> should be text
2015-12-31
*/

how to execute subquery without declaring XML?

why this query is not executing ??
SELECT [Value] = T.c.value('.','varchar(30)') FROM (SELECT '<s>'+ REPLACE ((select tag_id+',' from tbl_container_track for xml path('')),',','</s> <s>')+ '</s>').nodes('/s') T(c)
But this one is working ?
declare #X xml
SELECT #X = (SELECT '<s>'+ REPLACE ((select tag_id+',' from tbl_container_track for xml path('')),',','</s> <s>')+ '</s>')
SELECT [Value] = T.c.value('.','varchar(30)') FROM #X.nodes('/s') T(c)
Can some one help me to simplify without declaring #X ?
Try this: CAST TO XML Datatype you missed
SELECT [Value] = T.c.value('.', 'varchar(30)')
FROM (SELECT Cast(( '<s>' + Replace ((SELECT tag_id+',' FROM tbl_container_track FOR xml path('')), ',', '</s> <s>')
+ '</s>' ) AS XML)) AS Data
CROSS APPLY Data.nodes('/s') T(c)

What is the meaning of SELECT ... FOR XML PATH(' '),1,1)?

I am learning sql in one of the question and here I saw usage of this,can some body make me understand what xml path('') mean in sql? and yes,i browsed through web pages I didn't understand it quite well!
I am not getting the Stuff behind,now what does this piece of code do ?(only select part)
declare #t table
(
Id int,
Name varchar(10)
)
insert into #t
select 1,'a' union all
select 1,'b' union all
select 2,'c' union all
select 2,'d'
select ID,
stuff(
(
select ','+ [Name] from #t where Id = t.Id for XML path('')
),1,1,'')
from (select distinct ID from #t )t
There's no real technique to learn here. It's just a cute trick to concatenate multiple rows of data into a single string. It's more a quirky use of a feature than an intended use of the XML formatting feature.
SELECT ',' + ColumnName ... FOR XML PATH('')
generates a set of comma separated values, based on combining multiple rows of data from the ColumnName column. It will produce a value like ,abc,def,ghi,jkl.
STUFF(...,1,1,'')
Is then used to remove the leading comma that the previous trick generated, see STUFF for details about its parameters.
(Strangely, a lot of people tend to refer to this method of generating a comma separated set of values as "the STUFF method" despite the STUFF only being responsible for a final bit of trimming)
SQL you were referencing is used for string concatenation in MSSQL.
It concatenates rows by prepending , using for xml path to result
,a,b,c,d. Then using stuff it replaces first , for , thus removing it.
The ('') in for xml path is used to remove wrapper node, that is being automatically created. Otherwise it would look like <row>,a,b,c,d</row>.
...
stuff(
(
select ',' + CAST(t2.Value as varchar(10)) from #t t2 where t1.id = t2.id
for xml path('')
)
,1,1,'') as Value
...
more on stuff
more on for xml path

Concatenating Values Within SQL XML field

I have a table in SQL Server 2012 which has an XML field. The field contains arrays (the number of elements is not constant) in the following format:
<values>
<value>A</value>
<value>B</value>
<value>C</value>
<value>D</value>
</values>
and I would like to turn it into a varchar like this:
'A;B;C;D'
I have tried:
SELECT myField.value('.', 'NVARCHAR(50)')
FROM myTable
which creates 'ABCD' but I don't know how to delimit it (In the real case they are not single character values).
Try this
DECLARE #myTable TABLE (id int,myField XML)
INSERT INTO #myTable(id,myField) VALUES(1,'<values>
<value>A</value>
<value>B</value>
<value>C</value>
<value>D</value>
</values>')
;WITH xmltable
AS
(
SELECT id, myField.v.value('.', 'varchar(200)') AS myField
FROM #myTable
CROSS APPLY myField.nodes('/values/value') AS myField(v)
)
SELECT STUFF((SELECT ';' + myField
FROM xmltable t2
WHERE t2.id = t1.id
FOR XML PATH('')),1,1,'') AS myField
FROM xmltable t1
GROUP BY id
I've thought of a hack to achieve this...
SELECT REPLACE(
REPLACE(
REPLACE(
CAST([myField] As NVARCHAR(MAX)),
'<values><value>',
''),
'</value></values>',
''),
'</value><value>',
';'
) As [Hacked]
FROM [myTable]
...but it makes me feel a bit dirty. There has to be a better way.

TSQL Reverse FOR XML Encoding

I am using FOR XML in a query to join multiple rows together, but the text contains quotes, "<", ">", etc. I need the actual character instead of the encoded value like """ etc. Any suggestions?
Basically what you're asking for is invalid XML and luckly SQL Server will not produce it. You can take the generated XML and extract the content, and this operation will revert the escaped characters to their text representation. This revert normally occurs in the presnetaitonlayer, but it can occur in SQL Server itslef by instance using XML methods to extract the content of the produced FOR XML output. For example:
declare #text varchar(max) = 'this text has < and >';
declare #xml xml;
set #xml = (select #text as [node] for xml path('nodes'), type);
select #xml;
select x.value(N'.', N'varchar(max)') as [text]
from #xml.nodes('//nodes/node') t(x);
I have a similar requirement to extract column names for use in PIVOT query.
The solution I used was as follows:
SELECT #columns = STUFF((SELECT '],[' + Value
FROM Table
ORDER BY Value
FOR XML PATH('')), 1, 2, '') + ']'
This produces a single string:
[Value 1],[Value 2],[Value 3]
I hope this points you in the right direction.
--something like this?
SELECT * INTO #Names FROM (
SELECT Name='<>&' UNION ALL
SELECT Name='ab<>'
) Names;
-- 1)
SELECT STUFF(
(SELECT ', ' + Name FROM #Names FOR XML PATH(''))
,1,2,'');
-- 2)
SELECT STUFF(
(SELECT ', ' + Name FROM #Names FOR XML PATH(''),TYPE).value('text()[1]','nvarchar(max)')
,1,2,'');
-- 2) is slower but will not return encoded value.
Hope it help.