TSQL Reverse FOR XML Encoding - sql

I am using FOR XML in a query to join multiple rows together, but the text contains quotes, "<", ">", etc. I need the actual character instead of the encoded value like """ etc. Any suggestions?

Basically what you're asking for is invalid XML and luckly SQL Server will not produce it. You can take the generated XML and extract the content, and this operation will revert the escaped characters to their text representation. This revert normally occurs in the presnetaitonlayer, but it can occur in SQL Server itslef by instance using XML methods to extract the content of the produced FOR XML output. For example:
declare #text varchar(max) = 'this text has < and >';
declare #xml xml;
set #xml = (select #text as [node] for xml path('nodes'), type);
select #xml;
select x.value(N'.', N'varchar(max)') as [text]
from #xml.nodes('//nodes/node') t(x);

I have a similar requirement to extract column names for use in PIVOT query.
The solution I used was as follows:
SELECT #columns = STUFF((SELECT '],[' + Value
FROM Table
ORDER BY Value
FOR XML PATH('')), 1, 2, '') + ']'
This produces a single string:
[Value 1],[Value 2],[Value 3]
I hope this points you in the right direction.

--something like this?
SELECT * INTO #Names FROM (
SELECT Name='<>&' UNION ALL
SELECT Name='ab<>'
) Names;
-- 1)
SELECT STUFF(
(SELECT ', ' + Name FROM #Names FOR XML PATH(''))
,1,2,'');
-- 2)
SELECT STUFF(
(SELECT ', ' + Name FROM #Names FOR XML PATH(''),TYPE).value('text()[1]','nvarchar(max)')
,1,2,'');
-- 2) is slower but will not return encoded value.
Hope it help.

Related

How to encode XML in T SQL without the additional XML overhead

I have a database which (For whatever reason) has a column containing pipe delimited data.
I want to parse this data quickly, so I've thought of converting this column (nvarchar) into an XML by replacing the pipes with XML attributes and putting it into an XML data typed column somewhere else.
It works, except in the case where that column had a character that required encoding, such a '<' character.
I found I could encode XML using FOR XML clause, however, that appears to inject some XML tags around the data.
For example: (this gives error on bad character)
SELECT CAST('<f>' + replace(value,'|','</f><f>') + '</f>' AS XML)
FROM TABLE
this gives xml encoded value, but wraps it in "< value> < /value>" tag
SELECT value
FROM table
FOR XML PATH('')
Any ideas on how I can get the XML encoded value without this extra tag added, so I can convert the pipe format to XML after it's done (preferably in one swoop)?
EDIT: since people are asking, this is what 5 potential rows of data might look like
foo
foo|bar
foo|bar|1
foo||
baz|
And the results would be
Col1, Col2, Col3
foo,null,null
foo,bar,null
foo,bar,1
foo,null,null
baz,null,null
I'm achieving this by using the resulting XML type in a sub query such as: (it can be up to 4 columns pr 3 pipes in any given row)
SELECT
*,
x.query('f[1]').value('.','nVarChar(2048)') Col1
,x.query('f[2]').value('.','nVarChar(2048)') Col2
,x.query('f[3]').value('.','nvarchar(2048)') Col3
,x.query('f[4]').value('.','nvarchar(2048)') Col4
FROM
(
SELECT *,
CAST('<f>' + REPLACE(Value,'|','</f><f>') + '</f>' AS XML) as x
FROM table
) y
#srutzky makes a great point. No, I do not need to do XML here at all. If I can find a fast & clean way to parse pipes in a set based operation, I'll do that. Will review the SQL# documentation...
SELECT CAST('<values><f>' +
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(value,'&','&')
,'"','"')
,'<','<')
,'>','>')
,'|','</f><f>') + '</f></values>' AS XML)
FROM TABLE;
You could try the following BUT you need to make sure the content is "xml safe", in other words the content does not contain values which xml will reject (look into xml element content parsing).
Try the following...it's test script to see if it does what you want..
UPDATE:
ok, it might help if I read the question all the way through...2 steps...split the pipes and then xml all the split items...try this:
Create the following function:
CREATE FUNCTION [dbo].[udf_SPLIT]
(
#s nvarchar(max),
#trimPieces bit,
#returnEmptyStrings bit,
#delimiter nvarchar(10)
)
RETURNS #t TABLE (val nvarchar(max))
AS
BEGIN
DECLARE #i int, #j int
SELECT #i = 0, #j = (LEN(#s) - LEN(REPLACE(#s,#delimiter,'')))
;WITH cte AS
(
SELECT i = #i + 1,
s = #s,
n = substring(#s, 0, charindex(#delimiter, #s)),
m = substring(#s, charindex(#delimiter, #s)+1, len(#s) - charindex(#delimiter, #s))
UNION ALL
SELECT i = cte.i + 1,
s = cte.m,
n = substring(cte.m, 0, charindex(#delimiter, cte.m)),
m = substring(cte.m, charindex(#delimiter, cte.m) + 1, len(cte.m)-charindex(#delimiter, cte.m))
FROM cte
WHERE i <= #j
)
INSERT INTO #t (val)
SELECT [pieces]
FROM (
SELECT CASE
WHEN #trimPieces = 1 THEN LTRIM(RTRIM(CASE WHEN i <= #j THEN n ELSE m END))
ELSE CASE WHEN i <= #j THEN n ELSE m END
END AS [pieces]
FROM cte
) t
WHERE (#returnEmptyStrings = 0 AND LEN(pieces) > 0)
OR (#returnEmptyStrings = 1)
OPTION (maxrecursion 0)
RETURN
END
next try the following to test...
DECLARE #str nvarchar(500) = 'test|<html>this</html>|boogie woogie| SDGDSFG| game<br /> on |working| this|'
SELECT REPLACE(
REPLACE(
REPLACE(
REPLACE([val],'&','&')
,'"','"')
,'<','<')
,'>','>')
AS [f]
FROM [dbo].[udf_SPLIT](#str,1,0,'|')
FOR XML PATH('')
If not totally correct, hopefully will put you on right path...
HTH
Dave
Your idea was absolutely OK: By making an XML out of your string the XML engine will convert all special characters properly. After your splitting the XML should be correct.
If your string is stored in a column you can avoid the automatically given name by either doing kind of computation (something like '' + YourColumn) or you give the column an alias AS [*]:
Try it like this:
DECLARE #str VARCHAR(100)='300|2€&ÄÖÜ|This is text -> should be text|2015-12-31';
SELECT #str FOR XML PATH('');
/*
300|2€&ÄÖÜ|This is text -> should be text|2015-12-31
*/
DECLARE #Xml XML=(SELECT CAST('<x>' + REPLACE((SELECT #str FOR XML PATH('')),'|','</x><x>')+'</x>' AS XML));
SELECT #Xml.value('/x[1]','int') AS IntTypeSave
,#Xml.value('/x[3]','varchar(max)') AS VarcharTypeSave
,#Xml.value('/x[4]','datetime') AS DateTypeSave;
/*
300 This is text -> should be text 2015-12-31 00:00:00.000
*/
SELECT X.value('.','varchar(max)') AS EachX
FROM #Xml.nodes('/x') AS Each(X);
/*
300
2€&ÄÖÜ
This is text -> should be text
2015-12-31
*/

SQL for concatenating strings/rows into one string/row? (How to use FOR XML PATH with INSERT?)

I am concatenating several rows/strings in an table (on Microsoft SQL Server 2010) into a string by using a method as suggested here:
SELECT ',' + col FROM t1 FOR XML PATH('')
However, if I try to insert the resulting string as (single) row into another table like so:
INSERT INTO t2
SELECT ', ' + col FROM t1 FOR XML PATH('')
I receive this error message:
The FOR XML clause is not allowed in a INSERT statement.
t2 currently has a single column of type NVARCHAR(80). How can I overcome this problem, i.e. how can I collapse a table t1 with many rows into a table t2 with row that concatenates all the strings from t1 (with commas)?
Rather than xml path why not do it like this?
DECLARE #Cols VARCHAR(8000)
SELECT #Cols = COALESCE(#Cols + ', ', '') +
ISNULL(col, 'N/A')
FROM t1
Insert into t2 values(#Cols);
You need to cast it back to an nvarchar() before inserting. I use this method, deletes the first separator as well and as I'm doing the , type part, it handles entities correctly.
insert into t2
select stuff((
select ', ' + col from t1
for xml path(''), type
).value('.', 'nvarchar(80)'), 1, 2, '')
So you concatenate all col with prepending comma+space as an xml-object. Then you take the .value() of child with xquery-path . which means "take the child we are at, don't traverse anywhere". You cast it as an nvarchar(80) and replace a substring starting at position 1 and length 2 with an empty string ''. So the 2 should be replaced with however long your separator is.

Get values from XML tags with dynamically specified data fields

I have 2 tables:
Table1 has a list of XML tag names that I want to extract from an XML field. I simulate this by running this query
SELECT
'CLIENT'
UNION SELECT
'FEE'
UNION SELECT
'ADDRESS'
This results in a single column with 3 rows in it, the names of which will be used to extract corresponding data from XML tags.
The second table has a column called ClientData, it is in XML format and it has thousands of rows of data. My task is to extract values from XML tags specified in Table1, in this case I want values from 3 xml tags: Client, FEE and ADDRESS.
So, if the XML is this
<XML>
<CLIENT>some client</CLIENT>
<FEE>some fee</FEE>
<ADDRESS>some address</ADDRESS>
</XML>
After running a query I should get this:
Client, FEE, ADDRESS
some client, some fee, some address
Right now i have a query:
SELECT
coalesce(Cast(ClientData as xml).value('(/XML/CLIENT)[1]', 'varchar(max)'), ''),
coalesce(Cast(ClientData as xml).value('(/XML/FEE)[1]', 'varchar(max)'), ''),
coalesce(Cast(ClientData as xml).value('(/XML/ADDRESS)[1]', 'varchar(max)'), '')
FROM dbo.Table2 WITH(NOLOCK)
This gives me the necessary result, however it is not dynamic. Meaning, if I want to include a 4th xml value, lets say, PHONE, I would need to add coalesce(Cast(ClientData as xml).value('(/XML/PHONE)[1]', 'varchar(max)'), '') to the SELECT
My question is,
How do I make my query dynamic so that instead of hardcoding tag names that I want to extract from XML in Table2 I would use Tabl1 as a source of tag names to extract?
I hope my explanation was good enough :)
Thank you!
You can achieve this using DYNAMIC SQL
The TagsTable should have all the possible Tags
we can then construct SQL using the tag names and execute it
create table TagsTable
( tagName varchar(256)
)
insert into TagsTable values ('CLIENT')
insert into TagsTable values ('FEE')
insert into TagsTable values ('ADDRESS')
declare #query nvarchar(max)
SELECT #query = STUFF((select ',' + 'coalesce(Cast(ClientData as xml).value(''(/XML/'
+ tagName + ')[1]'', ''varchar(max)''), '''') as ' + tagName +' '
FROM TagsTable
FOR XML PATH ('') ), 1,1,'')
SET #query = 'SELECT ' + #query + 'FROM dbo.Table2 WITH(NOLOCK)'
select #query
exec sp_executesql #query

Convert multiples xml nodes data into varchar

I want to convert an xml string like this :
'<orga_label>ORG1</orga_label><orga_label>ORG2</orga_label><orga_label>ORG3</orga_label>'
into a varchar like this :
'ORG1, ORG2, ORG3'
in t-sql in one query.
Is that possible?
You can keep is very simple and avoid XML methods here...
DECLARE #foo xml = '<orga_label>ORG1</orga_label><orga_label>ORG2</orga_label><orga_label>ORG3</orga_label>';
SELECT
REPLACE(
REPLACE(
REPLACE(
CONVERT(nvarchar(4000), #foo), '</orga_label><orga_label>', ', '
),
'<orga_label>', ''
),
'</orga_label>', ''
);
Edit: this has the advantage of not invoking the XML methods and processor.
It's better you use a xml parser in script language like ruby .
require 'rexml/document'
xml =REXML::Document.new(File.open"filename/filename.XML")
xml.each_element('//(your element)') do |sdobi|
puts sdobi.attributes["orga_label"]
end
If you really want to use sql, it's a little bit comeplex:
SELECT SUBSTRING( columnname, LOCATE( '<orga_label',columnname ) +12, LOCATE( '</', tablename) ) from tablename
the if the substring not right try to change the number
declare #xml xml = '<orga_label>ORG1</orga_label><orga_label>ORG2</orga_label><orga_label>ORG3</orga_label>';
select stuff((select
',' + s from (
select
a.b.value('(.)[1]', 'varchar(50)') s
from #xml.nodes('/orga_label') a(b)
) t
for xml path('')
),1,1,'');

SQL Query to List

I have a table variable in a stored procedure. What I want is to find all of the unique values in one column and join them in a comma-separated list. I am already in a stored procedure, so I can do it some way that way; however, I am curious if I can do this with a query. I am on SQL Server 2008. This query gets me the values I want:
SELECT DISTINCT faultType FROM #simFaults;
Is there a way (using CONCAT or something like that) where I can get the list as a single comma-separated value?
This worked for me on a test dataset.
DECLARE #MyCSV Varchar(200) = ''
SELECT #MyCSV = #MyCSV +
CAST(faulttype AS Varchar) + ','
FROM #Simfaults
GROUP BY faultType
SET #MyCSV = LEFT(#MyCSV, LEN(#MyCSV) - 1)
SELECT #MyCSV
The last part is needed to trim the trailing comma.
+1 to JNK - the other common way you will see, which doesn't require a variable is:
SELECT DISTINCT faulttype + ','
FROM #simfaults
FOR XML PATH ('')
Note that if faulttype contains characters like "<" for example, those will be xml encoded. But for simple values this will be OK.
this is how we do this
create table #test (item int)
insert into #test
values(1),(2),(3)
select STUFF((SELECT ', ' + cast(Item as nvarchar)
FROM #test
FOR XML PATH('')), 1, 2, '')
Without the space after the comma it would be;
select STUFF((SELECT ',' + cast(Item as nvarchar)
FROM #test
FOR XML PATH('')), 1,1, '')