How to read data from multiple XML files in SQL Server? - sql

Background :
I want to obtain data from multiple XML files (stored in database) and fetch them into one result set. The basic working solution, with single XML file looks similar to this one :
DECLARE #xml xml
SET #xml =
(SELECT TOP 1 convert(varchar(max), convert(varbinary(max), [XML_FILE]))
FROM [SOME_TABLE])
SELECT
b.value('(./SomeNode/text())[1]','nvarchar(100)')) as [Some_Text],
b.value('(./SomeOtherNode/#VAL)[1]','int')) as [Some_Val]
FROM #xml.nodes('Example/File') as a(b)
Obviously this won't work with SELECT that returns many rows (many XML files). Sub-optimal solution could be achieved using cursor (iterating over collection -> pushing data into temporary table -> SELECT (*) FROM temporary_table) however, I believe thats not necessary and more straightforward solution can be achieved.
Question :
How to fetch data from multiple XML files, obtained via SELECT query, into a single result-set, without using cursor?
FILE_NAME || Value 1 || Value 2 || ...
----------------------------------------------
XML_FILE_1 || Node1Value || Node2Value || ...
XML_FILE_2 || Node1Value || Node2Value || ...

I've found solution thanks to #Shnugo answer.
If the type of xml-container column is different then XML MS-SQL dedicated one, then double CROSS APPLY should be performed. Example below :
DECLARE #mockup TABLE(ID INT IDENTITY, [XML_DATA] VARBINARY(MAX));
INSERT INTO #mockup VALUES('<Example><File><SomeNode>blah</SomeNode><SomeOtherNode VAL="1"/></File></Example>')
,('<Example><File><SomeNode>blub</SomeNode><SomeOtherNode VAL="2"/></File></Example>')
SELECT
ID,
b.value('(SomeNode/text())[1]','nvarchar(100)') as [Some_Text],
b.value('(SomeOtherNode/#VAL)[1]','int') as [Some_Val]
FROM #mockup
CROSS APPLY (SELECT CAST(convert(varbinary(max), [XML_DATA]) as XML)) as RAW_XML(xml_field)
CROSS APPLY RAW_XML.xml_field.nodes('Example/File') as a(b)

For sure the CURSOR approach is not needed and would be wrong entirely...
The general approach should be something like this:
SELECT
b.value('(./SomeNode/text())[1]','nvarchar(100)') as [Some_Text],
b.value('(./SomeOtherNode/#VAL)[1]','int') as [Some_Val]
FROM [SOME_TABLE]
CROSS APPLY [XML_FILE].nodes('Example/File') as a(b);
But there are questions open:
Speaking about xml files is a bit bewildering... I hope to get this correctly, that all these XMLs are living in a table's column.
If the first is true: Are all these XMLs of the same structure? if not you will need some kind of filtering.
is the XML in your table's column a native XML-type already? Your example uses CONVERT extensivly... You will need a native XML in order to use .nodes()
If there's no native XML: Do you have to deal with invalid / uncastable data?
Are there rows with no data but you want to see them anyway? In this case you can try OUTER APPLY instead of CROSS APPLY.
For demonstration a running stand-alone mockup:
DECLARE #mockup TABLE(ID INT IDENTITY, [XML_FILE] XML);
INSERT INTO #mockup VALUES('<Example><File><SomeNode>blah</SomeNode><SomeOtherNode VAL="1"/></File></Example>')
,('<Example><File><SomeNode>blub</SomeNode><SomeOtherNode VAL="2"/></File></Example>')
SELECT
ID,
b.value('(SomeNode/text())[1]','nvarchar(100)') as [Some_Text],
b.value('(SomeOtherNode/#VAL)[1]','int') as [Some_Val]
FROM #mockup
CROSS APPLY [XML_FILE].nodes('Example/File') as a(b)

Related

Stripping Values between two brackets {}

Good Afternoon,
I'm trying to query a column that gets data between two brackets. there may be multiple sets in the column such as : {Abrasision} {None} {Bruise}
i use this and it doesn't do exactly what i want, because i think i only use one bracket in the query. i want to get each value in my result set and insert into a table variable. Just having a little bit of trouble.
SELECT
LEFT(InjuryCategory, CHARINDEX('{', InjuryCategory)-1),
SUBSTRING(InjuryCategory, CHARINDEX('{', InjuryCategory)+1, LEN(InjuryCategory)-CHARINDEX('{', InjuryCategory)-CHARINDEX('{',REVERSE(InjuryCategory ))),
RIGHT(InjuryCategory, CHARINDEX('{', REVERSE(InjuryCategory))-1)
FROM TblVictim
You may use STRING_SPLIT(), STUFF() and STRING_AGG() to get the expected results. Note, that STRING_SPLIT() orders the results (using enable_ordinal parameter) only in Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse Analytics (serverless SQL pool only), so STRING_AGG() may aggregate differently.
Test data:
SELECT *
INTO tblVictim
FROM (
VALUES ('{Abrasision} {None} {Bruise}')
) t (InjuryCategory)
Statement:
SELECT STRING_AGG(STUFF(s.[value], 1, CHARINDEX('{', s.[value]), ''), ' ') AS Category
FROM tblVictim t
CROSS APPLY STRING_SPLIT(t.InjuryCategory, '}') s
WHERE s.[value] <> ''
Result:
Category
----------------------
Abrasision None Bruise
In newer versions of SQL Server, you can combine STRING_SPLIT and TRIM
SELECT TRIM('{}' FROM s.[value]) AS Category
FROM TblVictim v
CROSS APPLY STRING_SPLIT(v.InjuryCategory, ' ') s
WHERE s.[value] <> '';
db<>fiddle
Quick and dirty, since this is delimited data, pretend it's XML. Setup:
DECLARE #tblVictim TABLE(ID INT IDENTITY, InjuryCategory NVARCHAR(MAX));
INSERT #tblVictim(InjuryCategory)
VALUES
('{Abrasision} {None} {Bruise}'),
('{Abrasision} {<5} {Bruise; very severe}');
Query:
WITH data AS (
SELECT ID, xml = CAST(REPLACE(REPLACE(InjuryCategory,
'{', '<i><![CDATA['),
'}', ']]></i>') AS XML
)
FROM #tblVictim
)
SELECT ID, node.value('text()[1]', 'nvarchar(max)')
FROM data
CROSS APPLY xml.nodes('i') AS nodes(node)
Note that this completely breaks down (with no easy fixes) if there are unbalanced delimiters.

SQL Server: display whole column only if substring found

Working with SQL Sever 2016. I am constrained by the fact we cannot create functions or stored procedures. I am trying to find %word% in many columns across a table (75). Right now, I have a very large clump of
and (fieldname1 like %word%
or fieldname2 like %word%
or fieldname3 like %word%) etc.
While cumbersome, this does provide me the correct results. However:
I am looking to simplify this and
in the select, I want to display the whole column if and only if it finds %word% (or even just the column name would work)
Thank you in advance for any thoughts.
--...slow...
declare #searchfor varchar(100) = '23';
select #searchfor as [thevalue],
thexml.query('for $a in (/*[contains(upper-case(.), upper-case(sql:variable("#searchfor")))])
return concat(local-name($a[1]), ",")').value('.', 'nvarchar(max)') as [appears_in_columns],
*
from
(
select *, (select o.* for xml path(''), type) as thexml
from sys.all_objects as o --table goes here
) as src
where thexml.exist('/*[contains(upper-case(.), upper-case(sql:variable("#searchfor")))]') = 1;
One option uses cross apply to unpivot the table and then search:
select v.*
from mytable t
cross apply (values
('fieldname1', fieldname1),
('fieldname2', fieldname2),
('fieldname3', fieldname3)
) v(fieldname, fieldvalue)
where v.fieldvalue like '%word%'
Note that if more than one column contains the search word, you will get several rows in the resultset. I am unsure how you want to handle this use case (there are options).
SELECT OBJECT_NAME(id) ObjectName , [Text]
FROM syscomments
WHERE TEXT LIKE '%word%'

SQL Query for Attribute Value(s)

I have searched everywhere and seem to be having trouble for my specific issue. I am trying to parse xml values out of our database. The table is named 'Table.XMLfileData', with a column of XMLData. The current setup of that column is as such:
The setup of the XML itself are all nested in attributes:
I want to be able to pull any piece of data out of each of these XML files. The query that I have found in my research should be something like this:
SELECT r.value('#first_name','varchar(60)')
FROM TableName
CROSS APPLY columnname.nodes('Vehicle_Loan/Applicants/Applicant/first_name') AS
x(r)
However I retrieve a blank or null value every time. I am new to this, what am I doing wrong?
.value(...) requires a single node to work with and XPath is case sensitive.
SELECT r.value('(./#first_name)[1]','varchar(60)')
FROM TableName
CROSS APPLY columnname.nodes('Vehicle_Loan/Applicants/Applicant') AS
x(r)
... working example ...
DECLARE #xml XML = N'
<Vehicle_Loan>
<Applicants>
<Applicant first_name="Matt" />
<Applicant first_name="Jim" />
</Applicants>
</Vehicle_Loan>
';
SELECT r.value('(./#first_name)[1]','varchar(60)') AS [FirstName]
FROM #xml.nodes('Vehicle_Loan/Applicants/Applicant') AS x(r)
... output ...
FirstName
---------------
Matt
Jim

SQL return list of ntext and convert it to XML

I have a query that will return and a list of ntext, and in these ntext they contain XML value.
my question is how to convert each of ntext to xml and do logic with it
Query:
select a.content
from dbo.content as a
inner join dbo.xml_collection_tbl as b on a.xml_fg_id = b.xml_collection_id
where a.inherit_from='val1' and b.collection_title='val2' and a.content_table= 'val3'
result:
what I want to do here is to check rather the Query returns contain the value that I looking for. lets say the page title = "hello World"
I tried below.But it returns many empty rows and with one correct row
select cast(a.content_html as xml).query('(//root[pagetitle/text()="AAA"])') content_html1
from dbo.content as a
inner join dbo.xml_collection_tbl as b on a.xml_fg_id = b.xml_collection_id
where a.inherit_from='val1' and b.collection_title='val2' and a.content_table= 'val3'
expected result is: return only one row where it's not empty (row 54)
First of all: NTEXT, TEXT and IMAGE are deprecated for centuries and will not be supported in future versions! Get rid of this type as soon as possible!
SQL-Server does not store the XML as the text you see, but as a hierarchically stuctured tree. This makes the handling of an XML astonishingly fast (no parsing on string level!). Your approach has to parse each and every XML over and over, which is a very expensive operation! Change your XML's storage to the native XML type and you will be very happy with the new performance!
If you have to stick with this, you can try as such:
DECLARE #t TABLE (ID INT IDENTITY, YourXML NTEXT);
INSERT INTO #t VALUES('<root><pagetitle>111</pagetitle></root>')
,('<root><pagetitle>aaa</pagetitle></root>')
,('<root><pagetitle>222</pagetitle></root>')
SELECT A.CastedXML
,B.pt.query('.')
FROM #t AS t
CROSS APPLY(SELECT CAST(YourXML AS XML) AS CastedXML) AS A
CROSS APPLY A.CastedXML.nodes('/root/pagetitle[text()="aaa"]') AS B(pt);
Demo of XQuery expression https://learn.microsoft.com/en-us/sql/xquery/xquery-language-reference-sql-server to filter data
with sd as (
select cast(content_html as xml) as col
from (
values
('<root><pagetitle>FFF</pagetitle></root>')
,('<root><pagetitle>AAA</pagetitle></root>')
) as a(content_html)
)
select t.n.value('.[1]', 'varchar(100)') as content_html1
from sd
cross apply col.nodes('root/pagetitle[text()="AAA"]') t(n)

Compare Xml data in SQL

I have two tables with same NVARCHAR field that really contains XML data.
in some cases this really-XML-field is really same as one row in other table but differs in attributes order and therefor string comparison does not return the correct result!!!
and to determining the same XML fields ,I need to have a comparison like:
cast('<root><book b="" c="" a=""/></root>' as XML)
= cast('<root><book a="" b="" c=""/></root>' as XML)
but I get this Err Msg:
The XML data type cannot be compared or sorted, except when using the
IS NULL operator.
then what is the best solution to determine the same XML without re-casting them to NVARCHAR?
Why cast it at all? Just plug them into an XML column in a temp table and run Xquery to compare them to the other table. EDIT: Included example of the comparison. There are many, many ways to run the query against the XML to get the rows that are the same - exactly how that query is written is going to depend on preference, requirements, etc. I went with a simple group by/count, but a self join could be used, WHERE EXISTS against the columns that are being searched for duplicates, you name it.
CREATE TABLE #Test (SomeXML NVARCHAR(MAX))
CREATE TABLE #XML (SomeXML XML)
INSERT #Test (SomeXML)
VALUES('<root><book b="b" c="c" a="a"/></root>')
,('<root><book a="a" b="b" c="c"/></root>')
INSERT #XML (SomeXML)
SELECT SomeXML FROM #Test;
WITH XMLCompare (a,b,c)
AS
(
SELECT
x.c.value('#a[1]','char(1)') AS a
,x.c.value('#b[1]','char(1)') AS b
,x.c.value('#c[1]','char(1)') AS c
FROM #XML
CROSS APPLY SomeXMl.nodes('/root/book') X(C)
)
SELECT
a
,b
,c
FROM XMLCompare as a
GROUP BY
a
,b
,c
HAVING COUNT(*) >1