How to extract multiple xml element into individual tables? - sql

I have XML data like this
DECLARE #input XML =
'<LicensingReportProcessResult>
<LicensingReport>
<Address key="3845HoopaLnLasVegasNV89169-3350U.S.A.">
<LineOne>3845 Hoopa Ln</LineOne>
<CityName>Las Vegas</CityName>
<StateOrProvinceCode>NV</StateOrProvinceCode>
<PostalCode>89169-3350</PostalCode>
<CountryCode>U.S.A.</CountryCode>
</Address>
<Person key="PersonPRI711284842">
<ExternalIdentifier>
<TypeCode>NAICProducerCode</TypeCode>
<Id>8001585</Id>
</ExternalIdentifier>
<BirthDate>1961-07-29</BirthDate>
</Person>
</LicensingReport>
</LicensingReportProcessResult>'
My T-SQL code to extract one specific set of elements:
-- extract into temp table
INSERT INTO #Address
SELECT
Tbl.Col.value('#Address', 'NVARCHAR(100)'),
Tbl.Col.value('#City', 'NVARCHAR(100)'),
Tbl.Col.value('#State', 'NVARCHAR(100)'),
Tbl.Col.value('#PostalCode', 'NVARCHAR(100)'),
Tbl.Col.value('#CountryCode', 'NVARCHAR(100)')
FROM
#xml.nodes('//LicensingReportProcessResult/LicensingReport/Address') Tbl(Col)
-- verify results
SELECT * FROM #Address
I want to insert different element data into separate tables. Like Address data into an Address table and Person data into a Person table. As new elements are added I want to save data into separate tables.
Can someone help?

Are you asking how to dynamically define new tables for top level xml elements in a document? You can do that with any Xml Serialization library that reads a document and returns the elements and attributes as a tree, and from that metadata create a table definition that you then execute in sql.
Also consider simply storing your data as xml, perhaps with a defined schema, and then writing queries or views that extract the various elements using XPath or the xml data type methods as you already do, instead of extracting into physical tables.

Related

Saving all nodes from a big XML to SQL Database table in XMLType column using SSIS efficiently

I have a XML with many records as nodes in it. I need to save each record in xml format a SQL server table in column of XML datatype .
I can perform this task in SSIS using "XML Task Editor" to count all the nodes and using "For Loop Container" and read Node value using "XML Task Editor" and save it database.
Another option is using Script task, reading the XML file and save each node in a loop.
Please suggest a better approach which is efficient with big files.
Below is sample of Input XML File. I need to save each (3 records in below example) "RECORD" full node in XML form in SQL Server database table which has a column with xml datatype.
I would suggest 2 step approach.
Use SSIS Import Column Transformation in a Data Flow Task to load entire XML file into a staging table single row column.
Use stored procedure to produce individual RECORD XML fragments as separate rows and INSERT them into a final destination table.
SQL
DECLARE #staging_tbl TABLE (id INT IDENTITY PRIMARY KEY, xmldata XML);
INSERT INTO #staging_tbl (xmldata) VALUES
(N'<root>
<RECORD UI="F298AF1F"></RECORD>
<RECORD UI="4C6AAA65"></RECORD>
</root>');
-- INSERT INTO destination_table (ID, xml_record)
SELECT id
, c.query('.') AS xml_record
FROM #staging_tbl
CROSS APPLY xmldata.nodes('/root/RECORD') AS t(c);
Output
id
xml_record
1
<RECORD UI="F298AF1F" />
1
<RECORD UI="4C6AAA65" />
You can use the nodes() method to return a rowset of nodes in the xml document. This is the simplest example:
select node_table.xml_node_column.query('.') node
from xmldocument
cross apply xmldocument.nodes('/root/RECORD') node_table(xml_node_column)
https://learn.microsoft.com/en-us/sql/t-sql/xml/nodes-method-xml-data-type?view=sql-server-ver16

Querying XML tag in SQL server

I have a table Student with a column studentStateinfo which consist of XML value as below.
<params xmlns="">
<OldStudentID>1aedghe1d8ef</OldStudentID>
</params>
Now when I query this table Student I only want to check whether studentStateinfo column have an XML data with tag <OldStudentID>
Use the exist() Method (xml Data Type)
Example using a variable, you should change that to a column instead.
declare #X xml = '
<params xmlns="">
<OldStudentID>1aedghe1d8ef</OldStudentID>
</params>';
select #X.exist('/params/OldStudentID');

SQL. XML values from column. How to get?

For example I have table "BigApple" with three columns.
first column includes numbers
second column includes some text
third column includes XML files.
My question is: how to get to the third column of the specific values for a particular tag?
Use one of the XML methods on XML column https://msdn.microsoft.com/en-us/library/ms190798.aspx
In fact, if you have the same kind of XML data in the third column you can read specific tag values easily.
Please refer to examples on SQL XML query using a single XML variable
and example to query XML column in SQL database table using CROSS APPLY
Mao, how do you expect to get an answer which really helps you without showing your data? It can be trivial 'til really tricky to get data from an XML. Do you need only one particular tag? Or are there several data? Nested data?
One example for a trivial read might be this:
CREATE TABLE #tmpTbl(Number INT, SomeText VARCHAR(100),SomeXML XML);
INSERT INTO #tmpTbl VALUES
(1,'Test1','<root><a>xmlA1</a><b>xmlB1</b></root>')
,(2,'Test2','<root><a>xmlA2</a><b>xmlB2</b></root>');
SELECT Number
,SomeText
,SomeXML.value('(/root/a)[1]','varchar(10)') AS Tag_a
FROM #tmpTbl;
GO
DROP TABLE #tmpTbl;
The result
Number SomeText Tag_a
1 Test1 xmlA1
2 Test2 xmlA2

sql server Xquery nodes value performance

I have a table with 25,000 rows. Table Audit (Id int identity(1,1), AdditionalInfo xml)
The sample data in AdditionalInfo column for a row looks like below
<Audit version="1">
<Context name="Event">
<Action name="OrganizationEventReceived">
<Input>
<Source type="SourceOrganizationId">77d2678b-ea4a-43ad-816b-c63edf206b08</Source>
<Target type="TargetOrganizationId">b98fd3ae-dbcb-4826-9d92-7e445ad61273,b98fd3ae-dbcb-4826-9d92-7e445ad61273,b98fd3ae-dbcb-4826-9d92-7e445ad61273</Target>
</Input>
</Action>
</Context>
</Audit>
I like to shred the xml and collect the data in output dataset with following query.
SELECT Id,
p.value('(#name)[1]', 'nvarchar (100)') AS TargetAction,
p.value('(Input/Source/text())[1]', 'nvarchar (500)') AS Source,
p.value('(Input/Target/text())[1]', 'nvarchar (max)') AS Target
FROM dbo.Audit CROSS APPLY AdditionalInfo.nodes('/Audit/Context/Action') AS AdditionalInfo(p)
The performance of the query is bad. It is taking 15 seconds to give the result set for just 25,000 rows. Is there a better way of doing it. I even tried putting primary and secondary xml indexes on AdditionalInfo column. Please help and let me know, to use better sql server xquery techniques.
Thanks,
Great question.
My recent task requires to parse about 35'000 XML documents, valid document being ~20kB.
More and larger xml files tend to exponentially fill the memory:
100 documents: 0:33
1000 documents: 25:00 😵‍💫
Try to distribute your work:
Variable target stores unstructured data, which eats most of computing power due to the data type and different length in values
depth of nodes in CROSS APPLY matters: avoid triple nodes in nodes(), consider two nodes and recursion (see below on split)
batch mode: process several documents at time, WHERE id IN (1,2,3)
loop a list of documents, FOR;
parse using local variables, such as DECLARE #xml_doc XML; SET #xml_doc = SELECT xmldata FROM xmlsource WHERE id=1;
avoid exporting xml node content, only write result values
parse all elements separately: saving order of elements using function ROW_NUMBER(), then LEFT JOIN all parts to xml documents list using some identifier, such as xml_id

Retrieving multiple xml child node values

I have a column of type varchar(max) populated with xml nodes and values; as an example, the column data starts with <tag1> <tag2>value1</tag2><tag3>value2</tag3>... </tag1>. What I need to get out of this string is "value1 value2 value3... valueN" within one cell for every row in the table using static SQL or a stored procedure. The node tree isn't always the same, sometimes the path is <tagX><tagY>valueY</tagY>...</tagX>.
All of my experience with shredding xml is only used to get one specific value, property, or tag, not all values while retaining the column and row count. Currently I query then loop through the result set on my product's end and shred everything, but that's no longer an option due to recent changes.
It's possible to change the column to be of type xml, but if possible I'd like to avoid having to do so.
Cast the column to XML (or change it in the table to XML) and shred the xml on //* to get all nodes in a table. Then you can use for xml path to concat the values back together.
select (
select ' '+X.N.value('text()[1]', 'varchar(max)')
from (select cast(T.XMLCol as xml)) as T1(XMLCol)
cross apply T1.XMLCol.nodes('//*') as X(N)
for xml path(''), type
).value('substring(text()[1], 2)', 'varchar(max)')
from T
SQL Fiddle