XML to SQL - SQL Server - sql

I'm traversing an XML file to read nodes and fill into to SQL Server tables. I have a Root node having Department node which further may have one or more as element. I want to select all the possible values from in a SQL result set.
Please find below XML I'm referring:
DECLARE #x XML='
<Root>
<Department>
<DeptID>D101</DeptID>
<DeptID>D102</DeptID>
</Department>
</Root>'
I'm using below SQL Query to get the data from XML but I can read only first DeptID as I'm passing [1] inside DeptID[1]. If I pass [2] I can get thee second value. But in real life scenario, I won't be able to know how many DeptID would be there in the XML. So I want a generic script to read as many as DeptIDs comes in XML.
SELECT n.value('DeptID[1]','varchar(10)') AS DeptID FROM #x.nodes('/Root/Department') R(n)

You can use OpenXMl method of sql server to get more elements in table as follows.
Step 1: Suppose this is your sample XML data.
DECLARE #XML XML='
<ROOT>
<Customers>
<Customer CustomerID="C001" CustomerName="Arshad Ali">
<Orders>
<Order OrderID="10248" OrderDate="2012-07-04T00:00:00">
<OrderDetail ProductID="10" Quantity="5" />
<OrderDetail ProductID="11" Quantity="12" />
<OrderDetail ProductID="42" Quantity="10" />
</Order>
</Orders>
<Address> Address line 1, 2, 3</Address>
</Customer>
<Customer CustomerID="C002" CustomerName="Paul Henriot">
<Orders>
<Order OrderID="10245" OrderDate="2011-07-04T00:00:00">
<OrderDetail ProductID="11" Quantity="12" />
<OrderDetail ProductID="42" Quantity="10" />
</Order>
</Orders>
<Address> Address line 5, 6, 7</Address>
</Customer>
<Customer CustomerID="C003" CustomerName="Carlos Gonzlez">
<Orders>
<Order OrderID="10283" OrderDate="2012-08-16T00:00:00">
<OrderDetail ProductID="72" Quantity="3" />
</Order>
</Orders>
<Address> Address line 1, 4, 5</Address>
</Customer>
</Customers>
</ROOT>'
Step 2: Use of OPENXML method to get elements at any level as follows.
DECLARE #hDoc AS INT, #SQL NVARCHAR (MAX)
EXEC sp_xml_preparedocument #hDoc OUTPUT, #XML
SELECT CustomerID, CustomerName, Address, OrderID, OrderDate, ProductID, Quantity
FROM OPENXML(#hDoc, 'ROOT/Customers/Customer/Orders/Order/OrderDetail')
WITH
(
CustomerID [varchar](50) '../../../#CustomerID',
CustomerName [varchar](100) '../../../#CustomerName',
Address [varchar](100) '../../../Address',
OrderID [varchar](1000) '../#OrderID',
OrderDate datetime '../#OrderDate',
ProductID [varchar](50) '#ProductID',
Quantity int '#Quantity'
)
EXEC sp_xml_removedocument #hDoc
GO
Above steps will give you following Output.

Try it like this
DECLARE #x XML='
<Root>
<Department>
<DeptID>D101</DeptID>
<DeptID>D102</DeptID>
</Department>
</Root>';
SELECT d.value('text()[1]','varchar(10)') AS DeptID
FROM #x.nodes('/Root/Department/DeptID') A(d);
Your own code
SELECT n.value('DeptID[1]','varchar(10)') AS DeptID
FROM #x.nodes('/Root/Department') R(n)
... follows the right idea. But .nodes() must return the repeating element, which is <DeptID>. Your approach is looking for the first <DeptID> within <Department> actually

Related

OpenXML returning NULL

I am trying to import xml into my database with the following query using OpenXML in Microsoft SQL Server:
DECLARE #xml XML;
DECLARE #y INT;
SET #xml
= '<ArrayOfArticle xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Article>
<ScriptId xmlns="https://test.com/">5135399</ScriptId>
<Title xmlns="https://test.com/">Stocks divided into two corners</Title>
<Mediatype xmlns="https://test.com/">News papeer</Mediatype>
<Abstract xmlns="https://test.com/">Foreign capital doubled this year.</Abstract>
<ScriptDate xmlns="https://test.com/">2017-12-30T00:00:00</ScriptDate>
<ScriptTypeId xmlns="https://test.com/">1</ScriptTypeId>
<ScriptType xmlns="https://test.com/">News general</ScriptType>
<Media xmlns="https://test.com/">Times</Media>
<ArticleUrl xmlns="https://test.com/">http://test.com</ArticleUrl>
<AnalysisResult xmlns="https://test.com/">
<Analysis>
<Regno>111</Regno>
<Name>New York Times</Name>
<Result>1</Result>
<ResultName>Positive</ResultName>
</Analysis>
<Analysis>
<Regno>222</Regno>
<Name>Washington Post</Name>
<Result>1</Result>
<ResultName>Negative</ResultName>
</Analysis>
</AnalysisResult>
<FacebookStats xmlns="https://test.com/">
<ShareCount xsi:nil="true" />
<LikeCount xsi:nil="true" />
<CommentCount xsi:nil="true" />
<TotalCount xsi:nil="true" />
</FacebookStats>
<MediaScore xmlns="https://test.com/">
<MediaScore>
<Regno>111</Regno>
<CompanyName>New York Times</CompanyName>
<MediaScoreID>2</MediaScoreID>
<Name>Neither</Name>
</MediaScore>
<MediaScore>
<Regno>222</Regno>
<CompanyName>Washington Post</CompanyName>
<MediaScoreID>2</MediaScoreID>
<Name>Neither</Name>
</MediaScore>
</MediaScore>
<Page xmlns="https://test.com/">26</Page>
<ProgramId xmlns="https://test.com/">0</ProgramId>
<ProgramTime xmlns="https://test.com/" xsi:nil="true" />
<ProgramLength xmlns="https://test.com/">0</ProgramLength>
<ProgramOrder xmlns="https://test.com/">0</ProgramOrder>
</Article>
</ArrayOfArticle>';
EXEC sp_xml_preparedocument #y OUTPUT, #xml;
SELECT *
FROM
OPENXML(#y, '/ArrayOfArticle/Article', 1)
WITH
(
ScriptId VARCHAR(20),
Title VARCHAR(30),
Mediatype VARCHAR(30)
);
The query however only returns NULL values. What am I missing here? Would it be optimal to import the XML using SSIS instead. Not sure how much more details I can give at the given hour.
Do not use FROM OPENXML. This approach (together with the corresponding SPs to prepare and to remove a document) is outdated and should not be used any more.
Try the XML type's native methods, in this case .value():
Your XML is rather weird - concerning namespaces. If its creation is under your control you should try to clean this namespace mess. The unusual thing is, that your XML declares default namespaces over and over.
You can use the deep search with // together with a namespace wildcard *:
--GetItEasyCheesy (not recommended)
SELECT #xml.value(N'(//*:ScriptId)[1]',N'int') AS ScriptId
,#xml.value(N'(//*:Title)[1]',N'nvarchar(max)') AS Title
,#xml.value(N'(//*:Mediatype )[1]',N'nvarchar(max)') AS Mediatype ;
You can declare the namespace as default, but in this case you must wildcard the outer elements, as they are not part of this namespace:
--Use a default namespace
WITH XMLNAMESPACES(DEFAULT 'https://test.com/')
SELECT #xml.value(N'(/*:ArrayOfArticle/*:Article/ScriptId/text())[1]',N'int') AS ScriptId
,#xml.value(N'(/*:ArrayOfArticle/*:Article/Title/text())[1]',N'nvarchar(max)') AS Title
,#xml.value(N'(/*:ArrayOfArticle/*:Article/Mediatype/text())[1]',N'nvarchar(max)') AS Mediatype;
The recommended approach is to bind the inner namespace to a prefix and use this
--Recommended
WITH XMLNAMESPACES('https://test.com/' AS ns)
SELECT #xml.value(N'(/ArrayOfArticle/Article/ns:ScriptId/text())[1]',N'int') AS ScriptId
,#xml.value(N'(/ArrayOfArticle/Article/ns:Title/text())[1]',N'nvarchar(max)') AS Title
,#xml.value(N'(/ArrayOfArticle/Article/ns:Mediatype/text())[1]',N'nvarchar(max)') AS Mediatype;
If your <ArrayOfArticles> contains more than one <Article> you can use .nodes() to get alle of them as derived table. In this case the query is
WITH XMLNAMESPACES('https://test.com/' AS ns)
SELECT art.value(N'(ns:ScriptId/text())[1]',N'int') AS Recommended
,art.value(N'(ns:Title/text())[1]',N'nvarchar(max)') AS Title
,art.value(N'(ns:Mediatype/text())[1]',N'nvarchar(max)') AS Mediatype
FROM #xml.nodes(N'/ArrayOfArticle/Article') AS A(art);
your XML contains namespaces, I'd use xquery in order to extract the data from your XML
UPDATE with additional elements extract
DECLARE #xml XML;
SET #xml
= '<ArrayOfArticle xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Article>
<ScriptId xmlns="https://test.com/">5135399</ScriptId>
<Title xmlns="https://test.com/">Stocks divided into two corners</Title>
<Mediatype xmlns="https://test.com/">News papeer</Mediatype>
<Abstract xmlns="https://test.com/">Foreign capital doubled this year.</Abstract>
<ScriptDate xmlns="https://test.com/">2017-12-30T00:00:00</ScriptDate>
<ScriptTypeId xmlns="https://test.com/">1</ScriptTypeId>
<ScriptType xmlns="https://test.com/">News general</ScriptType>
<Media xmlns="https://test.com/">Times</Media>
<ArticleUrl xmlns="https://test.com/">http://test.com</ArticleUrl>
<AnalysisResult xmlns="https://test.com/">
<Analysis>
<Regno>111</Regno>
<Name>New York Times</Name>
<Result>1</Result>
<ResultName>Positive</ResultName>
</Analysis>
<Analysis>
<Regno>222</Regno>
<Name>Washington Post</Name>
<Result>1</Result>
<ResultName>Negative</ResultName>
</Analysis>
</AnalysisResult>
<FacebookStats xmlns="https://test.com/">
<ShareCount xsi:nil="true" />
<LikeCount xsi:nil="true" />
<CommentCount xsi:nil="true" />
<TotalCount xsi:nil="true" />
</FacebookStats>
<MediaScore xmlns="https://test.com/">
<MediaScore>
<Regno>111</Regno>
<CompanyName>New York Times</CompanyName>
<MediaScoreID>2</MediaScoreID>
<Name>Neither</Name>
</MediaScore>
<MediaScore>
<Regno>222</Regno>
<CompanyName>Washington Post</CompanyName>
<MediaScoreID>2</MediaScoreID>
<Name>Neither</Name>
</MediaScore>
</MediaScore>
<Page xmlns="https://test.com/">26</Page>
<ProgramId xmlns="https://test.com/">0</ProgramId>
<ProgramTime xmlns="https://test.com/" xsi:nil="true" />
<ProgramLength xmlns="https://test.com/">0</ProgramLength>
<ProgramOrder xmlns="https://test.com/">0</ProgramOrder>
</Article>
</ArrayOfArticle>'
DECLARE #T TABLE (XmlCol XML)
INSERT INTO #T
SELECT #xml
;WITH XMLNAMESPACES ('https://test.com/' as p1)
SELECT z.t.value ('../../p1:ScriptId[1]',' varchar(100)') ScriptId,
z.t.value ('../../p1:Title[1]',' varchar(100)') Title,
z.t.value ('../../p1:Mediatype[1]',' varchar(100)') Mediatype,
z.t.value ('p1:CompanyName[1]', 'varchar(100)') CompanyName
FROM #T t
CROSS APPLY XmlCol.nodes ('/ArrayOfArticle/Article/p1:MediaScore/p1:MediaScore') z(t)
DECLARE #y INT
EXEC sp_xml_preparedocument #y OUTPUT, #xml,
'<ns xmlns:x="https://test.com/"/>'
SELECT *
FROM
OPENXML(#y, '/ArrayOfArticle/Article', 2)
WITH
(
[ScriptId] VARCHAR(20) 'x:ScriptId', --<< and so on
[Title] VARCHAR(30),
Mediatype VARCHAR(30)
)
EXEC sp_xml_removedocument #y --<< lost in your code

How do you read an XML file in SQL?

I am using SQL Server 2014 to query XML text using xquery. I am able to insert portions of XML text to query but I need to be able to point to a local file to read and query it and cannot figure out how to do that. Below is what I have for what I am currently doing.
declare #xmldata xml
set #xmldata = '
<Orders>
<Order OrderID="100" OrderDate="1/30/2012">
<OrderDetail ProductID="1" Quantity="3">
<Price>350</Price>
</OrderDetail>
<OrderDetail ProductID="2" Quantity="8">
<Price>500</Price>
</OrderDetail>
<OrderDetail ProductID="3" Quantity="10">
<Price>700</Price>
</OrderDetail>
</Order>
<Order OrderID="200" OrderDate="2/15/2012">
<OrderDetail ProductID="4" Quantity="5">
<Price>120</Price>
</OrderDetail>
</Order>
</Orders>'
SELECT x.c.value('(OrderDetail)[2]', 'varchar(100)') as OrderDetail
FROM #xmldata.nodes('/Orders/Order') x(c)
--XMl File Location: "C:\Users\User\X\Example.xml")
Thanks in advance!
Try this https://msdn.microsoft.com/en-CA/library/ms191184.aspx
INSERT INTO T(XmlCol)
SELECT * FROM OPENROWSET(
BULK 'c:\SampleFolder\SampleData3.txt',
SINGLE_BLOB) AS x;

How to convert nested XML into corresponding tables?

I have a complex nested XML (generated from a C# entity graph), for example:
<Customers>
<Customer>
<Id>1</Id>
<Number>12345</Number>
<Addresses>
<Address>
<Id>100</Id>
<Street>my street </street>
<city>London</city>
</Address>
<Address>
<Id>101</Id>
<street>my street 2</street>
<city>Berlin</city>
</Address>
</Addresses>
<BankDetails>
<BankDetail>
<Id>222</Id>
<Iban>DE8439834934939434333</Iban>
</BankDetail>
<BankDetail>
<Id>228</Id>
<Iban>UK1237921391239123213</Iban>
</BankDetail>
</BankDetails>
<Orders>
<Order>
<OrderLine>
</OrderLine>
</Order>
</Orders>
</Customer>
</Customers>
Before saving the above XML data into the actual tables, I need to process it first. For this reason, I created corresponding table types. Each of these table types have an extra column (guid as ROWGUID) so that if I'm processing new data (not yet assigned primary key) I generate a unique key. I use this column to keep the relational integrity between different table types.
What is the SQL syntax to convert the above nested XML to their corresponding tables, keeping in mind that child records must reference the generated parent guid?
Try it like this:
DECLARE #xml XML=
N'<Customers>
<Customer>
<Id>1</Id>
<AccountNumber>12345</AccountNumber>
<Addresses>
<Address>
<Id>100</Id>
<street>my street></street>
<city>London</city>
</Address>
<Address>
<Id>101</Id>
<street>my street></street>
<city>Berlin</city>
</Address>
</Addresses>
<BankDetails>
<BankDetail>
<Id>222</Id>
<Iban>DE8439834934939434333</Iban>
</BankDetail>
<BankDetail>
<Id>228</Id>
<Iban>UK1237921391239123213</Iban>
</BankDetail>
</BankDetails>
<Orders>
<Order>
<OrderLine />
</Order>
</Orders>
</Customer>
</Customers>';
--This query will create a table #tmpInsert with all the data
SELECT cust.value('Id[1]','int') AS CustomerID
,cust.value('AccountNumber[1]','int') AS CustomerAccountNumber
,addr.value('Id[1]','int') AS AddressId
,addr.value('street[1]','nvarchar(max)') AS AddressStreet
,addr.value('city[1]','nvarchar(max)') AS AddressCity
,bank.value('Id[1]','int') AS BankId
,bank.value('Iban[1]','nvarchar(max)') AS BankIban
,ord.value('OrderLine[1]','nvarchar(max)') AS OrderLine
INTO #tmpInsert
FROM #xml.nodes('/Customers/Customer') AS A(cust)
OUTER APPLY cust.nodes('Addresses/Address') AS B(addr)
OUTER APPLY cust.nodes('BankDetails/BankDetail') AS C(bank)
OUTER APPLY cust.nodes('Orders/Order') AS D(ord);
--Here you can check the content
SELECT * FROM #tmpInsert;
--Clean-Up
GO
DROP TABLE #tmpInsert
Once you've got all your data in the table, you can use simple DISTINCT, GROUP BY, if needed ROW_NUMBER() OVER(PARTITION BY ...) to select each set separately for the proper insert.

Save XML with attribute to Table in SQL Server

Hi I have XML data with attribute as input for SQL, i need this to be inserted in my table.
XML Data is
<?xml version="1.0" encoding="ISO-8859-1"?>
<MESSAGEACK>
<GUID GUID="kfafb30" SUBMITDATE="2015-10-15 11:30:29" ID="1">
<ERROR SEQ="1" CODE="28681" />
</GUID>
<GUID GUID="kfafb3" SUBMITDATE="2015-10-15 11:30:29" ID="1">
<ERROR SEQ="2" CODE="286381" />
</GUID>
</MESSAGEACK>
I want this to be inserted in below Format
GUID SUBMIT DATE ID ERROR SEQ CODE
kfafb3 2015-10-15 11:30:29 1 1 28681
kfafb3 2015-10-15 11:30:29 1 1 2868
please help.
Look into XPath and xml Data Type Methods in MSDN. This is one possible way :
declare #xml As XML = '...you XML string here...'
INSERT INTO YourTable
SELECT
guid.value('#GUID', 'varchar(100)') as 'GUID'
,guid.value('#SUBMITDATE', 'datetime') as 'SUBMIT DATE'
,guid.value('#ID', 'int') as 'ID'
,guid.value('ERROR[1]/#SEQ', 'int') as 'SEQ'
,guid.value('ERROR[1]/#CODE', 'int') as 'CODE'
FROM #xml.nodes('/MESSAGEACK/GUID') as x(guid)
Result :
just paste this into an empty query window and execute. Adapt to your needs:
DECLARE #xml XML=
'<?xml version="1.0" encoding="ISO-8859-1"?>
<MESSAGEACK>
<GUID GUID="kfafb30" SUBMITDATE="2015-10-15 11:30:29" ID="1">
<ERROR SEQ="1" CODE="28681" />
</GUID>
<GUID GUID="kfafb3" SUBMITDATE="2015-10-15 11:30:29" ID="1">
<ERROR SEQ="2" CODE="286381" />
</GUID>
</MESSAGEACK>';
SELECT Msg.Node.value('#GUID','varchar(max)') AS [GUID] --The value is no GUID, if the original values are, you could use uniqueidentifier instead of varchar(max)
,Msg.Node.value('#SUBMITDATE','datetime') AS SUBMITDATE
,Msg.Node.value('#ID','int') AS ID
,Msg.Node.value('(ERROR/#SEQ)[1]','int') AS [ERROR SEQ]
,Msg.Node.value('(ERROR/#CODE)[1]','int') AS CODE
FROM #xml.nodes('/MESSAGEACK/GUID') AS Msg(Node)

Parsing XML using TSQL

I'm trying to parse out the following XML with TSQL:
<Response xmlns="http://data.fcc.gov/api" status="OK" executionTime="9">
<Block FIPS="181770103002004" />
<County FIPS="18177" name="Wayne" />
<State FIPS="18" code="IN" name="Indiana" />
</Response>
Using the following script:
SELECT x.i.value('#name', 'varchar(200)') AS county
FROM #xml.nodes('Response/County') AS x(i)
But I get no results, any help as to what I'm doing wrong would be greatly appreciated.
Thanks!
Your XML namespace is messing things up. Either remove the xmlns="http://data.fcc.gov/api" from the Response element, or prefix your query with WITH XMLNAMESPACES ( DEFAULT 'http://data.fcc.gov/api')
;WITH XMLNAMESPACES ( DEFAULT 'http://data.fcc.gov/api')
SELECT x.i.value('#name', 'varchar(200)') AS county
FROM #xml.nodes('Response/County') AS x(i)
Or you can use wildcard namespaces in the query:
SELECT x.i.value('#name', 'varchar(200)') AS county
FROM #xml.nodes('*:Response/*:County') AS x(i)
You can do it using OPENXML like this:
DECLARE #idoc INT
DECLARE #xml AS XML =
'<Response xmlns="http://data.fcc.gov/api" status="OK" executionTime="9">
<Block FIPS="181770103002004" />
<County FIPS="18177" name="Wayne" />
<State FIPS="18" code="IN" name="Indiana" />
</Response>'
EXEC sp_xml_preparedocument #idoc OUTPUT, #xml, N'<root xmlns:n="http://data.fcc.gov/api" />'
SELECT
Name AS County
FROM OPENXML (#idoc, '/n:Response/n:County', 1)
WITH
(
Name VARCHAR(255) '#name'
)
EXEC sp_xml_removedocument #idoc
GO