I have a complex nested XML (generated from a C# entity graph), for example:
<Customers>
<Customer>
<Id>1</Id>
<Number>12345</Number>
<Addresses>
<Address>
<Id>100</Id>
<Street>my street </street>
<city>London</city>
</Address>
<Address>
<Id>101</Id>
<street>my street 2</street>
<city>Berlin</city>
</Address>
</Addresses>
<BankDetails>
<BankDetail>
<Id>222</Id>
<Iban>DE8439834934939434333</Iban>
</BankDetail>
<BankDetail>
<Id>228</Id>
<Iban>UK1237921391239123213</Iban>
</BankDetail>
</BankDetails>
<Orders>
<Order>
<OrderLine>
</OrderLine>
</Order>
</Orders>
</Customer>
</Customers>
Before saving the above XML data into the actual tables, I need to process it first. For this reason, I created corresponding table types. Each of these table types have an extra column (guid as ROWGUID) so that if I'm processing new data (not yet assigned primary key) I generate a unique key. I use this column to keep the relational integrity between different table types.
What is the SQL syntax to convert the above nested XML to their corresponding tables, keeping in mind that child records must reference the generated parent guid?
Try it like this:
DECLARE #xml XML=
N'<Customers>
<Customer>
<Id>1</Id>
<AccountNumber>12345</AccountNumber>
<Addresses>
<Address>
<Id>100</Id>
<street>my street></street>
<city>London</city>
</Address>
<Address>
<Id>101</Id>
<street>my street></street>
<city>Berlin</city>
</Address>
</Addresses>
<BankDetails>
<BankDetail>
<Id>222</Id>
<Iban>DE8439834934939434333</Iban>
</BankDetail>
<BankDetail>
<Id>228</Id>
<Iban>UK1237921391239123213</Iban>
</BankDetail>
</BankDetails>
<Orders>
<Order>
<OrderLine />
</Order>
</Orders>
</Customer>
</Customers>';
--This query will create a table #tmpInsert with all the data
SELECT cust.value('Id[1]','int') AS CustomerID
,cust.value('AccountNumber[1]','int') AS CustomerAccountNumber
,addr.value('Id[1]','int') AS AddressId
,addr.value('street[1]','nvarchar(max)') AS AddressStreet
,addr.value('city[1]','nvarchar(max)') AS AddressCity
,bank.value('Id[1]','int') AS BankId
,bank.value('Iban[1]','nvarchar(max)') AS BankIban
,ord.value('OrderLine[1]','nvarchar(max)') AS OrderLine
INTO #tmpInsert
FROM #xml.nodes('/Customers/Customer') AS A(cust)
OUTER APPLY cust.nodes('Addresses/Address') AS B(addr)
OUTER APPLY cust.nodes('BankDetails/BankDetail') AS C(bank)
OUTER APPLY cust.nodes('Orders/Order') AS D(ord);
--Here you can check the content
SELECT * FROM #tmpInsert;
--Clean-Up
GO
DROP TABLE #tmpInsert
Once you've got all your data in the table, you can use simple DISTINCT, GROUP BY, if needed ROW_NUMBER() OVER(PARTITION BY ...) to select each set separately for the proper insert.
Related
I'm traversing an XML file to read nodes and fill into to SQL Server tables. I have a Root node having Department node which further may have one or more as element. I want to select all the possible values from in a SQL result set.
Please find below XML I'm referring:
DECLARE #x XML='
<Root>
<Department>
<DeptID>D101</DeptID>
<DeptID>D102</DeptID>
</Department>
</Root>'
I'm using below SQL Query to get the data from XML but I can read only first DeptID as I'm passing [1] inside DeptID[1]. If I pass [2] I can get thee second value. But in real life scenario, I won't be able to know how many DeptID would be there in the XML. So I want a generic script to read as many as DeptIDs comes in XML.
SELECT n.value('DeptID[1]','varchar(10)') AS DeptID FROM #x.nodes('/Root/Department') R(n)
You can use OpenXMl method of sql server to get more elements in table as follows.
Step 1: Suppose this is your sample XML data.
DECLARE #XML XML='
<ROOT>
<Customers>
<Customer CustomerID="C001" CustomerName="Arshad Ali">
<Orders>
<Order OrderID="10248" OrderDate="2012-07-04T00:00:00">
<OrderDetail ProductID="10" Quantity="5" />
<OrderDetail ProductID="11" Quantity="12" />
<OrderDetail ProductID="42" Quantity="10" />
</Order>
</Orders>
<Address> Address line 1, 2, 3</Address>
</Customer>
<Customer CustomerID="C002" CustomerName="Paul Henriot">
<Orders>
<Order OrderID="10245" OrderDate="2011-07-04T00:00:00">
<OrderDetail ProductID="11" Quantity="12" />
<OrderDetail ProductID="42" Quantity="10" />
</Order>
</Orders>
<Address> Address line 5, 6, 7</Address>
</Customer>
<Customer CustomerID="C003" CustomerName="Carlos Gonzlez">
<Orders>
<Order OrderID="10283" OrderDate="2012-08-16T00:00:00">
<OrderDetail ProductID="72" Quantity="3" />
</Order>
</Orders>
<Address> Address line 1, 4, 5</Address>
</Customer>
</Customers>
</ROOT>'
Step 2: Use of OPENXML method to get elements at any level as follows.
DECLARE #hDoc AS INT, #SQL NVARCHAR (MAX)
EXEC sp_xml_preparedocument #hDoc OUTPUT, #XML
SELECT CustomerID, CustomerName, Address, OrderID, OrderDate, ProductID, Quantity
FROM OPENXML(#hDoc, 'ROOT/Customers/Customer/Orders/Order/OrderDetail')
WITH
(
CustomerID [varchar](50) '../../../#CustomerID',
CustomerName [varchar](100) '../../../#CustomerName',
Address [varchar](100) '../../../Address',
OrderID [varchar](1000) '../#OrderID',
OrderDate datetime '../#OrderDate',
ProductID [varchar](50) '#ProductID',
Quantity int '#Quantity'
)
EXEC sp_xml_removedocument #hDoc
GO
Above steps will give you following Output.
Try it like this
DECLARE #x XML='
<Root>
<Department>
<DeptID>D101</DeptID>
<DeptID>D102</DeptID>
</Department>
</Root>';
SELECT d.value('text()[1]','varchar(10)') AS DeptID
FROM #x.nodes('/Root/Department/DeptID') A(d);
Your own code
SELECT n.value('DeptID[1]','varchar(10)') AS DeptID
FROM #x.nodes('/Root/Department') R(n)
... follows the right idea. But .nodes() must return the repeating element, which is <DeptID>. Your approach is looking for the first <DeptID> within <Department> actually
This is my first time working with XML files. I have been able to read the file into a table. Now I am trying to access the data elements to insert / update into ERP.
I am getting stuck using https://learn.microsoft.com/en-us/sql/t-sql/functions/openxml-transact-sql as a guide.
I am only attempting a single retrieval now, trying to keep things simple:
XML:
<TranscriptRequest xmlns="urn:org:pesc:message:TranscriptRequest:v1.0.0">
<TransmissionData>
<DocumentID xmlns="">88895-20170227180832302-ccd</DocumentID>
<CreatedDateTime xmlns="">2017-02-27T18:08:32.303-08:00</CreatedDateTime>
<DocumentTypeCode xmlns="">Request</DocumentTypeCode>
<TransmissionType xmlns="">Original</TransmissionType>
<Source xmlns="">
<Organization>
<DUNS>626927060</DUNS>
<OrganizationName>AVOW</OrganizationName>
</Organization>
</Source>
<Destination xmlns="">
<Organization>
<OPEID>3419</OPEID>
<OrganizationName>Charleston Southern University</OrganizationName>
</Organization>
</Destination>
<DocumentProcessCode xmlns="">PRODUCTION</DocumentProcessCode>
</TransmissionData>
<Request>
<CreatedDateTime xmlns="">2017-02-27T00:00:00.000-08:00</CreatedDateTime>
<Requestor xmlns="">
<Person>
<Birth>
<BirthDate>1985-01-01</BirthDate>
</Birth>
<Name>
<FirstName>Chad</FirstName>
<LastName>test2</LastName>
</Name>
<AlternateName>
<FirstName>Chad</FirstName>
<LastName>Walker</LastName>
<CompositeName>Walker, Chad</CompositeName>
</AlternateName>
<Contacts>
<Address>
<AddressLine>10260 west st</AddressLine>
<City>Denver</City>
<StateProvinceCode>CO</StateProvinceCode>
<PostalCode>80236</PostalCode>
</Address>
<Phone>
<CountryPrefixCode>1</CountryPrefixCode>
<AreaCityCode>303</AreaCityCode>
<PhoneNumber>8152848</PhoneNumber>
</Phone>
<Email>
<EmailAddress>cwalker#parchment.com</EmailAddress>
</Email>
</Contacts>
</Person>
</Requestor>
My SQL:
USE TMSEPRD
DECLARE #XML AS XML, #hDoc AS INT, #SQL NVARCHAR (MAX)
SELECT #XML = [XMLData] FROM [dbo].[CSU_Parchment_XMLwithOpenXML]
EXEC sp_xml_preparedocument #hDoc OUTPUT, #XML
SELECT BirthDate
FROM OPENXML(#hDoc, 'ROOT/Request/Requestor/Person/Birth/Birthdate')
WITH
(
BirthDate varchar(20) '../#BirthDate'
)
EXEC sp_xml_removedocument #hDoc
GO
Birthdate remains blank. So far I have tried eliminating the ../ in front of #Birthdate, I have tried removing the #, I have tried removing Birthdate from the path.
Using Xml Data Type Methods
DECLARE #XML XML='<TranscriptRequest xmlns="urn:org:pesc:message:TranscriptRequest:v1.0.0">
<TransmissionData>
<DocumentID xmlns="">88895-20170227180832302-ccd</DocumentID>
<CreatedDateTime xmlns="">2017-02-27T18:08:32.303-08:00</CreatedDateTime>
<DocumentTypeCode xmlns="">Request</DocumentTypeCode>
<TransmissionType xmlns="">Original</TransmissionType>
<Source xmlns="">
<Organization>
<DUNS>626927060</DUNS>
<OrganizationName>AVOW</OrganizationName>
</Organization>
</Source>
<Destination xmlns="">
<Organization>
<OPEID>3419</OPEID>
<OrganizationName>Charleston Southern University</OrganizationName>
</Organization>
</Destination>
<DocumentProcessCode xmlns="">PRODUCTION</DocumentProcessCode>
</TransmissionData>
<Request>
<CreatedDateTime xmlns="">2017-02-27T00:00:00.000-08:00</CreatedDateTime>
<Requestor xmlns="">
<Person>
<Birth>
<BirthDate>1985-01-01</BirthDate>
</Birth>
<Name>
<FirstName>Chad</FirstName>
<LastName>test2</LastName>
</Name>
<AlternateName>
<FirstName>Chad</FirstName>
<LastName>Walker</LastName>
<CompositeName>Walker, Chad</CompositeName>
</AlternateName>
<Contacts>
<Address>
<AddressLine>10260 west st</AddressLine>
<City>Denver</City>
<StateProvinceCode>CO</StateProvinceCode>
<PostalCode>80236</PostalCode>
</Address>
<Phone>
<CountryPrefixCode>1</CountryPrefixCode>
<AreaCityCode>303</AreaCityCode>
<PhoneNumber>8152848</PhoneNumber>
</Phone>
<Email>
<EmailAddress>cwalker#parchment.com</EmailAddress>
</Email>
</Contacts>
</Person>
</Requestor>
</Request>
</TranscriptRequest>';
SQL query:
with xmlnamespaces('urn:org:pesc:message:TranscriptRequest:v1.0.0' as ns)
select t.n.value('BirthDate[1]','date')
from #XML.nodes('ns:TranscriptRequest/ns:Request/Requestor/Person/Birth') t(n);
Namespaces
How is this XML generated? Is this under your control? What makes me wonder are various xmlns="" This is defining empty default namespaces over and over... Allthough there is one default NS xmlns="urn:org:pesc:message:TranscriptRequest:v1.0.0" in the first line, which seems to be correct.
For this namespace issue there are many solutions:
Create this without the empty xmlns - if this is under your control
Use REPLACE and cut away xmlns="" (be aware of the leading blank!) on string level, before you write this into your xml typed column / variable
use a wildcard (*:) for the namespaces of the first two levels
Define a namespace alias for the default and let the empty namespaces be the default for the lower levels
Structure
Your XML seems to be plain 1:1, only 1 source, 1 destination, one reques. And only 1 person within <Request> and again only 1 Name and so on below <Person>. There is a <Contacts> element, which sounds like 1:n, but it seems to include only 1 address, 1 phone... Reading plain 1:1 is easy...
I will use a CTE with two calls in .query() for an easy approach to solve the namespace issue with a wildcard. This allows to read the rest without namespaces:
some examples how to read your elements, the rest is up to you...
WITH levels AS
(
SELECT #xml.query(N'/*:TranscriptRequest/*:TransmissionData/*') AS td
,#xml.query(N'/*:TranscriptRequest/*:Request/*') AS rq
)
SELECT --TransmissionData
td.value(N'(DocumentID/text())[1]',N'nvarchar(max)') AS DocumentID
,td.value(N'(CreatedDateTime/text())[1]',N'datetime') AS td_CreatedDateTime
--more elemenst...
,td.value(N'(Source/Organization/DUNS/text())[1]',N'bigint') AS Source_Organization_DUNS
--more elements...
,td.value(N'(Destination/Organization/OPEID/text())[1]',N'bigint') AS Destination_Organization_OPEID
--Request
,rq.value(N'(CreatedDateTime/text())[1]',N'datetime') AS rq_CreatedDateTime
--Request-Person
,rq.value(N'(Requestor/Person/Name/FirstName/text())[1]',N'nvarchar(max)') AS Requestor_FirstName
--Contacts
,rq.value(N'(Requestor/Person/Contacts/Address/AddressLine/text())[1]',N'nvarchar(max)') AS Requestor_AddressLine
FROM levels
I'm new to XSD, so please help. I've created some XML using "for xml path" in SQL Server Management Studio 2008. It looks like:
I've read some literature, where after creating such an xml, I'm supposed to run this:
But it doesn't work. As I know after saving this file in xml and clicking it twice, it should open xml in browser. But it doesn't. What is wrong? primary key is ReferenceCode. I've created xml using this query:
select p.ReferenceCode
,p.LastName
,p.FirstName
,p.BirthDate
,p.BirthPlace
,(
select d.Type
,d.Series
,d.Number
,d.IssueDate
,d.IssueAuthority
from #Document d
where d.ReferenceCode = p.ReferenceCode
for xml path ('Document'),root('Documents'),Type
)
,(
select a.Type
,a.Street
from #Address a
where a.ReferenceCode = p.ReferenceCode
for xml path ('Address'),root('Addresses'),Type
)
,(
select h.Number
from #Phone h
where h.ReferenceCode = p.ReferenceCode
for xml path ('Phone'),root('Phone'),Type
)
from #Person p
for xml path ('Person'),root('Root')
Thanks in advance
<Root>
<Person>
<ReferenceCode>10000007462</ReferenceCode>
<LastName>Артамонова</LastName>
<FirstName>Галина</FirstName>
<BirthDate>1961-07-19</BirthDate>
<BirthPlace>РОССИЙСКАЯ ФЕДЕРАЦИЯ, д. Криуша Староюрьевского р-на Тамбовской обл.</BirthPlace>
<Documents>
<Document>
<Type>21</Type>
<Series>4508</Series>
<Number>685129</Number>
<IssueDate>2006-08-16</IssueDate>
<IssueAuthority>ОВД р-на Чертаново-Центральное г. Москвы,</IssueAuthority>
</Document>
</Documents>
<Addresses>
<Address>
<Type>1</Type>
<Street>Днепропетровская ул</Street>
</Address>
<Address>
<Type>1</Type>
<Street>Декабристов ул</Street>
</Address>
<Address>
<Type>2</Type>
<Street>Днепропетровская ул</Street>
</Address>
<Address>
<Type>2</Type>
<Street>Декабристов ул</Street>
</Address>
</Addresses>
<Phones>
<Phone>
<Number>907-09-33 </Number>
</Phone>
<Phone>
<Number>+7(903)1780367 </Number>
</Phone>
</Phones>
</Person>
</Root>
Thanks a lot, problem was - I can't create xsd with above xml, please help me to create it. Answers were - use another version, it was created and then deleted, encoding is wrong.
I found the answer myself, I needed just headings which are usually put above the xml. That much.
<Order>
<AmazonOrderID>1111-222-33</AmazonOrderID>
<MerchantOrderID>111-222-33</MerchantOrderID>
<PurchaseDate>2014-08-03T18:11:11+00:00</PurchaseDate>
<LastUpdatedDate>2014-08-03T18:11:14+00:00</LastUpdatedDate>
<OrderStatus>Pending</OrderStatus>
<SalesChannel>Amazon.in</SalesChannel>
<FulfillmentData>
<FulfillmentChannel>Amazon</FulfillmentChannel>
<ShipServiceLevel>Standard</ShipServiceLevel>
<Address>
<City>bangalore</City>
<State>karnataka</State>
<PostalCode>560038</PostalCode>
<Country>IN</Country>
</Address>
</FulfillmentData>
<OrderItem>
<ASIN>B00AW9A53Q</ASIN>
<SKU>SiM13</SKU>
<ItemStatus>Unshipped</ItemStatus>
<ProductName>Chicco 500ml Body Lotion</ProductName>
<Quantity>1</Quantity>
<ItemPrice>
<Component>
<Type>Principal</Type>
<Amount currency="INR">390.0</Amount>
</Component>
</ItemPrice>
</OrderItem>
</Order>
how can i import this xml data into relational tables. i have 2 tables one is Order and another is items.i wanna insert below mentioned line in order table with as primary key.
<AmazonOrderID>111-222-333</AmazonOrderID>
<MerchantOrderID>1111-3333-444</MerchantOrderID>
<PurchaseDate>2014-08-03T18:11:11+00:00</PurchaseDate>
<LastUpdatedDate>2014-08-03T18:11:14+00:00</LastUpdatedDate>
<OrderStatus>Pending</OrderStatus>
<SalesChannel>Amazon.in</SalesChannel>
and order item in items table with amazonid as foreign key
<OrderItem>
<ASIN>B00AW9A53Q</ASIN>
<SKU>SiM13</SKU>
<ItemStatus>Unshipped</ItemStatus>
<ProductName>Chicco 500ml Body Lotion</ProductName>
<Quantity>1</Quantity>
<ItemPrice>
<Component>
<Type>Principal</Type>
<Amount currency="INR">390.0</Amount>
</Component>
</ItemPrice>
</OrderItem>
</Order>
please anyone can provide solution for this..i will be very grateful.
Assuming you have your XML in a SQL variable of type XML called #Input, you could use this for the first part of your question:
INSERT INTO dbo.Order(AmazonOrderID, MerchantOrderID,
PurchaseDate, LastUpdateDate, OrderStatus, SalesChannel)
SELECT
AmazonOrderID = #input.value('(/Order/AmazonOrderID)[1]', 'varchar(25)'),
MerchantOrderID = #input.value('(/Order/MerchantOrderID)[1]', 'varchar(25)'),
PurchaseDate = #input.value('(/Order/PurchaseDate)[1]', 'datetimeoffset'),
LastUpdatedDate = #input.value('(/Order/LastUpdatedDate)[1]', 'datetimeoffset'),
OrderStatus = #input.value('(/Order/OrderStatus)[1]', 'varchar(20)'),
SalesChannel = #input.value('(/Order/SalesChannel)[1]', 'varchar(50)')
I'm not quite clear what you want to do with the second part of your question - are there potentially multiple <OrderItem> entries? What do you want to do about the nested <ItemPrice> structure?
And what do you mean by amazonid as foreign key - what AmazonId are you referring to?
Here is my SQL. I cannot seem to get one single value out of this thing. It only works if I remove all of the xmlns attributes.
I think the problem is that this xml contains 2 default namespaces, one attached to the Response element and one attached to the Shipment element.
DECLARE #xml XML
SET #xml = '<TrackResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Response xmlns="http://www.ups.com/XMLSchema/XOLTWS/Common/v1.0">
<ResponseStatus>
<Code>1</Code>
<Description>Success</Description>
</ResponseStatus>
<TransactionReference />
</Response>
<Shipment xmlns="http://www.ups.com/XMLSchema/XOLTWS/Track/v2.0">
<InquiryNumber>
<Code>01</Code>
<Description>ShipmentIdentificationNumber</Description>
<Value>1ZA50209234098230</Value>
</InquiryNumber>
<ShipperNumber>A332098</ShipperNumber>
<ShipmentAddress>
<Type>
<Code>01</Code>
<Description>Shipper Address</Description>
</Type>
<Address>
<AddressLine>123 HWY X</AddressLine>
<City>SOMETOWN</City>
<StateProvinceCode>SW</StateProvinceCode>
<PostalCode>20291 1234</PostalCode>
<CountryCode>US</CountryCode>
</Address>
</ShipmentAddress>
<ShipmentWeight>
<UnitOfMeasurement>
<Code>LBS</Code>
</UnitOfMeasurement>
<Weight>0.00</Weight>
</ShipmentWeight>
<Service>
<Code>42</Code>
<Description>UPS GROUND</Description>
</Service>
<Package>
<TrackingNumber>1ZA50209234098230</TrackingNumber>
<PackageServiceOption>
<Type>
<Code>01</Code>
<Description>Signature Required</Description>
</Type>
</PackageServiceOption>
<Activity>
<ActivityLocation>
<Address>
<City>SOMEWHERE</City>
<StateProvinceCode>PA</StateProvinceCode>
<CountryCode>US</CountryCode>
</Address>
</ActivityLocation>
<Status>
<Type>X</Type>
<Description>Damage reported. / Damage claim under investigation.</Description>
<Code>UY</Code>
</Status>
<Date>20120424</Date>
<Time>125000</Time>
</Activity>
<Activity>
<ActivityLocation>
<Address>
<City>SOMEWHERE</City>
<StateProvinceCode>PA</StateProvinceCode>
<CountryCode>US</CountryCode>
</Address>
</ActivityLocation>
<Status>
<Type>X</Type>
<Description>All merchandise discarded. UPS will notify the sender with details of the damage.</Description>
<Code>GY</Code>
</Status>
<Date>20120423</Date>
<Time>115500</Time>
</Activity>
<PackageWeight>
<UnitOfMeasurement>
<Code>LBS</Code>
</UnitOfMeasurement>
<Weight>0.00</Weight>
</PackageWeight>
</Package>
</Shipment>
</TrackResponse>'
select Svc.Dsc.value('(/TrackResponse/Shipment/Service/Description)[1]', 'varchar(25)')
from #xml.nodes('/TrackResponse') as Svc(Dsc)
As #marc_s said, you are ignoring xml namespaces. Here is a sql fiddle example. This gives X, I think that is what you need. Read this article for more. Note : *:TrackResponse[1]/*: in the xpath
--Results: X
declare #xmlTable as table (
xmlData xml
)
insert into #xmlTable
select '<TrackResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Response xmlns="http://www.ups.com/XMLSchema/XOLTWS/Common/v1.0">
...
</TrackResponse>'
;with xmlnamespaces(default 'http://www.ups.com/XMLSchema/XOLTWS/Track/v2.0')
select
x.xmlData.value('(/*:TrackResponse[1]/*:Shipment[1]/Package[1]/Activity[1]/Status[1]/Type[1])','varchar(100)') as all_snacks
from #xmlTable x
Two problems:
you're blatantly ignoring the XML namespace that's defined on the <shipment> element
your XQuery expression was a bit off
Try this:
-- define XML namespace
;WITH XMLNAMESPACES('http://www.ups.com/XMLSchema/XOLTWS/Track/v2.0' AS ns)
select
Svc.Dsc.value('(ns:Shipment/ns:Service/ns:Description)[1]', 'varchar(25)')
from
-- this already selects all <TrackResponse> nodes - no need to repeat that in
-- your above call to .value()
#xml.nodes('/TrackResponse') as Svc(Dsc)
Gives me a result of:
UPS GROUND