I have a XML file, which I need to import into a SQL Server database. The XML file is build like this:
<report>
<deltagere>
<deltager>
<number>142555267</number>
<date>29-12-2006</date>
<name>
<name>
<from>01-05-2000</from>
<to>01-01-2003</to>
<text>foo</text>
</name>
<name>
<from>01-01-2003</from>
<to>29-12-2006</to>
<text>bzz</text>
</name>
</name>
<information>
<deltagertype>person</deltagertype>
<leader>John Smith</leader>
<status>Active</status>
</information>
<role>Responsible</role>
</deltager>
<deltager>
<number>4000134982</number>
<date>05-12-2007</date>
<name>
<name>
<from>07-07-2007</from>
<to>05-12-2007</to>
<text>bar</text>
</name>
</name>
<information>
<deltagertype>person</deltagertype>
<leader>Wolfgang Smith</leader>
<status>Active</status>
</information>
<role>Responsible</role>
</deltager>
...
</deltagere>
</report>
As you can see the name attribute can hold multiple names. I have managed to import the XML into my database, but only with the first name attribute.
The code I have written so far is:
DECLARE #XmlFile XML
SELECT #XmlFile = BulkColumn
FROM OPENROWSET(BULK 'C:\input.xml', SINGLE_BLOB) x;
INSERT INTO dbo.deltagere(number, dato, nameFrom, nameTo, nameText, deltagertype, leader, deltagerStatus, deltagerRole)
SELECT
number = deltagere.value('(number)[1]', 'bigint'),
dato = deltagere.value('(date)[1]', 'varchar(10)'),
nameFrom = deltagere.value('(name/name/from)[1]', 'varchar(10)'),
nameTo = deltagere.value('(name/name/to)[1]', 'varchar(10)'),
nameText = deltagere.value('(name/name/text)[1]', 'varchar(30)'),
deltagertype = deltagere.value('(information/deltagertype)[1]', 'varchar(20)'),
leader = deltagere.value('(information/leader)[1]', 'varchar(50)'),
deltagerStatus = deltagere.value('(information/status)[1]', 'varchar(50)'),
deltagerRole = deltagere.value('(role)[1]', 'varchar(50)')
FROM
#XmlFile.nodes('/report/deltagere/deltager') AS XTbl(deltagere);
Which gives me this output:
| number | dato | nameFrom | nameTo | nameText | deltagertype | ...
| 142555267 | 29-12-2006 | 01-05-2000 | 01-01-2003 | foo | person | ...
| 4000134982 | 05-12-2007 | 07-07-2007 | 05-12-2007 | bar | person | ...
I would like to have a row for each name/name. So something like this:
-------------------------------------------------------
| number | dato | nameFrom | nameTo | nameText | deltagertype | ...
| 142555267 | 29-12-2006 | 01-05-2000 | 01-01-2003 | foo | person | ...
| 142555267 | 29-12-2006 | 01-01-2003 | 29-12-2006 | bzz | person | ...
| 4000134982 | 05-12-2007 | 07-07-2007 | 05-12-2007 | bar | person | ...
and so on.
I'm really lost in how to do this. So I hope any of you have any ideas on how to modify my code to allow this or maybe an different approach on the problem.
Try this - you need to do a second .nodes() call to enumerate all <name> subnodes:
SELECT
number = deltagere.value('(number)[1]', 'bigint'),
dato = deltagere.value('(date)[1]', 'varchar(10)'),
-- NEW NEW NEW - read from `XC` pseudo columns to get 1-n names
nameFrom = XC.value('(from)[1]', 'varchar(10)'),
nameTo = XC.value('(to)[1]', 'varchar(10)'),
nameText = XC.value('(text)[1]', 'varchar(30)'),
deltagertype = deltagere.value('(information/deltagertype)[1]', 'varchar(20)'),
leader = deltagere.value('(information/leader)[1]', 'varchar(50)'),
deltagerStatus = deltagere.value('(information/status)[1]', 'varchar(50)'),
deltagerRole = deltagere.value('(role)[1]', 'varchar(50)')
FROM
#XmlFile.nodes('/report/deltagere/deltager') AS XTbl(deltagere)
CROSS APPLY
deltagere.nodes('name/name') AS XT2(XC)
Related
I have a XML file that has a series of attributes. The attributes look something like the list below:
<Summary>
<MyAttributes AT001="ABC" AT002="123" AT003="456" AT004="DEF" ... />
</Summary>
I need to iterate over the attributes and add them into a SQL table that looks something like this:
Name
Value
AT001
ABC
AT002
123
AT003
456
AT004
DEF
...
...
Because the attribute list isn't fixed, I need to iterate over all the attributes to ensure each attribute gets added.
I typically can figure out how to do things in SQL, but this one has me stumped!
It is not clear what SQL you are using.
Here is how to do it in MS SQL Server by using its T-SQL and XQuery methods.
SQL
DECLARE #xml XML =
N'<Summary>
<MyAttributes AT001="ABC" AT002="123" AT003="456" AT004="DEF" />
</Summary>';
SELECT c.value('local-name(.)', 'VARCHAR(30)') AS attr_name
, c.value('.', 'VARCHAR(30)') AS attr_value
FROM #xml.nodes('/Summary/MyAttributes/#*') AS t(c);
Output
+-----------+------------+
| attr_name | attr_value |
+-----------+------------+
| AT001 | ABC |
| AT002 | 123 |
| AT003 | 456 |
| AT004 | DEF |
+-----------+------------+
I would like to know how to extract multiple values from a single XML row, the problem is that this XML value somethimes have duplicate (name, id, email) tag childs,
for example:
<foo>
<name>
Dacely Lara Camilo
</name>
<id>
001-1942098-2
</id>
<email>
myuncletouchme#gmail.com
</email>
</foo>
<foo>
<name>
Alba Elvira Castro
</name>
<id>
001-0327959-2
</id>
<email>
4doorsmorehorse#hotmail.com
</email>
</foo>
Or somethimes the data in that column can be like this
<foo>
<name>
Nelson Antonio Jimenez
</name>
<id>
001-0329459-3
</id>
<email>
gsucastillo#tem.com
</email>
</foo>
<foo>
<name>
Emelinda Serrano
</name>
<id>
001-0261732-4
</id>
<email>
gucastillo#tem.com
</email>
</foo>
<foo>
<name>
Nelson Antonio Jimenez
</name>
<id>
001-0329259-3
</id>
<email>
gucastillo#tem.com
</email>
</foo>
<foo>
<name>
Emelinda Serrano
</name>
<id>
001-0268332-4
</id>
<email>
gucastillo#tem.com
</email>
</foo>
And I want all of then to be transpose to a single row just like this:
My current code just extract the first pair, if it can help,
WITH BASEDATA (ID, SIGNATURE, X) AS (
SELECT TOP 50
A.ID_SIGNATURE,
A.SIGNATURE,
A.XML
FROM DWH.DIM_CORE_SIGNATURE A
)SELECT
ID,
A.value('(id)[1]', 'nvarchar(max)') AS ID_SIGNATURE,
A.value('(name)[1]', 'nvarchar(max)') AS NAME,
A.value('(email)[1]', 'nvarchar(max)') AS EMAIL
FROM BASEDATA
CROSS APPLY X.nodes('//foo') AS SIGNATURE(A)
Notable points:
.nodes('/foo') method has a better, more performant XPath expression.
It is better to use .value('(id/text())[1]',... for the same
reason.
As #Lamu already suggested, it is better to use real data types instead of nvarchar(max) across the board.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, xmldata XML);
INSERT INTO #tbl (xmldata) VALUES
(N'<foo>
<name>Dacely Lara Camilo</name>
<id>001-1942098-2</id>
<email>myuncletouchme#gmail.com</email>
</foo>
<foo>
<name>Alba Elvira Castro</name>
<id>001-0327959-2</id>
<email>4doorsmorehorse#hotmail.com</email>
</foo>')
, (N'<foo>
<name>Nelson Antonio Jimenez</name>
<id>001-0329459-3</id>
<email>gsucastillo#tem.com</email>
</foo>
<foo>
<name>Emelinda Serrano</name>
<id>001-0261732-4</id>
<email>gucastillo#tem.com</email>
</foo>
<foo>
<name>Nelson Antonio Jimenez</name>
<id>001-0329259-3</id>
<email>gucastillo#tem.com</email>
</foo>
<foo>
<name>Emelinda Serrano</name>
<id>001-0268332-4</id>
<email>gucastillo#tem.com</email>
</foo>');
-- DDL and sample data population, end
SELECT ID,
c.value('(id/text())[1]', 'char(13)') AS ID_SIGNATURE,
c.value('(name/text())[1]', 'nvarchar(30)') AS NAME,
c.value('(email/text())[1]', 'nvarchar(128)') AS EMAIL
FROM #tbl
CROSS APPLY xmldata.nodes('/foo') AS t(c);
Output
+----+---------------+----------------------+-----------------------------+
| ID | ID_SIGNATURE | NAME | EMAIL |
+----+---------------+----------------------+-----------------------------+
| 1 | 001-1942098-2 | Dacely Lara Camilo | myuncletouchme#gmail.com |
| 1 | 001-0327959-2 | Alba Elvira Castro | 4doorsmorehorse#hotmail.com |
| 2 | 001-0329459-3 | Nelson Antonio Jimen | gsucastillo#tem.com |
| 2 | 001-0261732-4 | Emelinda Serrano | gucastillo#tem.com |
| 2 | 001-0329259-3 | Nelson Antonio Jimen | gucastillo#tem.com |
| 2 | 001-0268332-4 | Emelinda Serrano | gucastillo#tem.com |
+----+---------------+----------------------+-----------------------------+
I am new to XML stuff. I've figured out how to query and return the values from the XML file (example below). However, I run into a problem that it only capture the first node of 'SerialNo' tag because the tag has the same node name "SerialNo" repeated. In the XML file, it has 4 serial numbers for SKU#TT234343, but it only gives me the first Serial11111. I am totally stuck and don't know how to list all of those serial#.
I would like the query result for SKU#TT234343, listing all 4 serial numbers if possible.
Please help. Thanks!
The XML File looks like:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<ROOT>
<ShipNotice version="1" >
<InvoiceDate>01/01/2015</InvoiceDate>
<InvoiceNumber>6868686</InvoiceNumber>
<ShipDate>02/02/2015</ShipDate>
<ShipTime>2306</ShipTime>
<PONumber>P444444</PONumber>
<PODate>03/03/2015</PODate>
<ShipCode>XXX</ShipCode>
<ShipDescription>FedEx Economy</ShipDescription>
<ShipTo>
<AddressName>ShipABC</AddressName>
<AddressContact>Name1</AddressContact>
<AddressLine1>2222 Street Name</AddressLine1>
<AddressLine2> </AddressLine2>
<City>AUSTIN</City>
<State>TX</State>
<ZipCode>78111</ZipCode>
</ShipTo>
<BillTo>
<AddressName>BillABC</AddressName>
<AddressContact>Name1</AddressContact>
<AddressLine1>1234 Street Name</AddressLine1>
<AddressLine2>-SUITE 111</AddressLine2>
<City>Los Angeles</City>
<State>CA</State>
<ZipCode>95136</ZipCode>
</BillTo>
<TotalWeight>324</TotalWeight>
<EmptyCartonWGT>0</EmptyCartonWGT>
<NumberOfCarton>1</NumberOfCarton>
<DirectShipFlag>D</DirectShipFlag>
<ShipFromWarehouse>88</ShipFromWarehouse>
<ShipFromZip>94538</ShipFromZip>
<ShipTrackNo>33333333</ShipTrackNo>
<EndUserPONumber>55555555</EndUserPONumber>
<CustomerSONumber/>
<Package sequence="1" >
<TrackNumber>666666666</TrackNumber>
<PackageWeight>324</PackageWeight>
<Item sequence="1" >
<SOLineNo>1</SOLineNo>
<MfgPN>XYZ1111111</MfgPN>
<SKU>TT234343</SKU>
<ShipQuantity>4</ShipQuantity>
<CustPOLineNo>1</CustPOLineNo>
<CustSOLineNo/>
<Description>Server1234</Description>
<CustPN/>
<UPC/>
<UnitPrice>1000</UnitPrice>
<EndUserPOLineNo>0</EndUserPOLineNo>
<SerialNo>Serial11111</SerialNo>
<SerialNo>Serial22222</SerialNo>
<SerialNo>Serial33333</SerialNo>
<SerialNo>Serial44444</SerialNo>
</Item>
<Item sequence="2" >
<SOLineNo>2</SOLineNo>
<MfgPN>XYZ222222</MfgPN>
<SKU>TT8848788</SKU>
<ShipQuantity>4</ShipQuantity>
<CustPOLineNo>2</CustPOLineNo>
<CustSOLineNo/>
<Description>GGG localization</Description>
<CustPN/>
<UPC/>
<UnitPrice>0.00</UnitPrice>
<EndUserPOLineNo>0</EndUserPOLineNo>
<SerialNo/>
</Item>
</Package>
</ShipNotice>
</ROOT>
The SQL Query:
DECLARE #XML AS XML, #hDoc AS INT, #SQL NVARCHAR (MAX)
EXEC sp_xml_preparedocument #hDoc OUTPUT, #xmlData
SELECT
InvoiceNumber, PONumber, PODate
, AddressName
, MfgPN, SerialNo
--, AddressContact, AddressLine1, AddressLine2, City, State, ZipCode
FROM OPENXML(#hDoc, '/ROOT/ShipNotice/Package/Item')
WITH
(
--- ################# Level 1 #################
InvoiceNumber [varchar](50) '../../InvoiceNumber',
PONumber [varchar](100) '../../PONumber',
PODate [varchar](100) '../../PODate',
--- ################# Level 2 #################
AddressName [varchar](100) '../../ShipTo/AddressName',
--- ################# Level 3 #################
MfgPN [varchar](100) 'MfgPN',
SerialNo [varchar](100) 'SerialNo'
)
You can try using the newer technology XQuery instead of OPENXML(). Using XQuery, you can use nodes() method to shred the XML on elements that will correspond to the rows in the output, and use value() to extract the element value :
SELECT
shipnotice.value('InvoiceNumber[1]','varchar(20)') InvoiceNumber
, shipnotice.value('PONumber[1]','varchar(20)') PONumber
, shipnotice.value('PODate[1]','varchar(20)') PODate
, shipnotice.value('(ShipTo/AddressName)[1]','varchar(100)') AddressName
, item.value('MfgPN[1]','varchar(100)') MfgPN
, serialno.value('.','varchar(100)') SerialNo
FROM #XML.nodes('/ROOT/ShipNotice') as t(shipnotice)
OUTER APPLY shipnotice.nodes('Package/Item') as t2(item)
OUTER APPLY item.nodes('SerialNo') as t3(serialno)
Sqlfiddle Demo
output :
| InvoiceNumber | PONumber | PODate | AddressName | MfgPN | SerialNo |
|---------------|----------|------------|-------------|------------|-------------|
| 6868686 | P444444 | 03/03/2015 | ShipABC | XYZ1111111 | Serial11111 |
| 6868686 | P444444 | 03/03/2015 | ShipABC | XYZ1111111 | Serial22222 |
| 6868686 | P444444 | 03/03/2015 | ShipABC | XYZ1111111 | Serial33333 |
| 6868686 | P444444 | 03/03/2015 | ShipABC | XYZ1111111 | Serial44444 |
| 6868686 | P444444 | 03/03/2015 | ShipABC | XYZ222222 | |
I am trying to learn XQuery and Xpath in SQL Server
I created a sample file and uploaded it to a Table with 2 columns ID, XMLDoc. The below code is within the document in the XMLDoc column so it is the only record in the column.
I am trying to query the file so it will show all the results in a table like a normal select statement would. How would you construct the select statement to select all the information like a select * ? How would you select one field like all suppliers? I would like to select the supplier, requestor for each item.
Here is the xml:
<tst:Document xmlns:tst ="http://www.w3.org/2001/XMLSchema" SchemaVersion="0.1" Classification="Test" UniqueIdentifier="1234" Title="Test">
<tst:Revision RevNumber="0" TimeStamp="2013-01-21T12:56:00">
<tst:Author Name="Me" Guid="1234" />
</tst:Revision>
<tst:Formats>
<tst:A12 Item="1">
<tst:Requestor Name="ADC" />
<tst:Supplier Name="BBC" />
<tst:Code>B</tst:Code>
<tst:IsRequirement>true</tst:IsRequirement>
<tst:IsNotRequired>false</tst:IsInformation>
<tst:Remarks>ADC (Random Input Section)</tst:Remarks>
<tst:Notes>Next Round.</tst:Notes>
<tst:Events>
<tst:SubTest Item="0">
<tst:BLDG>BLDG1</tst:BLDG>
<tst:BLDG2>BLDG2</tst:BLDG2>
<tst:Function>Testing</tst:Function>
<tst:Desciption>Normal Flow</tst:Desciption>
</tst:SubTest>
</tst:Events>
<tst:IsReady>true</tst:IsReady>
<tst:IsNotReady>false</tst:IsNotReady>
</tst:A12>
<tst:A12 Item="2">
<tst:Requestor Name="ADC" />
<tst:Supplier Name="BBC" />
<tst:Code>A</tst:Code>
<tst:IsRequirement>true</tst:IsRequirement>
<tst:IsInformation>false</tst:IsInformation>
<tst:Remarks>Requirement Not yet met.</tst:Remarks>
<tst:Notes>Ready.</tst:Notes>
<tst:Events>
<tst:SubTest Item="0">
<tst:BLDG>BLDG3</tst:BLDG>
<tst:BLDG2>BLDG4</tst:BLDG2>
<tst:TotalEvents>1</tst:TotalEvents>
<tst:Function>Development</tst:Function>
<tst:Desciption>Process Flow</tst:Desciption>
</tst:SubTest>
</tst:Events>
<tst:IsReady>true</tst:IsReady>
<tst:IsNotReady>false</tst:IsNotReady>
</tst:A12>
</tst:Formats>
</tst:Document>
Query I ran
I just got a return, but it is still showing it in xml form:
Select XMLDoc.query('/*/*/*/*[local-name()=("Requestor", "Supplier")]')
From XMLLoad
I Updated the xml snippet, sry had a typo! It will load now
INSERT INTO TableName(ColumnName)
SELECT * FROM OPENROWSET(
BULK 'C:\Users\Filepath.xml',
SINGLE_BLOB) AS x;
SQL Fiddle
MS SQL Server 2008 Schema Setup:
create table XMLDoc (XMLLoad xml);
insert into XMLDoc(XMLLoad) values('
<tst:Document xmlns:tst ="http://www.w3.org/2001/XMLSchema" SchemaVersion="0.1" Classification="Test" UniqueIdentifier="1234" Title="Test">
<tst:Revision RevNumber="0" TimeStamp="2013-01-21T12:56:00">
<tst:Author Name="Me" Guid="1234" />
</tst:Revision>
<tst:Formats>
<tst:A12 Item="1">
<tst:Requestor Name="ADC" />
<tst:Supplier Name="BBC" />
<tst:Code>B</tst:Code>
<tst:IsRequirement>true</tst:IsRequirement>
<tst:IsInformation>false</tst:IsInformation>
<tst:Remarks>ADC (Random Input Section)</tst:Remarks>
<tst:Notes>Next Round.</tst:Notes>
<tst:Events>
<tst:SubTest Item="0">
<tst:BLDG>BLDG1</tst:BLDG>
<tst:BLDG2>BLDG2</tst:BLDG2>
<tst:Function>Testing</tst:Function>
<tst:Desciption>Normal Flow</tst:Desciption>
</tst:SubTest>
</tst:Events>
<tst:IsReady>true</tst:IsReady>
<tst:IsNotReady>false</tst:IsNotReady>
</tst:A12>
<tst:A12 Item="2">
<tst:Requestor Name="ADC" />
<tst:Supplier Name="BBC" />
<tst:Code>A</tst:Code>
<tst:IsRequirement>true</tst:IsRequirement>
<tst:IsInformation>false</tst:IsInformation>
<tst:Remarks>Requirement Not yet met.</tst:Remarks>
<tst:Notes>Ready.</tst:Notes>
<tst:Events>
<tst:SubTest Item="0">
<tst:BLDG>BLDG3</tst:BLDG>
<tst:BLDG2>BLDG4</tst:BLDG2>
<tst:TotalEvents>1</tst:TotalEvents>
<tst:Function>Development</tst:Function>
<tst:Desciption>Process Flow</tst:Desciption>
</tst:SubTest>
</tst:Events>
<tst:IsReady>true</tst:IsReady>
<tst:IsNotReady>false</tst:IsNotReady>
</tst:A12>
</tst:Formats>
</tst:Document>');
Query 1:
with xmlnamespaces('http://www.w3.org/2001/XMLSchema' as tst)
select A12.X.value('#Item', 'int') as A12,
A12.X.value('tst:Requestor[1]/#Name', 'varchar(25)') as Requestor,
A12.X.value('tst:Supplier[1]/#Name', 'varchar(25)') as Supplier,
A12.X.value('(tst:Code/text())[1]', 'varchar(25)') as Code,
A12.X.value('(tst:IsRequirement/text())[1]', 'bit') as IsRequirement,
A12.X.value('(tst:IsInformation/text())[1]', 'bit') as IsInformation,
A12.X.value('(tst:Remarks/text())[1]', 'varchar(50)') as Remarks,
A12.X.value('(tst:Notes/text())[1]', 'varchar(50)') as Notes,
ST.X.value('#Item', 'int') as SubTest,
ST.X.value('(tst:BLDG/text())[1]', 'varchar(25)') as BLDG,
ST.X.value('(tst:BLDG2/text())[1]', 'varchar(25)') as BLDG2,
ST.X.value('(tst:TotalEvents/text())[1]', 'int') as TotalEvents,
ST.X.value('(tst:Function/text())[1]', 'varchar(25)') as [Function],
ST.X.value('(tst:Desciption/text())[1]', 'varchar(50)') as Desciption
from XMLDoc as X
cross apply X.XMLLoad.nodes('/tst:Document/tst:Formats/tst:A12') as A12(X)
cross apply A12.X.nodes('tst:Events/tst:SubTest') as ST(X)
Results:
| A12 | REQUESTOR | SUPPLIER | CODE | ISREQUIREMENT | ISINFORMATION | REMARKS | NOTES | SUBTEST | BLDG | BLDG2 | TOTALEVENTS | FUNCTION | DESCIPTION |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 1 | ADC | BBC | B | 1 | 0 | ADC (Random Input Section) | Next Round. | 0 | BLDG1 | BLDG2 | (null) | Testing | Normal Flow |
| 2 | ADC | BBC | A | 1 | 0 | Requirement Not yet met. | Ready. | 0 | BLDG3 | BLDG4 | 1 | Development | Process Flow |
Check out value() and nodes().
Got some basic XML as a XML datatype within SQL 2005. One record/row looks like this
<doc>
<level1>
<level2>
<name>James</name>
<age>12</age>
</level2>
<level2>
<name>John</name>
<age>23</age>
</level2>
</level1>
</doc>
When I perform some basic T_SQL
SELECT TOP 1
DocumentXML.query('data(//doc/name)'),
DocumentXML.query('data(//doc/age)')
FROM [DBNAME].[dbo].[TBLNAME]
I get
ID | Name | Age
----------------------
1 | JamesJohn | 1223
How do I re-write the T-SQL so it displays as
ID | Name | Age
--------------------
1 | James | 12
2 | John | 23
Your example doesn't work for me; the second level2 opens with </level2>. And //doc/name doesn't exist; might be //doc/level1/level2/name.
Here's an example of how to retrieve a rowset from an XML:
declare #t table (id int identity, doc xml)
insert #t (doc) values (
'<doc>
<level1>
<level2>
<name>James</name>
<age>12</age>
</level2>
<level2>
<name>John</name>
<age>23</age>
</level2>
</level1>
</doc>')
SELECT x.a.value('(name)[1]','varchar(50)') as col1
, x.a.value('(age)[1]','varchar(50)') as col2
FROM #t t
cross apply
t.doc.nodes('//level2') x(a)
This prints:
col1 col2
James 12
John 23