How to search XML in SQL Server - sql

I looked at some threads but I think I'm missing something in Microsoft SQL Server (SSMS).
I have XML in column defined as XML datatype that looks like this:
(I erased stuff before this not sure if it's needed)
<ItemGroupData ItemGroupOID="TEST" TransactionType="Insert">
<ItemData ItemOID="TEACHER" Value="145"/>
<ItemData ItemOID="AGE" Value="50" />
</ItemGroupData>
<ItemGroupData ItemGroupOID="TEST" TransactionType="Insert">
<ItemData ItemOID="TEACHER" Value="151"/>
<ItemData ItemOID="AGE" Value="42" />
</ItemGroupData>
There's stuff I truncated but what is the most optimal way to locate the XML file where teacher 145 is and they can be in any of the Itemdata groups?
I can find it like:
SELECT
CAST(XML AS nvarchar(max)) AS test
FROM
table1
WHERE
XML LIKE '%14%'
but I am looking into learning different ways without casting unless that is the most optimal way?

Yes, that "stuff before" that you erased will be very important! You need to build up XPath expressions to select individual pieces from the XML - and those depend on everything from the root on down! Also - you might have XML namespaces that are defined in the "stuff before" - which you need to respect to get any results.
Anyhoo - ASSUMING you have just a <root>....</root> node before your XML, you could get your desired result like this :
DECLARE #XmlTbl TABLE (ID INT NOT NULL, XmlData XML)
INSERT INTO #XmlTbl (ID, XmlData)
VALUES (1,
'<root><ItemGroupData ItemGroupOID="TEST" TransactionType="Insert">
<ItemData ItemOID="TEACHER" Value="145"/>
<ItemData ItemOID="AGE" Value="50" />
</ItemGroupData>
<ItemGroupData ItemGroupOID="TEST" TransactionType="Insert">
<ItemData ItemOID="TEACHER" Value="151"/>
<ItemData ItemOID="AGE" Value="42" />
</ItemGroupData></root>')
SELECT
t.ID,
XC.value('(#ItemGroupOID)', 'varchaR(50)') AS ItemGroupOID,
XC.value('(#TransactionType)', 'varchaR(50)') AS TransactionType,
XC2.value('#ItemOID', 'varchar(25)') AS ItemOID,
XC2.value('#Value', 'int') AS value
FROM
#XmlTbl t
CROSS APPLY -- "enumerate" the <ItemGroupData> nodes under <root>
XmlData.nodes('/root/ItemGroupData') AS XT(XC)
CROSS APPLY -- "enumerate" the <ItemData> subnodes
XC.nodes('ItemData') AS XT2(XC2)
WHERE
XC.value('(ItemData/#Value)[1]', 'int') = 145
This would return these results:
ID
ItemGroupOID
TransactionType
ItemOID
value
1
TEST
Insert
TEACHER
145
1
TEST
Insert
AGE
50

SQL Server supports powerful XQuery language to deal with the XML data type.
Please try the following solution.
It is using XPath predicate [#Value=sql:variable("#TeacherValue")] to search directly in the XML data type.
The sql:variable("#TeacherValue") construct allows to pass a parameter to it.
Also, SQL Server supports XML indexes for that.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, XmlData XML);
INSERT INTO #tbl (XmlData) VALUES
(N'<root>
<ItemGroupData ItemGroupOID="TEST" TransactionType="Insert">
<ItemData ItemOID="TEACHER" Value="145"/>
<ItemData ItemOID="AGE" Value="50"/>
</ItemGroupData>
<ItemGroupData ItemGroupOID="TEST" TransactionType="Insert">
<ItemData ItemOID="TEACHER" Value="151"/>
<ItemData ItemOID="AGE" Value="42"/>
</ItemGroupData>
</root>');
-- DDL and sample data population, end
DECLARE #TeacherValue INT = 145;
SELECT t.ID
, p.value('#ItemGroupOID', 'VARCHAR(50)') AS ItemGroupOID
, p.value('#TransactionType', 'VARCHAR(50)') AS TransactionType
, c.value('#ItemOID', 'VARCHAR(25)') AS ItemOID
, c.value('#Value', 'INT') AS value
FROM #tbl AS t
CROSS APPLY XmlData.nodes('/root/ItemGroupData[ItemData[#ItemOID="TEACHER"
and #Value=sql:variable("#TeacherValue")]]') AS t1(p)
CROSS APPLY t1.p.nodes('ItemData') AS t2(c);
Output
ID
ItemGroupOID
TransactionType
ItemOID
value
1
TEST
Insert
TEACHER
145
1
TEST
Insert
AGE
50

Related

Parse out XML data into Rows in SQL Server

I am trying to parse WordPress data in our SQL Server from an Elasticsearch structure.
This is the contents of one field on one record;
<category domain="category" nicename="featured">Featured</category>
<category domain="post_tag" nicename="name1">Name 1</category>
<category domain="post_tag" nicename="name-2">Name 2</category>
<category domain="post_tag" nicename="different-name">Different Name</category>
<category domain="type" nicename="something-else">Something Else</category>
I'd like to parse this out as a table with the headers Domain, NiceName and Contents and a row for each of these nodes in the data. Something along these lines;
Domain
NiceName
Contents
category
featured
Featured
post_tag
name1
Name 1
post_tag
name-2
Name 2
post_tag
different-name
Different Name
type
something-else
Something Else
The number of nodes is different for each row in the data and can appear in any order. Currently the data is stored in a varchar data type but this can be modified if it's best to parse using something like XML.
It's recommended that you use the xml data type for storing XML data. But if you must store it in a varchar column you can use try_cast to cast it to XML (which results in null if it's not actually valid XML) and then work with it using the normal nodes(), query() and value() XML methods such as the following...
create table dbo.Records (
OneField varchar(max)
);
insert dbo.Records (OneField) values
('<category domain="category" nicename="featured">Featured</category>
<category domain="post_tag" nicename="name1">Name 1</category>
<category domain="post_tag" nicename="name-2">Name 2</category>
<category domain="post_tag" nicename="different-name">Different Name</category>
<category domain="type" nicename="something-else">Something Else</category>');
select
Category.value('#domain', 'varchar(50)') as [Domain],
Category.value('#nicename', 'varchar(50)') as [NiceName],
Category.value('(text())[1]', 'varchar(50)') as [Contents]
from dbo.Records R
cross apply (select try_cast(OneField as XML)) X(OneFieldXML)
cross apply OneFieldXML.nodes('/category') N(Category);
Domain
NiceName
Contents
category
featured
Featured
post_tag
name1
Name 1
post_tag
name-2
Name 2
post_tag
different-name
Different Name
type
something-else
Something Else

Performance Issue with cross apply while reading XML nodes for large dataset

Performance issue with XML cross apply:
DataTable has 1300 entries and the field xmldata has 250 nodes, so the query is running 1300 * 250 times to brings the output and the execution times takes a while.. about an hour to generate 325000 rows. Does anybody face a similar issue with the large dataset? Your help is highly appreciated.
Sample XML:
<dataModel>
<Colum1>
<value />
<displayText />
<controltype>textbox</controltype>
<label>Field1</label>
<controlid>4458575-b0d3-ff4d-01ac-5447e21234dd</controlid>
</Colum1>
<Colum2>
<value />
<displayText />
<controltype>textbox</controltype>
<label>Field2</label>
<controlid>5a5b7b7e-7b66-1f0d-a562-9d0660a74e11</controlid>
</Colum2>
....
</dataModel>
select t.c.value('(local-name(.))[1]', 'nvarchar(100)') as keyname ,
t.c.value('(controlid)[1]', 'nvarchar(200)') as controlid,
t.c.value('(label)[1]', 'nvarchar(500)') as label
from DataTable xmldata
CROSS APPLY xmldata .nodes('/dataModel/*') T(c)
Thanks
Your approach seems to be pretty straight forward. Not much space for enhancements...
The following is just a tiny change, but might speed up things:
declare #tbl TABLE(ID INT IDENTITY, xmldata XML);
INSERT INTO #tbl VALUES
(N'<dataModel>
<Colum1>
<value />
<displayText />
<controltype>textbox</controltype>
<label>Field1</label>
<controlid>4458575-b0d3-ff4d-01ac-5447e21234dd</controlid>
</Colum1>
<Colum2>
<value />
<displayText />
<controltype>textbox</controltype>
<label>Field2</label>
<controlid>5a5b7b7e-7b66-1f0d-a562-9d0660a74e11</controlid>
</Colum2>
</dataModel>');
select t.c.value('(local-name(.))[1]', 'nvarchar(100)') as keyname ,
t.c.value('(controlid/text())[1]', 'nvarchar(200)') as controlid,
t.c.value('(label/text())[1]', 'nvarchar(500)') as label
from #tbl xmldata
CROSS APPLY xmldata .nodes('/dataModel/*') T(c);
I added /text() to your XPaths (find details here).
It would be kind to tell us, how much difference you've encountered using /text(), thx.
And important to know: One very expensive part with XML is the initial parsing. Make sure, that the table's column is natively xml typed and that your run-time measurement is not biased by any loading / reading / parsing action (find details here).

How to get value from a node in XML via SQL Server

I've found several pieces of information online about this but I can't get it working for the life of me.
This is the XML I have:
I need to extract the ID & Name value for each node. There are a lot.
I tried to do this but it returns NULL:
select [xml].value('(/Alter/Object/ObjectDefinition/MeasureGroup/Partitions/Partition/ID)[1]', 'varchar(max)')
from test_xml
I understand the above would return only 1 record. My question is, how do I return all records?
Here's the XML text (stripped down version):
<Alter xmlns="http://schemas.microsoft.com/analysisservices/2003/engine" AllowCreate="true" ObjectExpansion="ExpandFull">
<ObjectDefinition>
<MeasureGroup xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<ID>ts_homevideo_sum_20140430_76091ba1-3a51-45bf-a767-f9f3de7eeabe</ID>
<Name>table_1</Name>
<StorageMode valuens="ddl200_200">InMemory</StorageMode>
<ProcessingMode>Regular</ProcessingMode>
<Partitions>
<Partition>
<ID>123</ID>
<Name>2012</Name>
</Partition>
<Partition>
<ID>456</ID>
<Name>2013</Name>
</Partition>
</Partitions>
</MeasureGroup>
</ObjectDefinition>
</Alter>
You need something like this:
DECLARE #MyTable TABLE (ID INT NOT NULL, XmlData XML)
INSERT INTO #MyTable (ID, XmlData)
VALUES (1, '<Alter xmlns="http://schemas.microsoft.com/analysisservices/2003/engine" AllowCreate="true" ObjectExpansion="ExpandFull">
<ObjectDefinition>
<MeasureGroup xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<ID>ts_homevideo_sum_20140430_76091ba1-3a51-45bf-a767-f9f3de7eeabe</ID>
<Name>table_1</Name>
<StorageMode valuens="ddl200_200">InMemory</StorageMode>
<ProcessingMode>Regular</ProcessingMode>
<Partitions>
<Partition>
<ID>123</ID>
<Name>2012</Name>
</Partition>
<Partition>
<ID>456</ID>
<Name>2013</Name>
</Partition>
</Partitions>
</MeasureGroup>
</ObjectDefinition>
</Alter>')
;WITH XMLNAMESPACES(DEFAULT 'http://schemas.microsoft.com/analysisservices/2003/engine')
SELECT
tbl.ID,
MeasureGroupID = xc.value('(ID)[1]', 'varchar(200)'),
MeasureGroupName = xc.value('(Name)[1]', 'varchar(200)'),
PartitionID = xp.value('(ID)[1]', 'varchar(200)'),
PartitionName = xp.value('(Name)[1]', 'varchar(200)')
FROM
#MyTable tbl
CROSS APPLY
tbl.XmlData.nodes('/Alter/ObjectDefinition/MeasureGroup') AS XT(XC)
CROSS APPLY
XC.nodes('Partitions/Partition') AS XT2(XP)
WHERE
ID = 1
First of all, you must respect and include the default XML namespace defined in the root of your XML document.
Next, you need to do a nested call to .nodes() to get all <MeasureGroup> and all contained <Partition> nodes, so that you can reach into those XML fragments and extract the ID and Name from them.
This should then result in something like this as output:

How to get value from ntext (in xml format) column in sql

I have a column in my SQL database that is called Triggers_xml_data and its type is ntext. The column is in a xml format and I am trying to get a value from a certain part of the xml. I seen an example of this being done without a column like this:
declare #fileContent xml
set #fileContent ='<my:Header>
<my:Requestor>Mehrlein, Roswitha</my:Requestor>
<my:RequestorUserName>SJM\MehrlR01</my:RequestorUserName>
<my:RequestorEmail>RMehrlein#SJM.com</my:RequestorEmail>
<my:HRContact>Roswita Mehrlein, Beatrice Porta</my:HRContact>
<my:Entity>SJM Germany</my:Entity>
<my:Department>HR/Administration</my:Department>
<my:PositionTitle>Sales Representative</my:PositionTitle>
<my:JobDescription>x0lGQRQAAAABAAAAAAAAAAAeAQAyAAAAVgBAAAAA=</my:JobDescription>
<my:PositionDepartment>Sales</my:PositionDepartment>'
 
;WITH XMLNAMESPACES ('http://schemas.microsoft.com/office/infopath/2003/myXSD/2005-08-29T12-58-51' as my)
select #fileContent.value('(//my:PositionDepartment)[1]', 'varchar(255)')
But I want to select my column like this:
Declare #filevalue xml
select de.triggers_xml_data
from dbo.DEPLOYMENT_ENVIRONMENT as de
But this is not working and I tried to use this #filecontent.value('(//value)[1]','varchar(255)') and making it equal the column value, I have tried casting it but I can't find a way to do this. Is this possible?
When I do this:
SELECT
CAST(
REPLACE(CAST(de.TRIGGERS_XML_DATA AS VARCHAR(MAX)), 'encoding="utf-16"', '')
AS XML).value('(triggers/triggerDefinition/config/item/value)[1]', 'NVARCHAR(max)') as Item, de.ENVIRONMENT_ID
from dbo.DEPLOYMENT_ENVIRONMENT as de
where de.ENVIRONMENT_ID = 19234819
I am getting a null value returned.
Here is an example of what my xml could look like:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<configuration xml:space="preserve">
<triggers>
<defined>true</defined>
<triggerDefinition>
<id>1</id>
<name>After successful deployment</name>
<userDescription/>
<isEnabled>true</isEnabled>
<pluginKey>com.atlassian.bamboo.triggers.atlassian-bamboo-triggers:afterSuccessfulDeployment</pluginKey>
<triggeringRepositories/>
<config>
<item>
<key>deployment.trigger.afterSuccessfulDeployment.triggeringEnvironmentId</key>
<value>19234819</value>
</item>
</config>
</triggerDefinition>
</triggers>
<bambooDelimiterParsingDisabled>true</bambooDelimiterParsingDisabled>
</configuration>
The XML, as you posted it, is not valid. Your code example does not work... It is not allowed to use a namespace prefix without a namespace declaration. Furthermore your example misses the closing Header-tag...
I corrected this...
DECLARE #yourTbl TABLE(ID INT, YourXML NTEXT);
INSERT INTO #yourTbl VALUES
(1,N'<my:Header xmlns:my="DummyUrl">
<my:Requestor>Mehrlein, Roswitha</my:Requestor>
<my:RequestorUserName>SJM\MehrlR01</my:RequestorUserName>
<my:RequestorEmail>RMehrlein#SJM.com</my:RequestorEmail>
<my:HRContact>Roswita Mehrlein, Beatrice Porta</my:HRContact>
<my:Entity>SJM Germany</my:Entity>
<my:Department>HR/Administration</my:Department>
<my:PositionTitle>Sales Representative</my:PositionTitle>
<my:JobDescription>x0lGQRQAAAABAAAAAAAAAAAeAQAyAAAAVgBAAAAA=</my:JobDescription>
<my:PositionDepartment>Sales</my:PositionDepartment>
</my:Header>');
--Lazy approach
SELECT ID
,CAST(CAST(YourXml AS NVARCHAR(MAX)) AS XML).value(N'(//*:PositionDepartment)[1]','nvarchar(max)')
FROM #yourTbl;
--explicit approach
WITH XMLNAMESPACES('DummyUrl' AS my)
SELECT ID
,CAST(CAST(YourXml AS NVARCHAR(MAX)) AS XML).value(N'(/my:Header/my:PositionDepartment)[1]','nvarchar(max)')
FROM #yourTbl
Some Background
If possible you should not store XML in other format than XML and further more one should avoid NTEXT, as it is depricated since SS2005!.
You have to cast NTEXT to NVARCHAR(MAX) first, than cast this to XML. The second will break, if the XML is not valid. That means: If the XML is really the way you posted it, this cannot work!
UPDATE: String-based approach, if XML does not work
If you cannot cast this to XML you might try this
--String based
WITH Casted AS
(
SELECT ID
,CAST(YourXML AS NVARCHAR(MAX)) AS TheXmlAsString
FROM #yourTbl
)
,WithPosition AS
(
SELECT Casted.*
,CHARINDEX(N'<my:PositionDepartment>',TheXmlAsString) + LEN(N'<my:PositionDepartment>') AS FirstLetter
FROM Casted
)
SELECT ID
,SUBSTRING(TheXmlAsString,FirstLetter,CHARINDEX('<',TheXmlAsString,FirstLetter)-FirstLetter)
FROM WithPosition
UPDATE 2
According to your edit the following returns a NULL value. This is good, because it shows, that the cast was successfull.
SELECT
CAST(
REPLACE(CAST(de.TRIGGERS_XML_DATA AS VARCHAR(MAX)), 'encoding="utf-16"', '')
AS XML).value('(triggers/triggerDefinition/config/item/value)[1]',
'NVARCHAR(max)') as Item, de.ENVIRONMENT_ID
from dbo.DEPLOYMENT_ENVIRONMENT as de
where de.ENVIRONMENT_ID = 19234819
Try this (skip namespace with wildcard):
SELECT
CAST(
REPLACE(CAST(de.TRIGGERS_XML_DATA AS VARCHAR(MAX)), 'encoding="utf-16"', '')
AS XML).value('(*:triggers/*:triggerDefinition/*:config/*:item/*:value)[1]', 'NVARCHAR(max)') as Item, de.ENVIRONMENT_ID
from dbo.DEPLOYMENT_ENVIRONMENT as de
where de.ENVIRONMENT_ID = 19234819
And this should be even better:
SELECT
CAST(CAST(de.TRIGGERS_XML_DATA AS NVARCHAR(MAX)) AS XML).value('(*:triggers/*:triggerDefinition/*:config/*:item/*:value)[1]', 'NVARCHAR(max)') as Item, de.ENVIRONMENT_ID
from dbo.DEPLOYMENT_ENVIRONMENT as de
where de.ENVIRONMENT_ID = 19234819
UPDATE 3
I'd rather cut away the full declaration. Your posted example would go like this
DECLARE #DEPLOYMENT_ENVIRONMENT TABLE(ENVIRONMENT_ID INT, TRIGGERS_XML_DATA NTEXT);
INSERT INTO #DEPLOYMENT_ENVIRONMENT VALUES
(19234819,N'<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<configuration xml:space="preserve">
<triggers>
<defined>true</defined>
<triggerDefinition>
<id>1</id>
<name>After successful deployment</name>
<userDescription/>
<isEnabled>true</isEnabled>
<pluginKey>com.atlassian.bamboo.triggers.atlassian-bamboo-triggers:afterSuccessfulDeployment</pluginKey>
<triggeringRepositories/>
<config>
<item>
<key>deployment.trigger.afterSuccessfulDeployment.triggeringEnvironmentId</key>
<value>19234819</value>
</item>
</config>
</triggerDefinition>
</triggers>
<bambooDelimiterParsingDisabled>true</bambooDelimiterParsingDisabled>
</configuration>');
WITH Casted AS
(
SELECT CAST(de.TRIGGERS_XML_DATA AS NVARCHAR(MAX)) AS XmlAsSting
FROM #DEPLOYMENT_ENVIRONMENT as de
where de.ENVIRONMENT_ID = 19234819
)
SELECT CAST(SUBSTRING(XmlAsSting,CHARINDEX('?>',XmlAsSting)+2,8000) AS XML).value('(/*:configuration/*:triggers/*:triggerDefinition/*:config/*:item/*:value)[1]', 'NVARCHAR(max)') as Item
FROM Casted;

XML Output from SQL Server 2008

I am trying to create an XML output from SQL that has 3 nested statements but have pretty minimal experience in this area. The code I've written is below:
select
replace(replace(replace(
(
select ID as [#ID],
(select cast(Name as int) as [#Name],
(select num as [#Number],
from #tbl_new_claims_export
for xml path('Num'),root('Numbers'), type
)
from #tbl_new_claims_export
for xml path('LineItem'), type
)
from #tbl_new_claims_export
for XML PATH('Line'),ROOT('Lines')
),'><','>'+char(10)+'<'),'<Num', char(9)+'<Num'), '<Num>', char(9)+'<Num>') ;
I am trying to create an output that looks like this:
<Lines>
<Line ID ="1">
<LineItem Name ="Michael"/>
<Numbers>
<Num Number="24"</Num>
</Numbers>
</LineItem>
</Line>
For each Line, I want to see the Line, Name, and Number as shown above. However, it is showing multiple Names under each Line and then repeats the Number below. Can anybody help me troubleshoot this code?
Thanks.
Without sample data with 1:n examples and the expected output it is reading in the magic glass bulb...
Anyway, this
SELECT
1 AS [Line/#ID]
,'Michael' AS [LineItem/#Name]
,24 AS [Numbers/Num/#Number]
FOR XML PATH('Lines')
will produce exactly the output you specify:
<Lines>
<Line ID="1" />
<LineItem Name="Michael" />
<Numbers>
<Num Number="24" />
</Numbers>
</Lines>
If you need further help, please specify a minimal and reduced test scenario. Best would be a fiddle or some pasteable code like
DECLARE #tbl TABLE(ID INT, col1 VARCHAR(MAX)/*more columns*/);
INSERT INTO #tbl VALUES (1,'test1')/*more values*/