How to get required XML element from not well formed XML data in SQL server - sql

In my SQL 2008 database table, I have one column name AUTHOR that contains XML data. The XML is not well formed and has data like below
<Author>
<ID>172-32-1176</ID>
<LastName>White</LastName>
<FirstName>Johnson</FirstName>
<Address>
<Street>10932 Bigge Rd.</Street>
<City>Menlo Park</City>
<State>CA</State>
</Address>
</Author>
Some XML have all of above data and some have just one tag.
<ID>172-32-1176</ID>
I want to write query that returns me a column as identiry.
I tried using AUTHOR.query('data(/Author/ID)') as identity but it fails when XML does not have Author node.
Thanks,
Vijay

Have you tried something like /Author/ID|/ID ? i.e. try for the first scenario, and with no match, the second ?
(note that the | operator is a set union operator, as described here)

In case that nothing "certain" can be maintained about the XML, except that a unique ID element contains the required identity value, then the following XPath expression selects the ID element:
//ID

Related

SQL get value from XML in tag, by tag value

I have the following XML:
<Main>
<ResultOutput>
<Name>TEST1</Name>
<Value>D028</Value>
</ResultOutput>
<ResultOutput>
<Name>TEST2</Name>
<Value>Accept</Value>
</ResultOutput>
<ResultOutput>
<Name>TEST3</Name>
<Value />
</ResultOutput>
</Main>
What I want is to get the value of the <value> tag in SQL.
Basically want to say get <value> where <Name> has the value of TEST1, as an example
This is what I have at the moment, but this depends on the position of the XML tag:
XMLResponse.value(Main/ResultOutput/Value)[5]', nvarchar(max)')
The best way to do this is not to put extra where .value clauses, but to do it directly in XQuery.
Use [nodename] to filter by a child node, you can even nest such predicates. text() gets you the inner text of the node:
XMLResponse.value('(/Main/ResultOutput[Name[text()="TEST1"]]/Value/text())[1]', 'nvarchar(max)')
Below is an example using the sample XML in your question. You'll need to extend this to add namespace declarations and the proper xpath expressions that may be present in your actual XML as your query attempt suggests.
SELECT ResultOutput.value('Value[1]', 'nvarchar(100)')
FROM #xml.nodes('Main/ResultOutput') AS Main(ResultOutput)
WHERE ResultOutput.value('Name[1]', 'nvarchar(100)') = N'TEST1';

Export SQL XML field to grid [duplicate]

I have something like the following XML in a column of a table:
<?xml version="1.0" encoding="utf-8"?>
<container>
<param name="paramA" value="valueA" />
<param name="paramB" value="valueB" />
...
</container>
I am trying to get the valueB part out of the XML via TSQL
So far I am getting the right node, but now I can not figure out how to get the attribute.
select xmlCol.query('/container/param[#name="paramB"]') from LogTable
I figure I could just add /#value to the end, but then SQL tells me attributes have to be part of a node. I can find a lot of examples for selecting the child nodes attributes, but nothing on the sibling atributes (if that is the right term).
Any help would be appreciated.
Try using the .value function instead of .query:
SELECT
xmlCol.value('(/container/param[#name="paramB"]/#value)[1]', 'varchar(50)')
FROM
LogTable
The XPath expression could potentially return a list of nodes, therefore you need to add a [1] to that potential list to tell SQL Server to use the first of those entries (and yes - that list is 1-based - not 0-based). As second parameter, you need to specify what type the value should be converted to - just guessing here.
Marc
Depending on the the actual structure of your xml, it may be useful to put a view over it to make it easier to consume using 'regular' sql eg
CREATE VIEW vwLogTable
AS
SELECT
c.p.value('#name', 'varchar(10)') name,
c.p.value('#value', 'varchar(10)') value
FROM
LogTable
CROSS APPLY x.nodes('/container/param') c(p)
GO
-- now you can get all values for paramB as...
SELECT value FROM vwLogTable WHERE name = 'paramB'

SQL Date column in xml

StudentID ExamID 09/05/2017 08/05/2017 07/05/2017 06/05/2017 05/05/2017
123 AS12 12
123 AS13 13 23
While convert the above using "FOR XML PATH , Elements" in sql statement. I got the error.
error:Column name '09/05/2017' contains an invalid XML identifier as
required by FOR XML; '2'(0x0032) is the first character at fault.
Is there any way I will get XML in format:
<row>
<StudentID>123</StockID>
<LessonID>AS13</LessonID>
<09/05/2017>13</09/05/2017>
<08/05/2017>23</08/05/2017>
<07/05/2017></07/05/2017>
<06/05/2017></06/05/2017>
<05/05/2017></05/05/2017>
</row>
It is a very bad design, to store your date-based values in columns of the student table. Whenever you have to add a column in order to add more data, the design is bad... This should be stored in a related side table, while a PIVOT query constructs this output format, whenever you need it.
And: Avoid culture specific date formats!!!
How should one know, wheter 06/05/2017 is the 6th of May or the 5th of June? Use ISO8601 like 2017-05-06 (which makes it sure, that you think about the 6th of May)
About your question: No, this is impossible!
XML does not allow an element's name like '05/05/2017'. You must start with a non-numeric character or an underscore and several characters like the / are forbidden...
Try to create your XML similar to
<row>
<StudentID>123</StockID>
<LessonID>AS13</LessonID>
<Marks>
<Mark date="2017-05-09">13<Mark>
<Mark date="2017-05-08">23<Mark>
[... more of them ...]
</Marks>
</row>
This error goes back to how to treat strings in the language you wish to program in. In this case once you are inside the brackets(<>) the slash is (/) is a special character and the first set of algorithms that process this (regex) XML identify the slash as an issue thereby throwing the error.
Additionally you may want to consider how you want to treat your objects in XML. First group is the class, the class has many students, and the students take many lessons, and each lesson has a grade. (or in this case it looks like a lesson has many grades, not shown here)
<CLASS>
<STUDENT>
<StudentID>123</StudentID>
<LESSON>
<LessonID>AS12</LessonID>
<DATE>09/05/2017</DATE>
<GRADE>93.00</GRADE>
</LESSON>
<LESSON>
<LessonID>AS12</LessonID>
<DATE>08/05/2017</DATE>
<GRADE>93.00</GRADE>
</LESSON>
</STUDENT>
<STUDENT>
...
</STUDENT>
</CLASS>

FOR XML PATH in SQL server and [text()]

I found some article on internet
this url
Then I code query like that and I get same result
But when I change AS [text()] to [name]
the result contain XML tag
like this
So My question is What is [text()] in this code
Thank you.
The other current answers don't explain much about where this is coming from, or just offer links to poorly formatted sites and don't really answer the question.
In many answers around the web for grouping strings there are the copy paste answers without a lot of explanation of what's going on. I wanted to better answer this question because I was wondering the same thing, and also give insight into what is actually happening overall.
tldr;
In short, this is syntax to help transform the XML output when using FOR XML PATH which uses column names (or aliases) to structure the output. If you name your column text() the data will be represented as text within the root tag.
<row>
My record's data
<row>
In the examples you see online for how to group strings and concat with , it may not be obvious (except for the fact that your query has that little for xml part) that you are actually building an XML file with a specific structure (or rather, lack of structure) by using FOR XML PATH (''). The ('') is removing the root xml tags, and just spitting out the data.
The deal with AS [text()]
As usual, AS is acting to name or rename the column alias. In this example, you are aliasing this column as [text()]. The []s are simply SQL Server's standard column delimiters, often unneeded, except today since our column name has ()s. That leaves us with text() for our column name.
Controlling the XML Structure with Column Names
When you are using FOR XML PATH you are outputting an XML file and can control the structure with your column names. A detailed list of options can be found here: https://msdn.microsoft.com/en-us/library/ms189885.aspx
An example includes starting your column name with an # sign, such as:
SELECT color as '#color', name
FROM #favorite_colors
FOR XML PATH
This would move this column's data to an attribute of the current xml row, as opposed to an item within it. You end up with
<row color="red">
<name>tim</name>
</row>
<row color="blue">
<name>that guy</name>
</row>
So then, back to [text()]. This is actually specifying an XPath Node Test. In the context of MS Sql Server, you can learn about this designation here. Basically it helps determine the type of element we are adding this data to, such as a normal node (default), an xml comment, or in this example, some text within the tag.
An example using a few moves to structure the output
SELECT
color as [#color]
,'Some info about ' + name AS [text()]
,name + ' likes ' + color AS [comment()]
,name
,name + ' has some ' + color + ' things' AS [info/text()]
FROM #favorite_colors
FOR XML PATH
Notice we are using a few designations in our column names:
#color: a tag attribute
text(): some text for this root tag
comment(): an xml comment
info/text(): some text in a specific xml tag, <info>
The output looks like this:
<row color="red">
Some info about tim
<!--tim likes red-->
<name>tim</name>
<info>tim has some red things</info>
</row>
<row color="blue">
Some info about that guy
<!--that guy likes blue-->
<name>that guy</name>
<info>that guy has some blue things</info>
</row>
Wrapping it up, how can these tools group and concat strings?
So, with the solutions we see for grouping strings together using FOR XML PATH, there are two key components.
AS [text()]: Writes the data as text, instead of wrapping it in a tag
FOR XML PATH (''): Renames the root tag to '', or rather, removes it entirely
This gives us "XML" (air quotes) output that is essentially just a string.
SELECT name + ', ' AS [text()] -- no 'name' tags
FROM #favorite_colors
FOR XML PATH ('') -- no root tag
returns
tim, that guy,
From there, it's just a matter of joining that data back to the larger dataset from which it came.
While righting the sql query remove alias name then you got the text.
select name+',' aa from employee for xml path('')
then the result comes in xml with aa tag.
select (select name+',' from employee for xml path('')) aa

Import Xml nodes as Xml column with SSIS

I'm trying to use the Xml Source to shred an XML source file however I do not want the entire document shredded into tables. Rather I want to import the xml Nodes into rows of Xml.
a simplified example would be to import the document below into a table called "people" with a column called "person" of type "xml". When looking at the XmlSource --- it seem that it suited to shredding the source xml, into multiple records --- not quite what I'm looking for.
Any suggestions?
<people>
<person>
<name>
<first>Fred</first>
<last>Flintstone</last>
</name>
<address>
<line1>123 Bedrock Way</line>
<city>Drumheller</city>
</address>
</person>
<person>
<!-- more of the same -->
</person>
</people>
I didn't think that SSIS 2005 supported the XML datatype at all. I suppose it "supports" it as DT_NTEXT.
In any case, you can't use the XML Source for this purpose. You would have to write your own. That's not actually as hard as it sounds. Base it on the examples in Books Online. The processing would consist of moving to the first child node, then calling XmlReader.ReadSubTree to return a new XmlReader over just the next <person/> element. Then use your favorite XML API to read the entire <person/>, convert the resulting XML to a string, and pass it along down the pipeline. Repeat for all <person/> nodes.
Could you perhaps change your xml output so that the content of person is seen as a string? Use escape chars for the <>.
You could use a script task to parse it as well, I'd imagine.