Capturing mutliple XML strings with the same node names in SQL - sql

Weaving my way through the XML string world - I've come across this issue I'm having.
So I have two XML string that are super similar to each other - only thing is - is that they have different info inside the nodes.
XML string 1:
<DocumentElement>
<Readings>
<ReadingID>1</ReadingID>
<ReadingDate>2013-12-19T00:00:00-05:00</ReadingDate>
<Sys>120</Sys>
<Dia>80</Dia>
<PageNumber>4</PageNumber>
<AddedDate>2015-04-17T19:30:22.2255116-04:00</AddedDate>
<UpdateDate>2015-04-17T19:30:22.2255116-04:00</UpdateDate>
</Readings>
<Readings>
<ReadingID>2</ReadingID>
<ReadingDate>2014-01-10T00:00:00-05:00</ReadingDate>
<Sys>108</Sys>
<Dia>86</Dia>
<PageNumber>8</PageNumber>
<AddedDate>2015-04-17T19:32:08.5121747-04:00</AddedDate>
<UpdateDate>2015-04-17T19:32:08.5121747-04:00</UpdateDate>
</Readings>
</DocumentElement>
XML String 2:
<DocumentElement>
<Readings>
<ReadingID>1</ReadingID>
<ReadingDate>2013-12-20T00:00:00-05:00</ReadingDate>
<Sys>140</Sys>
<Dia>70</Dia>
<PageNumber>10</PageNumber>
<AddedDate>2015-04-17T19:30:22.2255116-04:00</AddedDate>
<UpdateDate>2015-04-17T19:30:22.2255116-04:00</UpdateDate>
</Readings>
</DocumentElement>
Now this is really just an example - I could have an infinite amount of strings just like this that I would want to pull data from. In this case I have two strings and I'm looking to extract all info on <Sys>, <Dia> and <ReadingDate>
I would also like to display this info in a table like this:
Reading Date | Sys | Dia
----------------------------
12/29/2013 | 120 | 80
----------------------------
1/10/2014 | 108 | 86
----------------------------
12/20/2013 | 140 | 70
I am totally unsure how to proceed with this - any and all help is appreciated!

Assuming those XML's are in an XML column named MyXmlColumn, in a table named MyTable*, you can try something like this :
SELECT
R.value('ReadingDate[1]', 'DATETIME') as ReadingDate
, R.value('Sys[1]', 'INT') as Sys
, R.value('Dia[1]', 'INT') as Dia
FROM MyTable t
CROSS APPLY t.MyXmlColumn.nodes('/DocumentElement/Readings') as readings(R)
SQL Fiddle
*: next time you should've provided these info in the first place

Related

How would I remove specific words or text from KQL query?

I have the following query which provides me with all the data I need exported but I would like text '' removed from my final query. How would I achieve this?
| where type == "microsoft.security/assessments"
| project id = tostring(id),
Vulnerabilities = properties.metadata.description,
Severity = properties.metadata.severity,
Remediations = properties.metadata.remediationDescription
| parse kind=regex id with '/virtualMachines/' Name '/providers/'
| where isnotempty(Name)
| project Name, Severity, Vulnerabilities, Remediations ```
You could use replace_string() (https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/replace-string-function) to replace any substring with an empty string

Parse XML data from a column in SQL Server

I'm trying to put together a report for a badge program. Some the data for custom fields we created are stored in a single column in the table as XML. I need to return a couple of the items in there to the report and am having a hard time getting the proper syntax to get it to parse out.
Aside from the XML, the query itself is simple:
SELECT
Person1.firstName
,Person1.lastName
,Person1.idNumber
,Person1.idNumber2
,Person1.idNumber3
,Person1.status
,Person1.customdata
FROM
Person1
the field "customdata" is the XML field that I need to pull Title, and 2 different dates out of. This is what the XML looks like:
<person1_7:CustomData xmlns:person1_7="http://www.badgepass.com/Person1_7">
<Title>IT Director</Title>
<Gaming_x0020_Level>Level 1</Gaming_x0020_Level>
<Gaming_x0020_Issue_x0020_Date>2021-02-18T12:00:00Z</Gaming_x0020_Issue_x0020_Date>
<Gaming_x0020_Expire_x0020_Date>2022-02-18T12:00:00Z</Gaming_x0020_Expire_x0020_Date>
<Betting_x0020_Level>Level 1</Betting_x0020_Level>
<Betting_x0020_Issue_x0020_Date>2021-02-18T12:00:00Z</Betting_x0020_Issue_x0020_Date>
<Betting_x0020_Expire_x0020_Date>2022-02-18T12:00:00Z</Betting_x0020_Expire_x0020_Date>
<BadgeType>Dual Employee</BadgeType>
<Gaming_x0020_Status>TEMP</Gaming_x0020_Status>
<Betting_x0020_Status>TEMP</Betting_x0020_Status>
</person1_7:CustomData>
I have tried a couple of different methods trying to follow the advice from How to query for Xml values and attributes from table in SQL Server? and then tried declaring a XML namespace with the following query:
WITH XMLNAMESPACES ('http://www.badgepass.com/Person1_7' as X)
SELECT
Person1.firstName
,Person1.lastName
,Person1.idNumber
,Person1.idNumber2
,Person1.idNumber3
,Person1.status
,Person1.customdata.value('(/X:person1_7:customdata/X:Title)[1]', 'varchar(100)') AS Title
FROM
Person1
So far all of my results keep returning "XQuery [Person1.customData.value()]: ")" was expected.
" I'm assuming I have a syntax issue that I'm overlooking as I've never had to manipulate XML with SQL before. Thank you in advance for any help.
Please try the following solution.
Notable points:
XQuery .nodes() method establishes a context so you can access any
XML element right away without long XPath expressions.
Use of the text() in the XPath expressions is for performance
reasons.
SQL
-- DDL and sample data population, start
DECLARE #person1 TABLE (firstname varchar(50), customdata xml);
INSERT INTO #person1(firstname, customdata) VALUES
('John', '<person1_7:CustomData xmlns:person1_7="http://www.badgepass.com/Person1_7">
<Title>IT Director</Title>
<Gaming_x0020_Level>Level 1</Gaming_x0020_Level>
<Gaming_x0020_Issue_x0020_Date>2021-02-18T12:00:00Z</Gaming_x0020_Issue_x0020_Date>
<Gaming_x0020_Expire_x0020_Date>2022-02-18T12:00:00Z</Gaming_x0020_Expire_x0020_Date>
<Betting_x0020_Level>Level 1</Betting_x0020_Level>
<Betting_x0020_Issue_x0020_Date>2021-02-18T12:00:00Z</Betting_x0020_Issue_x0020_Date>
<Betting_x0020_Expire_x0020_Date>2022-02-18T12:00:00Z</Betting_x0020_Expire_x0020_Date>
<BadgeType>Dual Employee</BadgeType>
<Gaming_x0020_Status>TEMP</Gaming_x0020_Status>
<Betting_x0020_Status>TEMP</Betting_x0020_Status>
</person1_7:CustomData>');
-- DDL and sample data population, end
WITH XMLNAMESPACES ('http://www.badgepass.com/Person1_7' as person1_7)
SELECT firstName
, c.value('(Title/text())[1]', 'VARCHAR(100)') AS Title
, c.value('(Gaming_x0020_Issue_x0020_Date/text())[1]', 'DATETIME') GamingIssueDate
FROM #person1
CROSS APPLY customdata.nodes('/person1_7:CustomData') AS t(c);
Output
+-----------+-------------+-------------------------+
| firstName | Title | GamingIssueDate |
+-----------+-------------+-------------------------+
| John | IT Director | 2021-02-18 12:00:00.000 |
+-----------+-------------+-------------------------+

HIVE-SQL_SERVER: HadoopExecutionException: Not enough columns in this line

I have a hive table with the following structure and data:
Table structure:
CREATE EXTERNAL TABLE IF NOT EXISTS db_crprcdtl.shcar_dtls
ID string,
CSK string,
BRND string,
MKTCP string,
AMTCMP string,
AMTSP string,
RLBRND string,
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/on/hadoop/dir/'
-------------------------------------------------------------------------------
ID | CSK | BRND | MKTCP | AMTCMP
-------------------------------------------------------------------------------
782 flatn,grpl,mrtn hnd,mrc,nsn 34555,56566,66455 38900,59484,71450
1231 jikl,bngr su,mrc,frd 56566,32333,45000 59872,35673,48933
123 unsrvl tyt,frd,vlv 25000,34789,33443 29892,38922,36781
Trying to push this data into the SQL Server. But while doing so, getting the following error message:
SQL Error [107090] [S0001]: HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: Not enough columns in this line.
What I tried:
There's an online article where the author has documented similar kind of issues. I tried to implement one of them Looked in Excel and found two columns that had carriage returns but this also doesn't come handy.
Any suggestion/help would be really appreciated. Thanks
If I'm able to understand your issue, then it seems that your , separated data is getting divided into various columns rather one column on the SQL-SERVER, something like:
------------------------------
ID |CSK |BRND |MKTCP |AMTCMP
------------------------------
782 flatn grpl mrtn hnd mrc nsn 345 56566 66455 38900 59484 71450
1231 jikl bngr su mrc frd 56566 32333 45000 59872 35673 48933
123 unsrvl tyt frd vlv 25000 34789 33443 29892 38922 36781
So, if you look on Hive there are only 5 columns. While on SQL-SERVER the same. This I presume as you haven't shared the schema. But if that's the case, then you see that there are more than 5 values are being passed. While the schema definition is only of 5 columns.
So the error is populating.
Refer this Document by MS and try to create a FILE_FORMAT with FIELD_TERMINATOR ='\t',
like:
CREATE EXTERNAL FILE FORMAT <name>
WITH (   
FORMAT_TYPE = DELIMITEDTEXT,   
FORMAT_OPTIONS (        
FIELD_TERMINATOR ='\t',
| STRING_DELIMITER = string_delimiter
| First_Row = integer -- ONLY AVAILABLE SQL DW
| DATE_FORMAT = datetime_format
| USE_TYPE_DEFAULT = { TRUE | FALSE }
| Encoding = {'UTF8' | 'UTF16'} )
);
Hope that helps to resolve to your issue :)

Select on CLOB XML DB2

I have this data as CLOB field in DB2. I am converting the data to char using cast:
SELECT CAST(CLOBColumn as VARCHAR(32000))
FROM Schema.MyTable;
Here is how the result XML comes out from the above:
<TreeList TreeNo="ABC">
<Tree ErrorCode="INVALID_TREE" ErrorDescription="Tree doesn’t exist." TreeID="123456"/>
<Tree ErrorCode="INVALID_TREE" ErrorDescription="Tree doesn’t exist." TreeID="1234567"/>
</TreeList>
And this is how I expect my output
|TreeNo | TreeID | ErrorCode | ErrorDescription
|ABC | 123456 | INVALID_TREE | Tree doesn’t exist
|ABC | 1234567 | INVALID_TREE | Tree doesn’t exist
How do I achieve this?
You need to use the XMLTABLE function which allows to map XML data to a table. You can pass in XML-typed data and it works if you directly parse the CLOB to XML. The SELECT would look like the following (you get the idea):
SELECT x.*
FROM schema.mytable, XMLTABLE(
'$CLOBColumn/TreeList'
COLUMNS
TreeNo VARCHAR(10) PATH '#TreeNo',
TreeID INT PATH 'Tree[#TreeID]',
...
) AS x
;

xmlquery returns all values as one long line instead of separate entities

I am trying to query the telephone numbers from the following xml file
<xmlPhoneEntity>
<TelephoneEntity>
<number>123</number>
</TelephoneEntity>
<TelephoneEntity>
<number>456</number>
</TelephoneEntity>
<TelephoneEntity>
<number>789</number>
</TelephoneEntity>
</xmlPhoneEntity>
This XML is located in my DB - the table looks like this
id customer_id telephone blabla
1 111 xmlfile
2 222 xmlfile
etc
My sql query looks like this -
select xmlserialize(xmlquery('/xmlPhoneEntity/TelephoneEntity/number/text()
passing telephone"
the response: 123456789
I tried using /nodes() instead of /text() and the result is the same.
How do I separate the values ?