How to filter data from xml content using xpath queries to create a temporary table - sql

I am trying to create a sql query using xpath where i am looking to filter the data i need and put that in a temporary table.
Example:
<superStarsDoc>
<names>
<starname>
<preferredname>pref</preferredname>
<firstNm>Bradd</firstNm>
<lastNm>Pitt</lastNm>
</starname>
</names>
</superStarsDoc>
and i am trying to get something like this but not working
with data(firstName,lastName) as
(
unnest(xpath('/superStarsDoc/names/starname/firstNm[#firstNm="Bradd"]/text()',
(select xmlparse(document superstar_doc))))::text as firstName
,unnest(xpath('/superStarsDoc/names/starname/lastNm[#lastNm="Pitt"]lastNm="/text()',
(select xmlparse(document superstar_doc))))::text as lastName
from dbname.superstartable
)
I tried searching for solution but i did not find anything specific for my requirement, i dont have any attribute to point to that record exactly.
I tried using the following solution but that is not working, i am getting syntax error.
XPath 1.0 to find if an element's value is in a list of values
Note: I typed the code here as i cannot copy paste my code exactly, so please excuse any typos

You should probably fix your XPath with :
/superStarsDoc/names/starname/firstNm[.="Bradd"]/text()
/superStarsDoc/names/starname/lastNm[.="Pitt"]/text()
Generic code :
with superstartable(superstar_doc) as (
values (
'<?xml version="1.0" encoding="UTF-8"?>
<superStarsDoc>
<names>
<starname>
<preferredname>pref</preferredname>
<firstNm>Bradd</firstNm>
<lastNm>Pitt</lastNm>
</starname>
</names>
</superStarsDoc>
'::xml)
)
SELECT
xpath('/superStarsDoc/names/starname/firstNm[.="Bradd"]/text()', superstar_doc)[1] as "first-name",
xpath('/superStarsDoc/names/starname/lastNm[.="Pitt"]/text()', superstar_doc)[1] as "last-name"
from superstartable

Related

Snowflake Get value from XML column

I am working in Snowflake
I need a specific value from XML
SELECT data_xml,REGEXP_SUBSTR(data_xml,'<pinLessNetworkBin>(.*?)(</pinLessNetworkBin>)',3) as network
FROM "DW"."DB"."TABLE"
My results for now
<pinLessNetworkBin>STAR</pinLessNetworkBin>
I just need the value inside
Here the xml:
<?xml version="1.0" encoding="UTF-8"?>
<ns0:FundingSource xmlns:ns0="www.url.com/be/example/payments/model/Concepts/FundingSource" Id="12887819260" extId="">
<id>3939</id>
<pinLessNetworkBin>STAR</pinLessNetworkBin>
</ns0:FundingSource>
How I can get that value?
Regards
the contents of an XML object is retrieved via GET(object, '$') thus for your regex result GET(parse_xml(network), '$') will get you the content. See GET
or you should really retrieve the pinLessNetworkBin via XMLGET:
SELECT data_xml,
XMLGET(parse_xml(data_xml), 'pinLessNetworkBin') as pinLessNetworkBin
FROM "DW"."DB"."TABLE"
parse_xml(data_xml)
which will give you the <pinLessNetworkBin>STAR</pinLessNetworkBin> thus you want to fetch the contents
SELECT data_xml,
get(XMLGET(parse_xml(data_xml), 'pinLessNetworkBin'), '$') as pinLessNetworkBin
FROM "DW"."DB"."TABLE"
parse_xml(data_xml)
should give you 'STAR'
see the PARSE_XML

PostgresSQL xpath with namespaces

I would like to know how to use the xpath funtion in the following example:
The xml is inside a table called SR_DATA, field XMLDATA of type TEXT
The following is the structure of the xml document:
<?xml version="1.0" encoding="UTF-8"?>
<modulo modelCodeScheme="DocType" modelCodeSchemeVersion="01" modelCodeValue="TYPE_20a" modelCodeMeaning="SCREENING" group="groupname" type="format" xmlns="http://www.expr.com/2008/FMSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<AAAAA modelCodeScheme="MAM" modelCodeSchemeVersion="1" modelCodeValue="AN_MAM_6" modelCodeMeaning="Family1" tipodato="booleano">
<![CDATA[false]]>
</AAAAA>
<BBBBB modelCodeScheme="MAM" modelCodeSchemeVersion="1" modelCodeValue="AN_MAM_8" modelCodeMeaning="Family2" tipodato="booleano">
<![CDATA[false]]>
</BBBBB>
</modulo>
Let's say I want to read the text about the element named AAAAA, so my query looks like this:
SELECT (xpath('/modulo/AAAAA/text()', XMLDATA::xml) AS status
FROM SR_DATA;
My query doesn't raise any error but the resultset is empty; I suppose I have to map the NAMESPACES but I need a hint on how to do it.
You need to specify namespaces in the xpath function. The node contains multiple text nodes; you could combine the nodes together using array_to_string function:
SELECT TRIM(BOTH FROM array_to_string(xpath('/x:modulo/x:AAAAA/text()', XMLDATA::xml, ARRAY[
ARRAY['x', 'http://www.expr.com/2008/FMSchema']
]), ''))
FROM SR_DATA
-- false
Demo on db<>fiddle

Extracting values from XML field with unusual XML using SQL

Hoping someone can help -
The XML format was put together with a very simple syntax, but for some reason I'm struggling to parse it using a standard 'value' type query.
I'm experienced with SQL, but only have limited experience in XML, and after 2 hours of frustration and much Googling, I thought I'd ask for my own sanity!
The data is stored as a text string, so converting it to XML before parsing:
<!-- Data config file -->
<Config>
<!-- keys-->
<foo value="bar"/>
<foo1 value="bar1"/>
<big_foo value="bar/bar.com"/>
<other value="f00"/>
The query I'm using is:
SELECT
col.value('foo1[0]', 'nvarchar(max)') as VALUE
from
(
select
CAST((SELECT TOP 1 xml_text FROM dbo.xml_lookup)AS XML)
as Col)x
but this returns NULL rather than the expected "bar1".
Any idea where I'm going wrong?
proper XPath would be
col.value('(Config/foo1)[1]/#value', 'nvarchar(max)')
sql fiddle demo

How to query xml value inside a xml column in SQL server

I have something like following code inside [XMLValue] column of a table called "AlgorithmLog":
<?xml version="1.0" encoding="utf-8"?>
<AdapterInfo xmlns:i="http://www.w3.org/2001/XMLSchema-instance"
xmlns:d1p1="http://schemas.datacontract.org/2004/07/Adapters.Adapter.CloudTrader"
xmlns="http://schemas.datacontract.org/2004/07/Adapters.Adapter"
i:type="d1p1:AlgorithmStatusReport">
<SequenceNumber>0</SequenceNumber>
<TrackingGuid i:nil="true" />
<d1p1:Broker>Default</d1p1:Broker>
...
<d1p1:XMLValue><?xml version="1.0"?><int xmlns="http://schemas.microsoft.com/2003/10/Serialization/">1900</int></d1p1:XMLValue>
</AdapterInfo>
and I want to get the value "1900" inside the node <d1p1:XMLValue>
So here is my query:
WITH XMLNAMESPACES('http://schemas.datacontract.org/2004/07/Adapters.Adapter' AS x,
'http://schemas.datacontract.org/2004/07/Adapters.Adapter.CloudTrader' As p,
'http://schemas.microsoft.com/2003/10/Serialization/'as w)
SELECT
XMLValue.query('(/x:AdapterInfo/p:XMLValue/w:int)[1]')AS [XMLVaule]
FROM AlgorithmLog
But it returns nothing.
Could anyone tell me where I did wrong or how I can do it?
Thank you.
Since you have "encoded" XML inside another XML node, and you cannot automatically cast to the XML datatype using the .value() XQuery method, it all gets a bit involved - but this seems to work for me:
;WITH XMLNAMESPACES('http://schemas.datacontract.org/2004/07/Adapters.Adapter' AS x,
'http://schemas.datacontract.org/2004/07/Adapters.Adapter.CloudTrader' As p,
'http://schemas.microsoft.com/2003/10/Serialization/'as w)
SELECT
CAST(XmlContent.value('(/x:AdapterInfo/p:XMLValue)[1]', 'varchar(2000)') AS XML).value('(w:int)[1]', 'int') AS [XMLValue]
FROM AlgorithmLog
WHERE ....... -- use whatever condition makes sense for you here

Finding the relevant records using XQuery/XPath

I'm very very new to XQUERY/XPATH :) so I could very well be going about this the wrong way. I have a customer object serialized and stored in a database column in the following format.
<Customer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Addresses>
<AddressBlock>
<AddressLine1>1234 SomeStreet Ave.</AddressLine1>
<City>SomeCity</City>
<State>SomeState</State>
<Zipcode>SomeZip</Zipcode>
</AddressBlock>
<AddressBlock>
<AddressLine1>5678 SomeOtherStreet Ave.</AddressLine1>
<City>SomeOtherCity</City>
<State>SomeOtherState</State>
<Zipcode>SomeOtherZip</Zipcode>
</AddressBlock>
</Addresses>
</Customer>
I'm looking for a way to select this record if addressline1 and city in the same addressblock contains certain keywords. I have the following statement that almost does what I'm looking for.
select *
from users
where [UserData].exist('/Customer/Addresses/AddressBlock/AddressLine1/text()[contains(upper-case(.),""SOMESTREET"")]')=1
and [UserData].exist('/Customer/Addresses/AddressBlock/City/text()[contains(upper-case(.),""SOMECITY"")]')=1"
My only problem is this statment will also return the record if the first addressblock contains the addressline1 and the second addressblock contains the city.
You have to test both conditions in the same XQuery.
select *
from users
where [UserData].exist('/Customer/Addresses/AddressBlock
[contains(upper-case(AddressLine1[1]),"SOMESTREET") and
contains(upper-case(City[1]),"SOMECITY")]')=1