XPath doesn't provide proper tag - sql

I'm trying to get tag "" from xml below.
If i execute request like this:
WITH x(col) AS (select'<document xmlns="http://example.com/digital/back/" xmlns:ns2="http://example.com/digital/back/complexId" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="">
<header>
<docId>13a2f29a28b12ecb</docId>
<dt>2018-12-10T11:59:48.112+03:00</dt>
</header>
<pay>
<reqTransfer id="154638">
<source>
<card>
<virtualCardNum>4B74C1EE187</virtualCardNum>
<bsc>VISA</bsc>
</card>
</source>
</reqTransfer>
</pay>
</document>
'::xml)
SELECT xpath('/document/pay/reqTransfer/source/card/bsc/text()', col) AS bsc
FROM x;
I get {}, but if I relpace the document start tag
<document xmlns="http://example.com/digital/back/" xmlns:ns2="http://example.com/digital/back/complexId" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="">
with <document> or even <document xmlns="">, I get { VISA } - that is right.
What should I do to replace <document xmlns="..."> with <document> or get { VISA } without replacement?

If you are working with XML namespaces, they are worth mentioning in your Xpath queries too, i.e. use
SELECT xpath('/d:document/d:pay/d:reqTransfer/d:source/d:card/d:bsc/text()', col,
ARRAY[ARRAY['d', 'http://example.com/digital/back/']]) AS bsc
http://sqlfiddle.com/#!17/9eecb/24719
See also:
how to ignore namespaces with XPath

Related

Create Smaller XML based on value of element

On Python 3.7, I am looking to create a subset of a XML. For example, the larger XML is:
<data>
<student>
<result>
<grade>A</grade>
</result>
<details>
<name>John</name>
<id>100</id>
<age>16</age>
<email>john#mail.com</email>
</details>
</student>
<student>
<result>
<grade>B</grade>
</result>
<details>
<name>Alice</name>
<id>101</id>
<age>17</age>
<email>alice#mail.com</email>
</details>
</student>
<student>
<result>
<grade>F</grade>
</result>
<details>
<name>Bob</name>
<id>102</id>
<age>16</age>
<email>bob#mail.com</email>
</details>
</student>
<student>
<result>
<grade>A</grade>
</result>
<details>
<name>Hannah</name>
<id>103</id>
<age>17</age>
<email>hannah#mail.com</email>
</details>
</student>
</data>
and am looking for a new XML like below, the condition to create a smaller subset depends on a list of ids in this case 101 and 102. All other student blocks will be deleted.
<data>
<student>
<result>
<grade>B</grade>
</result>
<details>
<name>Alice</name>
<id>101</id>
<age>17</age>
<email>alice#mail.com</email>
</details>
</student>
<student>
<result>
<grade>F</grade>
</result>
<details>
<name>Bob</name>
<id>102</id>
<age>16</age>
<email>bob#mail.com</email>
</details>
</student>
</data>
i.e. The output XML will depend on a list of id's, in this case ['101',102']
This is what I tried:
import lxml.etree
#Original Large XML
tree = etree.parse(open('students.xml'))
root = tree.getroot()
results = root.findall('student')
textnumbers = [r.find('details/id').text for r in results]
print(textnumbers)
required_ids = ['101','102']
wanted = tree.xpath("//student/details/[not(#id in required_ids)]")
for node in unwanted:
node.getparent().remove(node)
#New Smaller XML
tree.write(open('student_output.xml', 'wb'))
But I am getting an expected error of "Invalid expression" for
wanted = tree.xpath("//student/details/[not(#id in required_ids)]")
I know it's a read, but i am fairly new to Python, thanks in advance for your help.
I think you can do it like this:
from lxml import etree as ET
required_ids = ['101','102']
for event, element in ET.iterparse('students.xml'):
if element.tag == 'student' and not(element.xpath('.//id/text()')[0] in required_ids):
element.clear()
element.getparent().remove(element)
if element.tag == 'data':
ET.dump(element)
Instead of the dump you would of course want to write to a file, that is use
if element.tag == 'data':
tree = ET.ElementTree(element)
tree.write('student_output.xml')
Your attempt fails as you can't simply use a Python list variable in XPath and in is not an XPath 1.0 operator.

Find element or attribute value anywhere in XML

I am trying to find the value of an element / attribute regardless of where it exists in the XML.
XML:
<?xml version="1.0" encoding="UTF-8"?>
<cXML payloadID="12345677-12345567" timestamp="2017-07-26T09:11:05">
<Header>
<From>
<Credential domain="1212">
<Identity>01235 </Identity>
<SharedSecret/>
</Credential>
</From>
<To>
<Credential domain="1212">
<Identity>01234</Identity>
</Credential>
</To>
<Sender>
<UserAgent/>
<Credential domain="8989">
<Identity>10678</Identity>
<SharedSecret>Testing123</SharedSecret>
</Credential>
</Sender>
</Header>
<Request deploymentMode="Prod">
<ConfirmationRequest>
<ConfirmationHeader noticeDate="2017-07-26T09:11:05" operation="update" type="detail">
<Total>
<Money>0.00</Money>
</Total>
<Shipping>
<Description>Delivery</Description>
</Shipping>
<Comments>WO# generated</Comments>
</ConfirmationHeader>
<OrderReference orderDate="2017-07-25T15:22:11" orderID="123456780000">
<DocumentReference payloadID="5678-4567"/>
</OrderReference>
<ConfirmationItem quantity="1" lineNumber="1">
<ConfirmationStatus quantity="1" type="detail">
<ItemIn quantity="1">
<ItemID>
<SupplierPartID>R954-89</SupplierPartID>
</ItemID>
<ItemDetail>
<UnitPrice>
<Money currency="USD">0.00</Money>
</UnitPrice>
<Description>Test Descritpion 1</Description>
<UnitOfMeasure>QT</UnitOfMeasure>
</ItemDetail>
</ItemIn>
</ConfirmationStatus>
</ConfirmationItem>
<ConfirmationItem quantity="1" lineNumber="2">
<ConfirmationStatus quantity="1" type="detail">
<ItemIn quantity="1">
<ItemID>
<SupplierPartID>Y954-89</SupplierPartID>
</ItemID>
<ItemDetail>
<UnitPrice>
<Money currency="USD">0.00</Money>
</UnitPrice>
<Description>Test Descritpion 2</Description>
<UnitOfMeasure>QT</UnitOfMeasure>
</ItemDetail>
</ItemIn>
</ConfirmationStatus>
</ConfirmationItem>
</ConfirmationRequest>
</Request>
</cXML>
I want to get the value of the payloadID on the DocumentReference element. This is what I have tried so far:
BEGIN
Declare #Xml xml
Set #Xml = ('..The XML From Above..' as xml)
END
--no value comes back
Select c.value('(/*/DocumentReference/#payloadID)[0]','nvarchar(max)') from #Xml.nodes('//cXML') x(c)
--no value comes back
Select c.value('#payloadID','nvarchar(max)') from #Xml.nodes('/cXML/*/DocumentReference') x(c)
--check if element exists and it does
Select #Xml.exist('//DocumentReference');
I tried this in an xPath editor: //DocumentReference/#payloadID
This does work, but I am not sure what the equivalent syntax is in SQL
Calling .nodes() (like suggested in comment) is an unecessary overhead...
Better try it like this:
SELECT #XML.value('(//DocumentReference/#payloadID)[1]','nvarchar(max)')
And be aware, that XPath starts counting at 1. Your example with [0] cannot work...
--no value comes back
Select c.value('(/*/DocumentReference/#payloadID)[0]','nvarchar(max)') from...

How to handle For each condition in Data weaver: Mule

I'm getting struggle in looping the entries in data weaver. Below is the Input and the expected response.
Not sure how to make loop(I need to get RecordEntry and each entry with 'IndividualEntry') .
Input xml : Record entry tag in input xml is 3, but I might get many. So need to make a loop as dynamic.
<?xml version="1.0" encoding="UTF-8"?>
<Records>
<storenumber />
<calculated>false</calculated>
<subTotal>12</subTotal>
<RecordsEntries>
<RecordEntry>
<deliverycharge>30.0</deliverycharge>
<entryNumber>8</entryNumber>
<Value>true</Value>
</RecordEntry>
<RecordEntry>
<deliverycharge>20.0</deliverycharge>
<entryNumber>7</entryNumber>
<Value>false</Value>
</RecordEntry>
<RecordEntry>
<deliverycharge>1.0</deliverycharge>
<entryNumber>6</entryNumber>
<Value>false</Value>
</RecordEntry>
</RecordsEntries>
</Records>
Expected Response ( I'm expecting the below response)
<?xml version="1.0" encoding="UTF-8"?>
<orders>
<order>
<StoreID />
<Total>false</Total>
<IndividualEntry>
<Number>8</Number>
<DeliverCharge>30.0</DeliverCharge>
</IndividualEntry>
<IndividualEntry>
<Number>7</Number>
<DeliverCharge>20.0</DeliverCharge>
</IndividualEntry>
<IndividualEntry>
<Number>6</Number>
<DeliverCharge>1.0</DeliverCharge>
</IndividualEntry>
</order>
</orders>
My Data weaver Transformation as below
%dw 1.0
%output application/xml
---
{
orders: {
order: {
StoreID:payload.Records.storenumber,
Total: payload.Records.calculated,
IndividualEntry: payload.Records.RecordsEntries.*RecordEntry map {
Number:$.entryNumber,
DeliverCharge:$.deliverycharge
}
}
}
}
Currently I'm getting response as below ( I don't know how to make each Record entry as a IndividualEntry tag, and also here element tag is added in extra which is not required in my case)
<?xml version="1.0" encoding="UTF-8"?>
<orders>
<order>
<StoreID />
<Total>false</Total>
<IndividualEntry>
<element>
<Number>8</Number>
<DeliverCharge>30.0</DeliverCharge>
</element>
<element>
<Number>7</Number>
<DeliverCharge>20.0</DeliverCharge>
</element>
<element>
<Number>6</Number>
<DeliverCharge>1.0</DeliverCharge>
</element>
</IndividualEntry>
</order>
</orders>
Could any one help me in fix this. Thanks in advance.
One way to do it:
orders: {
order: {
StoreID: payload.Records.storenumber,
Total: payload.Records.calculated,
(payload.Records.RecordsEntries.*RecordEntry map {
IndividualEntry: {
Number:$.entryNumber,
DeliverCharge:$.deliverycharge
}
})
}
}
Inside an object when you put an expression between parenthesis that returns an array of key-value pairs it is evaluated and used to fill the object.
See section5.1.3. Dynamic elements in https://developer.mulesoft.com/docs/dataweave

Been trying to write an xquery code that converts a flat xml to 5-level hierarchy xml file

Would like to convert the following flax xml file to 5-level hierarchy xml structure using xquery, so far the all the xquery code i have written did not work.
<data>
<row>
<Year>1999</Year>
<Quarter>8</Quarter>
<Month>5</Month>
<Week>10</Week>
<Flight>6/11/1995</Flight>
<Un>WN</Un>
<Air>193</Air>
</row>
<data>
Out result i would like:
<data>
<row>
<Year>
<value>1999</value>
<Quarter>
<value>8</value>
<Month>
<value>5</value>
<Week>10</Week>
<Flight>6/11/1995</Flight>
<Un>WN</Un>
<Air>193</Air>
</Month>
</Quarter>
</Year>
</row>
<data>
It's unclear what XQuery processor you're using, or the exact schema of the data you will need to process, but here is an example of how to transform the data, assuming each row contains a unique set of entries:
let $data :=
<data>
<row>
<Year>1999</Year>
<Quarter>8</Quarter>
<Month>5</Month>
<Week>10</Week>
<Flight>6/11/1995</Flight>
<Un>WN</Un>
<Air>193</Air>
</row>
</data>
for $row in $data/row
return
element row {
element Year {
element value { $row/Year/data() },
element Quarter {
element value { $row/Quarter/data() },
element Month {
element value { $row/Month/data() },
$row/Week,
$row/Flight,
$row/Un,
$row/Air
}
}
}
}
If you want a single element for each year/quarter/month, use this code:
<data>
<row>{
for $year in //row/Year/data()
return
<Year>{
<value>{ $year }</value>,
for $quarter in //row[Year=$year]/Quarter/data()
return
<Quarter>{
<value>{ $quarter }</value>,
for $row in //row[Year=$year and Quarter=$quarter]
return
<Month>{
<value>{ Month/data() }</value>,
$row/*[not(local-name(.) = ('Month', 'Quarter', 'Year'))]
}</Month>
}</Quarter>
}</Year>
}</row>
</data>

Tips for finding prefixed tags in python lxml?

I am trying to using lxml's ElementTree etree to find a specific tag in my xml document.
The tag looks as follows:
<text:ageInformation>
<text:statedAge>12</text:statedAge>
</text:ageInformation>
I was hoping to use etree.find('text:statedAge'), but that method does not like 'text' prefix.
It mentions that I should add 'text' to the prefix map, but I am not certain how to do it. Any tips?
Edit:
I want to be able to write to the hr4e prefixed tags.
Here are the important parts of the document:
<?xml version="1.0" encoding="utf-8"?>
<greenCCD xmlns="AlschulerAssociates::GreenCDA" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:hr4e="hr4e::patientdata" xsi:schemaLocation="AlschulerAssociates::GreenCDA green_ccd.xsd">
<header>
<documentID root="18c41e51-5f4d-4d15-993e-2a932fed720a" />
<title>Health Records for Everyone Continuity of Care Document</title>
<version>
<number>1</number>
</version>
<confidentiality codeSystem="2.16.840.1.113883.5.25" code="N" />
<documentTimestamp value="201105300211+0800" />
<personalInformation>
<patientInformation>
<personID root="2.16.840.1.113883.3.881.PI13023911" />
<personAddress>
<streetAddressLine nullFlavor="NI" />
<city>Santa Cruz</city>
<state nullFlavor="NI" />
<postalCode nullFlavor="NI" />
</personAddress>
<personPhone nullFlavor="NI" />
<personInformation>
<personName>
<given>Benjamin</given>
<family>Keidan</family>
</personName>
<gender codeSystem="2.16.840.1.113883.5.1" code="M" />
<personDateOfBirth value="NI" />
<hr4e:ageInformation>
<hr4e:statedAge>9424</hr4e:statedAge>
<hr4e:estimatedAge>0912</hr4e:estimatedAge>
<hr4e:yearInSchool>1</hr4e:yearInSchool>
<hr4e:statusInSchool>attending</hr4e:statusInSchool>
</hr4e:ageInformation>
</personInformation>
<hr4e:livingSituation>
<hr4e:homeVillage>Putney</hr4e:homeVillage>
<hr4e:tribe>Oromo</hr4e:tribe>
</hr4e:livingSituation>
</patientInformation>
</personalInformation>
The namespace prefix must be declared (mapped to an URI) in the XML document. Then you can use the {URI}localname notation to find text:statedAge and other elements. Something like this:
from lxml import etree
XML = """
<root xmlns:text="http://example.com">
<text:ageInformation>
<text:statedAge>12</text:statedAge>
</text:ageInformation>
</root>"""
root = etree.fromstring(XML)
ageinfo = root.find("{http://example.com}ageInformation")
age = ageinfo.find("{http://example.com}statedAge")
print age.text
This will print "12".
Another way of doing it:
ageinfo = root.find("text:ageInformation",
namespaces={"text": "http://example.com"})
age = ageinfo.find("text:statedAge",
namespaces={"text": "http://example.com"})
print age.text
You can also use XPath:
age = root.xpath("//text:statedAge",
namespaces={"text": "http://example.com"})[0]
print age.text
I ended up having to use nested prefixes:
from lxml import etree
XML = """
<greenCCD xmlns="AlschulerAssociates::GreenCDA" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:hr4e="hr4e::patientdata" xsi:schemaLocation="AlschulerAssociates::GreenCDA green_ccd.xsd">
<personInformation>
<hr4e:ageInformation>
<hr4e:statedAge>12</hr4e:statedAge>
</hr4e:ageInformation>
</personInformation>
</greenCCD>"""
root = etree.fromstring(XML)
#root = etree.parse("hr4e_patient.xml")
ageinfo = root.find("{AlschulerAssociates::GreenCDA}personInformation/{hr4e::patientdata}ageInformation")
age = ageinfo.find("{hr4e::patientdata}statedAge")
print age.text