Tips for finding prefixed tags in python lxml? - lxml

I am trying to using lxml's ElementTree etree to find a specific tag in my xml document.
The tag looks as follows:
<text:ageInformation>
<text:statedAge>12</text:statedAge>
</text:ageInformation>
I was hoping to use etree.find('text:statedAge'), but that method does not like 'text' prefix.
It mentions that I should add 'text' to the prefix map, but I am not certain how to do it. Any tips?
Edit:
I want to be able to write to the hr4e prefixed tags.
Here are the important parts of the document:
<?xml version="1.0" encoding="utf-8"?>
<greenCCD xmlns="AlschulerAssociates::GreenCDA" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:hr4e="hr4e::patientdata" xsi:schemaLocation="AlschulerAssociates::GreenCDA green_ccd.xsd">
<header>
<documentID root="18c41e51-5f4d-4d15-993e-2a932fed720a" />
<title>Health Records for Everyone Continuity of Care Document</title>
<version>
<number>1</number>
</version>
<confidentiality codeSystem="2.16.840.1.113883.5.25" code="N" />
<documentTimestamp value="201105300211+0800" />
<personalInformation>
<patientInformation>
<personID root="2.16.840.1.113883.3.881.PI13023911" />
<personAddress>
<streetAddressLine nullFlavor="NI" />
<city>Santa Cruz</city>
<state nullFlavor="NI" />
<postalCode nullFlavor="NI" />
</personAddress>
<personPhone nullFlavor="NI" />
<personInformation>
<personName>
<given>Benjamin</given>
<family>Keidan</family>
</personName>
<gender codeSystem="2.16.840.1.113883.5.1" code="M" />
<personDateOfBirth value="NI" />
<hr4e:ageInformation>
<hr4e:statedAge>9424</hr4e:statedAge>
<hr4e:estimatedAge>0912</hr4e:estimatedAge>
<hr4e:yearInSchool>1</hr4e:yearInSchool>
<hr4e:statusInSchool>attending</hr4e:statusInSchool>
</hr4e:ageInformation>
</personInformation>
<hr4e:livingSituation>
<hr4e:homeVillage>Putney</hr4e:homeVillage>
<hr4e:tribe>Oromo</hr4e:tribe>
</hr4e:livingSituation>
</patientInformation>
</personalInformation>

The namespace prefix must be declared (mapped to an URI) in the XML document. Then you can use the {URI}localname notation to find text:statedAge and other elements. Something like this:
from lxml import etree
XML = """
<root xmlns:text="http://example.com">
<text:ageInformation>
<text:statedAge>12</text:statedAge>
</text:ageInformation>
</root>"""
root = etree.fromstring(XML)
ageinfo = root.find("{http://example.com}ageInformation")
age = ageinfo.find("{http://example.com}statedAge")
print age.text
This will print "12".
Another way of doing it:
ageinfo = root.find("text:ageInformation",
namespaces={"text": "http://example.com"})
age = ageinfo.find("text:statedAge",
namespaces={"text": "http://example.com"})
print age.text
You can also use XPath:
age = root.xpath("//text:statedAge",
namespaces={"text": "http://example.com"})[0]
print age.text

I ended up having to use nested prefixes:
from lxml import etree
XML = """
<greenCCD xmlns="AlschulerAssociates::GreenCDA" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:hr4e="hr4e::patientdata" xsi:schemaLocation="AlschulerAssociates::GreenCDA green_ccd.xsd">
<personInformation>
<hr4e:ageInformation>
<hr4e:statedAge>12</hr4e:statedAge>
</hr4e:ageInformation>
</personInformation>
</greenCCD>"""
root = etree.fromstring(XML)
#root = etree.parse("hr4e_patient.xml")
ageinfo = root.find("{AlschulerAssociates::GreenCDA}personInformation/{hr4e::patientdata}ageInformation")
age = ageinfo.find("{hr4e::patientdata}statedAge")
print age.text

Related

Karate: Match repeating element in xml

I'm trying to match a repeating element in a xml to karate schema.
XML message
* def xmlResponse =
"""
<Envelope>
<Header/>
<Body>
<Response>
<Customer>
<keys>
<primaryKey>1111111</primaryKey>
</keys>
<simplePay>false</simplePay>
</Customer>
<serviceGroupList>
<serviceGroup>
<name>XXXX</name>
<count>1</count>
<parentName>DDDDD</parentName>
<pendingCount>0</pendingCount>
<pendingHWSum>0.00</pendingHWSum>
</serviceGroup>
<serviceGroup>
<name>ZZZZZ</name>
<count>0</count>
<parentName/>
<pendingCount>3</pendingCount>
<pendingHWSum>399.00</pendingHWSum>
</serviceGroup>
</serviceGroupList>
</Response>
</Body>
</Envelope>
"""
I want to match each with following karate schema
Given def serviceGroupItem =
"""
<serviceGroup>
<name>##string</name>
<count>##string</count>
<parentName>##string</parentName>
<pendingCount>##string</pendingCount>
<pendingHWSum>##string</pendingHWSum>
</serviceGroup>
"""
This is how I tried
* xml serviceGroupListItems = get xmlResponse //serviceGroupList
* match each serviceGroupListItems == serviceGroupItem
But it doesn't work. Any idea how can I make it work
You have to match each serviceGroup.
* xml serviceGroupListItems = get xmlResponse //serviceGroupList
* match each serviceGroupListItems.serviceGroupList.serviceGroup == serviceGroupItem.serviceGroup

XPath doesn't provide proper tag

I'm trying to get tag "" from xml below.
If i execute request like this:
WITH x(col) AS (select'<document xmlns="http://example.com/digital/back/" xmlns:ns2="http://example.com/digital/back/complexId" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="">
<header>
<docId>13a2f29a28b12ecb</docId>
<dt>2018-12-10T11:59:48.112+03:00</dt>
</header>
<pay>
<reqTransfer id="154638">
<source>
<card>
<virtualCardNum>4B74C1EE187</virtualCardNum>
<bsc>VISA</bsc>
</card>
</source>
</reqTransfer>
</pay>
</document>
'::xml)
SELECT xpath('/document/pay/reqTransfer/source/card/bsc/text()', col) AS bsc
FROM x;
I get {}, but if I relpace the document start tag
<document xmlns="http://example.com/digital/back/" xmlns:ns2="http://example.com/digital/back/complexId" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="">
with <document> or even <document xmlns="">, I get { VISA } - that is right.
What should I do to replace <document xmlns="..."> with <document> or get { VISA } without replacement?
If you are working with XML namespaces, they are worth mentioning in your Xpath queries too, i.e. use
SELECT xpath('/d:document/d:pay/d:reqTransfer/d:source/d:card/d:bsc/text()', col,
ARRAY[ARRAY['d', 'http://example.com/digital/back/']]) AS bsc
http://sqlfiddle.com/#!17/9eecb/24719
See also:
how to ignore namespaces with XPath

How to handle For each condition in Data weaver: Mule

I'm getting struggle in looping the entries in data weaver. Below is the Input and the expected response.
Not sure how to make loop(I need to get RecordEntry and each entry with 'IndividualEntry') .
Input xml : Record entry tag in input xml is 3, but I might get many. So need to make a loop as dynamic.
<?xml version="1.0" encoding="UTF-8"?>
<Records>
<storenumber />
<calculated>false</calculated>
<subTotal>12</subTotal>
<RecordsEntries>
<RecordEntry>
<deliverycharge>30.0</deliverycharge>
<entryNumber>8</entryNumber>
<Value>true</Value>
</RecordEntry>
<RecordEntry>
<deliverycharge>20.0</deliverycharge>
<entryNumber>7</entryNumber>
<Value>false</Value>
</RecordEntry>
<RecordEntry>
<deliverycharge>1.0</deliverycharge>
<entryNumber>6</entryNumber>
<Value>false</Value>
</RecordEntry>
</RecordsEntries>
</Records>
Expected Response ( I'm expecting the below response)
<?xml version="1.0" encoding="UTF-8"?>
<orders>
<order>
<StoreID />
<Total>false</Total>
<IndividualEntry>
<Number>8</Number>
<DeliverCharge>30.0</DeliverCharge>
</IndividualEntry>
<IndividualEntry>
<Number>7</Number>
<DeliverCharge>20.0</DeliverCharge>
</IndividualEntry>
<IndividualEntry>
<Number>6</Number>
<DeliverCharge>1.0</DeliverCharge>
</IndividualEntry>
</order>
</orders>
My Data weaver Transformation as below
%dw 1.0
%output application/xml
---
{
orders: {
order: {
StoreID:payload.Records.storenumber,
Total: payload.Records.calculated,
IndividualEntry: payload.Records.RecordsEntries.*RecordEntry map {
Number:$.entryNumber,
DeliverCharge:$.deliverycharge
}
}
}
}
Currently I'm getting response as below ( I don't know how to make each Record entry as a IndividualEntry tag, and also here element tag is added in extra which is not required in my case)
<?xml version="1.0" encoding="UTF-8"?>
<orders>
<order>
<StoreID />
<Total>false</Total>
<IndividualEntry>
<element>
<Number>8</Number>
<DeliverCharge>30.0</DeliverCharge>
</element>
<element>
<Number>7</Number>
<DeliverCharge>20.0</DeliverCharge>
</element>
<element>
<Number>6</Number>
<DeliverCharge>1.0</DeliverCharge>
</element>
</IndividualEntry>
</order>
</orders>
Could any one help me in fix this. Thanks in advance.
One way to do it:
orders: {
order: {
StoreID: payload.Records.storenumber,
Total: payload.Records.calculated,
(payload.Records.RecordsEntries.*RecordEntry map {
IndividualEntry: {
Number:$.entryNumber,
DeliverCharge:$.deliverycharge
}
})
}
}
Inside an object when you put an expression between parenthesis that returns an array of key-value pairs it is evaluated and used to fill the object.
See section5.1.3. Dynamic elements in https://developer.mulesoft.com/docs/dataweave

How to insert XML to SQL

I try to add a XML file to SQL 2008.
My XML:
<ItemList>
<Section Index="0" Name="cat0">
<Item Index="0" Slot="0" />
<Item Index="1" Slot="0" />
</Section>
<Section Index="1" Name="cat1">
<Item Index="33" Slot="0" />
<Item Index="54" Slot="0" />
</Section>
<Section Index="2" Name="cat2">
<Item Index="55" Slot="0" />
<Item Index="78" Slot="0" />
</Section>
</ItemList>
SQL Column :
Name = Section Name,
Cat = Section Index,
Index = Item Index,
Slot = Item Slot.
My Example :
DECLARE #input XML = 'MY XML file'
SELECT
Name = XCol.value('#Index','varchar(25)'),
Cat = XCol.value('#Name','varchar(25)'),
[Index] = 'Unknown', /* Index from <Item>*/
Slot = 'Unknown' /* Slot from <Item> */
FROM #input.nodes('/ItemList/Section') AS test(XCol)
I don't know how to add values from "Item".
Thank you very much!
You can do it like this:
select
Name = XCol.value('../#Index','varchar(25)'),
Cat = XCol.value('../#Name','varchar(25)'),
[Index] = XCol.value('#Index','varchar(25)'),
Slot = XCol.value('#Slot','varchar(25)')
from
#input.nodes('/ItemList/Section/Item') AS test(XCol)
Key idea: take data one level deeper, not /ItemList/Section, but /ItemList/Section/Item. So in this case you are able to access attributes of Item and also you can access attributes of parent element (Section in your case) by specifying ../#Attribute_Name
Different than the previous answer - CROSS APPLY with the children Item nodes:
SELECT
Name = XCol.value('#Index','varchar(25)'),
Cat = XCol.value('#Name','varchar(25)'),
[Index] = XCol2.value('#Index','varchar(25)'),
Slot = XCol2.value('#Slot','varchar(25)')
FROM #input.nodes('/ItemList/Section') AS test(XCol)
CROSS APPLY XCol.nodes('Item') AS test2(XCol2)

How To Parse XDocument (Ebay) Items (List) Using Visual Basic?

<code>
For Each oXElement In oXDocument.Descendants("searchResult")
sTitle = oXElement.Element("title").Value
Next
</code>
I have also tried:
<code>
For Each oXElement In oXDocument.Elements(searchResults)
sTitle = oXElement.Element("title").Value
Next
</code>
I am having trouble getting a hold of nodes as well as understanding the way you communicate with XDocument nodes.
My Ultimate goal is to create an Ebay Object Model From all Ebay Element's Attributes. For that I need to refer to XML tag somehow - and this is where I would appreciate your advice or sample example that could let me proceed with parsing out this XML response.
Thank you all much for any help.
PS: I have searched for a similar questions and found a few of the same kind but still could not get my parsing to work.
<findItemsByProductResponse xmlns="http://www.ebay.com/marketplace/search/v1/services">
<ack>Success</ack>
<version>1.12.0</version>
<timestamp>2013-06-02T22:42:04.500Z</timestamp>
<searchResult count="5">
<item>
<itemId>370821427802</itemId>
<title>
Modern Database Management 11E by Hoffer, Ramesh, Topi 11th (Int'l Edition)
</title>
<globalId>EBAY-US</globalId>
<primaryCategory>
<categoryId>2228</categoryId>
<categoryName>Textbooks, Education</categoryName>
</primaryCategory>
<galleryURL>
http://thumbs3.ebaystatic.com/m/meSAqCRbXecSjZjO1833dWQ/140.jpg
</galleryURL>
<viewItemURL>
http://www.ebay.com/itm/Modern-Database-Management-11E-Hoffer-Ramesh-Topi-11th-Intl-Edition-/370821427802?pt=US_Texbook_Education
</viewItemURL>
<productId type="ReferenceID">143649496</productId>
<paymentMethod>PayPal</paymentMethod>
<autoPay>true</autoPay>
<location>Malaysia</location>
<country>MY</country>
<shippingInfo>
<shippingServiceCost currencyId="USD">0.0</shippingServiceCost>
<shippingType>Free</shippingType>
<shipToLocations>Worldwide</shipToLocations>
<expeditedShipping>true</expeditedShipping>
<oneDayShippingAvailable>false</oneDayShippingAvailable>
<handlingTime>1</handlingTime>
</shippingInfo>
<sellingStatus>
<currentPrice currencyId="USD">54.07</currentPrice>
<convertedCurrentPrice currencyId="USD">54.07</convertedCurrentPrice>
<sellingState>Active</sellingState>
<timeLeft>P20DT10H47M20S</timeLeft>
</sellingStatus>
<listingInfo>
<bestOfferEnabled>false</bestOfferEnabled>
<buyItNowAvailable>false</buyItNowAvailable>
<startTime>2013-05-24T09:25:25.000Z</startTime>
<endTime>2013-06-23T09:29:24.000Z</endTime>
<listingType>StoreInventory</listingType>
<gift>false</gift>
</listingInfo>
<returnsAccepted>true</returnsAccepted>
<condition>
<conditionId>1000</conditionId>
<conditionDisplayName>Brand New</conditionDisplayName>
</condition>
<isMultiVariationListing>false</isMultiVariationListing>
<topRatedListing>true</topRatedListing>
</item>
<item>...</item>
<item>...</item>
<item>...</item>
<item>...</item>
</searchResult>
<paginationOutput>
<pageNumber>1</pageNumber>
<entriesPerPage>5</entriesPerPage>
<totalPages>3</totalPages>
<totalEntries>14</totalEntries>
</paginationOutput>
<itemSearchURL>
http://www.ebay.com/ctg/143649496?LH_BIN=1&_ddo=1&_incaucbin=0&_ipg=5&_pgn=1
</itemSearchURL>
</findItemsByProductResponse>
You have to use XNamespace instance when querying your XML:
Dim ns = XNamespace.Get("http://www.ebay.com/marketplace/search/v1/services")
And with that add it to every Descendants, Elements, Element, Attributes, Attributes, etc. calls you make:
For Each oXElement In oXDocument.Descendants(ns + "searchResult")
sTitle = oXElement.Element(ns + "title").Value
Next
For Each oXElement In oXDocument.Elements(ns + searchResults)
sTitle = oXElement.Element(ns + "title").Value
Next
Two things. First, you fell into the trap that catches 90% of the people with problems using LINQ to XML. You forgot the namespace. You can use the following which works in C# or VB:
Dim ns = XNamespace.Get("http://www.ebay.com/marketplace/search/v1/services")
VB Also lets you use a Imports for a namespace just as you import other .Net namespaces at the top of your file. The advantage of this option is that if you have a schema in your project, you get intellisense over the XML structure while building your query.
Imports <xmlns:eb="http://www.ebay.com/marketplace/search/v1/services">
The second issue you have is that the title element is not a direct child of searchResult, but is nested an additional level deeper. Here's a sample leveraging the imports for the namespace. I'm using the VB XML Literals for descendents (...) for contrast with anyone giving you a C# biased answer ;-)
Public Class XmlTest
Public Sub TestXml()
Dim data = <findItemsByProductResponse xmlns="http://www.ebay.com/marketplace/search/v1/services">
<ack>Success</ack>
<version>1.12.0</version>
<timestamp>2013-06-02T22:42:04.500Z</timestamp>
<searchResult count="5">
<item>
<itemId>370821427802</itemId>
<title>
Modern Database Management 11E by Hoffer, Ramesh, Topi 11th (Int'l Edition)
</title>
<globalId>EBAY-US</globalId>
<primaryCategory>
<categoryId>2228</categoryId>
<categoryName>Textbooks, Education</categoryName>
</primaryCategory>
<galleryURL>
http://thumbs3.ebaystatic.com/m/meSAqCRbXecSjZjO1833dWQ/140.jpg
</galleryURL>
<viewItemURL>
http://www.ebay.com/itm/Modern-Database-Management-11E-Hoffer-Ramesh-Topi-11th-Intl-Edition-/370821427802?pt=US_Texbook_Education
</viewItemURL>
<productId type="ReferenceID">143649496</productId>
<paymentMethod>PayPal</paymentMethod>
<autoPay>true</autoPay>
<location>Malaysia</location>
<country>MY</country>
<shippingInfo>
<shippingServiceCost currencyId="USD">0.0</shippingServiceCost>
<shippingType>Free</shippingType>
<shipToLocations>Worldwide</shipToLocations>
<expeditedShipping>true</expeditedShipping>
<oneDayShippingAvailable>false</oneDayShippingAvailable>
<handlingTime>1</handlingTime>
</shippingInfo>
<sellingStatus>
<currentPrice currencyId="USD">54.07</currentPrice>
<convertedCurrentPrice currencyId="USD">54.07</convertedCurrentPrice>
<sellingState>Active</sellingState>
<timeLeft>P20DT10H47M20S</timeLeft>
</sellingStatus>
<listingInfo>
<bestOfferEnabled>false</bestOfferEnabled>
<buyItNowAvailable>false</buyItNowAvailable>
<startTime>2013-05-24T09:25:25.000Z</startTime>
<endTime>2013-06-23T09:29:24.000Z</endTime>
<listingType>StoreInventory</listingType>
<gift>false</gift>
</listingInfo>
<returnsAccepted>true</returnsAccepted>
<condition>
<conditionId>1000</conditionId>
<conditionDisplayName>Brand New</conditionDisplayName>
</condition>
<isMultiVariationListing>false</isMultiVariationListing>
<topRatedListing>true</topRatedListing>
</item>
<item>...</item>
<item>...</item>
<item>...</item>
<item>...</item>
</searchResult>
<paginationOutput>
<pageNumber>1</pageNumber>
<entriesPerPage>5</entriesPerPage>
<totalPages>3</totalPages>
<totalEntries>14</totalEntries>
</paginationOutput>
</findItemsByProductResponse>
For Each el In data...<eb:searchResult>
Console.WriteLine(el...<eb:title>.Value)
Next
End Sub
End Class