Parsing using NSXMLParser - objective-c

I am being returned some XML from a web service. Basically, the xml looks like this:
<response>
<data>
More XML here but the it's escaped by XML entities
</data>
</response>
so, as you can see, I have xml that is valid, but the stuff inside data tag is escaped with XML entities. what's the best (most efficient) way for me to feed this into the parser?
What I am doing right now is, when I get the data from web service, I convert it into NSString....then replace the "XML escaped entities" with real ones.....then convert it back into NSData...then feed it into the parser. This doesn't seem like a very good solution so I was wondering if there's a better way to do it?
Thanks.
Alright, here's the xml that I am getting:
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"><s:Header><ActivityId CorrelationId="d39007b5-ee69-41c7-a61d-831b456f9ea3" xmlns="http://schemas.microsoft.com/2004/09/ServiceModel/Diagnostics">aa88d1cd-253c-48d1-abeb-62a880bea806</ActivityId></s:Header><s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><LoginResponse xmlns="http://MSS"><LoginResult><LoginInformation>
<User>
<UserID>612</UserID>
<UserName>Demo User</UserName>
<Email>mssdev#mss-mail.com</Email>
<CompanyID>17034</CompanyID>
<CompanyName>PlanET Demonstration Agency</CompanyName>
</User>
</LoginInformation></LoginResult></LoginResponse></s:Body></s:Envelope>
As you can see, everything in is escaped.

That's kind of horrifying. Why don't you just take the contents of that tag (e.g. the data with the entities) and just pass that through another NSXMLParser? The first parser will have decoded the entities, and the second parser will be presented with the decoded version of the contents of that tag.

Related

Exception cx_st_match_element when deserializing XML?

I'm having trouble getting a simple transformation for XML to work in ABAP. I keep getting the exception cx_st_match_element when I try to execute it on a test XML document inside of a report.
I have the following XML that I want to transform into an ABAP internal table:
<?xml version="1.0" encoding="UTF-8"?>
<studenten xmlns="http://www.foo.be/bar/preinschrijvingsflow"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.foo.be/bar/preinschrijvingsflow bar_studenten.xsd">
<student>
<barGuid>1</barGuid>
<familienaam>Doe</familienaam>
<voornaam>John</voornaam>
<geslacht>1</geslacht>
<nationaliteit>BE</nationaliteit>
<geboortedatum>1995-11-18</geboortedatum>
<geboorteplaats>Antwerpen</geboorteplaats>
<email>John.Doe#gmail.com</email>
<straatNummer>Grote Markt 1 bus 0102</straatNummer>
<postcode>1000</postcode>
<gemeente>Brussel</gemeente>
<land>BE</land>
<telefoonnummer>+32123456789</telefoonnummer>
<academiejaar>2021</academiejaar>
</student>
</studenten>
To this end I defined the following simple transformation I called zc_tr_student:
<?sap.transform simple?>
<tt:transform
xmlns="http://www.foo.be/bar/preinschrijvingsflow"
xmlns:tt="http://www.sap.com/transformation-templates"
xmlns:ddic=" http://www.sap.com/abapxml/types/dictionary">
<tt:root name="studenten" type="ddic:ZCTT_bar_STUDENT"/>
<tt:template>
<studenten>
<tt:loop ref=".studenten" name="studenten">
<student>
<barGuid tt:value-ref="$studenten.bar_guid"/>
<familienaam tt:value-ref="$studenten.familienaam"/>
<voornaam tt:value-ref="$studenten.voornaam"/>
<geslacht tt:value-ref="$studenten.geslacht"/>
<nationaliteit tt:value-ref="$studenten.nationaliteit"/>
<geboortedatum tt:value-ref="$studenten.geboortedatum"/>
<geboorteplaats tt:value-ref="$studenten.geboorteplaats"/>
<email tt:value-ref="$studenten.email"/>
<straat_nummer tt:value-ref="$studenten.straat_nummer"/>
<postcode tt:value-ref="$studenten.postcode"/>
<gemeente tt:value-ref="$studenten.gemeente"/>
<land tt:value-ref="$studenten.land"/>
<telefoonnummer tt:value-ref="$studenten.telefoonnummer"/>
<academiejaar tt:value-ref="$studenten.academiejaar"/>
</student>
</tt:loop>
</studenten>
</tt:template>
</tt:transform>
In the tt:value-refattributes I refer to the field in the DDIC line type of the ABAP internal table corresponding to the tag in the XML.
If I call this simple transformation from an ABAP report like this:
call transformation zc_tr_student
source xml lv_bxml
result studenten = p_student.
The cx_st_match_element is thrown.
I validated both the syntax of the file and its adherence to the schema. The XML file and the XSD file are present in the same directory on the application server. I have no idea why the ST fails as the cx_st_match_element instance does not have any useful information except that it expected a studenten element. That element is clearly present in the XML file as the root element.
I'm inexperienced with defining simple transformations and I can't spot my error myself. Thank you in advance for your help,
Joshua

How to parse simple xml file

>> ? xml
No information on xml
There's parse-xml but it seems to me that it was for Rebol2.
I've searched for xml scripts in rebol.org and found xml-object.r that seemed to me like the most up to date from all searches.
I know about altxml, too, but the examples given are for html.
So, I'd like to ask about my choices if I want to parse and use information of +1GB of files of this simplified structure:
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
<SalesFile xmlns="urn:StandardSalesFile-1.0">
<Header>
<SalesFileVersion>1.01</SalesFileVersion>
<DateCreation>2014-04-30</DateCreation>
</Header>
<SalesInvoices>
<Invoice>
<InvoiceNo>INV 1/1</InvoiceNo>
<DocumentStatus>
<InvoiceStatus>N</InvoiceStatus>
<InvoiceStatusDate>2014-01-03T17:57:59</InvoiceStatusDate>
</DocumentStatus>
</Invoice>
<Invoice>
<InvoiceNo>INV 2/1</InvoiceNo>
<DocumentStatus>
<InvoiceStatus>N</InvoiceStatus>
<InvoiceStatusDate>2014-01-03T17:59:12</InvoiceStatusDate>
</DocumentStatus>
</Invoice>
</SalesInvoices>
</SalesFile>
Is Rebol3 going to have a parse-xml tool? Should I use xml-object? If so how? Because it's still beyong my novice level of the language. Other option?
There is also a Rebol 3 library by Christopher Ross-Gill called alt-xml.
http://www.ross-gill.com/page/XML_and_REBOL
This can translate the XML to either a block! or object! representation.
Your question states that these XML files are large and may not fit in main memory. I would suggest that creating 1GB XML files is not best practice as many parsers, including this one, do attempt to load the files into memory.
To support this you will have to chunk the files yourself by using open on the file and copy/part chunks out of the file. This is a bit messy, but it will work.
One way to make this cleaner is to use parse as per HostileFork's suggestion and modify the series as you parse it. Parse is very flexible in this regard.
Ideally parse would be able to work directly on port! objects, but this is only a future wish list item at the moment.
Do you really need to deal with the XML file as structure? If not, have you considered just using PARSE?
(Warning: the following is untested, I'm just presenting the concept.)
Invoices: copy []
parse my-doc [
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
thru <SalesFile xmlns="urn:StandardSalesFile-1.0">
thru <Header>
thru <SalesFileVersion> copy SalesFileVersion to </SalesFileVersion>
</SalesFileVersion>
thru <DateCreation> copy DateCreation to </DateCreation>
</DateCreation>
thru </Header>
thru <SalesInvoices>
any [
thru <Invoice>
(Invoice: object [])
thru <InvoiceNo> copy InvoiceNo to </InvoiceNo>
</InvoiceNo>
(Invoice/No: InvoiceNo)
thru <DocumentStatus>
thru <InvoiceStatus> copy InvoiceStatus to </InvoiceStatus>
</InvoiceStatus>
(Invoice/Status: InvoiceStatus)
thru <InvoiceStatusDate> copy InvoiceStatusDate to </InvoiceStatusDate>
</InvoiceStatusDate>
(Invoice/StatusDate: InvoiceStatusDate)
thru </DocumentStatus>
thru </Invoice>
]
thru </SalesInvoices>
thru </SalesFile>
to end
]
If you know you have well-formed XML and don't want a dependency on a library for processing clunky-ol' XML, Rebol can get pretty far and clear with PARSE. As TAG! is just a subclass of string, you can make things look relatively literate. And it's much more lightweight to just work with the strings.
Though if structural manipulations are required, you'll need something that makes a DOM. Altxml is the go-to right now, AFAIK.
(Hmm...I had a name for the pattern copy x to <foo> <foo> that escapes me at the moment, but this is a good case for it.)
%Rebol-Dom.r or %rebol-dom-mdlparser.r,
If your willing to use rebol2 with parse to seek thru to the node-name, then copy a chunk of data, you could feed that to Rebol-Dom.r getnodename "salesInvoice" and append that node-element to a block repetitively.

performing calculations with values in xml CLOB which has identical tags using sql

I've got a table (event_archive) and one of the columns(event_xml) has CLOB data in xml format as below. Is there a way to use SQL to summate the values of the "xx" tag? Please help as i'm completely baffled. Even simply extracting the values is a problem as there are 2 "xx" tags within the same root. Thanks in advance.
<?xml version="1.0" encoding="UTF-8"?>
<event type="CALCULATION">
<source_id>INTERNAL</source_id>
<source_participant/>
<source_role/>
<source_start_pos>1</source_start_pos>
<destination_participant/>
<destination_role/>
<event_id>123456</event_id>
<payload>
<cash_point reference="abc12345">
<adv_start>20120907</adv_start>
<adv_end>20120909</adv_end>
<conf>1234</conf>
<profile>3</profile>
<group>A</group>
<patterns>
<pattern id="00112">
<xx>143554.1</xx>
<yyy>96281.6</yyy>
<adv>875</adv>
</pattern>
<pattern id="00120">
<xx>227606.1</xx>
<yyy>97539.8</yyy>
<adv>18181</adv>
</pattern>
</patterns>
</cash_point>
</payload>
</event>
Different databases handle XML differently. There's no standard way of dealing with XML payloads via raw, standard SQL. So you'll need to look at your actual DB implementation to find out what support they have.

XML Import with "alternate" form or xml formatting

I have successfully imported an XML file parsing elements info table attributes using this xml data formating:
<PN>
<guid>aaaa</guid>
<dataInput>0</dataInput>
<deleted>false</deleted>
<customField1></customField1>
<customField2></customField2>
<customField3></customField3>
<description></description>
<name>name1></name>
<ccid>CC007814</ccid>
<productIds>bbbb</productIds>
</PN>
but it errors whwen I input an XML in this format:
<PN guid="aaaa"
deleted="false"
customField1=""
customField2=""
customField3=""
description=""
modified="2010-10-20T00:00:00.001"
created="2010-05-20T18:07:10.416"
name="name1"
ccid="CC006035"
productIds="bbbb"/>
Is this later form usable? Any help would be appreciated. Thanks.
It's usable, but you're looking at the difference between using tags (your first example) and attributes (your second example). Your processing is slightly different.

Import Xml nodes as Xml column with SSIS

I'm trying to use the Xml Source to shred an XML source file however I do not want the entire document shredded into tables. Rather I want to import the xml Nodes into rows of Xml.
a simplified example would be to import the document below into a table called "people" with a column called "person" of type "xml". When looking at the XmlSource --- it seem that it suited to shredding the source xml, into multiple records --- not quite what I'm looking for.
Any suggestions?
<people>
<person>
<name>
<first>Fred</first>
<last>Flintstone</last>
</name>
<address>
<line1>123 Bedrock Way</line>
<city>Drumheller</city>
</address>
</person>
<person>
<!-- more of the same -->
</person>
</people>
I didn't think that SSIS 2005 supported the XML datatype at all. I suppose it "supports" it as DT_NTEXT.
In any case, you can't use the XML Source for this purpose. You would have to write your own. That's not actually as hard as it sounds. Base it on the examples in Books Online. The processing would consist of moving to the first child node, then calling XmlReader.ReadSubTree to return a new XmlReader over just the next <person/> element. Then use your favorite XML API to read the entire <person/>, convert the resulting XML to a string, and pass it along down the pipeline. Repeat for all <person/> nodes.
Could you perhaps change your xml output so that the content of person is seen as a string? Use escape chars for the <>.
You could use a script task to parse it as well, I'd imagine.