Using NSXMLParser to parse xml with html inside nodes - objective-c

I am trying to read a xml-file with internal html to show in a UIWebView.
I am using NSXMLParser to parse an incoming XML-file. Works like charm. However now I want to parse HTML that is included in the tags.
For instance this is parsed just fine:
<item>
<letter>a</letter>
<word>word 1</word>
<description>Yada yada</description>
</item>
However this is making the parser crash:
<item>
<letter>a</letter>
<word>word 1</word>
<description><p>Yada</p></description>
</item>
since the parser thinks it should translate as xml-nodes obviously.
How could I change my parsing to handle internal html?
The code used in my class to parse the xml handles these nodes on method foundCharacters
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string{
// save the characters for the current item...
if ([currentElement isEqualToString:#"letter"]) {
[currentLetter appendString:string];
} else if ([currentElement isEqualToString:#"word"]) {
[currentWord appendString:string];
} else if ([currentElement isEqualToString:#"description"]) {
[currentDescription appendString:string];
}
}
And this on didEndElement
- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName{
if ([elementName isEqualToString:#"item"]) {
// save values to an item, then store that item into the array...
[item setObject:currentLetter forKey:#"letter"];
[item setObject:currentWord forKey:#"word"];
[item setObject:currentDescription forKey:#"description"];
[stories addObject:[item copy]];
}
}

Wrap the content of the xml elements in CDATA blocks -- that way the parser will not see that as xml and skip it. (don't confuse it with PCDATA! Parsed CDATA won't help, but CDATA itself should do the trick)
you could then use a 2nd parser to parse that 'internal html'

Related

NSXML Parser and loading a level cocos2D

I am trying to load my levels in objective-c through xml.
Here is the style of my xml that is output from my level editor.
<level>
<entity>
<name>grey</name>
<id>5</id>
<body>false</body>
<x>0.0</x>
<y>0.0</y>
<rotation>0.0</rotation>
</entity>
<entity>
<name>grey</name>
<id>5</id>
<body>false</body>
<x>0.0</x>
<y>0.0</y>
<rotation>0.0</rotation>
</entity>
</level>
Each game object is recognized in XML as its own entity(Sprite when it is loaded int the game). So when the parsing takes place each entity or CCSprite should have its own x and y.
I would like the parser to traverse the xml document so each entity has a name, x, and y, and rotation. The other elements im not concerned with yet.
I have been trying for hours now to get things up and running with NSXML parser but no luck thus far..Here is what i have tried so far.
NSString* currentElement;
- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName{
currentElement = elementName;
}
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string{
if([currentElement isEqualToString:#"name"]){
NSLog(#"Name found: %#", string);
}
}
This works for some of the elements but the problem is foundCharacters is called multiple times and sometimes the name string returns blank.
What i am trying to do is when a element "name is found, i want to call a method that adds the current entity to the scene taking the "Name", "X, and "Y" elements as paramters.

feed reader using nsxmlparserdelegate

i am creating an app that shows the latest feeds from a particular webservice,i am using NSXMLParserDelegate protocol for this purpose,well i read the apple documentation and i tried some tutorials too,but something seems to be going wrong somewhere,i dont understand how does the didEndElement,foundCharacters work,anyways i want to display the image,title and content,pub-date of the post,i am newbie to xmlparsing here's my viewcontroller.h(i have just parsed only the title element in the following code)
#property(nonatomic,strong)NSString *currentElement;
#property(nonatomic,strong)NSString *currentTitle;
#property(nonatomic,strong)NSMutableArray *titles;
viewdidload
NSURL *url=[NSURL URLWithString:#"http://www.forbes.com/fast/feed"];
NSXMLParser *parser=[[NSXMLParser alloc]initWithContentsOfURL:url];
[parser setDelegate:self];
[parser parse];
NSLog(#"%d",titles.count);
didStartElement
-(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict
{
self.currentElement=elementName;
if ([self.currentElement isEqualToString:#"title"])
{
self.currentTitle=[NSMutableString alloc];
titles=[[NSMutableArray alloc]init];
titles=[attributeDict objectForKey:#"title"];
}
}
foundCharacters
-(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict
{
self.currentElement=elementName;
if ([self.currentElement isEqualToString:#"title"])
{
self.currentTitle=[NSMutableString alloc];
titles=[[NSMutableArray alloc]init];
titles=[attributeDict objectForKey:#"title"];
}
}
didEndElement
-(void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{
if([self.currentElement isEqualToString:#"title"])
{
NSLog(#"%#",self.currentTitle);
}
}
doubts
1)where am i supposed to declare my titles array so that i can add individual title object to it.
what is the use of [attributeDict objectForKey] in didStartElement? it returned null for my program
2)what does foundCharacters delegate actually do? what does it append?
3)After didEndElement why doesnt the compiler reach for didStartElement and not the foundCharacters ?
4)finally should i actually use NSXMLParserDelegate protocol for the xml parsing,do others like touchXML,TBXML and others provided in the raywenderlich make a difference?
i am sorry for the long post,but i havent got any satisfying answers online regarding my queries,i used all the breakpoints and figured out how the delegates are called back and forth,i need some enlightening answers to my queries,thanks and sorry
1. Declare your array before starting to parse.Whenever you meet an elements (an xml tag), initialize the elements (set some BOOL in the class in a way that you can recognize what element you are reading);
2. Found characters are the characters found as value of a tag.If you know what element you are reading (reading your instance variables), you should append this string to your temporary NSMutableString and add it to the array only when the element is ended.
3. Because it doesn't start finding other characters until a new tag is reached.
Example
I see that you are confused, let's say that you have this XML code:
<person> mickey mouse </person>
When you meet the tag the element starts, then you find other characters (not the entire string, just a part of the string) until the string ends, then when you meet the tag the element is ended.
in didFindCharacters just add the found characters to a NSMutableString and in didEndElement you know what you ended and set a variable to the strings you found
image:
image => the character at didEnd are the name of the link, the url is in the attributes past in didStart
title-tag
didStart => didStart : html started, every tag is html until the didEnd of title-tag
content:
didStart => didStart : html started, every tag is html until the didEnd of content

Parsing XML CDATA Blocks

I'm attempting to parse an XML file (using NSXMLParser) from the website librarything.com. This is the first file I have ever parsed, but for the most part it seems fairly straight forward. My problem occurs when trying to parse a CDATA block; the method parser:foundCDATA: isn't called, and I can't understand why. I know my parser is set up properly because the parser:foundCharacters: method works fine. The XML data I am trying to parse looks like this http://www.librarything.com/services/rest/1.1/?method=librarything.ck.getwork&isbn=030788743X&apikey=d231aa37c9b4f5d304a60a3d0ad1dad4 and the CDATA block occurs inside the element with the attribute name "description".
Any help as to why the method is not being called would be greatly appreciated!
EDIT: I ran the parser:foundCharacters: method on the description CDATA block and it returned "<". I'm assuming this means that the parser is not seeing the CDATA tag correctly. Is there anything that can be done on my end to fix this?
It appears the CDATA contents in the <fact> tags is being returned incrementally over multiple call backs in parser:foundCharacters. In you class where you are conforming to NSXMLParserDelegate try building up the CDATA by appending it to an NSMutableString instance, like so:
(Note: here _currentElement is an NSString property and _factString is an NSMutableString property)
- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict {
self.currentElement = elementName;
if ([_currentElement isEqualToString:#"fact"]) {
// Make a new mutable string to store the fact string
self.factString = [NSMutableString string];
}
}
- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName {
if ([elementName isEqualToString:#"fact"]) {
// If fact string starts with CDATA tags then just get the CDATA without the tags
NSString *prefix = #"<![CDATA[";
if ([_factString hasPrefix:prefix]) {
NSString *cdataString = [_factString substringWithRange:NSMakeRange((prefix.length+1), _factString.length - 3 -(prefix.length+1))];
// Do stuff with CDATA here...
NSLog(#"%#", cdataString);
// No longer need the fact string so make a new one ready for next XML CDATA
self.factString = [NSMutableString string];
}
}
}
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
if ([_currentElement isEqualToString:#"fact"]) {
// If we are at a fact element, append the string
// CDATA is returned to this method in more than one go, so build the string up over time
[_factString appendString:string];
}
}

How to handle a tags inside another tags in NSXMLParser

I have a file:
<xml>
<component>something
<system>somethingDeeper
<value>somethingDeepest</value>
</system>
</component>
<component>somethinfDifferent
<value>somethingDifferentDeeper</value>
</component>
<value>somethingNew</value>
</xml>
So I want to distinguish what is inside another tag (ex. <system>) what is not. How to do this with NSXMLParser? I currently use BOOL ivar's but this is a lot of tags and this is not as elegant as I want it to be. I know that NSXMLParser is a SAX parser and I understand that.
In above example I will be enter to didEndElement method three times with:
elementName equal value Is there a more elegant way to distinguish what entry was from <component> tag above what not?
You could keep an array of tag names you are currently within
NSMutableArray *tagNameStack;
Each time you enter a new element you add it to this array. Each time you leave you remove it again. i.e.
- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict {
[tagNameStack addObject:elementName];
// Your code here
}
and
- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName {
// Your code here
[tagNameStack removeLastObject];
}
So, when you are parsing somethingDeepest your tagNameStack array would be
#"xml", #"component", #"system", #"value"
You can use this stack to decide where you are in the xml i.e.
- (void) parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
if ([tagNameStack containsObject:#"system"]) {
// Deal with the value inside the system tag
} else if ([tagNameStack containsObject:#"component"]) {
// Deal with the value inside the component tag
} else {
// This must be the value inside the XML tag
}
}

How does NSXMLParser differentiate between different elements?

I just did a tutorial on NSXMLParser. What I am completely at a loss at is how NSXMLParser differentiates between different elements. To me it seems undefined.
This is my XML
<?xml version="1.0" encoding="UTF-8"?>
<Prices>
<Price id="1">
<name>Rosie O'Gradas</name>
<Beer>4.50</Beer>
<Cider>4.50</Cider>
<Guinness>4</Guinness>
</Price>
</Prices>
And this is my Parser
-(void) parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict {
if ([elementName isEqualToString:#"Prices"]) {
app.listArray = [[NSMutableArray alloc] init];
NSLog(#"The Prices Count");
}
else if ([elementName isEqualToString:#"Price"]) {
thelist = [[List alloc] init];
thelist.drinkID = [[attributeDict objectForKey:#"id"]integerValue];
}
}
-(void) parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
if (!currentElementValue) {
currentElementValue = [[NSMutableString alloc]initWithString:string];
} else {
[currentElementValue appendString:string];
}
}
-(void) parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName {
if ([elementName isEqualToString:#"Prices"]) {
return;
}
if ([elementName isEqualToString:#"Price"]) {
[app.listArray addObject:thelist];
thelist = nil;
} else {
[thelist setValue:currentElementValue forKey:elementName];
currentElementValue = nil;
}
}
I did notice that the names of the Properties in the data object were the same as in the parser. So I understood that at least.
What I am at a loss at is where it assigns these properties their value.
So at the beginning it initializes the data object with
thelist = [[List alloc] init];
(List is the Data object) But then it does the first thing that I don't understand
thelist.drinkID = [[attributeDict objectForKey:#"id"]integerValue];
Because it is in an if statement won't it get overwritten every time it finds an id attribute. Or is the 'theList' declaration creating multiple objects?
In the found characters I really have no idea what is going on. As much as I can tell foundCharaters string is every bit of text inside the elements. So current element value is really just a bundle of strings appended together (but I can't tell as for some reason I can't NSLOG it).
From there in the didEndElement section, I wonder if this is the correct interpretation of the code.
if ([elementName isEqualToString:#"Price"]) {
[app.listArray addObject:thelist];
thelist = nil;
}
I understand that every time that the parser hits the element Price that the app.list array object (declared in another class) has the object added to it 'thelist'.
But here is bit where my lack of understanding in the earlier method takes effect
else {
[thelist setValue:currentElementValue forKey:elementName];
currentElementValue = nil;
}
What are they doing here? From what I see the current element value is just a jumble of characters from the XML file. How is it organized? With the Element Name?
One more question (sorry for the length) why isn't the element name case sensitive, I was experimenting and I found it wasn't. Both languages are case sensitive.
If I interpret your question correctly, it is just about understanding the code which is working fine.
In your XML you have 4 child elements to Price with id=1: name, Beer, Cider and Guinness.
The foundCharacters method will find the characters inside these 4 xml tags, i.e. what is written between <name> and </name>, <Beer> and </Beer>, etc. In your case this is the string Rosie O'Gradas for name, then the string 4.50 for Beer etc.
When characters are found, the method first checks if a container string exists, if not it creates one as currentElementValue. If it does exist, it appends the found characters.
What happens next, logically? It will hit the didEndElement method, in the first case the tag </name>. In this case it will assign the collected text in currentElementValue to the key #"name" and put this key-value pair into the list. The list is of type List, which is defined somewhere else, but it seems to be essentially an NSDictionary.
Because currentElementValue has been stored successfully, it should be destroyed, so the check for its existence next time it hits foundCharacters will work.
Clear?