how to find the most specific class for an individual with pellet+owlapi - semantic-web

I want to perform classification with ontology and pellet reasoner. Pellet has a function(namely realization()) to find the most specific for an individual. I have tried but it doesn't work, could anybody provide some help or give me some examples.
<Class rdf:about="http://webmind.dico.unimi.it/CARE/locont.owl#HavingDrink">
<equivalentClass>
<Class>
<intersectionOf rdf:parseType="Collection">
<rdf:Description rdf:about="http://webmind.dico.unimi.it/CARE/locont.owl#PersonalActivity"/>
<Restriction>
<onProperty rdf:resource="http://webmind.dico.unimi.it/CARE/locont.owl#hasActor"/>
<allValuesFrom>
<Class>
<intersectionOf rdf:parseType="Collection">
<rdf:Description rdf:about="http://webmind.dico.unimi.it/CARE/locont.owl#Person"/>
<Restriction>
<onProperty rdf:resource="http://webmind.dico.unimi.it/CARE/locont.owl#hasCurrentSymbolicLocation"/>
<someValuesFrom rdf:resource="http://webmind.dico.unimi.it/CARE/locont.owl#Kitchen"/>
</Restriction>
<Restriction>
<onProperty rdf:resource="http://webmind.dico.unimi.it/CARE/locont.owl#usingArtifact"/>
<someValuesFrom>
<Class>
<unionOf rdf:parseType="Collection">
<rdf:Description rdf:about="http://webmind.dico.unimi.it/CARE/locont.owl#CupsCupboard"/>
<rdf:Description rdf:about="http://webmind.dico.unimi.it/CARE/locont.owl#Fridge"/>
</unionOf>
</Class>
</someValuesFrom>
</Restriction>
</intersectionOf>
</Class>
</allValuesFrom>
</Restriction>
</intersectionOf>
</Class>
</equivalentClass>
<rdfs:subClassOf rdf:resource="http://webmind.dico.unimi.it/CARE/locont.owl#PersonalActivity"/>
</Class>
For example, the HavingDrink is one of the activity classes. now I create an individual and its ObjectProperty:
String file = "file:///home/uqjwen/workspace/Owlapi/snapshot.owl#";
OWLOntologyManager manager = OWLManager.createOWLOntologyManager();
OWLOntology ont = manager.loadOntology(IRI.create(file));
OWLDataFactory fac = manager.getOWLDataFactory();
PrefixManager pm = new DefaultPrefixManager(IRI.create("http://webmind.dico.unimi.it/CARE/locont.owl").toString());
//////////create hasActor property///////////////////////////////////////
OWLNamedIndividual currentActivity = fac.getOWLNamedIndividual("#alice_activity", pm);
OWLNamedIndividual alice = fac.getOWLNamedIndividual("#alice", pm);
OWLObjectProperty hasActor = fac.getOWLObjectProperty("#hasActor", pm);
OWLObjectPropertyAssertionAxiom propertyAssertion = fac
.getOWLObjectPropertyAssertionAxiom(hasActor,currentActivity,alice);
manager.addAxiom(ont, propertyAssertion);
////////////create hasCurrentSymbolicLocation ////////
OWLNamedIndividual kitchen = fac.getOWLNamedIndividual("#Kitchen", pm);
OWLObjectProperty hasLocation = fac
.getOWLObjectProperty("#hasCurrentSymbolicLocation", pm);
OWLObjectPropertyAssertionAxiom locationAssertion = fac
.getOWLObjectPropertyAssertionAxiom(hasLocation,alice,kitchen);
manager.addAxiom(ont, locationAssertion);
/////////////create using actifact //////////////
OWLNamedIndividual cc = fac.getOWLNamedIndividual("#cups_cupboard", pm);
OWLObjectProperty usingArtifact = fac
.getOWLObjectProperty("#usingArtifact", pm);
OWLObjectPropertyAssertionAxiom artifactAssertion =fac
.getOWLObjectPropertyAssertionAxiom(usingArtifact, alice, cc);
manager.addAxiom(ont, artifactAssertion);
OWLNamedIndividual fridge = fac.getOWLNamedIndividual("#fridge", pm);
artifactAssertion =fac
.getOWLObjectPropertyAssertionAxiom(usingArtifact, alice, fridge);
manager.addAxiom(ont, artifactAssertion);
//////////////reason
PelletReasoner reasoner = PelletReasonerFactory.getInstance().createReasoner( ont );
System.out.println(reasoner.isConsistent());
reasoner.getKB().classify();
reasoner.getKB().realize();
NodeSet<OWLClass> types = reasoner.getTypes(currentActivity, true);
it is supposed to return HavingDrink class, but it does not.

The first thing to check is that the ontology contains the classes and individuals you expect it to contain—you can do this by saving the ontology to System.out in the code, so that you're sure of what is loaded:
manager.saveOntology(ont, new SystemOutDocumentTarget());
Then make sure the IRIs being resolved by the prefix manager match the IRIs in the ontology:
OWLNamedIndividual currentActivity = fac.getOWLNamedIndividual("#alice_activity", pm);
System.put.println("expected individual "+currentActivity.getIRI());
Once these possible sources of error are out of the way, you need to verify that the type you expect is actually inferrable from the ontology—we cannot see the rest of the ontology, and there might be important information there that might change the expected result.
Edit:
From the ontology, the definition for HavingDrink (in functional syntax) is:
EquivalentClasses(:HavingDrink
ObjectIntersectionOf(
ObjectAllValuesFrom(:hasActor ObjectIntersectionOf(:Person ObjectSomeValuesFrom(:hasCurrentSymbolicLocation :Kitchen)
ObjectSomeValuesFrom(:usingArtifact ObjectUnionOf(:Fridge :CupsCupboard)))
)
:PersonalActivity))
In order for something to be a HavingDrink activity, it must have a value for usingArtifact of type Fridge or CupsCupboard, for which I cannot see assertions in your ontology. It's quite a complex definition, so I would start checking whether alice_activity is an instance of the separate parts of the intersection, and ensure each of them is satisfied. The middle term is not satisfied, as far as I can tell.

Thanks Joshua and Ignazio, finally I found the problem. When I defined the ontology with protege by myself, it worked well. So I can concluded that there might be problems with the definition of the ontology here: webmind.di.unimi.it/care/snapshot.owl.

Related

Adding a Retokenize pipe while training NER model

I am currenly attempting to train a NER model centered around Property Descriptions. I could get a fully trained model to function to my liking however, I now want to add a retokenize pipe to the model so that I can set up the model to train other things.
From here, I am having issues getting the retokenize pipe to actually work. Here is the definition:
def retok(doc):
ents = [(ent.start, ent.end, ent.label) for ent in doc.ents]
with doc.retokenize() as retok:
string_store = doc.vocab.strings
for start, end, label in ents:
retok.merge(
doc[start: end],
attrs=intify_attrs({'ent_type':label},string_store))
return doc
i am adding it into my training like this:
nlp.add_pipe(retok, after="ner")
and I am adding it into the Language Factories like this:
Language.factories['retok'] = lambda nlp, **cfg: retok(nlp)
The issue I keep getting is "AttributeError: 'English' object has no attribute 'ents'". Now I am assuming I am getting this error because the parameter that is being passed through this function is not a doc but actually the NLP model itself. I am not really sure to get a doc to flow into this pipe during training. At this point I don't really know where to go from here to get the pipe to function the way I want.
Any help is appreciated, thanks.
You can potentially use the built-in merge_entities pipeline component: https://spacy.io/api/pipeline-functions#merge_entities
The example copied from the docs:
texts = [t.text for t in nlp("I like David Bowie")]
assert texts == ["I", "like", "David", "Bowie"]
merge_ents = nlp.create_pipe("merge_entities")
nlp.add_pipe(merge_ents)
texts = [t.text for t in nlp("I like David Bowie")]
assert texts == ["I", "like", "David Bowie"]
If you need to customize it further, the current implementation of merge_entities (v2.2) is a good starting point:
def merge_entities(doc):
"""Merge entities into a single token.
doc (Doc): The Doc object.
RETURNS (Doc): The Doc object with merged entities.
DOCS: https://spacy.io/api/pipeline-functions#merge_entities
"""
with doc.retokenize() as retokenizer:
for ent in doc.ents:
attrs = {"tag": ent.root.tag, "dep": ent.root.dep, "ent_type": ent.l
abel}
retokenizer.merge(ent, attrs=attrs)
return doc
P.S. You are passing nlp to retok() below, which is where the error is coming from:
Language.factories['retok'] = lambda nlp, **cfg: retok(nlp)
See a related question: Spacy - Save custom pipeline

Increase the length of text returned by Highlighter

Initially, I thought setMaxDocCharsToAnalyze(int) will increase the output length, but it does not.
Currently the output generated by my Search (String fragment) is less than a line long and hence makes no sense as a preview.
Can the output generated by getBestFragment() be increased, by some mechanism, to at least 1 sentence or more (it does not matter if it is one and a half sentences or more, but I need it to be long enough to at least make some sense).
Indexing:
Document document = new Document();
document.add(new TextField(FIELD_CONTENT, content, Field.Store.YES));
document.add(new StringField(FIELD_PATH, path, Field.Store.YES));
indexWriter.addDocument(document);
Searching
QueryParser queryParser = new QueryParser(FIELD_CONTENT, new StandardAnalyzer());
Query query = queryParser.parse(searchQuery);
QueryScorer queryScorer = new QueryScorer(query, FIELD_CONTENT);
Fragmenter fragmenter = new SimpleSpanFragmenter(queryScorer);
Highlighter highlighter = new Highlighter(queryScorer); // Set the best scorer fragments
highlighter.setMaxDocCharsToAnalyze(100000); //"HAS NO EFFECT"
highlighter.setTextFragmenter(fragmenter);
// STEP B
File indexFile = new File(INDEX_DIRECTORY);
Directory directory = FSDirectory.open(indexFile.toPath());
IndexReader indexReader = DirectoryReader.open(directory);
// STEP C
System.out.println("query: " + query);
ScoreDoc scoreDocs[] = searcher.search(query, MAX_DOC).scoreDocs;
for (ScoreDoc scoreDoc : scoreDocs)
{
//System.out.println("1");
Document document = searcher.getDocument(scoreDoc.doc);
String title = document.get(FIELD_CONTENT);
TokenStream tokenStream = TokenSources.getAnyTokenStream(indexReader,
scoreDoc.doc, FIELD_CONTENT, document, new StandardAnalyzer());
String fragment = highlighter.getBestFragment(tokenStream, title); //Increase the length of the this String this is the output
System.out.println(fragment + "-------");
}
Sample Output
query: +Content:canada +Content:minister
|Liberal]] [[Prime Minister of Canada|Prime Minister]] [[Pierre Trudeau]] led a [[Minority-------
. Thorson, Minister of National War Services, Ottawa. Printed in Canada Description: British lion-------
politician of the [[New Zealand Labour Party| Labour Party]], and a cabinet minister. He represented-------
|}}}| ! [[Minister of Finance (Canada)|Minister]] {{!}} {{{minister-------
, District of Franklin''. Ottawa: Minister of Supply and Services Canada, 1977. ISBN 0660008351
25]], [[1880]] – [[March 4]], [[1975]]) was a [[Canada|Canadian]] provincial and federal-------
-du-Quebec]] region, in [[Canada]]. It is named after the first French Canadian to become Prime-------
11569347, Cannon_family_(Canada) ::: {{for|the American political family|Cannon family-------
minister of [[Guyana]] and prominent Hindu politician in [[Guyana]]. He also served, at various times-------
11559743, Mohammed_Hussein_Al_Shaali ::: '''Mohammed Hussein Al Shaali''' is the former Minister-------
The Fragmenter is the piece that controls this behavior. You can pass an int into the SimpleSpanFragmenter constructor to control the size of the fragments it produces (in bytes). The default size is 100. For example, to double that:
Fragmenter fragmenter = new SimpleSpanFragmenter(queryScorer, 200);
As far as splitting on sentence boundaries, there isn't a fragmenter for that, out of the box. Someone posted their implementation of one here. It's an extremely naive implementation, but you may find it helpful if you want to go down that particular rabbit hole.

Pellet internalReasonerException in OWL API

Have been stuck one day and would someone be kind enough to help?
I have loaded an ontology which imported SWEET(Semantic Web for Earth and Environmental Ontology). I did some SPARQL query on it, and I got such answer: "Object Property hasLowerBound is used with a hasValue restriction where the value is a literal: "0"^^integer". (hasLowerBound, which I have checked in the SWEET, is an Datatype Ontology in SWEET)
How can I solve this problem?
Here is the code I wrote and the error I got,Thank you so much for your help~
public class load {
public static void main(String[] args) throws OWLOntologyCreationException {
// Get hold of an ontology manager
OWLOntologyManager manager = OWLManager.createOWLOntologyManager();
File file = new File("G:/Protege/owlfiles/Before_Gather.owl");
// Load the local copy
OWLOntology loadMODIS = manager.loadOntologyFromOntologyDocument(file);
PelletReasoner reasoner =
PelletReasonerFactory.getInstance().createNonBufferingReasoner( loadMODIS
);
KnowledgeBase kb = reasoner.getKB();
PelletInfGraph graph = new
org.mindswap.pellet.jena.PelletReasoner().bind( kb );
InfModel model = ModelFactory.createInfModel( graph );
String PREFIX = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-
ns#>" +
"PREFIX owl: <http://www.w3.org/2002/07/owl#>" +
"PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>" +
"PREFIX seaice: <http://www.semanticweb.org/SeaIceOntology#>" +
"PREFIX repr: <http://sweet.jpl.nasa.gov/2.3/reprDataFormat.owl#>" +
"PREFIX realmCryo: <http://sweet.jpl.nasa.gov/2.3/realmCryo.owl#>" +
"PREFIX relaMath: <http://sweet.jpl.nasa.gov/2.3/relaMath.owl#>";
String SELECT = "select ?dataset ";
String WHERE = "where {" +
"?dataset relaMath:hasLowerBound " + "\"0\"^^xsd:integer" +
"}" ;
QueryExecution qe = SparqlDLExecutionFactory.create(QueryFactory.create(PREFIX + SELECT + WHERE), model);
ResultSet rs = qe.execSelect();
ResultSetFormatter.out(System.out,rs);
rs = null; qe.close();
reasoner.dispose();
//OWLReasonerSPARQLEngine sparqlEngine=new OWLReasonerSPARQLEngine(new MinimalPrintingMonitor());
//sparqlEngine.execQuery(str.toString(),dataset);
System.out.println("Loaded ontology: " + loadMODIS);
}
}
Exception in thread "main" org.mindswap.pellet.exceptions.InternalReasonerException: Object Property hasLowerBound is used with a hasValue restriction where the value is a literal: "0"^^integer
at org.mindswap.pellet.tableau.completion.rule.SomeValuesRule.applySomeValuesRule(SomeValuesRule.java:204)
at org.mindswap.pellet.tableau.completion.rule.SomeValuesRule.apply(SomeValuesRule.java:64)
at org.mindswap.pellet.tableau.completion.rule.AbstractTableauRule.apply(AbstractTableauRule.java:64)
at org.mindswap.pellet.tableau.completion.SROIQStrategy.complete(SROIQStrategy.java:157)
at org.mindswap.pellet.ABox.isConsistent(ABox.java:1423)
at org.mindswap.pellet.ABox.isConsistent(ABox.java:1260)
at org.mindswap.pellet.KnowledgeBase.consistency(KnowledgeBase.java:1987)
at org.mindswap.pellet.KnowledgeBase.isConsistent(KnowledgeBase.java:2061)
at org.mindswap.pellet.jena.PelletInfGraph.prepare(PelletInfGraph.java:258)
at org.mindswap.pellet.jena.PelletInfGraph.prepare(PelletInfGraph.java:241)
at com.clarkparsia.pellet.sparqldl.jena.SparqlDLExecutionFactory.create(SparqlDLExecutionFactory.java:113)
at com.clarkparsia.pellet.sparqldl.jena.SparqlDLExecutionFactory.create(SparqlDLExecutionFactory.java:261)
at com.clarkparsia.pellet.sparqldl.jena.SparqlDLExecutionFactory.create(SparqlDLExecutionFactory.java:226)
at loadMODIS.load.main(load.java:78)
hasLowerBound is parsed as a data property and an annotation property.
Pellet is checking for a data property case and assumes that if a property is not a data property it must be an object property. This is always the case for OWL 2 ontologies, but this ontology is not being parsed as an OWL 2 compatible ontology - punning of an annotation property and a data property is not allowed.
I'm not sure yet whether the problem is in the ontology or in the OWLAPI parsing of it.
Edit: It's a parsing issue. hasLowerBound is declared in relaMath.owl as data property. However relaMath.owl imports reprMath.owl, which uses hasLowerBound but does not declare it. reprMath.owl imports relaMath.owl, so there is a cyclic import there.
The problem is that, during parsing:
- relaMath.owl is parsed, import is found, reprMath.owl import is resolved; no declarations are parsed yet.
- reprMath.owl is parsed, import is found. relaMath.owl is already being parsed, so call does nothing. All entities declared in relaMath.owl are included while parsing reprMath.owl. PROBLEM: the entities have not been parsed yet, so this set is empty. hasLowerBound is found in reprMath but no declaration exists yet. Therefore OWLAPI defaults to AnnotationProperty.
- relaMath.owl parsing continues, declaration is found.
Final result: Any ontology importing relaMath.owl has illegal punning for hasLowerBound. OWLAPI bug.
Workaround: Add data property declaration to reprMath.owl:
<owl:DatatypeProperty rdf:about="#hasLowerBound"/>
This might need to be done in more than one ontology.

The ':' character, hexadecimal value 0x3A, cannot be included in a name

I saw this question already, but I didnt see an answer..
So I get this error:
The ':' character, hexadecimal value 0x3A, cannot be included in a name.
On this code:
XDocument XMLFeed = XDocument.Load("http://feeds.foxnews.com/foxnews/most-popular?format=xml");
XNamespace content = "http://purl.org/rss/1.0/modules/content/";
var feeds = from feed in XMLFeed.Descendants("item")
select new
{
Title = feed.Element("title").Value,
Link = feed.Element("link").Value,
pubDate = feed.Element("pubDate").Value,
Description = feed.Element("description").Value,
MediaContent = feed.Element(content + "encoded")
};
foreach (var f in feeds.Reverse())
{
....
}
An item looks like that:
<rss>
<channel>
....items....
<item>
<title>Pentagon confirms plan to create new spy agency</title>
<link>http://feeds.foxnews.com/~r/foxnews/most-popular/~3/lVUZwCdjVsc/</link>
<category>politics</category>
<dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/" />
<pubDate>Tue, 24 Apr 2012 12:44:51 PDT</pubDate>
<guid isPermaLink="false">http://www.foxnews.com/politics/2012/04/24/pentagon-confirms-plan-to-create-new-spy-agency/</guid>
<content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[|http://global.fncstatic.com/static/managed/img/Politics/panetta_hearing_030712.jpg<img src="http://feeds.feedburner.com/~r/foxnews/most-popular/~4/lVUZwCdjVsc" height="1" width="1"/>]]></content:encoded>
<description>The Pentagon confirmed Tuesday that it is carving out a brand new spy agency expected to include several hundred officers focused on intelligence gathering around the world.&amp;#160;</description>
<dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2012-04-4T19:44:51Z</dc:date>
<feedburner:origLink>http://www.foxnews.com/politics/2012/04/24/pentagon-confirms-plan-to-create-new-spy-agency/</feedburner:origLink>
</item>
....items....
</channel>
</rss>
All I want is to get the "http://global.fncstatic.com/static/managed/img/Politics/panetta_hearing_030712.jpg", and before that check if content:encoded exists..
Thanks.
EDIT:
I've found a sample that I can show and edit the code that tries to handle it..
EDIT2:
I've done it in the ugly way:
text.Replace("content:encoded", "contentt").Replace("xmlns:content=\"http://purl.org/rss/1.0/modules/content/\"","");
and then get the element in the normal way:
MediaContent = feed.Element("contentt").Value
The following code
static void Main(string[] args)
{
var XMLFeed = XDocument.Parse(
#"<rss>
<channel>
....items....
<item>
<title>Pentagon confirms plan to create new spy agency</title>
<link>http://feeds.foxnews.com/~r/foxnews/most-popular/~3/lVUZwCdjVsc/</link>
<category>politics</category>
<dc:creator xmlns:dc='http://purl.org/dc/elements/1.1/' />
<pubDate>Tue, 24 Apr 2012 12:44:51 PDT</pubDate>
<guid isPermaLink='false'>http://www.foxnews.com/politics/2012/04/24/pentagon-confirms-plan-to-create-new-spy-agency/</guid>
<content:encoded xmlns:content='http://purl.org/rss/1.0/modules/content/'><![CDATA[|http://global.fncstatic.com/static/managed/img/Politics/panetta_hearing_030712.jpg<img src='http://feeds.feedburner.com/~r/foxnews/most-popular/~4/lVUZwCdjVsc' height='1' width='1'/>]]></content:encoded>
<description>The Pentagon confirmed Tuesday that it is carving out a brand new spy agency expected to include several hundred officers focused on intelligence gathering around the world.&amp;#160;</description>
<dc:date xmlns:dc='http://purl.org/dc/elements/1.1/'>2012-04-4T19:44:51Z</dc:date>
<!-- <feedburner:origLink>http://www.foxnews.com/politics/2012/04/24/pentagon-confirms-plan-to-create-new-spy-agency/</feedburner:origLink> -->
</item>
....items....
</channel>
</rss>");
XNamespace contentNs = "http://purl.org/rss/1.0/modules/content/";
var feeds = from feed in XMLFeed.Descendants("item")
select new
{
Title = (string)feed.Element("title"),
Link = (string)feed.Element("link"),
pubDate = (string)feed.Element("pubDate"),
Description = (string)feed.Element("description"),
MediaContent = GetMediaContent((string)feed.Element(contentNs + "encoded"))
};
foreach(var item in feeds)
{
Console.WriteLine(item);
}
}
private static string GetMediaContent(string content)
{
int imgStartPos = content.IndexOf("<img");
if(imgStartPos > 0)
{
int startPos = content[0] == '|' ? 1 : 0;
return content.Substring(startPos, imgStartPos - startPos);
}
return string.Empty;
}
results in:
{ Title = Pentagon confirms plan to create new spy agency, Link = http://feeds.f
oxnews.com/~r/foxnews/most-popular/~3/lVUZwCdjVsc/, pubDate = Tue, 24 Apr 2012 1
2:44:51 PDT, Description = The Pentagon confirmed Tuesday that it is carving out
a brand new spy agency expected to include several hundred officers focused on
intelligence gathering around the world. , MediaContent = http://global
.fncstatic.com/static/managed/img/Politics/panetta_hearing_030712.jpg }
Press any key to continue . . .
A few points:
You never want to treat Xml as text - in your case you removed the namespace declaration but actually if the namespace was declared inline (i.e. without binding to the prefix) or a different prefix would be defined your code would not work even though semantically both documents would be equivalent
Unless you know what's inside CDATA and how to treat it you always want to treat is as text. If you know it's something else you can treat it differently after parsing - see my elaborate on CDATA below for more details
To avoid NullReferenceExceptions if the element is missing I used explicit conversion operator (string) instead of invoking .Value
the Xml you posted was not a valid xml - there was missing namespace Uri for feedburner prefix
This is no longer related to the problem but may be helpful for some folks so I am leaving it
As far as the contents of the encode element is considered it is inside CDATA section. What's inside CDATA section is not an Xml but plain text. CDATA is usually used to not have to encode '<', '>', '&' characters (without CDATA they would have to be encoded as < > and & to not break the Xml document itself) but the Xml processor treat characters in the CDATA as if they were encoded (or to be more correct in encodes them). The CDATA is convenient if you want to embed html because textually the embedded content looks like the original yet it won't break your xml if the html is not a well-formed Xml. Since the CDATA content is not an Xml but text it is not possible to treat it as Xml. You will probably need to treat is as text and use for instance regular expressions. If you know it is a valid Xml you can load the contents to an XElement again and process it. In your case you have got mixed content so it is not easy to do unless you use a little dirty hack. Everything would be easy if you have just one top level element instead of mixed content. The hack is to add the element to avoid all the hassle. Inside the foreach look you can do something like this:
var mediaContentXml = XElement.Parse("<content>" + (string)item.MediaContent + "</content>");
Console.WriteLine((string)mediaContentXml.Element("img").Attribute("src"));
Again it's not pretty and it is a hack but it will work if the content of the encoded element is valid Xml. The more correct way of doing this is to us XmlReader with ConformanceLevel set to Fragment and recognize all kinds of nodes appropriately to create a corresponding Linq to Xml node.
You should use XNamespace:
XNamespace content = "...";
// later in your code ...
MediaContent = feed.Element(content + "encoded")
See more details here.
(Of course, you the string to be assigned to content is the same as in xmlns:content="...").

vb.Net: How can I read a large XML file quickly?

I am trying to read this XML document.
An excerpt:
<datafile xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="wiitdb.xsd">
<WiiTDB version="20100217113738" games="2368"/>
<game name="Help Wanted: 50 Wacky Jobs (DEMO) (USA) (EN)">
<id>DHKE18</id>
<type/>
<region>NTSC-U</region>
<languages>EN</languages>
<locale lang="EN">
<title>Help Wanted: 50 Wacky Jobs (DEMO)</title>
<synopsis/>
</locale>
<developer>HUDSON SOFT CO., LTD.</developer>
<publisher>Hudson Entertainment, Inc.</publisher>
<date year="2009" month="" day=""/>
<genre>party</genre>
<rating type="ESRB" value="E10+">
<descriptor>comic mischief</descriptor>
<descriptor>mild cartoon violence</descriptor>
<descriptor>mild suggestive themes</descriptor>
</rating>
<wi-fi players="0"/>
<input players="2">
<control type="wiimote" required="true"/>
<control type="nunchuk" required="true"/>
</input>
<rom version="" name="Help Wanted: 50 Wacky Jobs (DEMO) (USA) (EN).iso" size="4699979776"/>
</game>
So far I have this:
Dim doc as XPathDocument
Dim nav as XPathNavigator
Dim iter as XPathNodeIterator
Dim lstNav As XPathNavigator
Dim iterNews As XPathNodeIterator
doc = New XPathDocument("wiitdb.xml")
nav = doc.CreateNavigator
iter = nav.Select("/WiiTDB/game") 'Your node name goes here
'Loop through the records in that node
While iter.MoveNext
'Get the data we need from the node
lstNav = iter.Current
iterNews = lstNav.SelectDescendants(XPathNodeType.Element, False)
'Loop through the child nodes
txtOutput.Text = txtOutput.Text & vbNewLine & iterNews.Current.Name & ": " & iterNews.Current.Value
End While
It just skips the "While iter.MoveNext" part of the code. I tries it with a simple XML file, and it works fine.
I think your XPath query is off. WiiTDB is a closed node, so you need to look for /datafile/game or //game.
Use the System.Xml.Serialization namespace instead: create a dedicated, serializable class to hold the data you wish to load and define shared serialize / deserialize functions with strongly typed arguments to do the work for you.
As the structure of the new classes will closely follow that of your XML data, there should be no confusion as to which data is located where within a run time instance.
See my answer here for an idea of how to create a class from an example XML file.