I extract nodes from an XML document by calling -nodesForXPath:error:. Now i wonder if it guarantees, that the nodes are returned in the same order as they appear from top to bottom in the document (it's crucial in my case).
My XML looks something like this and i retrieve the b tags with the XPath query:
<a>
<b>
...
</b>
<b>
...
</b>
<a>
Unfortunately the b tags do not have an explicit counter.
While the documentation for NSXMLNode doesn't state explicitly if order is preserved, I believe it will be because XML documents are inherently ordered. Also, a method that does not have a deterministic result set will usually have that fact stated; something that hasn't been done for NSXMLNode.
With that said, the only way to find out for sure is to run some tests on your data.
Related
For example, I have div tag that has two attributes.
class='hello#123' text='321#he#321llo#321'
<div> class='hello#123' text='321#he#321llo#321'></div>
Here, I want to write xpath for both class and text attributes but numbers may change dynamically. ie., "hello#123" may become "345" when we reload. "321#he#321llo#321" may become "567#he#456llo#321".
Note: Need to write xpath in single line not separately.
Assuming that you have the (corrected) two-attribute-HTML
<div class='hello#123' text='321#he#321llo#321'>...</div>
you can select it using the following, for example:
Using the contains() function
//div[contains(#class,'hello') and contains(#text,'#he#')]
This is quite specific and only applicable if the "hello" is always split in the same way
Using the translate() function to mask everything except the chars for "hello"
//div[translate(#class,'#0123456789','')='hello' and translate(#text,'#0123456789','')='hello']
This removes all # chars and digits and checks if the remaining string is "hello"
I guess combining these two approaches you will be able to create your own XPath expression fitting your needs. The patterns you provided were not fully clear, so this may only approach a good enough solution.
Can some tell me the function similar to normalize() of DOM in JDOM? I actually want to normalize the XML content and serialise it through XMLSerializer.
Thank You
Sam
Sandeep.
JDOM does not have a direct 'normalize' concept. Writing one would not be particularly hard, though. On the other hand, your intention is to output the XML in some format, and all the JDOM Output mechanisms will normalize the data for you.
So, for example, if you want to output the JDOM document as plain XML text, you can use the XMLOutputter class in org.jdom2.output and use an appropriate org.jdom2.output.Format instance (say, Format.getPrettyFormat() - do not use getRawFormat() as the raw formatter will not normalize the output at all).
In addition to outputting the JDOM document as text-based XML, you can also output to a DOM document, a SAX even stream, and even StAX streams. Each of these will produce a 'Normalized' output.
So, what you want to do (probably), is to:
Document mudoc = .....;
XMLOutputter xout = new XMLOutputter(Format.getPrettyFormat());
xout.output(mydoc, somestream);
Rolf
I'm extracting terms from the query calling ExtractTerms() on the Query object that I get as the result of QueryParser.Parse(). I get a HashTable, but each item present as:
Key - term:term
Value - term:term
Why are the key and the value the same? And more why is term value duplicated and separated by colon?
Do highlighters only insert tags or to do anything else? I want not only to get text fragments but to highlight the source text (it's big enough). I try to get terms and by offsets to insert tags by hand. But I worry if this is the right solution.
I think the answer to this question may help.
It is because .Net 2.0 doesnt have an equivalent to java's HashSet. The conversion to .Net uses Hashtables with the same value in key/value. The colon you see is just the result of Term.ToString(), a Term is a fieldname + the term text, your field name is probably "term".
To highlight an entire document using the Highlighter contrib, use the NullFragmenter
I have the following SQL query....
select AanID as '#name', '<![CDATA[' + Answer + ']]>' from AuditAnswers for XML PATH('str'), ROOT('root')
which works wonderfully but the column 'Answer' can sometimes have HTML markup in it. The query automatically escapes this HTML from the 'Answer' column in the generated XML. I don't want that. I will be wrapping this resulting column in CDATA so the escaping is not necessary.
I want the result to be this...
<str name="2"><![CDATA[<DIV><DIV Style="width:55%;float:left;">Indsfgsdfg]]></str>
instead of this...
<str name="2"><![CDATA[<DIV><DIV Style="width:55%;float:left;">In</str>
Is there a function or other mechanism to do this?
Selecting anything "FOR XML" escapes any pre-existing XML so that it will not break the consistency of the XmlDocument. The first example line you gave is considered to be improperly formed XML, and will not be able to be loaded by an XmlDocument object, as well as most parsers. I would consider restructuring what you're trying to do so that you can have a more efficient solution.
You can use for xml explicit and the cdata directive:
select
1 as tag,
null as parent,
AanID as [str!1!name],
Answer as [str!1!!cdata]
from AuditAnswers
for xml explicit
You can specify that the output be treated as CDATA when using EXPLICIT mode XML queries. See:
Using EXPLICIT Mode
and
Example: Specifying the CDATA Directive
What would be the benefit of having <[CDATA[ <div></div> ]]> over having <div></div> in your database output? To me, it looks like you would have a properly escaped HTML fragment in your XML output in both cases, and reading it back with a decent XML parser should give you the unescaped original version in both cases.
I have a json file/stream, i like to be able to make select SQL style
so here is the file
the file contain all the data i have, I'll like to be able to show, let said :
all the : odeu_nom and odeu_desc that is : categorie=Feuilles
if you can do that with PHP and json (eval) fine... tell me how...
on the other part in sql i will do : SELECT * from $json where categorie=Feuilles
p.s. i have found : jsonpath that is a xpath for json... maybe another option ?
p.s. #2... with some research, i have found anoter option, the json is the same as a array, maybe I can filter the array and just return the one i need ?... how do i do that ?
It makes more sense to try and stick with XPath-style selectors (like jsonpath), rather than using SQL, even if you are more familiar with SQL.
The advantage of the "path" is that it is more readily expressive of the hierarchical structure implicit to XML/JSON, as opposed to SQL which requires using various joins to help it "get out of its rectangular/tabular prison".
Although I never used jsonpath, by reading its summary page, I believe that the following should produce all the odeu_nom for objects which catagorie is 'Feuilles' (given the json input referred in the question).
$.Liste_des_odeurs[?(#.categorie = 'Feuilles'].odeu_nom
which correspond to the following XPath
/Liste_des_odeurs[categorie='Feuilles']/odeu_nom
Et voila...
BTW, 'Jazz is not dead, it just smells funny' (F Zappa)