Can a function be given after an XPath seperator? - xpath-2.0

Here is a XML.......
<Book id="isbn-0-671-21280-X">
<Title>How to Read a Book</Title>
<Subtitle>The Classic Guide to Intelligent Reading</Subtitle>
<Author>Mortimer J. Adler</Author>
<Author>Charles Van Doren</Author>
<Date>1940</Date>
<Publisher>Simon & Schuster</Publisher>
</Book>
I want to get the length of attribute id of Book
I wrote:
string-length(//Book/#id)
and one more:
//Book/string-length(#id)
Both are working, but I have doubts that the second way is fully correct.

In XPath 2 both ways are correct
You should not think about / as path anymore, but more like a map operator. Applying the function on the right to each element in the sequence to the left.
You can even do stuff like
//Book/(a-function-that-returns-nodes() | b-function-that-returns-nodes())/../..

There is a difference between those two kinds of calling a function. string-length(//some/expression) fetches all results of the path expression and then calculates the string length, //some/expression/string-length() will return the string length for every single matching result of that expression.
As your XML only includes one book, both versions will yield the same result. If there would be multiple books, the first query would return the summed up string length of all books, the second would return a string length for each book's identifier.
You could also write //book/#id/string-lenght() instead of //Book/string-length(#id) as there may not be more than one #id attribute, for other attributes (or subelements) the same as stated above would apply.

Related

How to write xpath for following example?

For example, I have div tag that has two attributes.
class='hello#123' text='321#he#321llo#321'
<div> class='hello#123' text='321#he#321llo#321'></div>
Here, I want to write xpath for both class and text attributes but numbers may change dynamically. ie., "hello#123" may become "345" when we reload. "321#he#321llo#321" may become "567#he#456llo#321".
Note: Need to write xpath in single line not separately.
Assuming that you have the (corrected) two-attribute-HTML
<div class='hello#123' text='321#he#321llo#321'>...</div>
you can select it using the following, for example:
Using the contains() function
//div[contains(#class,'hello') and contains(#text,'#he#')]
This is quite specific and only applicable if the "hello" is always split in the same way
Using the translate() function to mask everything except the chars for "hello"
//div[translate(#class,'#0123456789','')='hello' and translate(#text,'#0123456789','')='hello']
This removes all # chars and digits and checks if the remaining string is "hello"
I guess combining these two approaches you will be able to create your own XPath expression fitting your needs. The patterns you provided were not fully clear, so this may only approach a good enough solution.

Regex match SQL values string with multiple rows and same number of columns

I tried to match the sql values string (0),(5),(12),... or (0,11),(122,33),(4,51),... or (0,121,12),(31,4,5),(26,227,38),... and so on with the regular expression
\(\s*\d+\s*(\s*,\s*\d+\s*)*\)(\s*,\s*\(\s*\d+\s*(\s*,\s*\d+\s*)*\))*
and it works. But...
How can I ensure that the regex does not match a values string like (0,12),(1,2,3),(56,7) with different number of columns?
Thanks in advance...
As i mentioned in comment to the question, the best way to check if input string is valid: contains the same count of numbers between brackets, is to use client side programm, but not clear SQL.
Implementation:
List<string> s = new List<string>(){
"(0),(5),(12)", "(0,11),(122,33),(4,51)",
"(0,121,12),(31,4,5),(26,227,38)","(0,12),(1,2,3),(56,7)"};
var qry = s.Select(a=>new
{
orig = a,
newst = a.Split(new string[]{"),(", "(", ")"},
StringSplitOptions.RemoveEmptyEntries)
})
.Select(a=>new
{
orig = a.orig,
isValid = (a.newst
.Sum(b=>b.Split(new char[]{','},
StringSplitOptions.RemoveEmptyEntries).Count()) %
a.newst.Count()) ==0
});
Result:
orig isValid
(0),(5),(12) True
(0,11),(122,33),(4,51) True
(0,121,12),(31,4,5),(26,227,38) True
(0,12),(1,2,3),(56,7) False
Note: The second Select statement gets the modulo of sum of comma instances and the count of items in string array returned by Split function. If the result isn't equal to zero, it means that input string is invalid.
I strongly believe there's a simplest way to achieve that, but - at this moment - i don't know how ;)
:(
Unless you add some more constraints, I don't think you can solve this problem only with regular expressions.
It isn't able to solve all of your string problems, just as it cannot be used to check that the opening and closing of brackets (like "((())()(()(())))") is invalid. That's a more complicated issue.
That's what I learnt in class :P If someone knows a way then that'd be sweet!
I'm sorry, I spent a bit of time looking into how we could turn this string into an array and do more work to it with SQL but built in functionality is lacking and the solution would end up being very hacky.
I'd recommend trying to handle this situation differently as large scale string computation isn't the best way to go if your database is to gradually fill up.
A combination of client and serverside validation can be used to help prevent bad data (like the ones with more numbers) from getting into the database.
If you need to keep those numbers then you could rework your schema to include some metadata which you can use in your queries, like how many numbers there are and whether it all matches nicely. This information can be computed inexpensively from your server and provided to the database.
Good luck!

String Template: is it possible to get the n-th element of a Java List in the template?

In String Template one can easily get an element of a Java Map within the template.
Is it possible to get the n-th element of an array in a similar way?
According to the String Template Cheat Sheet you can easily get the first or second element:
You can combine operations to say things like first(rest(names)) to get second element.
but it doesn't seem possible to get the n-th element easily. I usually transform my list into a map with list indexes as keys and do something like
map.("25")
Is there some easier/more straightforward way?
Sorry, there is no mechanism to get a[i].
There is no easy way getting n-th element of the list.
In my opinion this indicates that your view and business logic are not separated enough: knowledge of what magic number 25 means is spread in both tiers.
One possible solution might be converting list of values to object which provides meaning to the elements. For example, lets say list of String represents address lines, in which case instead of map.("3") you would write address.street.

xPath last select element

Can someone help me to bring this code working? I have several select fields and I only want the last one in my variable.
variable = browser.elements_by_xpath('//div[#class="nested-field"]//select[last()]
Thanks!
This is a FAQ: The [] operator in XPath has higher precedence (priority) than the // pseudo-operator. This is why brackets must be used to change the default operator priorities. There are at least several similar questions with good explanations -- search for them and read and understand.
Instead of:
//div[#class="nested-field"]//select[last()]
Use:
(//div[#class="nested-field"]//select)[last()]
is the class attribute an exact match?
if the mark up is like this
<div class="nested-field other">
...
then you'll have to either match by the exact class or use xpath contains.

Extract terms from query for highlighting

I'm extracting terms from the query calling ExtractTerms() on the Query object that I get as the result of QueryParser.Parse(). I get a HashTable, but each item present as:
Key - term:term
Value - term:term
Why are the key and the value the same? And more why is term value duplicated and separated by colon?
Do highlighters only insert tags or to do anything else? I want not only to get text fragments but to highlight the source text (it's big enough). I try to get terms and by offsets to insert tags by hand. But I worry if this is the right solution.
I think the answer to this question may help.
It is because .Net 2.0 doesnt have an equivalent to java's HashSet. The conversion to .Net uses Hashtables with the same value in key/value. The colon you see is just the result of Term.ToString(), a Term is a fieldname + the term text, your field name is probably "term".
To highlight an entire document using the Highlighter contrib, use the NullFragmenter