This question already has answers here:
Why is XPath contains(text(),'substring') not working as expected?
(2 answers)
XPath contains(text(),'some string') doesn't work when used with node with more than one Text subnode
(7 answers)
Testing text() nodes vs string values in XPath
(1 answer)
Closed 3 months ago.
I want to clearly understand what is the difference between the following XPath expressions "//*[contains(.,'sometext')]" and "//*[contains(text(),'sometext')]".
From this great answer I understand, that text() returns a set of individual nodes, while . in a predicate evaluates to the string concatenation of all text nodes.
OK, but when I'm using [contains(.,'sometext')] or [contains(text(),'sometext')] this should return the same amount of elements matching those XPaths since here we checking for nodes containing someText content in itself or in some of their children. Right? And it doesn't matter if we are checking whether any of the text nodes of an element contains sometext or string concatenation of all text nodes contains the sometext text. This should give the same amount of matches.
However if we test this for example on this page I see 104 matches for //*[contains(text(),'selenium')] XPath while //*[contains(.,'selenium')] XPath is giving 442 matches.
So, what causes this difference?
Let me share my understanding using this xml.
<test>
<node>
selenium
<node2>
selenium
</node2>
</node>
<node>
selenium
</node>
</test>
First of all function text() returns list of node objects.
Function contains() takes two arguments where the first one is a string. So having this //*[contains(text(),'selenium')] would not always work. In XPath v2.0 It will fail when text() supplies several nodes to contains.
In my mentioned example white spaces before nodes are also text node:
This is why in my test your //*[contains(text(),'selenium')] query failed. Probably browsers have some work around for that to make things easier.
Now lets collapse that xml to get rid of that noise and look at the differences of approaches:
<test><node>selenium<node2>selenium</node2></node><node>selenium</node></test>
1. use text().
Here what https://www.freeformatter.com/xpath-tester.html returns:
Element='<node>selenium<node2>selenium</node2>
</node>'
Element='<node2>selenium</node2>'
Element='<node>selenium</node>'
Since //* defines all nodes within the tree here we have /test/node[1] that contains, also /test/node[1]/node2 and /test/node[2].
2. Now lets look at . case:
Now it returns:
Element='<test>
<node>selenium<node2>selenium</node2>
</node>
<node>selenium</node>
</test>'
Element='<node>selenium<node2>selenium</node2>
</node>'
Element='<node2>selenium</node2>'
Element='<node>selenium</node>'
Why? because first of all /test is converted to seleniumseleniumselenium. Then /test/node[1] is converted to seleniumselenium, then /test/node[1]/node2 is converted to selenium and finally /test/node[2] is converted to selenium
So this makes the difference. Depending on how complex your nesting is, the results might show more or less significant difference between to approaches.
This thread exlains the difference between dot and text() pretty well: XPath: difference between dot and text()
Related
For example, I have div tag that has two attributes.
class='hello#123' text='321#he#321llo#321'
<div> class='hello#123' text='321#he#321llo#321'></div>
Here, I want to write xpath for both class and text attributes but numbers may change dynamically. ie., "hello#123" may become "345" when we reload. "321#he#321llo#321" may become "567#he#456llo#321".
Note: Need to write xpath in single line not separately.
Assuming that you have the (corrected) two-attribute-HTML
<div class='hello#123' text='321#he#321llo#321'>...</div>
you can select it using the following, for example:
Using the contains() function
//div[contains(#class,'hello') and contains(#text,'#he#')]
This is quite specific and only applicable if the "hello" is always split in the same way
Using the translate() function to mask everything except the chars for "hello"
//div[translate(#class,'#0123456789','')='hello' and translate(#text,'#0123456789','')='hello']
This removes all # chars and digits and checks if the remaining string is "hello"
I guess combining these two approaches you will be able to create your own XPath expression fitting your needs. The patterns you provided were not fully clear, so this may only approach a good enough solution.
At some places, I saw element like following -
//a[.='Assignment']
Generally we've syntax like - //tagName[attributeKey='attributeValue'] or //tagName[text()='textValue']
But what is intention of . in xpath //a[.='Assignment']?
I'll use XPath 2.0 terminology here: XPath 1.0 has different terminology but the effect of the expression is the same.
"." refers to the context item: that is, the a element that you are testing against the predicate. Its value here is a node. When a node is used as an argument of "=", it is atomized, which means (unless your code is schema-aware, which is unlikely) that the string value of the node is compared with the other operand of "=". The string value of an element is the concatenation of all its descendant text nodes.
It sounds like you don't have ready access to an XPath reference. There are quite a few good books that cover XPath, there are online tutorials (which are highly variable in quality) and the W3C specification itself (especially for XPath 1.0) is surprisingly easy reading.
My question is about specifics of using dot and text() in XPath. For example, following find_element lines returns same element:
driver.get('http://stackoverflow.com/')
driver.find_element_by_xpath('//a[text()="Ask Question"]')
driver.find_element_by_xpath('//a[.="Ask Question"]')
So what is the difference? What are the benefits and drawbacks of using . and text()?
There is a difference between . and text(), but this difference might not surface because of your input document.
If your input document looked like (the simplest document one can imagine given your XPath expressions)
Example 1
<html>
<a>Ask Question</a>
</html>
Then //a[text()="Ask Question"] and //a[.="Ask Question"] indeed return exactly the same result. But consider a different input document that looks like
Example 2
<html>
<a>Ask Question<other/>
</a>
</html>
where the a element also has a child element other that follows immediately after "Ask Question". Given this second input document, //a[text()="Ask Question"] still returns the a element, while //a[.="Ask Question"] does not return anything!
This is because the meaning of the two predicates (everything between [ and ]) is different. [text()="Ask Question"] actually means: return true if any of the text nodes of an element contains exactly the text "Ask Question". On the other hand, [.="Ask Question"] means: return true if the string value of an element is identical to "Ask Question".
In the XPath model, text inside XML elements can be partitioned into a number of text nodes if other elements interfere with the text, as in Example 2 above. There, the other element is between "Ask Question" and a newline character that also counts as text content.
To make an even clearer example, consider as an input document:
Example 3
<a>Ask Question<other/>more text</a>
Here, the a element actually contains two text nodes, "Ask Question" and "more text", since both are direct children of a. You can test this by running //a/text() on this document, which will return (individual results separated by ----):
Ask Question
-----------------------
more text
So, in such a scenario, text() returns a set of individual nodes, while . in a predicate evaluates to the string concatenation of all text nodes. Again, you can test this claim with the path expression //a[.='Ask Questionmore text'] which will successfully return the a element.
Finally, keep in mind that some XPath functions can only take one single string as an input. As LarsH has pointed out in the comments, if such an XPath function (e.g. contains()) is given a sequence of nodes, it will only process the first node and silently ignore the rest.
There is big difference between dot (".") and text() :-
The dot (".") in XPath is called the "context item expression" because it refers to the context item. This could be match with a node (such as an element, attribute, or text node) or an atomic value (such as a string, number, or boolean). While text() refers to match only element text which is in string form.
The dot (".") notation is the current node in the DOM. This is going to be an object of type Node while Using the XPath function text() to get the text for an element only gets the text up to the first inner element. If the text you are looking for is after the inner element you must use the current node to search for the string and not the XPath text() function.
For an example :-
<a href="something.html">
<img src="filename.gif">
link
</a>
Here if you want to find anchor a element by using text link, you need to use dot ("."). Because if you use //a[contains(.,'link')] it finds the anchor a element but if you use //a[contains(text(),'link')] the text() function does not seem to find it.
Hope it will help you..:)
enter image description here
The XPath text() function locates elements within a text node while dot (.) locate elements inside or outside a text node. In the image description screenshot, the XPath text() function will only locate Success in DOM Example 2. It will not find success in DOM Example 1 because it's located between the tags.
In addition, the text() function will not find success in DOM Example 3 because success does not have a direct relationship to the element . Here's a video demo explaining the difference between text() and dot (.) https://youtu.be/oi2Q7-0ZIBg
What is the exact meaning of :: ?
And apart from parent, what else are the different things we can use?
By.xpath("parent::*/parent::*")
The shortest answer I can manage
:: separates an axis name from a node test in an XPath expression.
The longer answer
It does not make much sense to ask about the meaning of ":: in Selenium", because it's not a feature of Selenium. It belongs to XPath, which is a W3C specification in its own right and is used to navigate XML or XHTML documents.
By.xpath(" parent::*/parent::* ")
^ ^ ^
Selenium XPath Selenium
Selenium just happens to embed XPath in their web application framework (which is a good thing!).
So, I've taken the liberty to answer the question: What is the meaning of :: in XPath?
The meaning of :: in XPath
In XPath, :: does not mean anything on its own and only makes sense if there is
a valid XPath axis identifier to the left
a valid node test to the right
For example, parent::* is a valid XPath expression1. Here, parent is an XPath axis name, * is a node test2 - and :: marks the transition from the axis to the node test. Other possible axes are
ancestor following-sibling
ancestor-or-self namespace
attribute parent
child preceding
descendant preceding-sibling
descendant-or-self self
following
Of course those are not just names, they have a very clear-cut semantic dimension: each of them defines a unique way to navigate an XML document (or, rather, a tree-like representation of such a document). Their meaning is straightforward in most cases, for instance, following:: identifies something that "follows" the current context.
These tuples of axis and node test (or triples, also counting predicates) can be "chained together" with the binary / operator to form paths with several steps:
outermost-element/other/third
Navigating a simple document
<root>
<person>James Clark</person>
<person>Steve DeRose</person>
</root>
Naturally, navigation might depend very much on your current whereabouts. There are both absolute and relative path expressions. An example for an absolute path expression is
/child::root/child::person | abbreviated syntax: /root/person
As you can see, there is a / at the beginning of an absolute path expression. It stands for the document node (the outermost node of a tree, which is different from the outermost element node of a a tree). Relative path expressions look like
child::person | abbreviated syntax: person
The relative path expression will only find the person element node if the current context is the root element node. Otherwise, it will fail to locate anything.
Your XPath expression
To sum up and use what we have learned so far:
By.xpath("parent::*/parent::*")
finds the element node that is the grandparent of the current node. The names of both the parent and the grandparent node do not matter (that's what *is for). There's no / at the beginning, so it must be a relative path.
1 In fact, it is a location path, a special kind of XPath expression. Also, I have left out one important concept: predicates. Good things always come in threes, and XPath expressions come with an axis, a node test and with zero or more predicates.
2 A node test must be either a name test (testing the name of a node) or a kind test (testing the kind of node). Find ample information about node tests in the relevant part of the XPath specification.
This is xpath syntax, you can do other things like :
child::* Selects all element children of current node
attribute::* Selects all attributes of current node
child::text() Selects all text node children of current node
child::node() Selects all children of current node
Check a tutorial, especially about axes :
http://www.w3schools.com/xpath/xpath_axes.asp
I came to know that string-length returns the number of characters in the string. for e.g.,
//input[string-length( [string] )]
If so how to get the characters using Selenium? It would be great, If I get any examples or working code on that from google.com
Your query returns the n-th <input/> element of each context, where n is the length of [string].
If you want to return the string-length of a single element, use string-lenght(//input). Be aware that you with XPath 1.0 (which is the only one supported by selenium), you cannot return the string length of multiple objects.