I have two xpath selectors that find exactly the same element, but I wonder which one is better from code, speed of execution, readability points of view etc.
First Xpath :
//*[#id="some_id"]/table/tbody/tr[td[contains(., "Stuff_01")]]//ancestor-or-self::td/input[#value="Stuff_02"]
Second Xpath:
//tr[td[#title="Stuff_01"]]//ancestor-or-self::td/input[#value="Stuff_02"]
The argument for example is that if the code of the page will be changed and for example some "tbody" will be moved that the first one won't work, is it true ?
So any way which variant of the code is better and why ?
I would appreciate an elaborate answer, because this is crucial to the workflow.
It is possible that neither XPath is ideal. Seeing the targeted HTML and a description of the selection goal would be needed to decide or to offer another alternative.
Also, as with all performance matters, measure first.
That said, performance is unlikely to matter, especially if you use an #id or other anchor point to hone in on a reduced subtree before further restraining the selection space.
For example, if there's only one elem with id of 1234 in the document, by using //elem[#id="1234"]/rest-of-xpath, you've eliminated the rest of the document as a performance/readability/robustness concern. As long as the subtree below elem is relatively tame (and it usually will be), you'll be fine regarding those concerns.
Also, yes, table//td is a fine way to abstract over whether tbody is present or not.
Related
I'm doing automated tests. And I use SelectorHub to find elements in a website. In some cases I get very long Relative Xpath as you see below:
//body/div[#id='app']/div[#class='...']/div[#role='...']/div[#class='...']/div[#class='...']/div[#class='...']/div/div/div[#class='']/textarea"));
As I understood it correctly that it will fail if the website changes change in the future because it has too many "DIV". Why then is it said that relative Xpath is reliable? I could not create a shorter path manually to find a reliable path.
Any XPath that works today on a particular HTML page H1 may or may not produce the same result when applied (in the future) to a different HTML page H2. If you want it to have the best chance of returning the same result, then you want to minimise its dependencies, and in particular, you want to avoid having dependencies on the properties of H1 that are most likely to change. That, of course, is entirely subjective. It can be said that the longer your path expression is, the more dependencies it has (that is, the greater the number of changes that might cause it to break). But that's not universally true: the expression (//*)[842] is probably the shortest XPath expression to locate a particular element, but it's also highly fragile: it's likely to break if the HTML changes. Expressions using id attributes (such as //p[#id='Introduction'] are often considered reasonably stable, but they break too if the id values change.
The bottom line is that this is entirely subjective. Writing XPath expressions that are resilient to change in the HTML content is an art, not a science. It can only be done by reading the mind of the person who designed the HTML page.
I'm writing a script to scrape some data off the web.
I've copied the XPaths for a few of the same elements on different pages directly from the browser, which produces //*[#id="priceblock_dealprice"].
However, they're all span elements. I don't know enough about how XPath works under the hood, but I'm assuming //span[#id="priceblock_dealprice"] would obviously be quicker since it only has to check the span elements? Is this true?
Is there any benefit to using * over, say, span in this specific context?
You are not likely to see a huge performance difference by changing * to span.
The bigger performance impact would be eliminating, or at least constraining, the descendant axis //.
With a descendant axis that starts at the root node, you are forcing the XPath engine to walk over the entire node tree and inspect each and every element, which can be expensive with large documents.
If you were to provide any clues about the structure, the engine can avoid a lot of unnecessary work, and should perform better.
For instance:
/html/body/section[2]/div//*[#id="priceblock_dealprice"]
Besides performance, the other considerations are maintenance and flexibility.
You might get better performance with a more specific XPath, but then changes to the page structure and element names might result in things not matching anymore. You will need to decide what is more important.
Yes, its better to use 'span' instead of *, but as it is having an ID, so instead of XPath, its better to use By.ID.
ID will be somewhat fast compared to Xpath.
Just thinking and a question round my mind. I use to code xaml using grid, easier for me, at least. Just define the grid and using its cells (span row and column and so on...).
Doing a web´s demo today I remember table´s tag is being obsolote and I am wondering, the same idea could be applied to grid in mobile´s app.
Thinking about it, absolute and relative layout are similar to div (web speaking) so we could deduce the right way to render the xaml should be using those kind of layouts...
What do you think guys? What advantages and deficiencies are coming on using those approaches? I want to get use to code thinking in the best way, not thinking about easyness of that (supposing, that benefit me in long terms. Maintainability, flexibility,...).
As usual, thanks for your help mates.
Well, the point is the next one. As web design do, using layout (relative and absolute) like div could be more complex at the beggining, but after the project is growing up, the benefits of using these approaches take relevance with the time.
If you need to change the style of the view, these layout are more flexible for doing it, just set the limits of the each element and you can add another element as well in a easy way.
But if you decided to use grid layout approach, and after, you need to change the style, the operation will be more complicated because you probably need to reset the definition of column and row for adding new elements and this implicate to recode every row and column for the other rows as well.
Hoping it helps when you are in design phase and you need to calculate risk and maintainability of your app.
Cheers mates...
I trying to get all elements which were changed After I click on element.
I try to do the next:
List<WebElement> lstWeb = driver.findElements(By.xpath("//*");
driver.findElement(By.id("ImprBtn"));
List<WebElement> lstWebAfter = driver.findElements(By.xpath("//*");
lstWebAfter.removeAll(lstWeb );
The problem is that it's taking a long time, because in each list I have more than 800 WebElements.
There is an efficient way to identify changes in DOM after I click on element?
In general, i think it's not good approach (comparing the whole DOM before and after some operation) for designing your test cases, it's probably happens because not 100% correct design of the cases.
Design your tests cases more carefully, prepare your expected result and the expected DOM changes (i.e. the change in the webapp) and compare them to the actual result.
To learn more about recommended design of automation please read about the PageObjects Pattern.
You can find nice implementation for it here (don't mind the language, just read the code :)).
If you still need solution to identify DOM changes, check those solutions:
Mutation observers.
Check this DOM monitoring suggestions.
More about mutation events (recommended by W3 for what you looking for).
You should narrow down the xpath query in order to more efficiently search the DOM.
If you're not very comfortable with raw xath, I have a helper library that let's you create xpath based on LINQ-esq syntax.
Sharing a link in case you find it helpful
http://www.unit-testing.net/CurrentArticle/How-to-Create-Xpath-From-Lambda-Expressions.html
In short, this is bad web development and UX:
But solving it by using CSS3 word breaking (code & demo) can lead to an 'awkward whitespace' situation, and strange cut-offs — here's an example of both:
Maybe it's not such a big deal, and the UX perspective of it is here, but let's look at the semantics of one of the solutions:
You could ... use the <wbr> element to indicate an optional word
break opportunity. This will tell the browser to insert a line break
as necessary to flow onto a new line inside the container.
The first question: is using <wbr> semantic HTML? (Does it at least degrade gracefully?)
In either case, it seems that being un-semantic in the general sense is a small price to pay for good UX functionality.
However, the second quesiton is about the big picture:
Are there any schema.org (microdata/RFDa) ramifications to consider when using <wbr> to split up an email address? Will it still be valid there?
The wbr element is defined in the HTML5 spec. So it's fine to use it. If it's used right (= according to the definition in the spec), you may call it also "semantic use".
I don't think that there would be any problems in combination with micordata/RDFa. Usually you'd provide the URL in an attribute anyway, which can't contain wbr elements of course: foo<wbr>#example<wbr>.com.
For element content I'd guess (didn't check though) that microdata/RDFa parsers should use the text content without markup resp. understand what is markup and what is text, otherwise e.g. a FOAF name would be <abbr>Dr.</abbr> Foo instead of Dr. Foo.
So you can bet that microdata/RDFa parsers know HTML ;), and therefor it shouldn't be a problem to use its elements.