How to create an Xpath in a tricky section of document (for me) for the purpose of using with Selenium Basic in VBA - selenium

OK, so I mentioned Selenium Basic as that is the use of the XPath and I believe Selenium Basic uses Selenium version 2 so maybe it won't be able to understand some/all answers that might require the latest Selenium. But someone might take that into account if necessary.
There are dynamic classes at play here.
Criteria for selection.
1. Class starting with 'NextToJump__eventWrapper' (the outer one) must be used.
2. Class starting with 'NextToJump__venue' must contain text = 'Ballarat'
3. Class starting with 'NextToJump__race' (and/or span) must contain text = 'Race 2'
I need to be able to click on the <a> tag that contains Points 2 and 3.
The best that I've been able to do (and checked) using ChroPath in Chrome Devtools is...
//div[starts-with(#class,'NextToJump__eventWrapper')]//descendant::*[contains(text(),'Ballarat')]
But note that there are 2 cases of Point 2 in the HTML but only 1 case that satisfies Points 2 and 3.
Thanks
<div class="NextToJump__eventWrapper--13zZJ">
<div>
<div class="NextToJump__raceEvent--bfMON" data-testid="next-to-jump-item">
<a class="Link__link--9x4YY" href="/racing-betting/greyhound-racing/crayford-am/20200708/race-1-1801951-58544404">
<div class="NextToJump__iconWrapper--1yG60"></div>
<div class="NextToJump__eventDetail--CUzdX">
<div class="NextToJump__venue--1jwWA">Ballarat</div>
<div class="NextToJump__race--3JydR"><span>Race 1</span></div>
</div>
<div class="NextToJump__countdown--EG8mR"><span class="Countdown__countdown--4vRpD Countdown__imminent--2yc2K">52s</span></div>
</a>
</div>
<div class="NextToJump__raceEvent--bfMON" data-testid="next-to-jump-item">
<a class="Link__link--9x4YY active" href="/racing-betting/greyhound-racing/rockhampton/20200708/race-4-1799474-58466521" aria-current="page">
<div class="NextToJump__iconWrapper--1yG60"></div>
<div class="NextToJump__eventDetail--CUzdX">
<div class="NextToJump__venue--1jwWA">Rockhampton</div>
<div class="NextToJump__race--3JydR"><span>Race 4</span></div>
</div>
<div class="NextToJump__countdown--EG8mR"><span class="Countdown__countdown--4vRpD Countdown__imminent--2yc2K">2m 52s</span></div>
</a>
</div>
<div class="NextToJump__raceEvent--bfMON" data-testid="next-to-jump-item">
<a class="Link__link--9x4YY" href="/racing-betting/greyhound-racing/ballarat/20200708/race-4-1799454-58465201">
<div class="NextToJump__iconWrapper--1yG60"></div>
<div class="NextToJump__eventDetail--CUzdX">
<div class="NextToJump__venue--1jwWA">Ballarat</div>
<div class="NextToJump__race--3JydR"><span>Race 2</span></div>
</div>
<div class="NextToJump__countdown--EG8mR"><span class="Countdown__countdown--4vRpD Countdown__imminent--2yc2K">5m 52s</span></div>
</a>
</div>
</div>
</div>

The xpath expression you need to use to select your target <a> tag is long and convoluted, but that's life....
[formatted for ease of reading, but you can use that in one line]
//a
[ancestor::div[starts-with(#class,'NextToJump__eventWrapper')]]
[.//div[.="Ballarat"]
[starts-with(#class,'NextToJump__venue-')]
[./following-sibling::div[.="Race 2"]
[starts-with(#class,'NextToJump__race-')]
]
]
Edit:
In "plain English":
Find an <a> node which meets ALL these conditions (i) has an ancestor (not a parent) node which is a <div>, which <div> has a class attribute with an attribute name which starts with NextToJump__eventWrapper; and (ii) it has <div>descendant (not just a child) node, which has Ballarat as a text node AND which has a class attribute with an attribute name which starts with NextToJump__venue-, where that <div>descendant itself has a following sibling which is a <div> which itself has a Race 2 text node AND which has a class attribute with an attribute name which starts with NextToJump__race-...
Yes, the word "plain" doesn't really fit here, but that's the closest I could get. I like xpath, and it's very powerful, but sometimes it's very hard to follow... As an aside, it would have been somewhat less cryptic if xquery was used instead of straight xpath.

Related

Use GetElementsByClass to find all <div> elements by class name, nested inside a <p> element

I am creating a parser using Jsoup in Kotlin
I need to get a inner text of a tag with class "ptrack-content" inside the tag with class "titleCard-synopsis"
When I am trying to getElementsByClass in a element objects that created by a former getElementsByClass, I getting 0 elements
Code:
class NetlifxHtmlParser {
val html = """
<div class="titleCardList--metadataWrapper">
<div class="titleCardList-title"><span class="titleCard-title_text">Map Her</span><span><span class="duration ellipsized">50m</span></span></div>
<p class="titleCard-synopsis previewModal--small-text">
<div class="ptrack-content">A hidden map rocks Hartley High as the students' sexcapades are publicly exposed. Caught as the culprit, Amerie becomes an instant social pariah.</div>
</p>
</div>
<div class="titleCardList--metadataWrapper">
<div class="titleCardList-title"><span class="titleCard-title_text">Renaissance Titties</span><span><span class="duration ellipsized">50m</span></span></div>
<p class="titleCard-synopsis previewModal--small-text">
<div class="ptrack-content">Amerie, the new outcast, receives a party invitation that gives her butterflies. But when she manages to show up, a bitter surprise awaits.</div>
</p>
</div>
""".trimIndent()
fun parseEpisode() {
val doc = Jsoup.parseBodyFragment(html)
val titleCards = doc.getElementsByClass("titleCard-synopsis")
println("Episode: count titleCard = > ${titleCards.count()}") // 2
titleCards.forEachIndexed { index, element ->
val ptrack = element.getElementsByClass("ptrack-content")
println("Episode: count ptrack = > ${ptrack.count()}") // 0 !!
println("inner html = > ${ptrack.html()}") // null string !!
}
}
}
In the above code,
First, I am extracting tags with class name titleCard-synopsis.
For that , I using doc.getElementsByClass("titleCard-synopsis") which returns 2 element items.
Then, In the List of titleCard elements, I am extracting the elements that have ptrack-content as Class, by using the same getElementsByClass in each element,
which returns empty list.
Why this is happening ?
My goal is, I need to extract the description text for each title, the stored in the interior tags of p tag with class titleCard-synopsis.
If I try to get directly from "ptrack-content", it's working fine, but this a general class used in many places in the main HTML source. (this is snippet)
I need to get a inner text of a tag with class "ptrack-content" inside the tag with class "titleCard-synopsis"
But in the above method in the code, I am only getting emtpy list.
Why ?
Also note that, if I invoke the HTML() method in a element object of titleCards(ptrack.html()),
I am not getting the inner DIV tag, an empty string!!!
Please guide my to resolve the issue !
TL;DR
I need to get a inner text of a tag with class "ptrack-content" inside the tag with class "titleCard-synopsis"
I'm not really familiar with Kotlin, but this should produce the desired output:
val doc = Jsoup.parseBodyFragment(html)
val result = doc.select(".titleCard-synopsis + .ptrack-content")
result.forEachIndexed {index, element ->
println("${element.html()}")
}
Live example
This is an interesting problem!
You basically have an invalid HTML and jsoup is smart enough to auto-correct it for your. Your HTML structure gets altered and suddenly your query does not work.
This is the error:
<p class="titleCard-synopsis previewModal--small-text">
<div class="ptrack-content">A hidden map rocks Hartley High as the students' sexcapades are publicly exposed. Caught as the culprit, Amerie becomes an instant social pariah.</div>
</p>
You can't nest a <div> element inside a <p> element like that.
Paragraphs are block-level elements, and notably will automatically close if another block-level element is parsed before the closing </p> tag. [Source: <p>: The Paragraph element]
Also, look at Nesting block level elements inside the <p> tag... right or wrong?
This is how jsoup parses your tree:
<html>
<head></head>
<body>
<div class="titleCardList--metadataWrapper">
<div class="titleCardList-title">
<span class="titleCard-title_text">Map Her</span><span><span class="duration ellipsized">50m</span></span>
</div>
<p class="titleCard-synopsis previewModal--small-text"></p>
<div class="ptrack-content">
A hidden map rocks Hartley High as the students' sexcapades are publicly exposed. Caught as the culprit, Amerie becomes an instant social pariah.
</div>
<p></p>
</div>
<div class="titleCardList--metadataWrapper">
<div class="titleCardList-title">
<span class="titleCard-title_text">Renaissance Titties</span><span><span class="duration ellipsized">50m</span></span>
</div>
<p class="titleCard-synopsis previewModal--small-text"></p>
<div class="ptrack-content">
Amerie, the new outcast, receives a party invitation that gives her butterflies. But when she manages to show up, a bitter surprise awaits.
</div>
<p></p>
</div>
</body>
</html>
As you can see, elements with class titleCard-synopsis have no children with class ptrack-content.

How to find xpath of an element which depends upon sibling class

I have below html code
<a class = sidetoolsdivider>
<div class = sideone > Test 1 </div>
<div class = sidetwo> </div>
</a>
<a class = sidetoolsdivider>
<div class = sideone > Test 2 </div>
<div class = sidetwo> </div>
</a>
...............
Here I need to find xpath locator of class sidetwo which has text Test1. There are many such similar classes hence you can differentiate between different only based on element text
The xpath would be something like below:
Since the element depends on the text, can make use of text attribute for the same.
//div[text()='Text1']/following-sibling::div
Or
//div[contains(text(),'Text1')]/following-sibling::div
Or
//div[contains(text(),'Text1')]/following-sibling::div[#class='sidetwo']
Link to refer - Link
This gets you the correct 'a'. Find an 'a' which contains the right div of sideone (note the .//, find a Child which is)
"//a[.//div[ #class='sideone" and text()='Test 1']"
Then just get the side two, complete xPath
"//a[.//div[ #class='sideone" and text()='Test 1']//div[#class='sidetwo']"
Works even if there is more text inside the entire 'a' and stuff gets complex with more elements inside.

How to find Label of a input field

Looking for a generic way to find text before an input field to know what to fill in the field. Using xpath, css selector or any other way possible.
<div>
<span>Full Name</span>
<input name="xddadN">
</div>
<div>
<span>Email</span>
<input name="xedadN">
</div>
Or
<div>
<div><label>Full Name</label></div>
<div><input name="xddadN"></div>
<div><label>Email</label></div>
<div><input name="xedadN"></
</div>
Or
<div>
<label>Full Name<br>
<span><input name="xddadN"></span>
</label>
</div>
<div>
<label>Full Name<br>
<span><input name="xddadN"></span>
</label>
</div>
You can try below XPath expression to get preceding text node:
//input/preceding::*[1]
or more specific for Full Name
//input[#name="xddadN"]/preceding::*[1]
and Email:
//input[#name="xedadN"]/preceding::*[1]
For full name use this Xpath : //input[#name='xddadN']/preceding-sibling::span
code :
String fullName = driver.findElement(By.Xpath(//input[#name='xddadN']/preceding-sibling::span)).getText();
String Email = driver.findElement(By.Xpath(//input[#name='xedadN']/preceding-sibling::span)).getText();
You haven't mentioned any Selenium Language Binding Art so I will be using Java for the example.
First the Answer
Yes, you can use a generic way to find text before an input field as follows :
As per the HTML :
<div>
<span>Full Name</span>
<input name="xddadN">
</div>
<div>
<span>Email</span>
<input name="xedadN">
</div>
To retrieve the text Full Name from the <span> tag with respect to the <input> tag you can use :
String myText = driver.findElement(By.xpath("//input[#name='xddadN']//preceding::*[1]")).getAttribute("innerHTML");
Now the Pitfall
Without any visibility to your usecase in my opinion the generic way would be a pitfall which will induce much chaos and uncertanity for the following reasons :
As per the xpath we are straightway jumping into the previous element, a small change in the HTML DOM (e.g. inclusion of a <span> tag) will make your Testcases to Fail.
In general, while constructing a Locator Strategy through css-selectors or xpath it will be benificial to include the <tagName> to optimize the element search process. If <tagName> are not included your Tests will require more time to locate the elements and perform action on them. In this process you are compromising some of the advantages of Test Automation.
Conclusion
Hence as a conclusion as per the Best Practices always include the <tagName> while constructing a Locator Strategy through css-selectors or xpath.

Any suggesstions to locate the below elements in Selenium

Example HTML below. I want to locate the elements which contains the text Person Attributes and Age.
<div id="ext-gen210" class="x-tool x-tool-toggle"></div>
<span id="ext-gen214" class="x-panel-header-text">Person Attributes</span>
</div>
<div id="ext-comp-1071" class=" DDView ListView" style="height: auto;">
<p class="dragItem "><i class="icon-Gen"></i>Age</p>
</div>
Note: I am looking for a solution without using xpath or id or className as they might change with every new release of my website.
I tried to locate them using
'name' --> By.name("Person Attributes") and By.name("Age") but both failed.
By.name would check for the name attribute. What you need is to check the text of an element using By.xpath:
By.xpath('//div[span/text() = "Person Attributes"]')
Or, you can also check that an id element starts with ext-gen:
By.xpath('//div[starts-with(#id, "ext-gen")]')

Not finding the Correct xpath

I'm trying write a Python script to get some information from Google's products listed on the top right of the screen. (Usual 6 pictures with price and seller)
I am using Python, PhantomJS and Selenium
Doing a google search for "red shoe" I want my script to return the prices. I get stuck in the step where I try to even find the element containing the products. Am I missing something with my xpath?
def getTopSongs(object):
print "Working YETI"
browser = webdriver.PhantomJS('c:/projects/phantomjs/phantomjs.exe')
browser.get('http://google.com/search?q=red+shoe')
time.sleep(5)
title = browser.find_element_by_xpath('//div[contains#class, "pla-unit")]/text()[contains(., "red")]/following::b').text
From Google's webpage I element under a few nested
<div id="rhs">
...
<div class="_Pwb">
<div class="_Ohb">
<div style="width:109px" class="pla-unit">
<div class="_PD">
<div class="pla-unit-img-container">
<div class="_Z5">
<div class="_vT"><a href="http://www.somewebsite.com">
<span class="rhsl4">Nina 'Forbes' Peep Toe Pump <b>Red</b> R...</span>
<span class="rhsg3 rhsl5">Nina 'Forbes' Peep Toe Pum...</span>
<span class="rhsg4">Nina 'Forbes' Peep Toe Pu...</span></a>
</div>
<div class="_QD"><b>$78.95</b></div>
<div class="_mC">
<span class="rhsl4 a">Nordstrom</span>
<span class="rhsg3 rhsl5 a">Nordstrom</span>
<span class="rhsg4 a">Nordstrom</span>
</div>
</div>
*Update:
I added more HTML. In this example I am looking to get the text from ($78.95) annd (Norstrom)
*Update
To clarify,
<div id="rhs">
is an unique element
There are however multiple (6) elements of:
<div style="width:109px" class="pla-unit">
The elements under each category have the same name and follow the same structure and substructures
ie, there are 6
<div class="_PD">
<div class="pla-unit-img-container">
<div class="_Z5">
<div class="_vD">
<div class="_QD">
<div class="_mC">
and so on.
The main objective is to get all of the elements but for purposes of debugging I was asking help to get the first one.
The xpath for a price unit using XPathChecker on Firefox is:
id('rhs_block')/x:div[1]/x:div/x:div/x:div/x:div[1]/x:div[1]/x:div[2]/x:div[2]/x:b
You can use ancestor:: to go back up then following-sibling:: to get elements at the same level that follow it.
I haven't tried this but give it a shot:
title = browser.find_element_by_xpath('//div[contains#class, "pla-unit")]/text()[contains(., "red")]/ancestor::div/following-sibling::div[1]').text
Then to get to your div class ='mC' you just change:
following-sibling::div[1]
to
following-sibling::div[2]
and get the text from the spans under that.