how to scrape some text inside an href randomly generated - selenium - selenium

I was scraping a dynamic page with selenium and I got stuck getting text 1 and text 2 in the following example:
<span class="class number 1"> text 1 text 2 </span>
The same happens if the span is instead a div.
I managed to get text 1 with this python line
var = driver.find_element(By.CLASS_NAME, "class number 1").text"
However, to get text 2, since link 1 is generated, say, randomly, I can't refer to the href in any way!
Any help is really appreciated

Try this, it retrieves 'text 2':
driver.find_element(By.XPATH, ".//span[#class='class number 1']/a").text

Try using CSS Selectors
txt2 = driver.find_element(By.CSS_SELECTOR, "span[class='class number 1'] > a").text
#To extract both text node values at the same time, you can use innerHTML as follows
driver.find_element(By.CSS_SELECTOR, "span[class='class number 1'].get_attribute('innerHTML')

Related

Finding tag by xpath by a text that has an inner tag inside

I've recently come across an issue.
I need to find a div tag on a page, that contain specific text. The problem is, that text is divided into two parts by an inner link tag, so that an HTML tree would look like:
**<html>
<...>
<div>
start of div text - part 1
<a/>
end of div text - part 2
</div>
<...>
</html>**
To uniquely identify that div tag I'd need two parts of div text. Naturally, I would come up with something like this XPath:
.//div[contains(text(), 'start of div text') and contains(text(), 'end of div text')]
However, it doesn't work, the second part can not be found.
What would be the best approach to describe this kind of tag uniquely?
try to use below XPath to match required div by two text nodes:
//div[normalize-space(text())="start of div text - part 1" and normalize-space(text()[2])="end of div text - part 2"]
You were almost there. You simply need to replace the text() with . as follows:
//div[contains(., 'start of div text') and contains(., 'end of div text')]
Here is the snapshot of the validation :
This should work:
//div[contains(text(), 'start of div text') and contains(./a/text(), 'end of div text')]
Well if you have HTML DOM tree like this :
<div id="container" class="someclass">
<div>
start of div text - part 1
<a/>
end of div text - part 2
</div>
</div>
for extracting div text, you can write xpath like this :
//div[#id='container']/child::div
P.S : Writing xpath based on text to find the same exact text is not a good way to write Xpath.
If all you want is the div element of those child text elements, then you could isolate a piece of unique content from "part 1" and try the following:
//*[contains(., 'part 1')]/parent::div
This way you wouldn't have to think about the div's attributes.
However, this is usually not best practice. Ideally, you should use the following Xpath in most cases:
//div[#id,('some id') and contains(., 'part 1')]

How to extract the text from a child node which is within a <div> tag through Selenium and WebDriver?

I need to get the value 107801307 that is inside a specific Div but there are several other Divs in the path before getting into that DIV I need. Can anyone help?
Below is the image with the information that I need to extract from the DIV.
As per the HTML you have provided, to extract the text 107801307 you can use the following solution:
Java:
String myText = driver.findElement(By.xpath("//b[#class='label_tratamento'][contains(.,'Ban Claro')]//following::span[1]").getAttribute("innerHTML");
Python:
myText = driver.find_element_by_xpath("//b[#class='label_tratamento'][contains(.,'Ban Claro')]//following::span[1]").get_attribute("innerHTML")
Research xpath locators to find the specific element you want.
Assuming you were using Java, the code would be:
webdriver.findElement(By.xpath("//b[text()='Ban Claro']/following::span").getText();
or
webdriver.findElement(By.xpath("//b[#class='label_tratamento']/following::span").getText();
Use:
driver.findElement(By.xpath(("XPATH OF DIV HERE")[Index of the div where span is. Example: 4)/span)).getText();

selenium webdriver find element in region

Working with automated testing, I have come across the following issue quite a lot of time: I want to find an element on the page, but the element has to be at a specific region of the page.
Take the following as an example:
I have a searchfield with type-ahead on the site. In my sidebar, I have the element I am seraching for (lets call it "Selenium"), but that is not the element I am interested in, I want to see if my type-ahead search is delivering the expected result when searching for "Selenium".
<body>
<header>
<searchfield>
<searchresults>
<a .... >Selenium</a>
<searchresults>
</searchfield>
</header>
<aside>
...
<a .... >Selenium</a>
...
</aside>
</body>
If I in selenium webdriver search for the linktext "Selenium" it will find both entries, the one in the sidebar aswell as the one in the searchfield.
Furthermore am I not able to wait for the searchresult with the WaitForVisible command on the linkText as Selenium will find the element in the sidebar and conclude that the element is preset.
So my question is:
"With selenium webdriver, how do I search for an element within a specific region?"
Poking around with this issue, I came across the use of Xpath. With this I could create "areas" where I want to search for an element. As an example, I went from
html/body/div/aside/div[2]/ul/li
to
//div[contains(#class,'coworkerwidget')]/ul/li
Now the code is MUCH more dynamic, and less prone to errors if our frontend guys edit something in the future.
Regarding the search, I could now set up something like the following code
//div[contains(#class, 'searchfield')]//div[contains(#title, 'searchfield') and contains(., '" + searchword + "')]"
First we specify that we want to look in the searchfield area:
//div[contains(#class, 'searchfield')]
I can then set some more criteria for the result I want to find:
//div[contains(#class, 'title') and contains(., '" + searchword + "')]
Some litterature on the subjects for further reading.
http://www.qaautomation.net/?p=388
http://www.w3schools.com/xsl/xpath_syntax.asp
Click a button with XPath containing partial id and title in Selenium IDE
Retrieve an xpath text contains using text()

Retrieve specific text from HTML snippet using Selenium WebDriver

I've following HTML snippet and out of it I want to retrieve only 3 text:
<div class="assigned" ng-repeat="counte">
3
<div class="ng-binding">Assigned</div>
</div>
But when I use following, it returns 3Assigned
driver.findElement(By.xpath(//div[#class='assigned'])).getText();
I only want to retrieve 3, how can I achieve that?
When you evaluate a nodes set's text, it returns the concatenated text of all text children, grandchildren, etc.
You can use the [1] indexer as a suffix to obtain just the first text:
driver.findElement(By.xpath("(//div[#class='assigned']/text())[1]")).getText();
Try out with :
xpath - > //div[#class='ng-binding']/preceding-sibling::text()

CSS locator for corresponding xpath for selenium

The some part of the html of the webpage which I'm testing looks like this
<div id="twoWideCallouts">
<div class="callout">
<a target="_blank" href="http://facebook.com">Facebook</a>
</div>
<div class="callout last">
<a target="_blank" href="http://youtube.com">Youtube</a>
</div>
I've to check using selenium that when I click on text, the URL opened is the same that is given in href and not error page.
Using Xpath I've written the following command
//i is iterator
selenium.getAttribute("//div[contains(#class, 'callout')]["+i+"]/a/#href")
However, this is very slow and for some of the links doesn't work. By reading many answers and comments on this site I've come to know that CSS loactors are faster and cleaner to maintain so I wrote it again as
css = div:contains(callout)
Firstly, I'm not able to reach to the anchor tag.
Secondly, This page can have any number of div where id = callout. Using xpathcount i can get the count of this, and I'll be iterating on that count and performing the href check. How can something similar be done using CSS locator?
Any help would be appreciated.
EDIT
I can click on the link using the locator css=div.callout a, but when I try to read the href value using String str = "css=div.callout a[href]";
selenium.getAttribute(str);. I get the Error - element not found. Console description is given below.
19:12:33.968 INFO - Command request: getAttribute[css=div.callout a[href], ] on session
19:12:33.993 INFO - Got result: ERROR: Element css=div.callout a[href not found on session
I tried to get the href attribute using xpath like this
"xpath=(//div[contains(#class, 'callout')])["+1+"]/a/#href" and it worked fine.
Please tell me what should be the corresponding CSS locator for this.
It should be -
css = div:contains(callout)
Did you notice ":" instead of "." you used?
For CSSCount this might help -
http://www.eviltester.com/index.php/2010/03/13/a-simple-getcsscount-helper-method-for-use-with-selenium-rc/
#
On a different note, did you see proposal of new selenium site on area 51 - http://area51.stackexchange.com/proposals/4693/selenium.
#
To read the sttribute I used css=div.callout a#href and it worked. The problem was with use of square brackets around attribute name.
For the first part of your question, anchor your identifier on the hyperlink:
css=a[href=http://youtube.com]
For achieving a count of elements in the DOM, based on CSS selectors, here's an excellent article.