I'm trying write a Python script to get some information from Google's products listed on the top right of the screen. (Usual 6 pictures with price and seller)
I am using Python, PhantomJS and Selenium
Doing a google search for "red shoe" I want my script to return the prices. I get stuck in the step where I try to even find the element containing the products. Am I missing something with my xpath?
def getTopSongs(object):
print "Working YETI"
browser = webdriver.PhantomJS('c:/projects/phantomjs/phantomjs.exe')
browser.get('http://google.com/search?q=red+shoe')
time.sleep(5)
title = browser.find_element_by_xpath('//div[contains#class, "pla-unit")]/text()[contains(., "red")]/following::b').text
From Google's webpage I element under a few nested
<div id="rhs">
...
<div class="_Pwb">
<div class="_Ohb">
<div style="width:109px" class="pla-unit">
<div class="_PD">
<div class="pla-unit-img-container">
<div class="_Z5">
<div class="_vT"><a href="http://www.somewebsite.com">
<span class="rhsl4">Nina 'Forbes' Peep Toe Pump <b>Red</b> R...</span>
<span class="rhsg3 rhsl5">Nina 'Forbes' Peep Toe Pum...</span>
<span class="rhsg4">Nina 'Forbes' Peep Toe Pu...</span></a>
</div>
<div class="_QD"><b>$78.95</b></div>
<div class="_mC">
<span class="rhsl4 a">Nordstrom</span>
<span class="rhsg3 rhsl5 a">Nordstrom</span>
<span class="rhsg4 a">Nordstrom</span>
</div>
</div>
*Update:
I added more HTML. In this example I am looking to get the text from ($78.95) annd (Norstrom)
*Update
To clarify,
<div id="rhs">
is an unique element
There are however multiple (6) elements of:
<div style="width:109px" class="pla-unit">
The elements under each category have the same name and follow the same structure and substructures
ie, there are 6
<div class="_PD">
<div class="pla-unit-img-container">
<div class="_Z5">
<div class="_vD">
<div class="_QD">
<div class="_mC">
and so on.
The main objective is to get all of the elements but for purposes of debugging I was asking help to get the first one.
The xpath for a price unit using XPathChecker on Firefox is:
id('rhs_block')/x:div[1]/x:div/x:div/x:div/x:div[1]/x:div[1]/x:div[2]/x:div[2]/x:b
You can use ancestor:: to go back up then following-sibling:: to get elements at the same level that follow it.
I haven't tried this but give it a shot:
title = browser.find_element_by_xpath('//div[contains#class, "pla-unit")]/text()[contains(., "red")]/ancestor::div/following-sibling::div[1]').text
Then to get to your div class ='mC' you just change:
following-sibling::div[1]
to
following-sibling::div[2]
and get the text from the spans under that.
Related
OK, so I mentioned Selenium Basic as that is the use of the XPath and I believe Selenium Basic uses Selenium version 2 so maybe it won't be able to understand some/all answers that might require the latest Selenium. But someone might take that into account if necessary.
There are dynamic classes at play here.
Criteria for selection.
1. Class starting with 'NextToJump__eventWrapper' (the outer one) must be used.
2. Class starting with 'NextToJump__venue' must contain text = 'Ballarat'
3. Class starting with 'NextToJump__race' (and/or span) must contain text = 'Race 2'
I need to be able to click on the <a> tag that contains Points 2 and 3.
The best that I've been able to do (and checked) using ChroPath in Chrome Devtools is...
//div[starts-with(#class,'NextToJump__eventWrapper')]//descendant::*[contains(text(),'Ballarat')]
But note that there are 2 cases of Point 2 in the HTML but only 1 case that satisfies Points 2 and 3.
Thanks
<div class="NextToJump__eventWrapper--13zZJ">
<div>
<div class="NextToJump__raceEvent--bfMON" data-testid="next-to-jump-item">
<a class="Link__link--9x4YY" href="/racing-betting/greyhound-racing/crayford-am/20200708/race-1-1801951-58544404">
<div class="NextToJump__iconWrapper--1yG60"></div>
<div class="NextToJump__eventDetail--CUzdX">
<div class="NextToJump__venue--1jwWA">Ballarat</div>
<div class="NextToJump__race--3JydR"><span>Race 1</span></div>
</div>
<div class="NextToJump__countdown--EG8mR"><span class="Countdown__countdown--4vRpD Countdown__imminent--2yc2K">52s</span></div>
</a>
</div>
<div class="NextToJump__raceEvent--bfMON" data-testid="next-to-jump-item">
<a class="Link__link--9x4YY active" href="/racing-betting/greyhound-racing/rockhampton/20200708/race-4-1799474-58466521" aria-current="page">
<div class="NextToJump__iconWrapper--1yG60"></div>
<div class="NextToJump__eventDetail--CUzdX">
<div class="NextToJump__venue--1jwWA">Rockhampton</div>
<div class="NextToJump__race--3JydR"><span>Race 4</span></div>
</div>
<div class="NextToJump__countdown--EG8mR"><span class="Countdown__countdown--4vRpD Countdown__imminent--2yc2K">2m 52s</span></div>
</a>
</div>
<div class="NextToJump__raceEvent--bfMON" data-testid="next-to-jump-item">
<a class="Link__link--9x4YY" href="/racing-betting/greyhound-racing/ballarat/20200708/race-4-1799454-58465201">
<div class="NextToJump__iconWrapper--1yG60"></div>
<div class="NextToJump__eventDetail--CUzdX">
<div class="NextToJump__venue--1jwWA">Ballarat</div>
<div class="NextToJump__race--3JydR"><span>Race 2</span></div>
</div>
<div class="NextToJump__countdown--EG8mR"><span class="Countdown__countdown--4vRpD Countdown__imminent--2yc2K">5m 52s</span></div>
</a>
</div>
</div>
</div>
The xpath expression you need to use to select your target <a> tag is long and convoluted, but that's life....
[formatted for ease of reading, but you can use that in one line]
//a
[ancestor::div[starts-with(#class,'NextToJump__eventWrapper')]]
[.//div[.="Ballarat"]
[starts-with(#class,'NextToJump__venue-')]
[./following-sibling::div[.="Race 2"]
[starts-with(#class,'NextToJump__race-')]
]
]
Edit:
In "plain English":
Find an <a> node which meets ALL these conditions (i) has an ancestor (not a parent) node which is a <div>, which <div> has a class attribute with an attribute name which starts with NextToJump__eventWrapper; and (ii) it has <div>descendant (not just a child) node, which has Ballarat as a text node AND which has a class attribute with an attribute name which starts with NextToJump__venue-, where that <div>descendant itself has a following sibling which is a <div> which itself has a Race 2 text node AND which has a class attribute with an attribute name which starts with NextToJump__race-...
Yes, the word "plain" doesn't really fit here, but that's the closest I could get. I like xpath, and it's very powerful, but sometimes it's very hard to follow... As an aside, it would have been somewhat less cryptic if xquery was used instead of straight xpath.
I have a requirement to verify field name and values. My code looks like
<div class="line info">
<div class="unit labelInfo TextMdB">
Reference #:
</div>
<div class="unit lastUnit">
701
</div>
</div>
</div>
<div class="line info">
<div class="unit labelInfo TextMdB">
Registered Date:
</div>
<div class="unit lastUnit">
05/05/2020
</div>
</div>
I gave my xpath as
"//div[#class='unit lastUnit']//preceding-sibling::div[#class='unit labelInfo TextMdB' and contains(text(),'Reference #:')]".
With this xpath I am able to reach "reference#" field . But how to verify reference # field is displaying the value (in this case 701) .
Appreciate your response.
Thanks
You can first reach the Reference # text by using its text in the xpath and then you can use following-sibling to fetch the div tag and then use getText()(java) / text (python) method to get 701.
(Edited answer after OP's comment)
If you want to check if the element is displayed on the page or not then you can fetch its list and check if the size of that list is greater than 0 or not.
You can do it like:
In Java:
List<WebElement> elementList = driver.findElements(By.xpath("//div[#class='line info']//div[contains(text(),'Reference #')]//following-sibling::div"));
if(elementList.size()>0){
// Element is present on the UI
// Finding its text
String text = elementList.get(0).getText();
}
In python:
elementList = driver.find_elements_by_xpath("//div[#class='line info']//div[contains(text(),'Reference #')]//following-sibling::div")
if (elementList.len>0):
# Element is present
# Printing its text
print(elementList[0].text)
This question already has answers here:
Scraping data from website using vba
(5 answers)
Closed 4 years ago.
I am trying to get price detail of particular project from URL but I am clueless.
example:-From url (https://www.99acres.com/ppc-2515-residential-apartment-mailer) for project Eden Richmond Enclave i want price 14.22 to 31.15 Lac in range A1.
Below is the code I tried:
Sub test()
Set driver = CreateObject("Selenium.FirefoxDriver")
driver.get "https://www.99acres.com/ppc-2515-residential-apartment-mailer"
Range("A1") = driver.FindElementByXPath("//h1[contains(#class,'font-size15')][contains(text(),'Eden Richmond Enclave')][contains(#class,'product-lrg-box')]").Text
End Sub
Pic
below is the html code:-
<div class="pro-text">
<div class="product-text-box">
<div class="product-heading"><span><img src="https://newprojects.99acres.com/projects/eden_group/eden_richmond_enclave/bb7ttfq9.gif">
<h1 class="font-size15">Eden Richmond Enclave <p>Narendrapur</p>
</h1>
</span> </div>
</div>
<div class="product-text-box">
<ul class="product-lrg-box">
<li> <span><strong><span class="rupee-font">₹ </span>14.22 to 31.15 Lac</strong></span></li>
<li><strong>499-1093 SQFT</strong></li>
<li><strong>1-3 BHK</strong></li>
<li style="width:20% !important;"><strong>December 2020</strong></li>
</ul>
<div id="tabs" class="tab-link tabs-menu tabs-menu-new">
<ul>
<li>e-Brochure</li>
<li>Amenities</li>
<!-- <li style="width:20% !important;">Floor Plan</li>-->
<li style="width:20% !important;">Directions</li>
</ul>
</div>
<span class="enquire-new-bt" id="294015-469203,151100-enquire-new-bt" data-val="2"> I am Interested </span> </div>
</div>
Perhaps try
.FindElementByCSS("ul.product-lrg-box span")
Depends whether you intend to repeat this process for other elements.
For the above, with the HTML you supplied you get:
Otherwise,
.FindElementByCSS("ul.product-lrg-box")
retrieves the entire string.
Without being able to view the page and knowing if you want to retrieve more elements you might consider
.FindElementsByCss("ul.product-lrg-box span")
and looping over the collection returned (if valid); or,
Try scraping with IE and using something like:
IE.documentQuerySelectorAll("ul.product-lrg-box span")
and then looping over the NodeList returned.
To extract the text 14.22 to 31.15 Lac you can use either of the following Locator Strategies:
XPath:
//h1[contains(.,'Eden Richmond Enclave')]//following::div[1]/ul[#class='product-lrg-box']/li//span/strong[not(#class='rupee-font')]
I am learning angular2 using ng-book2 book and I was just playing around Built in directives.
I was reading about ngSwitch and I stumbled upon this feature where we can write multiple ngSwitchWhen with same conditions like following code:
<ul [ngSwitch]="choice">
<li *ngSwitchWhen="1">First choice</li>
<li *ngSwitchWhen="2">Second choice</li>
<li *ngSwitchWhen="3">Third choice</li>
<li *ngSwitchWhen="4">Fourth choice</li>
<li *ngSwitchWhen="2">Second choice, again</li>
<li *ngSwitchDefault>Default choice</li>
</ul>
which will output following result:
Second Choice
Second choice, again
I wrote code as below:
<div [ngSwitch]="myVar">
<div *ngSwitchWhen="myVar==1">My Var is 1</div>
<div *ngSwitchWhen="myVar==2">My Var is 2</div>
<div *ngSwitchWhen="myVar==3">My Var is 3</div>
<div *ngSwitchWhen="myVar==3">Special feature of ng Swtich</div>
<div *ngSwitchDefault>My Var is {{myVar}}</div>
</div>
which does not print output with same conditions.
I thought my code was proper but when I saw *ngSwitchWhen="myVar==3"
I found out my mistake.
But strangely it works properly except for repeated conditions
Is there any difference between these two conditions?
*ngSwitchWhen="2"
*ngSwitchWhen="myVar==3"
Which one to use?
ngSwitchWhen="2"
This expression checks the value of switchcase against the variable myVar(myVar=="6")
ngSwitchWhen="myVar==3"
Whereas this expression evaluates to myVar==(myVar==2) the value inside the parantheses return 1 if myVar is 2 and 0 if not
Example HTML below. I want to locate the elements which contains the text Person Attributes and Age.
<div id="ext-gen210" class="x-tool x-tool-toggle"></div>
<span id="ext-gen214" class="x-panel-header-text">Person Attributes</span>
</div>
<div id="ext-comp-1071" class=" DDView ListView" style="height: auto;">
<p class="dragItem "><i class="icon-Gen"></i>Age</p>
</div>
Note: I am looking for a solution without using xpath or id or className as they might change with every new release of my website.
I tried to locate them using
'name' --> By.name("Person Attributes") and By.name("Age") but both failed.
By.name would check for the name attribute. What you need is to check the text of an element using By.xpath:
By.xpath('//div[span/text() = "Person Attributes"]')
Or, you can also check that an id element starts with ext-gen:
By.xpath('//div[starts-with(#id, "ext-gen")]')