I'd like to extract a list of divs (including their children, for further processing) which contain one or more <p class="c8"> child tags, using BeautifulSoup 4, but I haven't had any luck using the CSS selector syntax. Can I use find_all and a boolean function, or is there a better way?
There are different ways to approach the problem. One, is to locate all p elements having class="c8" and find the parent div element:
for p in soup.find_all("p", class_="c8"):
div = p.find_parent("div")
You can also write a function to find all div elements checking that there is a desired child:
def filter_div(elm):
return elm.name == "div" and elm.find("p", class_="c8")
for div in soup.find_all(filter_div):
# do smth with div
Related
I want to access the 2nd element with same class name using css selectors.
1st element:
<a class="good">
2nd element:
<a class="good">
Css selector I am using :
a.good
but this accessing both of them.
How to access the 2nd one or anyone individually?
you can use pseudo selector, pseudo selector matches elements based on their position among a group of siblings.
check example link below
click here
Quick one, i am trying to avoid using xpath and using css selectors due to performance issues xpath can have so i would like to know the right approach of locating for example "A" in the list
<div class="input-search-suggests" xpath="1">
<div class="input-search-suggests-item">A</div>
<div class="input-search-suggests-item">B</div>
<div class="input-search-suggests-item">C</div>
</div>
Currently i am locating A using xpath / span but it would be indeed sufficient locating all elements and then grabbing A from the list that have same class which is "input-search-suggests-item"
#FindBy(xpath = "//span[contains(text(),'A')]")
CSS_SELECTOR does not have support for direct text what xpath has.
What this means is, for the below xpath
xpath = "//span[contains(text(),'A')]"
based on text A you can not write a css selector.
Instead to locate A using css selector, you can do :
div.input-search-suggests > div.input-search-suggests-item
In Selenium something like this :
#FindBy(cssSelector= "div.input-search-suggests > div.input-search-suggests-item")
Even though it will have 3 matching nodes, but findElement will take the first web element.
Also you may wanna look at nth-child(n)
div.input-search-suggests > nth-child(1)
to make use of index to locate A, B, C
Here is the Reference Link
I need to get the value 107801307 that is inside a specific Div but there are several other Divs in the path before getting into that DIV I need. Can anyone help?
Below is the image with the information that I need to extract from the DIV.
As per the HTML you have provided, to extract the text 107801307 you can use the following solution:
Java:
String myText = driver.findElement(By.xpath("//b[#class='label_tratamento'][contains(.,'Ban Claro')]//following::span[1]").getAttribute("innerHTML");
Python:
myText = driver.find_element_by_xpath("//b[#class='label_tratamento'][contains(.,'Ban Claro')]//following::span[1]").get_attribute("innerHTML")
Research xpath locators to find the specific element you want.
Assuming you were using Java, the code would be:
webdriver.findElement(By.xpath("//b[text()='Ban Claro']/following::span").getText();
or
webdriver.findElement(By.xpath("//b[#class='label_tratamento']/following::span").getText();
Use:
driver.findElement(By.xpath(("XPATH OF DIV HERE")[Index of the div where span is. Example: 4)/span)).getText();
So let's say my html structure have couple of similar div elements.
body/div/...
body/div/...
body/div/...
body/DIV/div[#class='class']
I would like to access the last one. But the one that is upper-cased.
So "//body/div/" selector will find many, but not related elements.
And "//div/div[#class='class']" will select the child div, not the upper case parent.
Use parent to select the parent of a given element:
//div/div[#class='class']/parent
If you want to be more verbose, you can also select a parent by a given tag name:
//div/div[#class='class']/parent::div
I am having three divs each with class myDiv(just for example). And each of div has unordered list with list items inside it.
So i can write down xpath as
By.xpath("//div[#class='myDiv']/ul/li")
I want the first myDiv only.
But this will give results of all three divs. How to get only first div contents. Please help to modify this xpath.
As discussed with the OP, following are potential xpaths to go with.
//li[contains(#id,'100_deal')]
(//div[#class='gbwshoveler-content'])[position()=1]
//div [#id="deals-onethirtyfive-hero10903707629515"]//ul/li
(//div[#class='gbwshoveler-content'])[1]
you can get first div using
By.xpath("//div[#class='myDiv'][1]/ul/li")