BeautifulSoup 4: select all divs with at least one child p tag with specific class - beautifulsoup

I'd like to extract a list of divs (including their children, for further processing) which contain one or more <p class="c8"> child tags, using BeautifulSoup 4, but I haven't had any luck using the CSS selector syntax. Can I use find_all and a boolean function, or is there a better way?

There are different ways to approach the problem. One, is to locate all p elements having class="c8" and find the parent div element:
for p in soup.find_all("p", class_="c8"):
div = p.find_parent("div")
You can also write a function to find all div elements checking that there is a desired child:
def filter_div(elm):
return elm.name == "div" and elm.find("p", class_="c8")
for div in soup.find_all(filter_div):
# do smth with div

Related

How to access 2nd element with same class name using css selectors

I want to access the 2nd element with same class name using css selectors.
1st element:
<a class="good">
2nd element:
<a class="good">
Css selector I am using :
a.good
but this accessing both of them.
How to access the 2nd one or anyone individually?
you can use pseudo selector, pseudo selector matches elements based on their position among a group of siblings.
check example link below
click here

Choose the correct element from the list of objects with the same className

Quick one, i am trying to avoid using xpath and using css selectors due to performance issues xpath can have so i would like to know the right approach of locating for example "A" in the list
<div class="input-search-suggests" xpath="1">
<div class="input-search-suggests-item">A</div>
<div class="input-search-suggests-item">B</div>
<div class="input-search-suggests-item">C</div>
</div>
Currently i am locating A using xpath / span but it would be indeed sufficient locating all elements and then grabbing A from the list that have same class which is "input-search-suggests-item"
#FindBy(xpath = "//span[contains(text(),'A')]")
CSS_SELECTOR does not have support for direct text what xpath has.
What this means is, for the below xpath
xpath = "//span[contains(text(),'A')]"
based on text A you can not write a css selector.
Instead to locate A using css selector, you can do :
div.input-search-suggests > div.input-search-suggests-item
In Selenium something like this :
#FindBy(cssSelector= "div.input-search-suggests > div.input-search-suggests-item")
Even though it will have 3 matching nodes, but findElement will take the first web element.
Also you may wanna look at nth-child(n)
div.input-search-suggests > nth-child(1)
to make use of index to locate A, B, C
Here is the Reference Link

How to extract the text from a child node which is within a <div> tag through Selenium and WebDriver?

I need to get the value 107801307 that is inside a specific Div but there are several other Divs in the path before getting into that DIV I need. Can anyone help?
Below is the image with the information that I need to extract from the DIV.
As per the HTML you have provided, to extract the text 107801307 you can use the following solution:
Java:
String myText = driver.findElement(By.xpath("//b[#class='label_tratamento'][contains(.,'Ban Claro')]//following::span[1]").getAttribute("innerHTML");
Python:
myText = driver.find_element_by_xpath("//b[#class='label_tratamento'][contains(.,'Ban Claro')]//following::span[1]").get_attribute("innerHTML")
Research xpath locators to find the specific element you want.
Assuming you were using Java, the code would be:
webdriver.findElement(By.xpath("//b[text()='Ban Claro']/following::span").getText();
or
webdriver.findElement(By.xpath("//b[#class='label_tratamento']/following::span").getText();
Use:
driver.findElement(By.xpath(("XPATH OF DIV HERE")[Index of the div where span is. Example: 4)/span)).getText();

Selenium. child xpath selector as as pointer to parent element

So let's say my html structure have couple of similar div elements.
body/div/...
body/div/...
body/div/...
body/DIV/div[#class='class']
I would like to access the last one. But the one that is upper-cased.
So "//body/div/" selector will find many, but not related elements.
And "//div/div[#class='class']" will select the child div, not the upper case parent.
Use parent to select the parent of a given element:
//div/div[#class='class']/parent
If you want to be more verbose, you can also select a parent by a given tag name:
//div/div[#class='class']/parent::div

Get first div contents only rather than all with same class

I am having three divs each with class myDiv(just for example). And each of div has unordered list with list items inside it.
So i can write down xpath as
By.xpath("//div[#class='myDiv']/ul/li")
I want the first myDiv only.
But this will give results of all three divs. How to get only first div contents. Please help to modify this xpath.
As discussed with the OP, following are potential xpaths to go with.
//li[contains(#id,'100_deal')]
(//div[#class='gbwshoveler-content'])[position()=1]
//div [#id="deals-onethirtyfive-hero10903707629515"]//ul/li
(//div[#class='gbwshoveler-content'])[1]
you can get first div using
By.xpath("//div[#class='myDiv'][1]/ul/li")