Getting text with Beautifulsoup - beautifulsoup

I am currently new to Html and python. I would like to do some webscraping project but had some problem to get a text from a html.
My html is as shown below.
<ul class="icons">
<li><span class="img"><img src="https://google/meter-s.png" alt="" title="" class=""></span>71.63%</li>
<li><span class="img"><img src="https://google/money-s.png" alt="" title="" class=""></span>RM 799,000</li>
<li><span class="img"><img src="google/rental-s.png" alt="" title="" class=""></span>€ 2,000/mth</li>
<li><span class="img"><img src="https://google/yield-s.png" alt="" title="" class=""></span>3%</li>
</ul> /*
And I would like to read/get the 3% text at the end of the list. I tried with this
soup.find(attrs={'li':None}).find('span', class_ = "img").next_sibling
but I however end up getting only the first value in the list which is 71.63%. I hope anyone expert in this topic could help me. thank you.

I would just grab all of the li elements, select the last one, and grab the text from there. You can do that like this:
soup.find_all('li')[-1].text

Related

Cypress - get an element in iframe

I solve the problem with getting into iframe but now I can't get my element. Maybe I'm finding bad but right now it took me too much time and I don't what to do next.
Source code:
<divid="ctl00_Telo_Dock_1005_C_ctl00_MainPage1_myPageVozidlo_inpDruhVozidla_ADX" class="inputCell" style="visibility:visible;display:inherit;">
<span id="ctl00_Telo_Dock_1005_C_ctl00_MainPage1_myPageVozidlo_lblDruhVozidla_ADX" class="labels labelC1_n W270">Druh vozidla:
</span>
<div id="ctl00_Telo_Dock_1005_C_ctl00_MainPage1_myPageVozidlo_cmbDruhVozidla_ADX" tabindex="13" class="RadDropDownList RadDropDownList_CMS_Black RadComboBoxInput" style="width:216px;height:23px;font-weight:bold;font-size:10pt;font-family:Arial;color:#396170;border-width:1px;border-style:Solid;border-color:#FDC267;background-color:#F9FBFC;">
<span class="rddlInner">
<span class="rddlFakeInput"></span>
<span class="rddlIcon"><!-- --></span>
</span>
<div class="rddlSlide" id="ctl00_Telo_Dock_1005_C_ctl00_MainPage1_myPageVozidlo_cmbDruhVozidla_ADX_DropDown" style="display:none;">
<div class="rddlPopup rddlPopup_CMS_Black">
<ul class="rddlList">
<li class="rddlItem rddlItemSelected"></li>
<li class="rddlItem">Osobní automobily</li>
<li class="rddlItem">Motocykly</li>
<li class="rddlItem">Užitkové automobily</li>
</ul>
</div>
</div>
<input id="ctl00_Telo_Dock_1005_C_ctl00_MainPage1_myPageVozidlo_cmbDruhVozidla_ADX_ClientState" name="ctl00_Telo_Dock_1005_C_ctl00_MainPage1_myPageVozidlo_cmbDruhVozidla_ADX_ClientState" type="hidden" />
</div>
</div>
Image of input:
My get function:
cy.get('#iframe-id')
.iframe('body #elementToFind')
.should('exist')
Thank you all for helping me.
Unfortunately, Cypress have some open issues regarding interacting with an iframe. But here's a pretty straightforward workaround: https://github.com/cypress-io/cypress/issues/136#issuecomment-328100955.
Anyway, I believe that this can work only if the domain of the outer page and of the iframe are the same, due to the same-origin limitation.

Unable to locate click the check box

Please help me the locate the check box and select. There are multiple check boxes, there is no way to locate them uniquely.
Here is the Code for one of such check box.
Thanks in Advance for the Help!!!
<div class="col-xs-2 col-sm-2 col-lg-2" style="height:65px">
<ul class="list-inline pull-right">
<li>
<md-input-container class="md-block">
<md-checkbox value="$index+1check" class="checkbox ng-valid ng-dirty ng-touched ng-empty" ng-model="item.Selectedd" ng-click="toggle($index+1, selected,item.TitleId,item)" icon,md-checkbox.md-checked._md-icon="{background-color: green;}" id="Cbk_List" role="checkbox" tabindex="0" aria-checked="false" aria-invalid="false" style=""><div class="_md-container md-ink-ripple" md-ink-ripple="" md-ink-ripple-checkbox=""><div class="_md-icon"></div></div><div ng-transclude="" class="_md-label">
</div></md-checkbox>
</md-input-container>
</li>
<li>
<div class="manageTitle_CirclCard">
<div class="ng-binding">2</div>
</div>
</li>
</ul>
</div>
You should try to locate using cssSelector with it's attribute tabindex as below :-
#Cbk_List[tabindex='0']
Edited :- If checkbox element has not any attributes with unique value, you should try using xpath with index, assuming you want to get first checkbox element, then try below xpath :-
(.//*[#id = 'Cbk_List'])[1]
Note :- In above xpath, just change the index from 1 to desire one checkbox to find.

How to Click on a Text in Selenium Webdriver 2.x

I am not able to click on the below HTML values through selenium webdriver click command through Java.
Here's my HTML...I have to click on PAAcctAcctRels, PAAcctActivityData, etc. as in the HTML.
I tried with LinkText (driver.findElement(By.linkText("PAAcctAcctRels")).click();) and xpath (driver.findElement(By.xpath(".//[#id='primaryNavLevel2Z6_G868H4S0K881F0AAEO37LG28N0']/div[1]/a")).click();)
<div id="primaryNavLevel2Z6_0G5A11K0KGF200AIUB98T20G52" class="dropdown_1columns">
<div class="col_1">
<a class="" href="?uri=nm:oid:Z6_0G5A11K0KGF200AIUB98T20G53">
<strong>
<span lang="en" dir="ltr">
PAAcctAcctRels
<span class="wpthemeAccess"> currently selected</span>
</span>
</strong>
</a>
</div>
<div class="col_1">
<a class="" href="?uri=nm:oid:Z6_0G5A11K0KGF200AIUB98T20GD4">
<span lang="en" dir="ltr">PAAcctActivityData</span>
</a>
</div>
<div class="col_1">
<a class="" href="?uri=nm:oid:Z6_0G5A11K0KGF200AIUB98T20GT1">
<span lang="en" dir="ltr">PAAcctAddrEmail</span>
</a>
</div>
Is there any other way to do this..please let me know.
1- For Clicking on text 'PAAcctActivityData', you can use the below code:
driver.findElement(By.xpath("//span[.='PAAcctActivityData']")).click();
2- For Clicking on text 'PAAcctAddrEmail', you can use the below code:
driver.findElement(By.xpath("//span[.='PAAcctAddrEmail']")).click();
NOTE:- The above xpaths will locate thespan elements with exact innerHTML/text as 'PAAcctActivityData' or 'PAAcctAddrEmail', respectively.
By.linkText("PAAcctAcctRels") won't work because that link has more text (ie ' currently selected'), and the problem with your xpath is that is starts with .//
The following should work (I have avoided using * for performance)
By.xpath("//div[#id='primaryNavLevel2Z6_G868H4S0K881F0AAEO37LG28N0']/div[1]/a")
Try using //[#id='primaryNavLevel2Z6_G868H4S0K881F0AAEO37LG28N0']/div[1]/a/span
as xpath. Remove the initial '.' and add '/span' at the end.

Prestashop: Combinations not showing

i've got a problem with combinations in my prestashop 1.6.0.9. Combinations just not showing in product page for example here - http://b-bservis.cz.webar.cz/home/8-jolly-fix.html there should be 3 combinations with different prices. I dont know what happend, but color picker and another (also default) combinations doesn't working too.
Can anybody help me please?
Thanks
Try activating the default theme, and navigate to the product page to see if the combinations show up. If they do show up, then you have a problem with your theme.
According to source file of his website, he's already using default-bootstrap theme.
Also according to the source file, product attributes are present, they just don't show up on product page.
<div id="attributes">
<div class="clearfix"></div>
<fieldset class="attribute_fieldset">
<label class="attribute_label">Balení </label>
<div class="attribute_list">
<ul id="color_to_pick_list" class="clearfix">
<li class="selected">
</li>
<li>
<a href="http://b-bservis.cz.webar.cz/zakladni-natery-penetracni/8-jolly-fix.html" id="color_26" name="" class="color_pick" title="">
</a>
</li>
<li>
<a href="http://b-bservis.cz.webar.cz/zakladni-natery-penetracni/8-jolly-fix.html" id="color_27" name="" class="color_pick" title="">
</a>
</li>
</ul>
<input type="hidden" class="color_pick_hidden" name="group_4" value="25" />
</div>
<!-- end attribute_list -->
</fieldset>
</div>
I had same problem with an earlier version of PrestaShop, after digging for hours I finally solved the issue.
You can check my post on PS forum and see if it helps.

vb.net split help required

im currently trying to grab an avatar from an html web source, probllem is theres several img sources and containers that have the same name, heres the current part i need
</div>
<div class="content no_margin">
<img src="http://www.gravatar.com/avatar/4787d9302360d807f3e6f94125f7754c?&d=mm&r=g&s=250" /><br />
<br />
<a class="link" href="http://sharefa.st/user/donkey">Uploads</a><br />
<a class="link" href="http://sharefa.st/user/donkey/favorites">Favorites</a><br />
</div>
</div>
<div id="content" class="left">
<div class="header">
Uploads
</div>
<div class="content no_margin">
<div class="profile_box">
<div class="profile_info">
Now the part i need to grab is:
<img src="http://www.gravatar.com/avatar/4787d9302360d807f3e6f94125f7754c?&d=mm&r=g&s=250" /><br />
this image, Any help and id be grateful!
try:
Dim wb As New WebBrowser
wb.Navigate("")
Do While wb.ReadyState <> WebBrowserReadyState.Complete
Application.DoEvents()
Loop
wb.DocumentText = HtmlString 'Your Html
For Each img As HtmlElement In wb.Document.GetElementsByTagName("img")
If InStr(img.GetAttribute("src"), "avatar") Then
MsgBox(img.GetAttribute("src"))
End If
Next
You appear to be attempting to parse HTML 'by hand'. Please don't.
http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html
see this question for some alternatives How do you parse an HTML in vb.net
Use regular expression to find what you're looking for:
http://msdn.microsoft.com/en-us/library/twcw2f1c.aspx
The example code demonstrates pretty much your scenario.