This sql call,
<div class="dateOnAd"><?php print $row['dttm_modified']; ?></div>
outputs:
2012-05-22 15:07:28.
I need to figure out a way to divide the output into different divs
<div class="example-date">
<span class="day">31</span>
<span class="month">July</span>
<span class="year">2009</span>
</div>
<div class="example-date">
<span class="day"><?php echo date('d', $row['ddtm_modified']); ?></span>
<span class="month"><?php echo date('F', $row['ddtm_modified']); ?></span>
<span class="year"><?php echo date('Y', $row['ddtm_modified']); ?></span>
</div>
You can read more about the date() function here: date.
Related
Hello everyone I have the information I want pulled using BeautiuflSoup but I can't seem to get it printed out correctly to send to pandas and excel.
html_f ='''
<li class="list-group-item">
<div>
<div class="tyler-toggle-controller open">
<p class="text-primary">
07/01/2022 Date
<span class="caret"> </span>
</p>
</div>
<div class="tyler-toggle-container row-buff" style="display: block; overflow: hidden;">
<p class="col-sm-12 col-md-12">
<span class="text-muted">Comment</span><br>
[1] Comments
</p>
</div>
</div>
</li>'''
My code used to pull the data I want:
soup = BeautifulSoup(html_f,'html.parser')
for child in soup.findAll('li',class_='list-group-item')[0]:
print (child.text)
Here is the info it pulls But it prints it out weird with tons of spacing
07/01/2022 Date
Comment
[1] Comments
Ideally, I only need the top portion of (date and File Date) printed out but at the very least I need help getting it into a list format like:
07/01/2022 Date
Comment
[1] Comments
To get your information printed as expected in your question, you could use stripped_strings and iterate over its elements:
for e in soup.find_all('li',class_='list-group-item'):
for t in list(e.stripped_strings):
print(t)
Note: In new code use find_all() instead of old syntax findAll().
Example
html='''
<li class="list-group-item">
<div>
<div class="tyler-toggle-controller open">
<p class="text-primary">
07/01/2022 Date
<span class="caret">
</span>
</p>
</div>
<div class="tyler-toggle-container row-buff" style="display: block; overflow: hidden;">
<p class="col-sm-12 col-md-12">
<span class="text-muted">
Comment
</span>
<br/>
[1] Comments
</p>
</div>
</div>
</li>
'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(html)
for e in soup.find_all('li',class_='list-group-item'):
for t in list(e.stripped_strings):
print(t)
Output
07/01/2022 Date
Comment
[1] Comments
Not sure cause you are talking about pandas, you also could pick each information, clean it up and append to a list of dicts:
data = []
for e in soup.find_all('li',class_='list-group-item'):
data.append({
'date': e.p.text.strip().replace(' Date',''),
'comment': e.select_one('.tyler-toggle-container br').next_sibling.strip()
})
pd.DataFrame(data)
or
data = [{
'date':soup.select_one('li.list-group-item .text-primary').text.strip().replace(' Date',''),
'comment':soup.select_one('li.list-group-item .tyler-toggle-container br').next_sibling.strip()
}]
Output
date
comment
07/01/2022
[1] Comments
So far so good, it's my trying
doc='''
<li class="list-group-item">
<div>
<div class="tyler-toggle-controller open">
<p class="text-primary">
07/01/2022 Date
<span class="caret">
</span>
</p>
</div>
<div class="tyler-toggle-container row-buff" style="display: block; overflow: hidden;">
<p class="col-sm-12 col-md-12">
<span class="text-muted">
Comment
</span>
<br/>
[1] Comments
</p>
</div>
</div>
</li>
'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(doc, 'html.parser')
text=[' '.join(child.get_text(strip=True).split(' ')).replace(' DateComment[1]',',') for child in soup.find_all('li',class_='list-group-item')]
print(text)
Output:
['07/01/2022, Comments']
Try this ways,must work
text=' '.join([' '.join(child.get_text(strip=True).split(' ')).replace(' DateComment[1]',',') for child in soup.find_all('li',class_='list-group-item')]).strip()
#Or
text= [' '.join(child.get_text(strip=True).split(' ')).replace(' DateComment[1]',',') for child in soup.find_all('li',class_='list-group-item')]
final_text= text[1]+ ',' +text[2]
final_text= text[1]+text[2].split()#if you want to make list
I am using scrapy shell just to make sure my selectors for my spider are correct. I am able to get all other sections I need except this one p tag that contains the cross ref part numbers. I am scraping from this particular page here
When I try response.css('div.col-1-2-2' > div.rpr-help m-chm > div > p::text').extract() it returns blank
When I try response.css('div > p::text').extract() the results have the section I am looking for plus a bunch of data I do not want.
I have a feeling this is going to be a super easy answer, but I have no idea what I am missing here
This is a snippet of the html section of the page I am trying to scrape, the last 'p' tag starting with Part Number
<div class="col-1-2-2">
<div id="img-detail" style="text-align:center;">
<div id="img-detail-main">
<a id="ctl00_cphMain_imgenlarge" rel="nofollow" href="/detail-img.aspx?id=3094537&i=" class="cboxElement"><img id="ctl00_cphMain_iMain" src="https://cdn.appliancepartspros.com/images/product/cache/whirlpool-clutch-assembly-285785-ap3094537_01_l.jpg" style="border-width:0px;outline:none;">
<div class="img-overlay" style="display:none;"><img src="/images/play.png" style="height:107px;"></div>
<div id="main-text-overlay" style="display:none;"></div>
</a>
</div>
<div class="img-help">Click image to open expanded view</div>
<div id="img-detail-thumb">
<div class="a-button a-active">
<img id="ctl00_cphMain_rImgTh_ctl01_imgTh" src="https://cdn.appliancepartspros.com/images/product/cache/whirlpool-clutch-assembly-285785-ap3094537_01_tt.jpg" style="border-width:0px;">
</div>
<div class="a-button">
<img id="ctl00_cphMain_rImgTh_ctl02_imgTh" src="https://cdn.appliancepartspros.com/images/product/cache/whirlpool-clutch-assembly-285785-ap3094537_02_tt.jpg" style="border-width:0px;">
</div>
<div class="a-button">
<img id="ctl00_cphMain_rImgTh_ctl03_imgTh" src="https://cdn.appliancepartspros.com/images/product/cache/whirlpool-clutch-assembly-285785-ap3094537_03_tt.jpg" style="border-width:0px;">
</div>
<div class="a-button">
<img id="ctl00_cphMain_rImgTh_ctl04_imgTh" src="https://cdn.appliancepartspros.com/images/product/cache/whirlpool-clutch-assembly-285785-ap3094537_04_tt.jpg" style="border-width:0px;">
</div>
<div class="a-button">
<img id="ctl00_cphMain_rImgTh_ctl05_imgTh" src="https://cdn.appliancepartspros.com/images/product/cache/whirlpool-clutch-assembly-285785-ap3094537_05_tt.jpg" style="border-width:0px;">
</div>
<div class="a-button">
<img id="ctl00_cphMain_rImgTh_ctl06_imgTh" class="diagram" data-dcmt="Clutch assembly AP3094537 is number 5 on this diagram. This is to give you an idea of the appearance and the location of the part. Your appliance model may be slightly different." src="https://483cda5f439700fab03b-6195bc77e724f6265ff507b1dc015ddb.ssl.cf1.rackcdn.com/0029384112_4.gif" style="border-width:0px;">
</div>
<div class="a-button">
<img id="ctl00_cphMain_rImgTh_ctl07_imgTh" class="video" src="https://img.youtube.com/vi/7RS1l6t8efc/hqdefault.jpg" style="border-width:0px;">
<div class="img-overlay"><img src="/images/play.png"></div>
</div>
</div>
</div>
<div class="rpr-help m-chm">
<div class="header">
<h2 class="h6">Repair Help</h2>
</div><!-- /end .header -->
<div class="inner m-bsc">
<ul>
<li>Repair Video</li>
<li>Repair Q&A</li>
</ul>
</div>
<div>
<br>
<span class="h4">Cross Reference Information</span><br>
<p>Part Number 285785 (AP3094537) replaces 2670, 285331, 285380, 285422, 285540, 285761, 285785VP, 3350015, 3350114, 3350115, 3351342, 3351343, 387888, 388948, 388949, 3946794, 3946847, 3951311, 3951312, 62699, 63174, 63765, 64176, AH334641, EA334641, J27-662, LP326, PS334641.
<br>
</p>
</div>
</div>
</div>
Hope this works
response.xpath('//div[#class="col-1-2-2"]//p/text()').extract_first()
You can try this also, response.xpath('(//div[#class="rpr-help m-chm"]//p//text())[1]').get()
<li class="tabRow tabRowLeft" *ngFor="let gene of filteredgene = (seq.genes) | limitTo:filteredgene.length/2+filteredgene.length%2">
<div class="displayFlex" (click)="showGeneRecord(gene.geneName,'ATTRIBUTE','.addPopup.attributeRisk')">
<div class="tabCell">
<div class="cellItem displayFlex">
<h4 class="flex1">{{gene.geneName}}</h4>
</div>
</div>
<div class="tabCell">
<div class="cellItem displayFlex">
<h4 class="flex1">{{gene.geneScore}}</h4>
</div>
</div>
</div>
</li>
I am trying to get the value in filteredgene and then for loop on filteredgene . I am getting error Bindings cannot contain assignments. Any one knows, what should I do resolve and get this thing done.
For limit to I have created a pipe too.
Hey guys below I have a interesting problem.... I am trying to setup a left and right side post so that I can take things like read more, post date or author and have them be in the col-md-4 and have the title and the post content along with read more be inside the col-md-8
I have a feeling I am going about this pretty strange as I am rusty as hell with my php and wordpress so any help in achieving this would be helpful. Interesting problem is I have two read more's put in place with one having a display:none for WHATEVER reason if I remove that my read more's go bonkers on the page.
The code:
<?php
$myposts = get_posts('numberposts=5
&category=homeposts');
foreach($myposts as $post) :?>
<div class="col-md-8" style="background:#000;">
<h3><a href="<?php echo the_permalink($post->ID); ?>" title="<?php echo $post->post_title;?>">
<?php echo $post->post_title ?></a></h3>
<?php echo substr($post->post_content,0,500) ?>
<a class="btn btn-default" style="display:none;" role="button" href="<?php echo get_permalink(); ?>">Read More</a>
</div>
<div class="col-md-4" style="background:#000;"><a class="btn btn-default" role="button" href="<?php echo get_permalink(); ?>">Read More</a></div>
<?php endforeach; ?>
ok so what is wrong with this....
<?php
$myposts = get_posts('numberposts=5
&category=homeposts');
foreach($myposts as $post) :?>
<div class="col-md-8">
<h3><a href="<?php echo the_permalink($post->ID); ?>" title="<?php echo $post->post_title;?>">
<?php echo $post->post_title; ?></a></h3>
<?php echo $post->the_excerpt; ?>
<a class="btn btn-default" style="display:none;" role="button" href="<?php echo get_permalink(); ?>">Read More</a>
</div>
<?php endforeach; ?>
Trying to get currency values from some html using scrapy.Code is
links = hxs.select('//a[#class="product-image"]/div[#class="price-box"]//span[#class="price"]/text()').extract()')
And the HTML
<div>
<span>
<sub>
<li class="item first">
<a href="http://www.xtra-vision.ie/dvd-blu-ray/to-rent/new-release/dvd/pitch-perfect-dvd.html" title="Image for Pitch Perfect" class="product-image">
<span class="exclusive-star">
</span>
<img src="http://www.xtra-vision.ie/media/catalog/product/cache/3/small_image/124x173/5b02ab93946615b958c913185aae2414/i/w/iws_5167c10c906b57.33524324.JPG.jpg" alt="Image for Pitch Perfect" />
<h2 class="product-name">Pitch Perfect</h2>
<div class="price-box">
<span class="regular-price" id="product-price-5174">
<span class="price">
€15
<sub class="price-bit">.99</sub>
</span>
</span>
</div>
</a>
</li>
</sub>
</span>
</div>
The resulting price i get is \u20ac15\t\t\t\t\t\t
Is there some way I can extract 15.99 from this html using xpath
I used a combination of xpath and Python so might not be quite what you were after, although this was mainly employed to get rid of the extraneous tabs added to the end of the "price".
price = hxs.select('//span[#class="price"]/text()').extract()
pricebit = hxs.select('//span[#class="price"]/sub[#class="price-bit"]/text()').extract()
totalprice = price + price-bit
totalstr = ''.join(totalprice).replace('\t','')