I'm trying to display my product description, but when I render it, the text goes next to each other instead of underneath.
So for example I'm getting
description 1 description 2
and what I'm trying to get is
description 1
description 2
When I save my description I save it like this
$description = "$description1. \r\n .$description2"
$product->description = $description;
$product->save();
and this is how I'm trying to render it in vue
<p v-html="product.description"></p>
have you tried using the "" tag similar to this:
$description = "$description1. <br> .$description2"
Haven't tested so syntax may be slightly different.
Related
I am trying to scrape an html file structured as follow using beautifulsoup. Basicaly, each unit is constisted of:
one <h2></h2>
one <h3></h3>
more than one <p></p>
Something like follow:
<h2>January, 2020</h2>
<h3>facility</h3>
<p>text1-1</p>
<p>text1-2</p>
<h2>April, 2020</h2>
<h3>scientists</h3>
<p>text2-1</p>
<p>text2-2</p>
<h2>June, 2020</h2>
<h3>lawyers</h3>
<p>text3-1</p>
<h2>.....
I want to get text including the <p> tags between </h3> and the next <h2>. The result should be:
for row #1:
<p>text1-1</p>
<p>text1-2</p>
for row #2:
<p>text2-1</p>
<p>text2-2</p>
for row #3:
<p>text3-1</p>
Here is what I tried so far:
num_h2 = len(soup.find_all('h2'))
for i in range(0,num_h2):
print('---------')
print(i)
p_string = ''
sibling = soup.find_all('h3')[i].find_next_sibling('p').getText()
if sibling:
p_string += sibling
else:
break
print(p_string)
The problem with this solution is that it only shows the content of the first <p> under each unit. I do not know how to find how many <p> are there to generate a for loop. Also, is there a better way to do this than using find_next_silibing()?
Maybe css selectors can help:
for s in soup.select('h3'):
for ns in (s.fetchNextSiblings()):
if ns.name == "h2":
break
else:
if ns.name == "p":
print(ns)
Output:
<p>text1-1</p>
<p>text1-2</p>
<p>text2-1</p>
<p>text2-2</p>
<p>text3-1</p>
this has puzzled me for a bit now. I am trying to pull all of the text from 'p' tags under 'h2' tags by names of "New Fundings" and "New Funds".
Number of 'p' tags aren't consistent for each page, so I was thinking of some sort of while loop and what I tried didn't work. The format for each tag is often the company name with 'strong', then listing text and other 'strong' tags for who funded/invested.
Once I can parse it properly, the goal is to export the company name from 'strong' tag with the proceeding text and the investing companies/people (from following 'strong' tags in the 'p' block to do some data analysis.
Any help would be appreciated - yes, I have looked through various other help pages, but the attempts I've made haven't been successful, so I came here.
import requests
page = requests.get("https://www.strictlyvc.com/2017/06/13/strictlyvc-june-12-2017/")
page
page.content
from bs4 import BeautifulSoup
soup = BeautifulSoup(page.content, 'html.parser')
entrysoup = soup.find(class_ = 'post-entry')
// trying to pull the right paragraphs but these only select the NEXT one, I want all of the tags under 'New Fundings' & 'New Funds' (basically, until the next tag that isn't either of those.
print(entrysoup.find('h2', text = 'New Fundings').find_next_sibling('p'))
print(entrysoup.find('h2', text = 'New Funds').find_next_sibling('p'))
// This was closer, but I wasn't sure how to get it to stop when it hit the non-New Fundings/New Funds tags
for strong_tag in entrysoup.find_all('strong'):
print (strong_tag.text, strong_tag.next_sibling)
I think this is the best result I could get for now. if It it's not what you want let me know so I could fiddle more. if it is mark it as answer:)
import requests
import bs4
page = requests.get("https://www.strictlyvc.com/2017/06/13/strictlyvc-june-12-2017/")
soup =bs4.BeautifulSoup(page.content, 'html.parser')
entrysoup = soup.find(class_ = 'post-entry')
Stop_Point = 'Also Sponsored By . . .'
for strong_tag in entrysoup.find_all('h2'):
if strong_tag.get_text() == 'New Fundings':
for sibling in strong_tag.next_siblings:
if isinstance(sibling, bs4.element.Tag):
print(sibling.get_text())
if sibling.get_text() == Stop_Point:
break
if sibling.name == 'div':
for children in sibling.children:
if isinstance(children, bs4.element.Tag):
if children.get_text() == Stop_Point:
break
print(children.get_text())
Suppose I have a HTML that have some heading & text like:
Heading 1
text......
Heading 2
text.....
Heading 3
text.....
Now I have to print this template in PDF, during print out, I have to add index page which actually refer page number with heading. Means print out should be like this.
Heading 1 ....... 1 [page number]
Heading 2 ....... 2
Heading 3 ....... 3
Heading 1
text......
Heading 2
text.....
Heading 3
text.....
So here I want to know, how to know page number based on text in HTML, like heading 1 belong to which page number & for others.
Any suggestion or idea really appreciated.
pdfConverter.PdfFooterOptions.PageNumberTextFontSize = 10;
pdfConverter.PdfFooterOptions.ShowPageNumber = true;
Its done inside the body of this method :-
private void AddFooter(PdfConverter pdfConverter)
{
string thisPageURL = HttpContext.Current.Request.Url.AbsoluteUri;
string headerAndFooterHtmlUrl = thisPageURL.Substring(0, thisPageURL.LastIndexOf('/')) + "/HeaderAndFooterHtml.htm";
//enable footer
pdfConverter.PdfDocumentOptions.ShowFooter = true;
// set the footer height in points
pdfConverter.PdfFooterOptions.FooterHeight = 60;
//write the page number
pdfConverter.PdfFooterOptions.TextArea = new TextArea(0, 30, "This is page &p; of &P; ",
new System.Drawing.Font(new System.Drawing.FontFamily("Times New Roman"), 10, System.Drawing.GraphicsUnit.Point));
pdfConverter.PdfFooterOptions.TextArea.EmbedTextFont = true;
pdfConverter.PdfFooterOptions.TextArea.TextAlign = HorizontalTextAlign.Right;
// set the footer HTML area
pdfConverter.PdfFooterOptions.HtmlToPdfArea = new HtmlToPdfArea(headerAndFooterHtmlUrl);
pdfConverter.PdfFooterOptions.HtmlToPdfArea.EmbedFonts = cbEmbedFonts.Checked;
}
See this page for more details
http://www.expertpdf.net/expertpdf-html-to-pdf-converter-headers-and-footers/
This is actually a pretty tricky problem which ExpertPDF would have to provide specific functionality to make possible.
My solution (not expertpdf) for this was to calculate the layout of the PDF first, get the text to be used in the index for each page and then calculate the layout of the index page/s. Then I'm able to number the pages (including the index pages) then update the page numbers in the index.. This is the only way to handle template pages which span multiple pages themselves, index text which wraps to take up more than a single line, and indexes which span multiple pages.
Create a TextElement
TextElement te = new TextElement(xPos, yPos, width, ""Page &p; of &P;"", footerFont);
footerTemplate.AddElement(te);
The library will automatically replace the &p; tokens.
i need to scrape a p tag which has h3 tag after it but does not have a closing p tag. It looks like this :
<script ad>asdasdasd</script>
<p>Translation companies are
-----------------------
-----------------------
<h3 class="this_class">mind blown site</h3>
There is no </p> tag so i cannot parse it completely. Now i have two questions :
1) can this be parsed using httpagility xpath ?
2) i have a function to find text between two strings (getbetween). But i have a doubt - If i use "asdasdasd" and " is it always 100% that vb.net will use the script tag which is just above h3 because there are 2-3 same lines - "asdasdasd"
3) Any other method you guys are aware of ?
(had to write in code so html does not mess up)
Regards,
It might be a good idea to post some more "real" html to really help you, at least the tags between the h3 and the p.
Anyway, this should get you the p-Tag from the h3-Tag.
HtmlDocument doc = new HtmlDocument();
doc.Load(... //Load the Html...
//Either of these lines will do
HtmlNode pNode = doc.DocumentNode.SelectSingleNode("//h3[#class='this_class']/preceding-sibling::p");
//HtmlNode pNode = doc.DocumentNode.SelectSingleNode("//h3[contains(text(),'mind blown site')]/preceding-sibling::p");
string pInnerHtml = pNode.NextSibling.InnerHtml; //Has the text "Translation companies are...."
So in general, to get all the nodes from the opening p tag to the start of a tag you don't want, you could do this:
var p = doc.DocumentNode.SelectSingleNode("//p");
var h3 = p.SelectSingleNode("following-sibling::h3[#class='this_class']");
var following = new List<string>();
for (var current = p.NextSibling; current != h3; current = current.NextSibling)
{
following.Add(current.InnerText);
}
var innerText = String.Concat(following);
I've got a strange problem connected with content rendering.
I use following code to grab the content:
lib.otherContent = CONTENT
lib.otherContent {
table = tt_content
select {
pidInList = this
orderBy = sorting
where = colPos=0
languageField = sys_language_uid
}
renderObj = COA
renderObj {
10 = TEXT
10.field = header
10.wrap = <h2>|</h2>
20 = TEXT
20.field = bodytext
20.wrap = <div class="article">|</div>
}
}
and everything works fine, except that I'd like to use also predefined column-content templates other than simple text (Text with image, Images only, Bullet list etc.).
The question is: with what I have to replace renderObj = COA and the rest between the brackets to let the TYPO3 display it properly?
Thanks,
I.
The available cObjects are more or less listed in TSRef, chapter 8.
TypoScript for rendering Text w/image can be found in typo3/sysext/css_styled_content/static/v4.3/setup.txt at line 724, and in the neighborhood you'll find e.g. bullets (below) and image (above), which is referenced in textpic line 731. Variants of this is what you'll write in your renderObj.
You will find more details in the file typo3/sysext/cms/tslib/class.tslib_content.php, where e.g. text w/image is found at or around line 897 and is called IMGTEXT (do a case-sensitive search). See also around line 403 in typo3/sysext/css_styled_content/pi1/class.cssstyledcontent_pi1.php, where the newer css-based rendering takes place.