VB.NET Webbrowser How to display specified content I want - vb.net

eg: the source code will be:
<html>
<head> balaba....
</head>
<body>
<div id="many_div">...</div>
<div id="main">
<div id="target">
.....balabala ...
</div>
</div>
</body></html>
Then, how to let my webbrowser only display the div with "target" id ?
Thanks!

You'll need to manipulate the HTML of the page.
I would use HtmlAgilityPack to extract the part you want and rewrite it to the same or another file:
Dim html = File.ReadAllText("c:\temp\htmlTest.htm")
Dim doc = New HtmlAgilityPack.HtmlDocument()
doc.LoadHtml(html)
Dim target = doc.GetElementbyId("target")
If target IsNot Nothing Then
Dim body = doc.DocumentNode.SelectSingleNode("//body")
body.RemoveAll()
body.PrependChild(target)
Using writer = File.OpenWrite("c:\temp\htmlTest2.htm")
doc.Save(writer)
End Using
End If
Now you just have to load this html in the WebBrowser.
If you want to get the HTML directly from internet/intranet:
Dim client As New HtmlAgilityPack.HtmlWeb()
Dim doc As HtmlAgilityPack.HtmlDocument = client.Load("http://yoururl.com")
' rest is the same as above(without doc.LoadHtml) '

Related

What should I do to avoid error message "Object variable or With Block Variable not set"

I am new in VBA script and have created small script to get the data from website however I always get error message "Object variable or With Block Variable not set" . please help me to get the best script to display the price "Rp 16.425"
Below are the sample of both HTML and my script :
HTML as screen below :
<div class="S4G9wXKj">
<h3 data-testid="hSRPProdName" class="Ka_fasQS">Buku MATEMATIKA BILINGUAL Kelas X SMA/MA - Yrama Widya</h3>
<div itemprop="offers" itemtype="http://schema.org/Offer">
<span itemprop="price" class="_3fNeVBgQ">
<span data-testid="hSRPProdPrice">Rp 16.425</span>
</span>
<meta itemprop="priceCurrency" content="IDR">
<div class="ifULVzgL"><div class="_3z5GLerJ">
<span class="UY2SWg6T" data-testid="spnSRPProdTabShopLoc">Yogyakarta</span>
<span class="_1GDgKs4K">Yrama Widya Books</span></div></div>
<div class="_3dCNp1CF"></div></div></div>
and this is my script :
Dim XMLPage As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim BHTMLDoc As New MSHTML.HTMLDocument
XMLPage.Open "GET", "https://www.tokopedia.com/search?st=product&q=buku&ob=3&pmin=16411&origin_filter=sort_price", False
XMLPage.send
BHTMLDoc.body.innerHTML = XMLPage.responseText
Dim i As Long
For i = 0 To 3
Debug.Print HTMLDoc.querySelectorAll("[data-testid=hsRPProdPrice] span").Item(i).innerText
Next

Html agility pack failed to scrape image

Ok I have found a code which as the website declared scrape image from div using htmlagility pack vb.net.
I followed the procedure and I get nothing.
This is source html:
<div class='my-gallery'>
<!-- ONLY PREV NAVIGATION -->
<!-- ONLY PREV NAVIGATION -->
<img src='http://example.com/image.jpg' alt='image'/>
<!-- ONLY NEXT NAVIGATION -->
<!-- ONLY NEXT NAVIGATION -->
</div>
This is vb.net code I tried:
Public Sub getImg()
Try
Dim link As String = ("http://www.exmple.com")
'download page from the link into an HtmlDocument'
Dim doc As HtmlDocument = New HtmlWeb().Load(link)
Dim div As HtmlNode = doc.DocumentNode.SelectSingleNode("//div[#class='my-gallery']//img//src")
If Not div Is Nothing Then
PreviewBox.ImageLocation = (div.ToString)
End If
Catch ex As Exception
MsgBox(ex.Message)
End Try
End Sub
The src is an attribute of the img element, so you need to extract it slightly differently, for example:
Dim img As HtmlNode = htmlDocument.DocumentNode.SelectSingleNode("//div[#class='my-gallery']//img")
If img IsNot Nothing Then
Dim url As String = img.Attributes("src").Value
PreviewBox.ImageLocation = url
End If

VbScript hta - Open a new hta by link and retrieve correct filename

I have two files; "1.hta" and "2.hta".
"1.hta" contains a simple link to file "2.hta"
2.hta
"2.hta" contains a script to determine its own filename
FullName = replace(oApp.commandLine,chr(34),"") 'oApp = HT Application ID
arrFN=split(FullName,"\")
FileName = arrFN(ubound(arrFN))
SourceDir=replace(FullName,FileName,"")
"2.hta" works perfectly when started "stand-alone" --> FileName = 2.hta
However, starting "2.hta" via link from "1.hta" I get --> FileName = 1.hta
I need a way to determine the correct filename, or does hta always retrieve the filename of the first/starting instance?
You can try like this :
<html>
<head>
<title>HTA Launch another HTA</title>
<HTA:APPLICATION
SINGLEINSTANCE="yes"
WINDOWSTATE="maximize"
>
</head>
<SCRIPT Language="vbscript">
Sub Execute(File)
Dim ws
Set ws = CreateObject("wscript.shell")
ws.run chr(34) & File & chr(34)
End sub
</SCRIPT>
<body>
<h1>This is test hta 1 ONE</h1>
Start the HTA2
</body>
</html>

Both HttpWebRequest and HtmlAgilityPack are unable to get text from table

This is the original function I wrote to get the Html of a web page and parse it with the same code used for "IE.document"
The code works fine with some websites but now I get an error on "doc.write" and i think it's because the webpage has "iso-8859-1" encoding and a different encoding in the second column of the table I'm trying to parse.
Function mWebRe(ByVal mUrl As String) As MSHTML.HTMLDocument
Dim request As HttpWebRequest = WebRequest.Create(mUrl)
request.Timeout = 10000
Dim doc As MSHTML.IHTMLDocument2 = New MSHTML.HTMLDocument
Try
Dim response As HttpWebResponse = request.GetResponse()
'this is the original code
'Dim reader As StreamReader = New StreamReader(response.GetResponseStream())
'this is an attempt without effects
Dim reader As StreamReader = New StreamReader(response.GetResponseStream(), Encoding.GetEncoding("iso-8859-1"))
Dim WebContent As String = reader.ReadToEnd() 'Here the text seems to be
doc.clear()
doc.write(WebContent) 'Here I get error on loading page
doc.close()
' The following is a must do, to make sure that the data is fully load.
While (doc.readyState <> "complete")
Thread.Sleep(50)
End While
Catch ex As Exception
Return Nothing
End Try
Return doc
End Function
I've tryed to modify the code and also tryed to use HtmlAgilityPack (I've never used it before) without success.
I need the content of the second "Table" (doesn't have id), so I wrote the code below (It isn't able to get the correct innertext from cells):
Dim web As HtmlAgilityPack.HtmlWeb = New HtmlWeb()
web.OverrideEncoding = Encoding.GetEncoding("ISO-8859-1")
Dim doc As HtmlAgilityPack.HtmlDocument = web.Load(mUrl)
For Each Table As HtmlNode In doc.DocumentNode.SelectNodes("//table")
For Each Row As HtmlNode In Table.SelectNodes("//tr")
For Each Cell As HtmlNode In Row.SelectNodes("//td")
Dim mTxt As String = Cell.InnerText
Next
Next
Next
This is the "start" of the webpage sourcecode:
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
This is an extract of a row I would extract:
<tr>
<td class="tableValues" align="center" valign="top" >Mar 24/12/2013</td>
<td class="tableValues" align="left" valign="top" >Iscritto al Ruol<!--span-->o<!--i>4</i--></td>
<td class="tableValues" align="left" valign="top" ></td>
</tr>
I think that the second column has a different encoding but I don't have any idea on how to convert it to the correct text.
Any suggest is appreciated.
I just solved inserting the code below in the code with htmlAgilityPack.
But if anyone can suggest a better solution I'll be grateful.
For Each Cell As HtmlNode In Row.SelectNodes("//td")
Dim mTxt As String = Cell.InnerText
If mTxt.Contains("&#") Then
Dim StrOk As String = WebUtility.HtmlDecode(mTxt)
StrOk = Regex.Replace(StrOk, "<!--.+?-->", String.Empty)
Debug.Print(StrOk)
End If

How to click this button?

Hi all I am a beginner in Visual Basic. Your help is highly appreciated.
How can i click this button in a webpage?
<a class="buttonRed" href="what_would_you_do.html" onclick="this.blur();">
<span>Get Started</span>
</a>
Short example which worked for me, tested with simple html file:
ClickA.html:
<!DOCTYPE HTML>
<html lang="en">
<head>
<title><!-- Insert your title here --></title>
</head>
<body>
<a class="buttonRed" href="what_would_you_do.html" onclick="this.blur();">
<span>Get Started</span>
</a>
</body>
</html>
vba standard module:
' Add References:
' - Microsoft HTML Object Library
' - Microsoft Internet Controls
Sub test()
Dim Browser As InternetExplorer
Dim Document As htmlDocument
Set Browser = New InternetExplorer
Browser.Visible = True
Browser.navigate "C:\Temp\VBA\ClickA.html"
Do While Browser.Busy And Not Browser.readyState = READYSTATE_COMPLETE
DoEvents
Loop
Set Document = Browser.Document
Dim anchorElement As HTMLAnchorElement
Set anchorElement = Document.getElementsByClassName("buttonRed").Item(Index:=1)
anchorElement.Click
Set Document = Nothing
Set Browser = Nothing
End Sub
actually this is not a button. its anchor tag. write your code as
<a class="buttonRed" href="what_would_you_do.html">
<span>Get Started</span>
</a>
and it will redirect to "what_would_you_do.html page (if its in root)