VB get text from html element - vb.net

I need to get the text between two span tags on a web page using visual basic.
<span>Some Text</span>
I know there must be a way but I can't seem to find it.
This is for a website i do not own.

Give your span an ID and runat="server" attribute e.g.
<span id="xMySpan" runat="server">Some Text</span>
Then you will be able to retrieve it in server-side code, e.g.
Dim sVar As String = xMySpan.InnerHtml

Are you extracting this from the entire HTML document or just the quoted text above?
If its just the above (and you've already filtered out the other HTML) then you can use a conbination of LEFT() and RIGHT() to snip off the ends, or use REPLACE() to get rid of the two tags.

What about assigning an ID to the span? If you do, then this works:
TextBox1.Text = _
WebBrowser1.Document.GetElementById("spanID").GetAttribute("innerText")
Using this format:
<span id="spanID">...</span>
EDIT: To filter by content:
$("span").filter(function(){
return $(this).html() == "a";
})
Will work with this:
<span>a</span>

I made this script, hope it will be helpful
I have:
Textbox to get the youtube url [urlVideo]
Button to load the page [btn_loadViews]
A webBrowser Control [webBrowser1]
and a label to show the text [lb_views]
I'm not validating anything, so This is just an example of how do i get text from websites.
If there's another way to do it, i would like to know it too. =)
Private Sub btn_loadViews_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btn_loadViews.Click
WebBrowser1.Navigate(urlVideo.Text)
WaitForPageLoad()
getViews()
End Sub
Private Sub getViews()
Try
Dim version = FileVersionInfo.GetVersionInfo("c:\windows\system32\ieframe.dll")
'Depending on the navigator version, google's server sends diffetent pages, so
'Here Detect ie version
If version.ProductVersion < "8" Then
lb_views.Text = WebBrowser1.Document.GetElementById("vc").FirstChild.InnerText
Else
lb_views.Text = WebBrowser1.Document.GetElementById("watch7-views-info").FirstChild.InnerText
End If
Catch ex As Exception
MsgBox(ex.ToString)
Application.Exit()
End Try
End Sub
Private Property pageready As Boolean = False
Private Sub WaitForPageLoad()
AddHandler WebBrowser1.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
While Not pageready
Application.DoEvents()
End While
pageready = False
End Sub
Private Sub PageWaiter(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs)
If WebBrowser1.ReadyState = WebBrowserReadyState.Complete Then
pageready = True
RemoveHandler WebBrowser1.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
End If
End Sub

Dim WithEvents hDoc As HTMLDocument
Set hDoc = WebBrowser1.Document
Dim strValue As String
strValue = hDoc.getElementsByName("so").Item(0).Value

Related

InvokeMember("click") does not trigger WebBrowser.DocumentCompleted event

This old unresolved question seems to relate most to my issue WebBrowser control not responding to InvokeMember("click")
The main difference is where the conversation petered out, my webpage does respond to ContentPlaceHolder1_btnComtrsView.click() correctly, where in the original question it did not.
I am trying to pull up a lat/long result after clicking the "View" button after entering in a value for COMTRS (use "M11S19E20" for example): https://www.earthpoint.us/TownshipsCaliforniaSearchByDescription.aspx
I've got the 2 separate document completed event handlers working correctly and all of that. So is it possible to handle the event of the document updating after clicking view? My code does work if I just click once to load the page and click, and a second time to pull the data out.
WebBrowserTRS.ScriptErrorsSuppressed = True
AddHandler WebBrowserTRS.DocumentCompleted, AddressOf ClickViewButton
WebBrowserTRS.Navigate("https://www.earthpoint.us/TownshipsCaliforniaSearchByDescription.aspx")
Private Sub ClickViewButton(sender As Object, e As WebBrowserDocumentCompletedEventArgs)
If e.Url.ToString() = "about:blank" Then Return
RemoveHandler WebBrowserTRS.DocumentCompleted, AddressOf ClickViewButton
Dim trsDoc As HtmlDocument = WebBrowserTRS.Document
Dim elem_Input_Submit As HtmlElement = trsDoc.GetElementById("ContentPlaceHolder1_btnComtrsView")
trsDoc.GetElementById("ContentPlaceHolder1_Comtrs").InnerText = _comtrs
AddHandler WebBrowserTRS.DocumentCompleted, AddressOf GetLatLong
elem_Input_Submit.InvokeMember("click")
End Sub
Private Sub GetLatLong(sender As Object, e As WebBrowserDocumentCompletedEventArgs)
RemoveHandler WebBrowserTRS.DocumentCompleted, AddressOf GetLatLong
Dim trsDoc As HtmlDocument = WebBrowserTRS.Document
Dim centroidFound As Boolean = False
For Each el As HtmlElement In trsDoc.GetElementsByTagName("tr")
Dim val As String
For Each el1 As HtmlElement In el.GetElementsByTagName("TD")
val = el1.InnerText
If val IsNot Nothing AndAlso val.Contains("Centroid") Then
centroidFound = True
' ...
WebBrowserTRS = New WebBrowser
Return
End If
Next
Next
If Not centroidFound Then
MsgBox("Unable to parse the township and range.",
MsgBoxStyle.Information, "Error in location lookup")
End If
Cursor = Cursors.Default
toolstripViewMap.Enabled = True
End Sub

How to save checkbox value in xml using vb.net

I know, maybe this is a question that comes up quite often, but I haven't been able to work on it.
How can I save this checkbox values in an XML file using VB.Net ?
Private Sub ChkVul_Click(sender As Object, e As EventArgs) Handles ChkVul.Click
If ChkVul.Checked = True Then
Me.pnlInsert.Visible = True
Else
Me.pnlInsert.Visible = False
End If
End Sub
Before I provide you with an answer, I wanted to recommend a slight code change. Since you are setting the Boolean value of pnlInsert.Visible based on a condition, which itself returns a Boolean value, simply get rid of the conditional check in the first place:
pnlInsert.Visible = ChkVul.Checked
Now to your question. What you are essentially asking is how to write a value to an XML file. Something to consider is that XML is only a markup language. At the end of the day, an XML file is simply a file that contains formatted text.
If you do not already have an XML file to read from, simply create a new instance of a XDocument (documentation). If you do have an XML file, then create a new instance of a XDocument by calling the static XDocument.Load method (documentation). Here is a function that takes in a file location and attempts to load a XDocument, if it is unable to then it returns a blank XDocument with a single <root> element:
Private Function LoadOrCreateXml(filename As String) As XDocument
Dim document = New XDocument()
document.Add(New XElement("root"))
Try
If (Not String.IsNullOrWhiteSpace(filename) AndAlso IO.File.Exists(filename)) Then
document = XDocument.Load(filename)
End If
Catch ex As Exception
' for the sake of this example, just silently fail
End Try
Return document
End Function
Now that you have an XDocument, it is just a matter of writing the value. You did not provide any details as to where the value should go or what the tag name should be, so I am going to assume that it should be a child of the <root /> element and the value would look something like this: <ChkVul>true/false</ChkVul>.
To do this, we will need to get the <root /> element by calling the Element method (documentation) on the XDocument to get the element and then call the Add method (documentation) on the resulting XElement to add our node with the value:
Dim document = LoadOrCreateXml("my-xml-file.txt")
document.Element("root").Add(New XElement("ChkVul", ChkVul.Checked))
The final piece of all this is to write the in-memory XDocument back to the file. You can leverage the XDocument.Save method (documentation):
Dim filename = "my-xml-file.txt"
Dim document = LoadOrCreateXml(filename)
document.Element("root").Add(New XElement("ChkVul", ChkVul.Checked))
document.Save(filename)
Maybe this is what you have in mind. I tested this with other controls,
Public Class Form1
Private SavePath As String = IO.Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop),
"MyControls.xml") 'some valid path <-----------<<<
Private Sub Form1_FormClosing(sender As Object, e As FormClosingEventArgs) Handles Me.FormClosing
SaveCheckedState(ChkVul, "ChkVul", True)
End Sub
Private Sub Form1_Shown(sender As Object, e As EventArgs) Handles Me.Shown
If IO.File.Exists(SavePath) Then
Try
MyControls = XElement.Load(SavePath)
For Each el As XElement In MyControls.Elements
Dim ctrl() As Control = Me.Controls.Find(el.#name, True)
If ctrl.Length > 0 Then
Dim chkd As System.Reflection.PropertyInfo = ctrl(0).GetType().GetProperty("Checked")
Dim chkdV As Boolean = Boolean.Parse(el.#checked)
chkd.SetValue(ctrl(0), chkdV)
End If
Next
Catch ex As Exception
'todo
End Try
End If
End Sub
Private Sub ChkVul_CheckedChanged(sender As Object, e As EventArgs) Handles ChkVul.CheckedChanged
If ChkVul.Checked Then
Me.pnlInsert.Visible = True
Else
Me.pnlInsert.Visible = False
End If
SaveCheckedState(ChkVul, "ChkVul")
End Sub
Private Sub RadioButton1_CheckedChanged(sender As Object, e As EventArgs) Handles RadioButton1.CheckedChanged
SaveCheckedState(RadioButton1, "RadioButton1")
End Sub
Private Sub CheckBox1_CheckedChanged(sender As Object, e As EventArgs) Handles CheckBox1.CheckedChanged
SaveCheckedState(CheckBox1, "CheckBox1")
End Sub
Private MyControls As XElement = <controls>
</controls>
Private Sub SaveCheckedState(Ctrl As Control,
Name As String,
Optional Save As Boolean = False)
If Ctrl.GetType().GetProperty("Checked") IsNot Nothing Then
Dim chkd As System.Reflection.PropertyInfo = Ctrl.GetType().GetProperty("Checked")
Dim chkdV As Boolean = CBool(chkd.GetValue(Ctrl))
Dim ie As IEnumerable(Of XElement)
ie = From el In MyControls.Elements
Where el.#name = Ctrl.Name
Select el Take 1
Dim thisXML As XElement
If ie.Count = 0 Then
thisXML = <ctrl name=<%= Name %>></ctrl>
MyControls.Add(thisXML)
Else
thisXML = ie(0)
End If
thisXML.#checked = chkdV.ToString
End If
If Save Then
Try
MyControls.Save(SavePath)
Catch ex As Exception
'todo
' Stop
End Try
End If
End Sub
End Class

Saving pdf document from webbrowser control

I'm navigating from webbrowser control to an url like this;
http://www.who.int/cancer/modules/Team%20building.pdf
It's shown in webbrowser control. What I want to do is to download this pdf file to computer. But I tried many ways;
Dim filepath As String
filepath = "D:\temp1.pdf"
Dim client As WebClient = New WebClient()
client.DownloadFileCompleted += new AsyncCompletedEventHandler(client_DownloadFileCompleted);
client.DownloadFileAsync(WebBrowserEx1.Url, filepath)
This one downloads a pdf but there is nothing in the file.
Also tried with
objWebClient.DownloadFile()
nothing changed.
I tried to show a save or print dialog;
WebBrowserEx1.ShowSaveAsDialog()
WebBrowserEx1.ShowPrintDialog()
but they didnt show any dialog. Maybe the last one is because it doesnt wait to load the the pdf into webbrowser completely.
When I try html files there is no problem to dowload, but in this .pdf file, I think I didn't manage to wait the file to be loaded as pdf into browser. This function(s);
Private Sub WaitForPageLoad(ByVal adimno As String)
If adimno = "1" Then
AddHandler WebBrowserEx1.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
While Not pageReady
Application.DoEvents()
End While
pageReady = False
End If
End Sub
Private Sub PageWaiter(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs)
If WebBrowserEx1.ReadyState = WebBrowserReadyState.Complete Then
pageReady = True
RemoveHandler WebBrowserEx1.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
End If
End Sub
are not working for this situation. I mean it gets into infinite loop.
So anyone knows how to wait this to load pdf then save into computer.
you could test the URL when document completed fires and if its .pdf, then do the following then navigate back, for example.
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
WebBrowserEx1.Navigate("http://www.who.int/cancer/modules/Team%20building.pdf")
End Sub
Private Sub WebBrowserEx1_DocumentCompleted(ByVal sender As Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowserEx1.DocumentCompleted
If WebBrowserEx1.Url.ToString.Contains(".pdf") Then
Using webClient = New WebClient()
Dim bytes = webClient.DownloadData(WebBrowserEx1.Url.ToString) 'again variable here
File.WriteAllBytes(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "TEST.pdf"), bytes) 'save to desktop or specialfolder. to list all the readily available user folders
End Using
'WebBrowserEx1.goback() 'could send browser back a page as well
End If
End Sub
You will need to make the filename "TEST" as a variable instead of a static string or else you will overwrite the same file each time. Perhaps:
WebBrowserEx1.DocumentTitle.ToString & ".pdf"
instead, which would save the file as pdf named by the webpage title. Only problem there is if the page contains illegal characters (that windows doesnt let you save with) it will throw an exception so that should be handled.

Confusing System.NullReferenceException error on a defined variable - VB

Alright so I am new here so I apologize in advance if I post incorrectly or am a little vague. My problem is that I run into a NullReferenceException when I try to run my code but while debugging and hovering my mouse over the problematic variable, I do indeed see the value of the variable.
Here is the VB code that I am working with:
Private Sub Login_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles login.Click
status.Text = "Connecting...."
WebBrowser2.Navigate("http://*****.com/?op=login")
WebBrowser2.Document.GetElementById("loginUsername").InnerText = username.Text
WebBrowser2.Document.GetElementById("loginPassword").InnerText = password.Text
WebBrowser2.Document.GetElementById("loginSubmit").InvokeMember("click")
End Sub
Here is the snapshot of what is going on:
------------ EDIT : SOLUTION -------------------
WebBrowser2.Url = New Uri("http://*****.com/?op=login")
WaitForPageLoad() ' <---------- ADDED NEW FUNCTION TO WAIT FOR PAGE LOAD
WebBrowser2.Document.GetElementById("loginUsername").InnerText = username.Text
WebBrowser2.Document.GetElementById("loginPassword").InnerText = password.Text
WebBrowser2.Document.GetElementById("loginSubmit").InvokeMember("click")
status.Text = "Completed"
So I created a new function (credits go to BGM in How to wait until WebBrowser is completely loaded in VB.NET?) called WaitForPageLoad() which essentially loops through a check for the page to be ready and then once it is, is kills the handler so the login is successful and the page does not loop. Here is the WaitForPageLoad():
Private Property pageready As Boolean = False
Private Sub WaitForPageLoad()
AddHandler WebBrowser2.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
While Not pageready
Application.DoEvents()
End While
pageready = False
End Sub
Private Sub PageWaiter(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs)
If WebBrowser2.ReadyState = WebBrowserReadyState.Complete Then
pageready = True
RemoveHandler WebBrowser2.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
End If
End Sub
WebBrowser2.Navigate takes some time to load the document, but is asynchronous. That means that the next code gets executed before the document finishes loading.
Consequently, in the next line, GetElementById cannot yet find the target element and returns Nothing. To prevent this, you cannot execute code after calling Navigate – instead, you need to create an event handler for the event that is fired once the document finished loading, and execute the code there. – This is the DocumentCompleted event.
On that line in particular...
Document could be null
The result of GetElementById("loginUsername") could be null.
Why do you think that username is null?
I bet that WebBrowser2.Document.GetElementById("loginUsername") returns null.
The other possibility is Document to be null.

Vb.Net Wait for webbrowser finish navigate

How can i wait till webbrowser loaded the page?
i tried:
webbrowser1.navigate(url)
msgbox("done")
This is the approach I used when I was having the same problem. By adding a handler you dont have to use a timer to unnecessary processing instead the event will fire as soon as the document has loaded. Dont be fooled by the name documentcompleted, it's actually waiting for the webpage to load.
AddHandler (webbrowser1.DocumentCompleted), AddressOf WebpageLoaded
webbrowser1.Navigate(url)
Public Sub WebpageLoaded(sender As System.Object, e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs)
MessageBox.Show("Done")
End Sub
Im not saying this is the best way to go but it worked well for me :)
Do While wb.ReadyState <> WebBrowserReadyState.Complete
Application.DoEvents()
Loop
I inherit a new class from WebBrowser control:
Public Class WebBrowserSyncFW
Inherits WebBrowser
Public Async Function NavigateSync(ByVal urlString As String, Optional ByVal timeoutmillisec As Integer = 30000) As Task(Of Boolean)
Dim IsLoaded As Boolean = False
Me.ScriptErrorsSuppressed = True
Me.Navigate(urlString)
AddHandler Me.DocumentCompleted, Sub(sender As Object, e As WebBrowserDocumentCompletedEventArgs)
IsLoaded = True
End Sub
For i = 1 To timeoutmillisec / 100
Await Task.Delay(100).ConfigureAwait(False)
If IsLoaded = True Then Return True
Next
Return False
End Function
End Class
Usage:
If Await WebBrowserSyncFW1.NavigateSync("http://www.youtube.com") Then
MsgBox("Page is loaded!", MsgBoxStyle.Information)
Else
MsgBox("Timeout!", MsgBoxStyle.Exclamation)
End If