How to build a for each with a customized Module - vb.net

Hi i have build a customized module do get elements by classname but i can not get work with for each function.
Well i try to get the loop on the code that call my elements by classname wich is a customized module
Normally its easy to get the loop but with this Customized Module , i can not find the way to this to work.
I am new yet on vb , and i can not find any example to do it wright
This is my code i need to perform a loop
I need some some thing like this
For Each cuele In ele
Next
But its give me an error
Expression is of type 'System.Windows.Forms.HtmlElement', which is not a collection type
I have try in a different way but i get a null exception
Imports System.Text
Imports System.Net
Imports System.IO
Imports System.Text.RegularExpressions
Public Class Form3
Dim WithEvents htmldoc As HtmlDocument
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
'TextBox1.Text = ""
WebBrowser1.Navigate(TextBox1.Text)
End Sub
Private Sub Button2_Click(sender As Object, e As EventArgs) Handles Button2.Click
htmldoc = WebBrowser1.Document
'WebBrowser1.Document.Forms(0).InvokeMember("click")
'WebBrowser1.Document.Forms(0).InvokeMember("submit")
Dim allelements As HtmlElementCollection = WebBrowser1.Document.All
For Each webpageelement As HtmlElement In allelements
'ListBox1.Items.Add(webpageelement.GetAttribute("title").ToString)
'ListBox1.Items.Add(webpageelement.GetAttribute("href").ToString)
Dim PageElement As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("time")
For Each CurElement As HtmlElement In PageElement
'TextBox1.Text = TextBox1.Text + CurElement.GetAttribute("Liga NOS") + Environment.NewLine
'ListBox2.Items.Add(CurElement.GetAttribute("datetime") + Environment.NewLine)
Next
Dim PageElement2 As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("td")
For Each CurElement As HtmlElement In PageElement2
'TextBox1.Text = TextBox1.Text + CurElement.GetAttribute("Liga NOS") + Environment.NewLine
'ListBox1.Items.Add(CurElement.GetAttribute("data-content") + Environment.NewLine)
Next
Dim PageElement3 As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("img")
For Each CurElement As HtmlElement In PageElement3
'TextBox1.Text = TextBox1.Text + CurElement.GetAttribute("Liga NOS") + Environment.NewLine
'ListBox3.Items.Add(CurElement.GetAttribute("alt") + Environment.NewLine)
Next
Next
For i As Integer = ListBox1.Items.Count - 1 To 0 Step -1
If ListBox1.GetItemText(ListBox1.Items(i)) = String.Empty Then
ListBox1.Items.RemoveAt(i)
End If
Next i
For i As Integer = ListBox2.Items.Count - 1 To 0 Step -1
If ListBox2.GetItemText(ListBox2.Items(i)) = String.Empty Then
ListBox2.Items.RemoveAt(i)
End If
Next i
Dim ele As HtmlElement = WebBrowser1.Document.GetElementsByClassName("league-data")(0)
'TextBox1.Text = ele.InnerText
ListBox2.Items.Add(ele.InnerText)
Dim ArrayTag As New ArrayList
For Each item As HtmlElement In
Form1.WebBrowser1.Document.GetElementsByTagName("div")
ArrayTag.Add(item.InnerText)
Next
ListBox2.Items.Add(ArrayTag(64))
End Sub
End Class

OK This might well work, but I can't test it. I've altered the extension a little, so that it creates a list rather than a HtmlElementCollection (It seems that these are read-only)
<System.Runtime.CompilerServices.Extension()>
Public Function GetElementsByClassName(Source As HtmlElementCollection, ClassName As String) As List(Of HtmlElement)
Dim output As New List(Of HtmlElement)
For i As Integer = 0 To Source.Count - 1
Try
If Source(i).GetAttribute("className") = ClassName Then
output.Add(Source(i))
End If
Catch ex As Exception
End Try
Next
Return output
End Function
And edited the form code so that it also deals with a list instead of the HtmlElementCollection
For Each cuele As HtmlElement In ele
ListBox2.Items.Add(cuele.InnerText)
Next
Hopefully this should work.
I also suggest that you use a list instead of an ArrayList as Lists are faster and strongly typed ..
Dim ArrayTag As New List(Of String)
For Each item As HtmlElement In Form1.WebBrowser1.Document.GetElementsByTagName("div")
ArrayTag.Add(item.InnerText)
Next
ListBox2.Items.Add(ArrayTag(64))

Related

VB .Net navigate each link and get a specific url

I'm having trouble using the web browser in VB .NET to navigate a list of URL's inside a multiline textbox, wait for them for load and get a specific URL (should contain a specific word).
What I have until now:
Private Sub Button4_Click(sender As Object, e As EventArgs) Handles Button4.Click
Dim webClient As New System.Net.WebClient
'Dim htmlDoc As New HtmlAgilityPack.HtmlDocument()
For Each strLine As String In TextBox4.Text.Split(vbCrLf)
Console.WriteLine("Turning False")
Console.WriteLine("----")
WebBrowser1.Navigate(strLine)
Console.WriteLine("Waiting for navigation - " & strLine)
Console.WriteLine("Navigated")
Next
For Each link In links
TextBox6.Text = TextBox6.Text & link & vbCrLf
Next
Dim lines As New List(Of String)(TextBox6.Lines)
For i As Integer = lines.Count - 1 To 1 Step -1
If lines(i) = lines(i - 1) Then
lines.RemoveAt(i)
End If
Next
TextBox6.Lines = lines.ToArray
End Sub
Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
For Each ele As HtmlElement In WebBrowser1.Document.Links
'Get whatever text there is in the 'href' attribute
Dim eletarget As String = ele.GetAttribute("href")
'Add it to the listbox
If eletarget.Contains("cfsecure") Then
links.Add(eletarget)
'Carry on to the next link
End If
Next
End Sub
First of all, it seems it doesn't wait until fully loaded. After that, I don't get any link from what I asked (don't know if it's related to the first problem I have or any mistake in my code getting the href with the word "cfsecure").
Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
For Each ele As HtmlElement In WebBrowser1.Document.Links
'Get whatever text there is in the 'href' attribute
Dim eletarget As String = ele.GetAttribute("href")
'Add it to the listbox
If eletarget.Contains("cfsecure") Then
links.Add(eletarget)
'Carry on to the next link
End If
Next
End Sub
What I'm missing here?

VB.NET How to get text from an HTML table?

I'm trying to access certain tables and values from a certain html page inside of a webbrowser control. Here's what i'm trying to access:
https://gyazo.com/c4312f860397d0f86ccce425d1fb3d48
In the end, I'm trying to access the value="100" inside of the input name="Server[players]". Is there any way to do this? I'm not using any external addons for visual studio or anything. I've already gotten this working:
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim divs = WebBrowser1.Document.GetElementsByTagName("div")
For Each div As HtmlElement In divs
If div.GetAttribute("id") = ("statusdetail-ajax") Then
Dim status As String = div.InnerText
Label1.Text = status
End If
Next
Which just shows me the online/offline status. Any help is greatly appreciated!
You want the GetAttribute() method, like so:
Private Sub DisplayMetaDescription()
If (WebBrowser1.Document IsNot Nothing) Then
Dim Elems As HtmlElementCollection
Dim WebOC as WebBrowser = WebBrowser1
Elems = WebOC.Document.GetElementsByTagName("META")
For Each elem As HtmlElement In Elems
Dim NameStr As String = elem.GetAttribute("name")
If ((NameStr IsNot Nothing) And (NameStr.Length <> 0)) Then
If NameStr.ToLower().Equals("description") Then
Dim ContentStr As String = elem.GetAttribute("content")
MessageBox.Show("Document: " & WebOC.Url.ToString() & vbCrLf & "Description: " & ContentStr)
End If
End If
Next
End If
End Sub
Taken from:
https://msdn.microsoft.com/en-us/library/system.windows.forms.htmlelement.getattribute(v=vs.110).aspx
EDIT:
Here's the same concept adapted to your code:
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim inputs = WebBrowser1.Document.GetElementsByTagName("input")
For Each input As HtmlElement In inputs
If input.GetAttribute("id") = ("Server_players") Then
Dim status As String = input.GetAttribute("value")
Label1.Text = status
End If
Next
End Sub
In other words, InnerText returns everything between elements, while GetAttribute() returns an attribute inside an element.
<element attribute="Value">Inner Text</element>

Can't login to website (VB.net browser)

I try to login here: http://www.swagbucks.com/p/login
I've tried the following methods:
Private Sub Button2_Click(sender As Object, e As EventArgs) Handles Button2.Click
For Each element As HtmlElement In WebBrowser1.Document.GetElementsByTagName("input")
If element.Id = "sbxJxRegEmail" Then
Dim i = 0
Dim strLine = TextBox2.Text
element.OuterText = strLine
End If
If element.Id = "sbxJxRegPswd" Then
Dim i = 0
Dim strLine = TextBox3.Text
element.OuterText = strLine
End If
Next
End Sub
and then I tried this
WebBrowser1.Document.GetElementById("sbxJxRegEmail").SetAttribute("value", TextBox2.Text)
WebBrowser1.Document.GetElementById("sbxJxRegPswd").SetAttribute("value", TextBox3.Text)
WebBrowser1.Document.GetElementById("loginBtn").Focus()
WebBrowser1.Document.GetElementById("loginBtn").InvokeMember("click")
What happens is that the text is entered, but on the browser, it's not really there. It's hard to explain, but basically the username and password are just floating on top of the text boxes like some sort of ghost text and when I then click it, it disappears.
How do I fix this?
It's possible that they are masked inputs, probably need to set the inner text of those fields like so...
WebBrowser1.Document.All("Username").SetAttribute("value", "myusername")
WebBrowser1.Document.All("Password").SetAttribute("value", "mypassword")
Dim elements As HtmlElementCollection = (WebBrowser1.Document.All.GetElementsByName("Password"))
For Each element As HtmlElement In elements
element.InnerText = "mypassword"
Next
'You may have to do the same for your username/email

VB.net multithreading for loop, parallel threads

I have a simple form with 2 RichTextBoxes and 1 button, the code grabs the url address from RichTextBox1 and phrases the page for the title field using regex and appends it to RichTextBox2. I want to multithread everything in such way that none of the url's are skipped and the thread numbers can be set ( according to the system free resources ) For example, let's say 10 threads to run in parallel. I searched everything and the best that I managed to do is run everything in a background worker and keep the GUI from freezing while working. A short code sample will be of much help, I am a beginner in VB.net.
Public Class Form1
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
End Sub
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
For i = 0 To RichTextBox1.Lines.Length - 1
If RichTextBox1.Lines(i).Contains("http://") Then
Dim html As String = New System.Net.WebClient() _
.DownloadString(RichTextBox1.Lines(i))
Dim pattern As String = "(?<=\<title\>)([^<]+?)(?=\</title\>)"
Dim match As System.Text.RegularExpressions.Match = _
System.Text.RegularExpressions.Regex.Match(html, pattern)
Dim title As String = match.Value
RichTextBox2.AppendText(title & vbCrLf)
End If
Next
End Sub
End Class
Updated code ( throwing "Index was outside the bounds of the array." errors. )
Imports System
Imports System.Threading
Public Class Form1
Public Sub test(ByVal val1 As String, ByVal val2 As String)
Dim zrow As String
zrow = RichTextBox1.Lines(val1)
If zrow.Contains("http://") Then
Dim html As String = New System.Net.WebClient().DownloadString(zrow)
Dim pattern As String = "(?<=\<title\>)([^<]+?)(?=\</title\>)"
Dim match As System.Text.RegularExpressions.Match = System.Text.RegularExpressions.Regex.Match(html, pattern)
Dim title As String = match.Value
RichTextBox2.AppendText(val2 & title & vbCrLf)
End If
End Sub
Public Sub lastfor(ByVal number)
Dim start As Integer = number - 100
For x = start To number - 1
Try
test(x, x)
RichTextBox2.AppendText(x & RichTextBox1.Lines(x).Trim & vbCrLf)
Catch ex As Exception
'MsgBox(ex.Message)
RichTextBox3.AppendText(ex.Message & vbCrLf & vbCrLf)
End Try
Next
End Sub
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
Control.CheckForIllegalCrossThreadCalls = False
End Sub
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim TotalLines As String = RichTextBox1.Lines.Length - 1
Dim TotalThreads As Integer = 10
Dim LinesPerThread As Integer = TotalLines / TotalThreads
Dim increment As String = LinesPerThread
Dim zdata(TotalThreads) As String
For i = 0 To TotalThreads - 1
zdata(i) = increment
increment = increment + LinesPerThread
Next
Dim lst As New List(Of Threading.Thread)
For Each bump As String In zdata
Dim t As New Threading.Thread(Function(l As String)
'Do something with l
'Update GUI like this:
If bump = String.Empty Or bump Is Nothing Then
Else
lastfor(l)
'MsgBox(l)
End If
End Function)
lst.Add(t)
t.Start(bump)
Next
'test(1)
End Sub
End Class
There are two ways two achieve this:
First, if you are using .NET 4.0, you could use a Parallel.ForEach loop:
Parallel.ForEach(RichTextBox1.Lines, Function(line As String)
' Do something here
' To update the GUI use:
Me.Invoke(Sub()
' Update GUI like this...
End Sub)
Return Nothing
End Function)
The other way is to do this manually (and you will have slightly more control):
Dim lst As New List(Of Threading.Thread)
For Each line In RichTextBox1.Lines
Dim t As New Threading.Thread(Function(l As String)
'Do something with l
'Update GUI like this:
Me.Invoke(Sub()
'Update Gui...
End Sub)
End Function)
lst.Add(t)
t.Start(line)
Next
Both of these are very crude, but will get the job done.
EDIT:
Here is a sample code that will control the number of threads:
Dim lst As New List(Of Threading.Thread)
Dim n As Integer = 1 ' Number of threads.
Dim npl As Integer = RichTextBox1.Lines / n
Dim seg As New List(Of String)
For Each line In RichTextBox1.Lines
For i = npl - n To npl
seg.Add(RichTextBox1.Lines.Item(i))
Next
Dim t As New Threading.Thread(Function(l As String())
For Each lin In l
' TO-DO...
Next
'Do something with l
'Update GUI like this:
Me.Invoke(Sub()
'Update Gui...
End Sub)
End Function)
lst.Add(t)
t.Start(seg.ToArray())
Next
*The above code might have bugs.

Form reloading data and Class variables resetting in VB.net

This program basically goes to a websites and gets all the links in a specific div tag.
It then navigates to each link and gets the links in those pages as well.
However, after getting the first links successfully and traveling to the first website, it recheck the windows form data and resets the class variables to their original values, thus losing all the links.
Why does it reload the windows form data and how could I keep the previous data?
Imports System.Text.RegularExpressions
Public Class Form
Private stage As String = "Getting Page Links"
Dim PageUrls() As String = {}
Dim PageHtml() As String = {}
Private Sub Form_Load(sender As Object, e As System.EventArgs) Handles Me.Load
WebBrowser.Navigate("websiteurlhidden")
End Sub
Private Sub WebBrowser_DocumentCompleted(sender As Object, e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser.DocumentCompleted
If WebBrowser.ReadyState = WebBrowserReadyState.Complete Then
Try
Select Case stage
Case "Getting Page Links"
Dim htmlDocument As HtmlDocument = Me.WebBrowser.Document
Dim htmlElementCollection As HtmlElementCollection = htmlDocument.GetElementsByTagName("DIV")
For Each htmlElement As HtmlElement In HtmlElementCollection
Dim imgUrl As String = htmlElement.GetAttribute("classname")
If imgUrl = " nine " Then
Dim linkIndex As Integer = 0
Dim index2 As Integer = 0
For Each link As HtmlElement In htmlElement.GetElementsByTagName("a")
If linkIndex >= 26 Then
If Not String.IsNullOrEmpty(link.GetAttribute("href")) Then
ReDim Preserve PageUrls(index2)
PageUrls(index2) = link.GetAttribute("href")
' MessageBox.Show(link.GetAttribute("href"))
' MessageBox.Show(PageUrls(linkIndex))
index2 = index2 + 1
End If
End If
linkIndex = linkIndex + 1
Next
For Each str As String In PageUrls
' MessageBox.Show(str)
Next
stage = "Going through pages"
End If
Next
GoThroughPages()
Case "Going through pages"
Dim htmlDocument As HtmlDocument = Me.WebBrowser.Document
Dim htmlElementCollection As HtmlElementCollection = htmlDocument.GetElementsByTagName("DIV")
Dim linkIndex As Integer = 0
For Each htmlElement As HtmlElement In HtmlElementCollection
Dim imgUrl As String = htmlElement.GetAttribute("classname")
If imgUrl = " nine " Then
ReDim Preserve PageHtml(linkIndex)
'need to make permanent.
PageHtml(linkindex) = htmlElement.ToString()
Dim PageDownloadLinks = htmlElement.GetElementsByTagName("a")
End If
End
Next
' GoThroughPages()
Case Else
MessageBox.Show("case else")
End Select
Catch ex As Exception
' MessageBox.Show(ex.Message & " " & ex.ToString)
End Try
End If
End Sub
Private Sub GoThroughPages()
For linkIndex As Integer = 0 To PageUrls.Count - 1
MessageBox.Show(PageUrls(linkIndex))
WebBrowser.Navigate(PageUrls(linkIndex))
Delay(3)
While (WebBrowser.IsBusy)
Application.DoEvents()
End While
Next
End Sub
Sub Delay(ByVal dblSecs As Double)
Const OneSec As Double = 1.0# / (1440.0# * 60.0#)
Dim dblWaitTil As Date
Now.AddSeconds(OneSec)
dblWaitTil = Now.AddSeconds(OneSec).AddSeconds(dblSecs)
Do Until Now > dblWaitTil
Application.DoEvents() ' Allow windows messages to be processed
Loop
End Sub
End Class
The simple solution is to change:
Dim linkIndex As Integer = 0
to
Static linkIndex As Integer = 0
This will cause linkIndex to retain its value between calls and the ReDim Preserve PageHtml(linkIndex) will not be reset on each call.