Why VB.NET simple app takes forever to load several lines - vb.net

I have a super simple script that I am using to separate a long list of phone numbers we've gathered from donors over the years into separate area code files.
Obviously, when you have almost 1 million lines it's going to take a while - BUT - if I put in 1,000 it takes less than a second. Once I put in 1 million it takes 10 seconds to do only 5 lines. How could this be?
Imports System.IO
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
System.Windows.Forms.Control.CheckForIllegalCrossThreadCalls = False
BackgroundWorker1.RunWorkerAsync()
End Sub
Private Sub BackgroundWorker1_DoWork(ByVal sender As System.Object, ByVal e As System.ComponentModel.DoWorkEventArgs) Handles BackgroundWorker1.DoWork
System.Windows.Forms.Control.CheckForIllegalCrossThreadCalls = False
Dim lines As String
lines = RichTextBox1.Lines.Count
Dim looper As String = 0
Dim path As String = "c:\acs\"
MsgBox("I have " + lines + " lines to do.")
If (Not System.IO.Directory.Exists(path)) Then
System.IO.Directory.CreateDirectory(path)
End If
Dim i As String = 0
For loops As Integer = 0 To RichTextBox1.Lines.Count - 1
Dim ac As String = RichTextBox1.Lines(looper).ToString
ac = ac.Substring(0, 3)
Dim strFile As String = path + ac + ".txt"
Dim sw As StreamWriter
sw = File.AppendText(strFile)
sw.WriteLine(RichTextBox1.Lines(looper))
sw.Close()
Label1.Text = String.Format("Processing item {0} of {1}", looper, lines)
looper = looper + 1
Next
MsgBox("done now")
End Sub
End Class

Each time you use the RichTextBox.Lines properties, VB.Net will need to split the content by CR+LF pair. Thus your For loops As Integer = - To RichTextBox1.Lines.Count-1 is really a performance hit.
Try to use:
For Each vsLine As String in RichTextBox1.Lines
instead. It will be a lot faster. Alternatively, if you must use For loop, then get it once:
Dim vasLines() As String = RichTextBox1.Lines
For viLines As Integer = 0 to UBound(vasLines.Count)
'....
Next
instead.

First, you're performing UI updates in your For loop. That will take time.
You're updating the UI in a thread that is not the main thread which might impact performance. You should not use the CheckForIllegalCrossThreadCalls method. You should update the UI properly using the ReportProgress method of the BackgroundWorker.
You are opening and closing a file for each iteration of the loop. That will take time as well.
I think a better method would be to add the data to a Dictionary(Of String, List(Of String)), with the area code as the key and the list would hold all the number for that area code. Once the Dictionary is filled, loop through the keys and write the numbers out.

Related

System.IO.File.ReadAllLines() method with progress bar in vb.net

I am reading a large text file in vb.net using System.IO.File.ReadAllLines(TextFileURL). Since the process takes a few seconds to finish, would there be a possibility to use a progress bar?
.
RawFile = System.IO.File.ReadAllLines(TextFileURL)
lines = RawFile.ToList
If arg = "" Then MsgBox("IMPORTER IS DONE")
.
There is no loop or anything that could be used to update the value of the progress bar. Any thoughts or workaround would be appreciated.
The following reads a pretty big .TXT file line by line and reports progress:
Code:
Imports System.IO
Public Class Form1
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
End Sub
Private Async Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim dialog As New OpenFileDialog
dialog.Filter = "Text|*.txt"
Dim result = dialog.ShowDialog()
If result <> DialogResult.OK Then
Return
End If
Dim stream = File.OpenRead(dialog.FileName)
Dim reader As New StreamReader(stream)
Dim percentage As Integer
While True
Dim line As String = Await reader.ReadLineAsync()
If line Is Nothing Then
Exit While
End If
' TODO do something with your line
Dim percentD As Double = 1D / stream.Length * stream.Position * 100D
Dim percentI As Integer = Math.Floor(percentD)
If percentI > percentage Then
ProgressBar1.Value = percentI
percentage = percentI
End If
End While
Await stream.DisposeAsync()
End Sub
End Class
Result:
Notes:
this puts a burden on the stream as ultimately reading a line is small data
try using a buffered stream to reduce pressure
notice that I only report when integer percentage is greater than previous
you'd overwhelm UI when updating the progress bar otherwise
there is trivial async usage, you may want to improve that overall
the progress bar doesn't exactly reach 100%, I let you fix that, it's pretty EZ to do
You can use ReadLines instead of ReadAllLines
as docs said, when you are working with very large files, ReadLines can be more efficient:
Dim lstOflines as List(Of String)
For Each line As String In File.ReadLines(TextFileURL)
lstOflines.Add(line)
Next line
For get the total number of lines, you can make a guess based on file Size instead processing two times the file
Code for getting filesize: (use before start processing)
Dim myFile As New FileInfo(TextFileURL)
Dim sizeInBytes As Long = myFile.Length

Alternative Process

I have 2 buttons and a DataGridView with 2 Columns (0 & 1).
The 1st button transfers a randomized cell from the Column(1) to a TextBox. Then, it stores that Cell in variable (a), plus the cell that opposites it in variable (b).
Private Sub Button3_Click(sender As Object, e As EventArgs) Handles Button3.Click
Dim rnd As New Random
Dim x As Integer = rnd.Next(0, Form1.DataGridView1.Rows.Count)
Dim y As Integer = 1
Dim a As String = Form1.DataGridView1.Rows(x).Cells(y).Value
Dim b As String = Form1.DataGridView1.Rows(x).Cells(y - 1).Value
TextBox3.Text = a
End Sub
The 2nd button, however, is supposed to compare if another TextBox's text has the same string variable (b) has as Strings. Now, if so, then it has to display a certain message and so on...
Private Sub Button2_Click(sender As Object, e As EventArgs) Handles Button2.Click
If TextBox4.Text = b Then '<<< ISSUE HERE!
MsgBox("Correct! ^_^")
ElseIf TextBox4.Text = "" Then
MsgBox("You have to enter something first! O_o")
Else
MsgBox("Wrong! >,<")
End If
End Sub
The problem is that the variable (b) is surely not shared across the two "private" subs. And so, there is NOTHING to compare to in the 2nd button's sub! I presume that the solution here is to split the "randomization process" into a separate function, then execute it directly when the 1st button gets activated. Furthermore, that function's variables have to be SHARED somehow, and I certainly don't know how!
Thanks for Mr. Olivier, the code has been improved significantly! Yet, I still encounter a "wrong" comparison issue, somehow!
Dim RND As New Random
Dim x As Integer
Private Function GetCell(ByVal rowIndex As Integer, ByVal cellIndex As Integer) As String
Return Form1.DataGridView1.Rows(rowIndex).Cells(cellIndex).Value
End Function
Private Sub btnRoll_Click(sender As Object, e As EventArgs) Handles btnRoll.Click
x = RND.Next(0, Form1.DataGridView1.Rows.Count)
tbxRoll.Text = GetCell(x, 1)
End Sub
Private Sub btnSubmit_Click(sender As Object, e As EventArgs) Handles btnSubmit.Click
If tbxSubmit.Text = GetCell(x, 0) Then
MsgBox("Correct! ^_^")
ElseIf tbxSubmit.Text = "" Then
MsgBox("You have to enter something first! O_o")
Else
MsgBox("Wrong! >,<")
End If
End Sub</code>
Well, unbelievably, I read a guide about "comparison operations" in VB.net and tried out the first yet the most primal method to compare equality - which was to use .Equals() command - and worked like a charm! Thank God, everything works just fine now. ^_^
If tbxSubmit.Text.Equals(GetCell(x, 0)) Then
Alright now... This is going to sound weird! But, following Mr. Olivier's advise to investigate "debug" the code, I rapped the string I'm trying to compare with brackets and realized that it's been outputted after a break-line space! So, I used the following function to remove the "white-space" from both of the comparison strings! And it bloody worked! This time for sure, though. ^_^
Function RemoveWhitespace(fullString As String) As String
Return New String(fullString.Where(Function(x) Not Char.IsWhiteSpace(x)).ToArray())
End Function
If RemoveWhitespace(tbxSubmit.Text) = RemoveWhitespace(GetCell(x, 0)) Then
Turn the local variables into class fields.
Dim rnd As New Random
Dim x As Integer
Dim y As Integer
Dim a As String
Dim b As String
Private Sub Button3_Click(sender As Object, e As EventArgs) Handles Button3.Click
x = rnd.Next(0, Form1.DataGridView1.Rows.Count)
y = 1
a = Form1.DataGridView1.Rows(x).Cells(y).Value
b = Form1.DataGridView1.Rows(x).Cells(y - 1).Value
TextBox3.Text = a
End Sub
These fields can now be accessed from every Sub, Function and Property.
Of course Button3_Click must be called before Button2_Click because the fields are initialized in the first method. If this is not the case then you should consider another approach.
Create a function for the Cell access
Private Function GetCell(ByVal rowIndex As Integer, ByVal cellIndex As Integer) _
As String
Return Form1.DataGridView1.Rows(rowIndex).Cells(cellIndex).Value
End Function
And then compare
If TextBox4.Text = GetCell(x, y - 1) Then
...
And don't store the values in a and b anymore. If y is always 1 then use the numbers directly.
If TextBox4.Text = GetCell(x, 0) Then
...
One more thing: give speaking names to your buttons in the properties grid before creating the Click event handlers (like e.g. btnRandomize). Then you will get speaking names for those routines as well (e.g. btnRandomize_Click).
See:
- VB.NET Class Examples
- Visual Basic .NET/Classes: Fields

VB.NET How to receive two values from serial waiting for the second

So I have the code below that is connected to a pager, the problem I have is sometimes a 'page message' will be split across two actual "pages" but if its split it will always contain (part 1 of 2) or (part 2 of 2), the problem I have is that my sub receives the page data and then calls the sub Parse_Page which is fine for single paged massages.
Ive tried to test with an if statement say if PageData.contains ("(Part 1 of 2)") and that's all good and well but I need store that string somewhere and somehow wait for the part message to arrive before putting part 1 and part 2 together and calling the "Parse_page" sub. I've tried various if's and arrays but I;m getting confused in what has to happen. Any ideas on how to do it?
Public Sub serial_DataReceived(ByVal sender As Object, ByVal e As System.IO.Ports.SerialDataReceivedEventArgs) Handles serial.DataReceived
Dim PageData As String = serial.ReadLine
Parse_Page(Nothing, Nothing)
End Sub
Simply use a global var.
Dim PageData1 As String = String.Empty
Dim PageData2 As String = String.Empty
Public Sub serial_DataReceived(ByVal sender As Object, ByVal e As System.IO.Ports.SerialDataReceivedEventArgs) Handles serial.DataReceived
If String.IsNullOrEmpty(PageData1) = True Then
PageData1 = serial.Readline
ElseIf String.IsNullOrEmpty(PageData2) = True Then
PageData2 = serial.Readline
PageData = PageData1 & PageData2
PageData1 = String.Empty
PageData2 = String.Empty
Parse_Page(Nothing, Nothing)
End If
End Sub

HI, New to programming with textfiles loops and much more

i have a textfile which is in this format
and i am trying to use a stream reader to help me loop each word into a text box, i am new to programming and really need help because all other examples are too complicated for me to understand,
this is what i am trying to do :
Dim objectreader As New StreamReader("filepath")
Dim linereader(1) As String
linereader = Split(objectreader.ReadLine, ",")
For i As Integer = 0 To UBound(linereader)
Spelling_Test.txtSpelling1.Text = linereader(0)
Spelling_Test.txtSpelling2.Text = linereader(0)
Next
but only get the first line of the text file in to a textbox, i need it to loop to the next line so i can write the next line in!
your help would be much appreciated, and if possible then can you show it practically , if you dont understand what i am trying to do then please ask
It is a little confusing on what you are trying to do, it looks like your text file consists of a word and a hint, you only have one set of textbox's and 3 lines of information in your file.
This example show you how to incrementally read your Stream.
Public Class Form1
Dim objectreader As StreamReader
Dim linereader() As String
Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
If IsNothing(objectreader) Then
objectreader = New StreamReader("C:\Temp\data.txt")
End If
linereader = Split(objectreader.ReadLine, ",")
If String.IsNullOrEmpty(linereader(0)) Then
objectreader.Close()
objectreader = Nothing
Else
txtSpelling1.Text = linereader(0) 'Word
txtSpelling2.Text = linereader(1) 'Hint
End If
End Sub
End Class
But in your case, I would probably just use the File.ReadAllLines Method
Private Sub Button2_Click(sender As System.Object, e As System.EventArgs) Handles Button2.Click
Dim result As String() = File.ReadAllLines("C:\Temp\data.txt")
For x = 0 To UBound(result)
Dim c As Control = Controls.Find("txtSpelling" + (x + 1).ToString, True)(0)
If Not IsNothing(c) Then
c.Text = Split(result(x), ",")(0)
End If
Next
End Sub

What is the most efficient method for looping through a SortedList in VB 2008?

The code below shows me (I think) that the "for each" loop is about 10% faster than the "i to n" loop, but the "for each" loop creates 567k in new memory? Is this right? Which way is generally most efficient with regards to speed and memory usage?
If you want to run this code in VB just add a button and 2 labels to a form.
Public Class StateObject
Public WorkSocket As String = "FFFFFFFFFFFF"
Public BufferSize As Integer = 32767
Public Buffer(32767) As Byte
End Class
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
For cnt As Integer = 1 To 250
Dim StateObjecter As New StateObject
ClientNetList.Add(cnt.ToString, StateObjecter)
Next
End Sub
Private ClientNetList As New SortedList(Of String, StateObject)
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim stop1 As New Stopwatch
Dim stop2 As New Stopwatch
Dim TotalMemory1 As Integer = GC.GetTotalMemory(False)
stop1.Start()
For cnt As Integer = 1 To 1000000
For i = 0 To ClientNetList.Count - 1
ClientNetList.Values(i).WorkSocket = "FFF"
Next
Next
stop1.Stop()
Dim TotalMemory2 As Integer = GC.GetTotalMemory(False)
MsgBox(TotalMemory2 - TotalMemory1)
TotalMemory1 = GC.GetTotalMemory(False)
Dim fff As Integer = GC.GetGeneration(ClientNetList)
stop2.Start()
For cnt As Integer = 1 To 1000000
For Each ValueType As StateObject In ClientNetList.Values
ValueType.WorkSocket = "FFF"
Next
Next
stop2.Stop()
Dim ffff As Integer = GC.GetGeneration(ClientNetList)
TotalMemory2 = GC.GetTotalMemory(False)
MsgBox(TotalMemory2 - TotalMemory1)
Label1.Text = "i: " & stop1.ElapsedMilliseconds
Label2.Text = "e: " & stop2.ElapsedMilliseconds
End Sub
On my system the "for i = 1 " loop was faster for the first test (the first click of the button of the program run) by about 20 percent. But the "for each" loop was a hair faster on subsequent tests. The "for each" loop took a little more memory, but this is temporary and will be eventually garbage collected.
The pros and cons of "for each" and "for i = " have been debated here. For each is nice because it works with structures other than arrays, and makes an object available. "For i =" has the advantage of designating the bounds and order of array items in the loop, and avoids a bug you can encounter with arrays:
Dim a(50) As Integer
Dim i As Integer
For Each i In a
i = 22
Next i
In this example, the array is never initialized to 22. The variable i is just a copy of an array element, and the original array element is not changed when i is assigned 22.