VB.net Reading a text file twice (Best practice - to close then re-open? Alternative?) - vb.net

I am writing an assembler for a self taught course I am doing.
I have a text file I read in to a dictionary structure.
I then need to reread the same text file, but obviously I am already at the end of that file.
How do I reset to the beginning again? What is the best practice?
Thank you.

You can utilize the BaseStream property to get access to the underlying stream (when reading a local file, this will be a FileStream), then reset the stream's Position property to rewind it to the beginning.
Dim Reader As New StreamReader("somefile.txt")
Dim Contents As String = Reader.ReadToEnd()
Reader.BaseStream.Position = 0
Dim FirstLine As String = Reader.ReadLine()

Related

Will Putting This Onto A background Worker Stop This Issue

I have been trying to fix this for a number of days now without any success. I know I have created another post related to this issue but not sure if I should have continued with the other post rather than creating a new one as I am still quite new to how SO works so apologies if I have gone about this the wrong way.
Objective
Read a text file from disk, output the contents to a Textbox so I can then extract the last 3 lines from it. This is the only way I can think of doing this.
The text file is continuously been updated by another running program but I can still read it even though it is in use but cannot write to it.
I am probing this file through a Timer which ticks every 1 second in order to get the latest information.
Now to the issue...
I have noticed that after some time my app becomes sluggish which is noticeable when I try to move it across the screen or resize it and the CPU usage starts to creep up to over 33%
My Thought Process
As this reading the file is a continuous one, I was thinking that I could move it onto a BackgroundWorker which from my understanding would put it on a different thread and take some load off the main GUI.
Am I barking up the wrong tree on this one?
I am reaching out to more advanced users before I start to get all the text books out on learning how to use the BackgroundWorker.
Here is the code I am using to Read the txt file and output it to a text Box. I have not included the code for extracting the last 3 lines because I don't think that part is causing the issue.
I think the issue is because I am constantly probing the source files every second with a timer but not 100% sure to be honest.
Dim strLogFilePath As String
strLogFilePath = "C:\DSD\data.txt"
Dim LogFileStream As FileStream
Dim LogFileReader As StreamReader
'Open file for reading
LogFileStream = New FileStream(strLogFilePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
LogFileReader = New StreamReader(LogFileStream)
'populate text box with the contents of the txt file
Dim strRowText As String
strRowText = LogFileReader.ReadToEnd()
TextBox1.text = strRowText
'Clean Up
LogFileReader.Close()
LogFileStream.Close()
LogFileReader.Dispose()
LogFileStream.Dispose()
Firstly, you should use the Using keyword instead of manually disposing objects, because that way you are guaranteed that the object will get disposed, even if an unexpected exception occurs, for example:
' You can initialize variables in one line
Dim strLogFilePath As String = "C:\DSD\data.txt"
Using LogFileStream As New FileStream(strLogFilePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
' Everything goes in here
End Using
You don't need the reader for my solution. The reading will be done manually.
Next, you need to read the last n lines (in your case, 3) of the stream. Reading the entire file when you're only interested in a few lines at the end is inefficient. Instead, you can start reading from the end until you've reached three (or any number of) line seprators (based on this answer):
Function ReadLastLines(
NumberOfLines As Integer, Encoding As System.Text.Encoding, FS As FileStream,
Optional LineSeparator As String = vbCrLf
) As String
Dim NewLineSize As Integer = Encoding.GetByteCount(LineSeparator)
Dim NewLineCount As Integer = 0
Dim EndPosition As Long = Convert.ToInt64(FS.Length / NewLineSize)
Dim NewLineBytes As Byte() = Encoding.GetBytes(LineSeparator)
Dim Buffer As Byte() = Encoding.GetBytes(LineSeparator)
For Position As Long = NewLineSize To EndPosition Step NewLineSize
FS.Seek(-Position, SeekOrigin.End)
FS.Read(Buffer, 0, Buffer.Length)
If Encoding.GetString(Buffer) = LineSeparator Then
NewLineCount += 1
If NewLineCount = NumberOfLines Then
Dim ReturnBuffer(CInt(FS.Length - FS.Position)) As Byte
FS.Read(ReturnBuffer, 0, ReturnBuffer.Length)
Return Encoding.GetString(ReturnBuffer)
End If
End If
Next
' Handle case where number of lines in file is less than NumberOfLines
FS.Seek(0, SeekOrigin.Begin)
Buffer = New Byte(CInt(FS.Length)) {}
FS.Read(Buffer, 0, Buffer.Length)
Return Encoding.GetString(Buffer)
End Function
Usage:
Using LogFileStream As New FileStream(strLogFilePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
' Depending on system, you may need to supply an argument for the LineSeparator param
Dim LastThreeLines As String = ReadLastLines(3, System.Text.Encoding.UTF8, LogFileStream)
' Do something with the last three lines
MsgBox(LastThreeLines)
End Using
Note that I haven't tested this code, and I'm sure it can be improved. It may also not work for all encodings, but it sounds like it should be better than your current solution, and that it will work in your situation.
Edit: Also, to answer your question, IO operations should usually be performed asynchronously to avoid blocking the UI. You can do this using tasks or a BackgroundWorker. It probably won't make it faster, but it will make your application more responsive. It's best to indicate that something is loading before the task begins.
If you know when your file is being written to, you can set a flag to start reading, and then unset it when the last lines have been read. If it hasn't changed, there's no reason to keep reading it over and over.

Avoid updating textbox in real time in vb.net

I have a very simple code in a VB.NET program to load all paths in a folder in a text box. The code works great, the problem is that it adds the lines in real time, so it takes about 3 minutes to load 20k files while the interface is displaying line by line.
This is my code:
Dim ImageryDB As String() = IO.Directory.GetFiles("c:\myimages\")
For Each image In ImageryDB
txtbAllimg.AppendText(image & vbCrLf)
Next
How can I force my program to load the files in chunks or update the interface every second?
Thanks in advance
Yes, you can do that. You'll need to load the file names into an off-screen data structure of some kind rather than loading them directly into the control. Then you can periodically update the control to display whatever is loaded so far. However, I think you'll find that the slowness comes only from updating the control. Once you remove that part, there will be no need to update the control periodically during the loading process since it will be nearly instantaneous.
You could just load all of the file names into a string and then only set the text box to that string after it's been fully loaded, like this:
Dim imagePaths As String = ""
For Each image As String In Directory.GetFiles("c:\myimages\")
imagePaths &= image & Environment.NewLine
Next
txtbAllimg.Text = imagePaths
However, that's not as efficient as using the StringBuilder:
Dim imagePaths As New StringBuilder()
For Each image As String In Directory.GetFiles("c:\myimages\")
imagePaths.AppendLine(image)
Next
txtbAllimg.Text = imagePaths.ToString()
However, since the GetFiles method is already returning the complete list of paths to you as a string array, it would be even more convenient (and likely even more efficient) to just use the String.Join method to combine all of the items in the array into a single string:
txtbAllimg.Text = String.Join(Environment.NewLine, Directory.GetFiles("c:\myimages\"))
I know that this is not an answer to your actual question, but AppendText is slow. Using a ListBox and Adding the items to it is approx. 3 times faster. The ListBox also has the benefit of being able to select an item easily (at least more easily than a TextBox)
For each image in ImageryDB
Me.ListBox1.Items.add (image)
Next
However, there is probably an even more useful and faster way to do this. Using FileInfo.
Dim dir As New IO.DirectoryInfo("C:\myImages")
Dim fileInfoArray As IO.FileInfo() = dir.GetFiles()
Dim fileInfo As IO.FileInfo
For Each fileInfo In fileInfoArray
Me.ListBox2.Items.Add(fileInfo.Name)
Next

File comparison in VB.Net

I need to know if two files are identical. At first I compared file sizes and creation timestamps, but that's not reliable enough. I have come up with the following code, that seems to work, but I'm hoping that someone has a better, easier or faster way of doing it.
Basically what I am doing, is streaming the file contents to byte arrays, and comparing thier MD5 hashes via System.Security.Cryptography.
Before that I do some simple checks though, since there is no reason to read through the files, if both file paths are identical, or one of the files does not exist.
Public Function CompareFiles(ByVal file1FullPath As String, ByVal file2FullPath As String) As Boolean
If Not File.Exists(file1FullPath) Or Not File.Exists(file2FullPath) Then
'One or both of the files does not exist.
Return False
End If
If String.Compare(file1FullPath, file2FullPath, True) = 0 Then
' fileFullPath1 and fileFullPath2 points to the same file...
Return True
End If
Dim MD5Crypto As New MD5CryptoServiceProvider()
Dim textEncoding As New System.Text.ASCIIEncoding()
Dim fileBytes1() As Byte, fileBytes2() As Byte
Dim fileContents1, fileContents2 As String
Dim streamReader As StreamReader = Nothing
Dim fileStream As FileStream = Nothing
Dim isIdentical As Boolean = False
Try
' Read file 1 to byte array.
fileStream = New FileStream(file1FullPath, FileMode.Open)
streamReader = New StreamReader(fileStream)
fileBytes1 = textEncoding.GetBytes(streamReader.ReadToEnd)
fileContents1 = textEncoding.GetString(MD5Crypto.ComputeHash(fileBytes1))
streamReader.Close()
fileStream.Close()
' Read file 2 to byte array.
fileStream = New FileStream(file2FullPath, FileMode.Open)
streamReader = New StreamReader(fileStream)
fileBytes2 = textEncoding.GetBytes(streamReader.ReadToEnd)
fileContents2 = textEncoding.GetString(MD5Crypto.ComputeHash(fileBytes2))
streamReader.Close()
fileStream.Close()
' Compare byte array and return result.
isIdentical = fileContents1 = fileContents2
Catch ex As Exception
isIdentical = False
Finally
If Not streamReader Is Nothing Then streamReader.Close()
If Not fileStream Is Nothing Then fileStream.Close()
fileBytes1 = Nothing
fileBytes2 = Nothing
End Try
Return isIdentical
End Function
I would say hashing the file is the way to go, It's how I have done it in the past.
Use Using statements when working with Streams and such, as they clean themselves up.
Here is an example.
Public Function CompareFiles(ByVal file1FullPath As String, ByVal file2FullPath As String) As Boolean
If Not File.Exists(file1FullPath) Or Not File.Exists(file2FullPath) Then
'One or both of the files does not exist.
Return False
End If
If file1FullPath = file2FullPath Then
' fileFullPath1 and fileFullPath2 points to the same file...
Return True
End If
Try
Dim file1Hash as String = hashFile(file1FullPath)
Dim file2Hash as String = hashFile(file2FullPath)
If file1Hash = file2Hash Then
Return True
Else
Return False
End If
Catch ex As Exception
Return False
End Try
End Function
Private Function hashFile(ByVal filepath As String) As String
Using reader As New System.IO.FileStream(filepath, IO.FileMode.Open, IO.FileAccess.Read)
Using md5 As New System.Security.Cryptography.MD5CryptoServiceProvider
Dim hash() As Byte = md5.ComputeHash(reader)
Return System.Text.Encoding.Unicode.GetString(hash)
End Using
End Using
End Function
This is what md5 is made for. You're doing it the right way. However, if you really want to improve it further, I can recommend some things to explore. The emphasis is on explore, because none of these are slam dunks. They may help, but they may also hurt, or they may be overkill. You'll need to evaluate them for your situation and determine (through testing) what will be the best solution.
The first recommendation is to compute the md5 hash without loading the entire file into RAM. The example is C#, but the VB.Net translation is fairly straightforward. If you're working with small files, then what you already have may be fine. However, for anything large enough to end up on .Net's Large Object Heap (85,000 bytes), you probably want to consider using the stream technique instead.
Additionally, if you're using a recent version of .Net, you might want to explore doing this asynchronously for each file. As a practical matter, I suspect you'll get best performance from what you have, as the disk I/O is likely to be the slowest part of this, and I'd expect traditional disks to perform best if you allow them to read from the files in sequence, rather than making your disk seek back and forth between the files. However, you may still be able to do better with asynchronous methods, especially if you follow the previous suggestion, because you can also await at the Read() call level, in addition to awaiting for the entire file. Also, if you're running this on an SSD, that would minimize the problems with seeks and could make an asynchronous solution a clear winner. One warning, though: this is a deep rabbit hole to chase... one that can be worthwhile, but you can also end up spending a lot of time on a YAGNI situation. This is the kind of thing, though, you might choose to explore once for a situation where you probably won't use it, so that you understand it well enough to know how it can help in the future for those situations when you do need it.
One more point is that, for the asynch recommendation to work, you need to isolate the hashing code into it's own method... but you should probably do this anyway.
My final recommendation is to remove the File.Exists() checks. This is a tempting test, I know, but it's almost always wrong. Especially if you adopt the first recommendation, just open the streams near the top of the method using an option that fails if the file does not exist, and make your check on whether the stream opened or not.

Using streamreader I can read the next line of words, but can I read the previous one?

Can I read the previous line using StreamReader?
Dim previousfile As New StreamReader("file.txt")
If previousfile.Peek <> +1 Then
txtName.text = previousfile.ReadLine
End If
Can anyone help?
you cannot read the previous line - StreamReader really is a forward only type of reader. when you read a line... thats it. you cannot go back.
why dont you hold the previous line being read in a temp variable or maybe use the FileStream which has a Seek method which maybe of some use to you?
or why not read the entire contents into a collection of strings and splitting it on some delimeter for example?
You can't read backwards with a StreamReader, but if you read all of the lines in first you can traverse them however you like. This does mean reading the whole file in up front, which may be less efficient depending on your usage, but this method would do the job and give you an array:
var lines = File.ReadAllLines("file.txt")

VB can't string to arraylist when read from file

Either I'm missing something really obvious or something about vb is really messed up. I'm trying to read in from a file and add the lines to an arraylist... pretty simple If I add strings to the arraylist this way
selectOptions.Add("Standard")
selectOptions.Add("Priority")
selectOptions.Add("3-Day")
selectOptions.Add("Overnight")
I have no problems
But when I do this it appears to end up empty which makes no sense to me.
Dim reader As StreamReader = My.Computer.FileSystem.OpenTextFileReader(path)
Dim line As String
Do
line = reader.ReadLine
selectOptions.Add(line)
Loop Until line Is Nothing
reader.Close()
Messagebox.show line all day so I know it is reading the file and the file isn't empty and I have checked the type of line which comes back as string. This makes no sense to me.
Checking for reader.EndOfStream in a While loop will probably work better:
Dim reader As New StreamReader(path)
Dim line As String
While Not reader.EndOfStream
line = reader.ReadLine
selectOptions.Add(line)
End While
reader.Close()
You can also get an exception if selectOptions isn't declared as a New ArrayList, if you properly have all your Options turned On.
Another thing to remember, if your code is in the form's Load Handler, it won't throw an exception it will just break out of the handler routine and load the form. This makes it really hard to find things like bad file names, badly declared objects, etc.
One thing I do is put suspect code in a button's Click handler and see what exceptions it throws there.
Of course this could all be moot if you use the File.ReadAllLines method and add it directly to the ArrayList:
selectOptions.AddRange(File.ReadAllLines(path))