Text file split in blocks vb.net - vb.net

I am trying to go through my text file and create a new file that will contain only the text I require. My current line looks like:
Car-1I
Colour-39
Cost-328
Dealer-28
Car-2
Colour-30
Cost-234
For each block of text I would like to read the first line, if the first line ends with an I, then read the next line, if that line contains a colour 39, then I would like to save the whole block of text to another file. If these two conditions aren't met, I dont want to save my values to the new text file.
Before anything about saving my values in classes are mentioned, these blocks of text can vary in size and values, so I dont always have a set range of values which is why i need to skip to the blank line
IO.File.WriteAllText("C:\Users\test2.txt", "") 'write to new file
Dim sKey As String
Dim sValue As Integer
For Each filterLine As String In File.ReadLines("C:\Users\test.txt")
sKey = Split(filterLine, ":")(0)
sValue = Split(filterLine, ":")(1)
If Not sValue.EndsWith("I") Then
ElseIf sValue.EndsWith("I") Then
End If
Next

Another method, using File.ReadLines to read lines of text from file. This method doesn't load all the text in memory, it reads from disc single lines of text, so it can also be useful when dealing with big files.
You could loop the IEnumerable collection it returns, but also use its GetEnumerator() method to control more directly when to move to the next line, or move more then one lines forward.
Its Enumerator.Current object returns the line of text currently read, Enumerator.MoveNext() moves to the next line.
A StringBuilder is used to store the strings when a match found. Strings are added to the StringBuilder object using its AppendLine() method.
This class is useful when dealing with strings that you need to store, compare and discard (or modify) quickly: since string are immutable, when you use String variables directly, especially in loops, you generate a whole lot of garbage that slows down any procedure quite a lot.
The blocks of text stored in the StringBuilder object are then written to a destination file using a StreamWriter with explicit encoding set to UTF-8 (writes the BOM). Its methods include asynchronous versions: WriteLine() can be replaced by awaitWriteLineAsync() to allow an async procedure.
Imports System.IO
Imports System.Text
Dim sourceFilePath = "<Path of the source file>"
Dim resultsFilePath = "<Path of the destination file>"
Dim sb As New StringBuilder()
Dim enumerator = File.ReadLines(sourceFilePath).GetEnumerator()
Using sWriter As New StreamWriter(resultsFilePath, False, Encoding.UTF8)
While enumerator.MoveNext()
If enumerator.Current.EndsWith("I") Then
sb.AppendLine(enumerator.Current)
enumerator.MoveNext()
If enumerator.Current.EndsWith("39") Then
While Not String.IsNullOrWhiteSpace(enumerator.Current)
sb.AppendLine(enumerator.Current)
enumerator.MoveNext()
End While
sWriter.WriteLine(sb.ToString())
End If
sb.Clear()
End If
End While
End Using

This will work:
Dim strFile As String = "c:\Test5\Source.txt"
Dim strOutFile As String = "c:\Test5\OutPut.txt"
Dim strOutData As String = ""
Dim SourceGroups As String() = Split(File.ReadAllText(strFile), vbCrLf + vbCrLf)
For Each sGroup As String In SourceGroups
Dim OneGroup() As String = Split(sGroup, vbCrLf)
If Strings.Right(OneGroup(0), 1) = "I" And (Strings.Right(OneGroup(1), 2) = "39") Then
If strOutData <> "" Then strOutData += (vbCrLf & vbCrLf)
strOutData += sGroup
End If
Next
File.WriteAllText(strOutFile, strOutData)

Something like this should work:
Dim base, i, c as Integer
Dim lines1$() = File.ReadLines("C:\Users\test.txt")
c = lines1.count
While i < c
if Len(RTrim(lines1(i))) Then
If Strings.Right(RTrim(lines1(i)), 1)="I" Then
base = i
i += 1
If Strings.Right(RTrim(lines1(i)), 2)="39" Then
While Len(RTrim(lines1(i))) 'skip to the next blank
i += 1
End While
' write lines1(from base to (i-1)) here
Else
While Len(RTrim(lines1(i)))
i += 1
End While
End If
Else
i += 1
End If
Else
i += 1
End If
End While

Related

Reading text from one file, to check other text file for matches

So I'm new to VB.NET, and I'm trying to read through 2 separate text files. File2 first & pulling a variable from it. I then want to take that variable, and check File1 to see if the string matches. Both files are relatively large, and I have to trim part of the string from the beginning, hence the multiple splits. So currently, this is what I have. But where I'm getting hung up is, how do I get it to check every line to see if it matches with the other file?
Public Function FileWriteTest()
Dim file1 = My.Computer.FileSystem.OpenTextFileReader(path)
Dim file2 = My.Computer.FileSystem.OpenTextFileReader(path)
Do Until file2.EndOfStream
Dim line = file2.ReadLine()
Dim noWhiteSpace As New String(line.Where(Function(x) Not Char.IsWhiteSpace(x)).ToArray())
Dim split1 = Split(noWhiteSpace, "=")
Dim splitStr = split1(0)
If splitStr.Contains(HowImSplittingText) Then
Dim parameter = Split(splitStr, "Module")
Dim finalParameter = parameter(1)
End If
'Here is where I'm not sure how to continue. I have trimmed the variable to where I would like to check with the other file.
'But i'm not sure how to continue from here.
Here are a couple of cleanup notes:
Get the lines of your first file by using IO.File.ReadAllLines (documentation)
Get the text of the second file by using IO.File.ReadAllText (documentation)
Use String.Replace (documentation) instead of Char.IsWhitespace, this is quicker and much more obvious as to what is going on
Use String.Split (documentation) instead of Split (because this is 2021 not 1994).
In terms of what you would do next, you would call String.IndexOf (documentation) on the second file's text, passing your variable value, to see if it returns a value greater than -1. If it does, then you know where at in the file the value exists.
Here is an example:
Dim file1() As String = IO.File.ReadAllLines("path to file 1")
Dim file2 As String = IO.File.ReadAllText("path to file 2")
For Each line In file1
Dim noWhiteSpace As String = line.Replace(" ", String.Empty)
Dim split1() As String = noWhiteSpace.Split("=")
If (splitStr.Length > 0) Then
Dim splitStr As String = split1(0)
If splitStr.Contains(HowImSplittingText) Then
Dim parameter() As String = splitStr.Split("Module")
If (parameter.Length > 0) Then
Dim finalParameter As String = parameter(0)
Dim index As Integer = file2.IndexOf(finalParameter)
If (index > -1) Then
' finalParameter is in file2
End If
End If
End If
End If
Next

Count words in an external file using delimiter of a space

I want to calculate the number of words in a text file using a delimiter of a space (" "), however I am struggling.
Dim counter = 0
Dim delim = " "
Dim fields() As String
fields = Nothing
Dim line As String
line = Input
While (SR.EndOfStream)
line = SR.ReadLine()
End While
Console.WriteLine(vbLf & "Reading File.. ")
fields = line.Split(delim.ToCharArray())
For i = 0 To fields.Length
counter = counter + 1
Next
SR.Close()
Console.WriteLine(vbLf & "The word count is {0}", counter)
I do not know how to open the file and to get the do this, very confused; would like an explanation so I can edit and understand from it.
You're going to be reading a file as the source of the data, so let's create a variable to refer to its filename:
Dim srcFile = "C:\temp\twolines.txt"
As you have shown already, a variable is needed to hold the number of words found:
Dim counter = 0
To read from the file, a StreamReader will do the job. Now, we look at the documenation for it (yes, really) and notice that it has a Dispose method. That means that we have to explicitly dispose of it after we've used it to make sure that no system resources are tied up until the computer is next rebooted (e.g there could be a "memory leak"). Fortunately, there is the Using construct to take care of that for us:
Using sr As New StreamReader(srcFile)
And now we want to iterate over the content of the file line-by-line until the end of the file:
While Not sr.EndOfStream
Then we want to read a line and find how many items separated by spaces it has:
counter += sr.ReadLine().Split({" "c}, StringSplitOptions.RemoveEmptyEntries).Length
The += operator is like saying "add n to a" instead of saying "a = a + n". The {" "c} is a literal array of the character " "c. The c tells it that is a character and not a string of one character. The StringSplitOptions.RemoveEmptyEntries means that if there was text of "one two" then it would ignore the extra spaces.
So, if you were writing a console program, it might look like:
Imports System.IO
Module Module1
Sub Main()
Dim srcFile = "C:\temp\twolines.txt"
Dim counter = 0
Using sr As New StreamReader(srcFile)
While Not sr.EndOfStream
counter += sr.ReadLine().Split({" "c}, StringSplitOptions.RemoveEmptyEntries).Length
End While
End Using
Console.WriteLine(counter)
Console.ReadLine()
End Sub
End Module
Any embellishments such as writing out what the number represents or error checking are left up to you.
With Path.Combine you don't have to worry about where the slashes or back slashes go. You can get the path of special folders easily using the Environment class. The File class of System.IO is shared so you don't have to create an instance.
Public Sub Main()
Dim p As String = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments), "Chapters.txt")
Debug.Print(Environment.SpecialFolder.MyDocuments.ToString)
Dim count As Integer = GetCount(p)
Console.WriteLine(count)
Console.ReadKey()
End Sub
Private Function GetCount(Path As String) As Integer
Dim s = File.ReadAllText(Path)
Return s.Split().Length
End Function
Use Split function, then Directly get the length of result array and add 1 to it.

VB.net How to remove quotes characters from a streamReader.readline.split()

I had built a project that read data from a report and it used to work fine but now for some reason the report puts every thing in to strings. So I want to modify my stream reader to remove or ignore the quotes as it reads the lines.
This is a snipet of the part that reads the lines.
Dim RawEntList As New List(Of Array)()
Dim newline() As String
Dim CurrentAccountName As String
Dim CurrentAccount As Account
Dim AccountNameExsists As Boolean
Dim NewAccount As Account
Dim NewEntry As Entrey
Dim WrongFileErrorTrigger As String
ListOfLoadedAccountNames.Clear()
'opens the file
Try
Dim sr As New IO.StreamReader(File1Loc)
Console.WriteLine("Loading full report please waite")
MsgBox("Loading full report may take a while.")
'While we have not finished reading the file
While (Not sr.EndOfStream)
'spliting eatch line up into an array
newline = sr.ReadLine.Split(","c)
'storring eatch array into a list
RawEntList.Add(newline)
End While
And then of course I iterate through the list to pull out information to populate objects like this:
For Each Entr In RawEntList
CurrentAccountName = Entr(36)
AccountNameExsists = False
For Each AccountName In ListOfLoadedAccountNames
If CurrentAccountName = AccountName Then
AccountNameExsists = True
End If
Next
You could just do
StringName.Replace(ControlChars.Quote, "")
or
StringName.Replace(Chr(34), "")
OR
streamReader.readline.split().Replace(Chr(34), "")
How about doing the replace before the split, after the readline? That should save iteration multiplication, or better yet (if possible), do a replace on the entire file (if the data is formatted in the way it can be done & you have enough memory) using the ReadAllText method of the File object, do your replace, then read the lines from memory to build your array (super fast!).
File.ReadAllText(path)

For some reason my program is not reading the file that I asked to be read

You can see thatt I've opened the file just below, but I've recently discovered that It is reading a file open above, which I have closed.
Dim TestNO As Integer
Dim myLines As New List(Of String)
Dim sb As StringBuilder
FileOpen(10, "F:\Computing\Spelling Bee\testtests.csv", OpenMode.Input)
Dim Item() As String = Split(fullline, ",")
Dim MaxVal As Integer = Integer.MaxValue
Do Until EOF(10)
fullline = LineInput(10)
If Item(7) > MaxVal Then
MaxVal = Item(7)
TestNO = MaxVal
End If
Loop
This is where I open and close my previous file.
Dim flag As Boolean = False
FileOpen(1, "F:\Computing\Spelling Bee\stdnt&staffdtls\stdnt&staffdtls.csv",
OpenMode.Input)
Do Until EOF(1)
fullline = LineInput(1)
Dim item() As String = Split(fullline, ",")
If enteredusername = item(0) And enteredpassword = item(1) Then
Console.WriteLine()
Console.Clear()
Console.WriteLine("Welcome," & item(3) & item(4))
Threading.Thread.Sleep(1000)
Console.Clear()
flag = True
If item(2) = "p" Then
FileClose(1)
pupilmenu()
ElseIf item(2) = "s" Then
FileClose(1)
staffmenu()
ElseIf item(2) = "a" Then
FileClose(1)
adminmenu()
FileOpen, EOF, and LineInput are all old VB6-style methods which are provided primarily for backwards compatibility. It would be far preferable to use the new .NET classes provided in the System.IO namespace. For instance, this same task is easily performed line this:
For Each line As String In File.ReadAllLines("F:\Computing\Spelling Bee\stdnt&staffdtls\stdnt&staffdtls.csv")
Dim fields() As String = line.Split(","c)
If (enteredUserName = fields(0)) And (enteredPassword = fields(1)) Then
' ...
End If
Next
Notice that I also used line.Split rather than Split(line), which is also an old VB6-style method. It's better to use the new String.Split method.
The File.ReadAllLines method opens the file, reads the entire contents of the file, closes the file, and then returns all of the data as an array of strings, with each item in the array being one line from the file. This is a very simple way to read an entire file. If it is a particularly large file, however, it would be better to use a FileStream object to read one line at a time.
Also, it's worth mentioning that reading a CSV file yourself can be complicated. They aren't always as simple as simply splitting on commas. For instance, the following is an example of a valid CSV line:
Bill, "Red, White, and Blue", Smith
As you can see, that line only contains three fields, but it contains four commas. Also, the quotation marks should not be considered as part of the value in the second field. The easiest way to read a CSV file is to use the TextFieldParser class, which handles all of those eccentricities.

How to avoid "Out Of Memory" exception when reading large files using File.ReadAllText(x)

This is my code to search for a string for all files and folders in the drive "G:\" that contains the string "hello":
Dim path = "g:\"
Dim fileList As New List(Of String)
GetAllAccessibleFiles(path, fileList)
'Convert List<T> to string array if you want
Dim files As String() = fileList.ToArray
For Each s As String In fileList
Dim text As String = File.ReadAllText(s)
Dim index As Integer = text.IndexOf("hello")
If index >= 0 Then
MsgBox("FOUND!")
' String is in file, starting at character "index"
End If
Next
This code will also results in memory leak/out of memory exception (as I read file as big as 5GB!). Perhaps it will bring the whole file to the RAM, then went for the string check.
Dim text As String = File.ReadAllText("C:\Users\Farizluqman\mybigmovie.mp4")
' this file sized as big as 5GB!
Dim index As Integer = text.IndexOf("hello")
If index >= 0 Then
MsgBox("FOUND!")
' String is in file, starting at character "index"
End If
But, the problem is: This code is really DANGEROUS, that may lead to memory leak or using 100% of the RAM. The question is, is there any way or workaround for the code above? Maybe chunking or reading part of the file and then dispose to avoid memory leak/out of memory? Or is there any way to minimize the memory usage when using the code? As I felt responsible for other's computer stability. Please Help :)
You should use System.IO.StreamReader, which reads line by line instead all the lines at the same time (here you have a similar post in C#); I personally never use ReadAll*** unless under very specific conditions. Sample adaptation of your code:
Dim index As Integer = -1
Dim lineCount As Integer = -1
Using reader As System.IO.StreamReader = New System.IO.StreamReader("C:\Users\Farizluqman\mybigmovie.mp4")
Dim line As String
line = reader.ReadLine
If (line IsNot Nothing AndAlso line.Contains("hello")) Then
index = line.IndexOf("hello")
Else
If (line IsNot Nothing) Then lineCount = line.Length
Do While (Not line Is Nothing)
line = reader.ReadLine
If (line IsNot Nothing) Then
lineCount = lineCount + line.Length
If (line.Contains("hello")) Then
index = lineCount - line.Length + line.IndexOf("hello")
Exit Do
End If
End If
Loop
End If
End Using
If index >= 0 Then
MsgBox("FOUND!")
' String is in file, starting at character "index"
End If