Count words in an external file using delimiter of a space - vb.net

I want to calculate the number of words in a text file using a delimiter of a space (" "), however I am struggling.
Dim counter = 0
Dim delim = " "
Dim fields() As String
fields = Nothing
Dim line As String
line = Input
While (SR.EndOfStream)
line = SR.ReadLine()
End While
Console.WriteLine(vbLf & "Reading File.. ")
fields = line.Split(delim.ToCharArray())
For i = 0 To fields.Length
counter = counter + 1
Next
SR.Close()
Console.WriteLine(vbLf & "The word count is {0}", counter)
I do not know how to open the file and to get the do this, very confused; would like an explanation so I can edit and understand from it.

You're going to be reading a file as the source of the data, so let's create a variable to refer to its filename:
Dim srcFile = "C:\temp\twolines.txt"
As you have shown already, a variable is needed to hold the number of words found:
Dim counter = 0
To read from the file, a StreamReader will do the job. Now, we look at the documenation for it (yes, really) and notice that it has a Dispose method. That means that we have to explicitly dispose of it after we've used it to make sure that no system resources are tied up until the computer is next rebooted (e.g there could be a "memory leak"). Fortunately, there is the Using construct to take care of that for us:
Using sr As New StreamReader(srcFile)
And now we want to iterate over the content of the file line-by-line until the end of the file:
While Not sr.EndOfStream
Then we want to read a line and find how many items separated by spaces it has:
counter += sr.ReadLine().Split({" "c}, StringSplitOptions.RemoveEmptyEntries).Length
The += operator is like saying "add n to a" instead of saying "a = a + n". The {" "c} is a literal array of the character " "c. The c tells it that is a character and not a string of one character. The StringSplitOptions.RemoveEmptyEntries means that if there was text of "one two" then it would ignore the extra spaces.
So, if you were writing a console program, it might look like:
Imports System.IO
Module Module1
Sub Main()
Dim srcFile = "C:\temp\twolines.txt"
Dim counter = 0
Using sr As New StreamReader(srcFile)
While Not sr.EndOfStream
counter += sr.ReadLine().Split({" "c}, StringSplitOptions.RemoveEmptyEntries).Length
End While
End Using
Console.WriteLine(counter)
Console.ReadLine()
End Sub
End Module
Any embellishments such as writing out what the number represents or error checking are left up to you.

With Path.Combine you don't have to worry about where the slashes or back slashes go. You can get the path of special folders easily using the Environment class. The File class of System.IO is shared so you don't have to create an instance.
Public Sub Main()
Dim p As String = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments), "Chapters.txt")
Debug.Print(Environment.SpecialFolder.MyDocuments.ToString)
Dim count As Integer = GetCount(p)
Console.WriteLine(count)
Console.ReadKey()
End Sub
Private Function GetCount(Path As String) As Integer
Dim s = File.ReadAllText(Path)
Return s.Split().Length
End Function

Use Split function, then Directly get the length of result array and add 1 to it.

Related

Get a specific value from the line in brackets (Visual Studio 2019)

I would like to ask for your help regarding my problem. I want to create a module for my program where it would read .txt file, find a specific value and insert it to the text box.
As an example I have a text file called system.txt which contains single line text. The text is something like this:
[Name=John][Last Name=xxx_xxx][Address=xxxx][Age=22][Phone Number=8454845]
What i want to do is to get only the last name value "xxx_xxx" which every time can be different and insert it to my form's text box
Im totally new in programming, was looking for the other examples but couldnt find anything what would fit exactly to my situation.
Here is what i could write so far but i dont have any idea if there is any logic in my code:
Dim field As New List(Of String)
Private Sub readcrnFile()
For Each line In File.ReadAllLines(C:\test\test_1\db\update\network\system.txt)
For i = 1 To 3
If line.Contains("Last Name=" & i) Then
field.Add(line.Substring(line.IndexOf("=") + 2))
End If
Next
Next
End Sub
Im
You can get this down to a function with a single line of code:
Private Function readcrnFile(fileName As String) As IEnumerable(Of String)
Return File.ReadLines(fileName).Where(Function(line) RegEx.IsMatch(line, "[[[]Last Name=(?<LastName>[^]]+)]").Select(Function(line) RegEx.Match(line, exp).Groups("LastName").Value)
End Function
But for readability/maintainability and to avoid repeating the expression evaluation on each line I'd spread it out a bit:
Private Function readcrnFile(fileName As String) As IEnumerable(Of String)
Dim exp As New RegEx("[[[]Last Name=(?<LastName>[^]]+)]")
Return File.ReadLines(fileName).
Select(Function(line) exp.Match(line)).
Where(Function(m) m.Success).
Select(Function(m) m.Groups("LastName").Value)
End Function
See a simple example of the expression here:
https://dotnetfiddle.net/gJf3su
Dim strval As String = " [Name=John][Last Name=xxx_xxx][Address=xxxx][Age=22][Phone Number=8454845]"
Dim strline() As String = strval.Split(New String() {"[", "]"}, StringSplitOptions.RemoveEmptyEntries) _
.Where(Function(s) Not String.IsNullOrWhiteSpace(s)) _
.ToArray()
Dim lastnameArray() = strline(1).Split("=")
Dim lastname = lastnameArray(1).ToString()
Using your sample data...
I read the file and trim off the first and last bracket symbol. The small c following the the 2 strings tell the compiler that this is a Char. The braces enclosed an array of Char which is what the Trim method expects.
Next we split the file text into an array of strings with the .Split method. We need to use the overload that accepts a String. Although the docs show Split(String, StringSplitOptions), I could only get it to work with a string array with a single element. Split(String(), StringSplitOptions)
Then I looped through the string array called splits, checking for and element that starts with "Last Name=". As soon as we find it we return a substring that starts at position 10 (starts at zero).
If no match is found, an empty string is returned.
Private Function readcrnFile() As String
Dim LineInput = File.ReadAllText("system.txt").Trim({"["c, "]"c})
Dim splits = LineInput.Split({"]["}, StringSplitOptions.None)
For Each s In splits
If s.StartsWith("Last Name=") Then
Return s.Substring(10)
End If
Next
Return ""
End Function
Usage...
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
TextBox1.Text = readcrnFile()
End Sub
You can easily split that line in an array of strings using as separators the [ and ] brackets and removing any empty string from the result.
Dim input As String = "[Name=John][Last Name=xxx_xxx][Address=xxxx][Age=22][Phone Number=8454845]"
Dim parts = input.Split(New Char() {"["c, "]"c}, StringSplitOptions.RemoveEmptyEntries)
At this point you have an array of strings and you can loop over it to find the entry that starts with the last name key, when you find it you can split at the = character and get the second element of the array
For Each p As String In parts
If p.StartsWith("Last Name") Then
Dim data = p.Split("="c)
field.Add(data(1))
Exit For
End If
Next
Of course, if you are sure that the second entry in each line is the Last Name entry then you can remove the loop and go directly for the entry
Dim data = parts(1).Split("="c)
A more sophisticated way to remove the for each loop with a single line is using some of the IEnumerable extensions available in the Linq namespace.
So, for example, the loop above could be replaced with
field.Add((parts.FirstOrDefault(Function(x) x.StartsWith("Last Name"))).Split("="c)(1))
As you can see, it is a lot more obscure and probably not a good way to do it anyway because there is no check on the eventuality that if the Last Name key is missing in the input string
You should first know the difference between ReadAllLines() and ReadLines().
Then, here's an example using only two simple string manipulation functions, String.IndexOf() and String.Substring():
Sub Main(args As String())
Dim entryMarker As String = "[Last Name="
Dim closingMarker As String = "]"
Dim FileName As String = "C:\test\test_1\db\update\network\system.txt"
Dim value As String = readcrnFile(entryMarker, closingMarker, FileName)
If Not IsNothing(value) Then
Console.WriteLine("value = " & value)
Else
Console.WriteLine("Entry not found")
End If
Console.Write("Press Enter to Quit...")
Console.ReadKey()
End Sub
Private Function readcrnFile(ByVal entry As String, ByVal closingMarker As String, ByVal fileName As String) As String
Dim entryIndex As Integer
Dim closingIndex As Integer
For Each line In File.ReadLines(fileName)
entryIndex = line.IndexOf(entry) ' see if the marker is in our line
If entryIndex <> -1 Then
closingIndex = line.IndexOf(closingMarker, entryIndex + entry.Length) ' find first "]" AFTER our entry marker
If closingIndex <> -1 Then
' calculate the starting position and length of the value after the entry marker
Dim startAt As Integer = entryIndex + entry.Length
Dim length As Integer = closingIndex - startAt
Return line.Substring(startAt, length)
End If
End If
Next
Return Nothing
End Function

Is it possible to use String.Split() when NewLine is the delimiter?

I have a question which asks me to calculate something from an input file. The problem is, the lines in the file don't use any special character as delimiter, like , or |. I will show it down below.
Data Communication
20
Visual Basic
40
The output I need to write to another file should look like this:
Data communication 20
Visual Basic 40
Total Books : 60
The problem is, how can I specify the delimiter? Like when there is a symbol as in strArray = strLine.Split(","). Since there is nothing I can use as delimiter, how can I split the file content?
There's no real need to split the text in the input file, when you can read a file line by line using standard methods.
You can use, e.g., a StreamReader to read the lines from the source file, check whether the current line is just text or it can be converted to a number, using Integer.TryParse and excluding empty lines.
Here, when the line read is not numeric, it's added as a Key in a Dictionary(Of String, Integer), unless it already exists (to handle duplicate categories in the source file).
If the line represents a number, it's added to the Value corresponding to the category Key previously read, stored in a variable named previousLine.
This setup can handle initial empty lines, empty lines in the text body and duplicate categories, e.g.,
Data Communication
20
Visual Basic
40
C#
100
Visual Basic
10
Other stuff
2
C++
10000
Other stuff
1
If a number is instead found in the first line, it's treated as a category.
Add any other check to handle a different structure of the input file.
Imports System.IO
Imports System.Linq
Dim basePath = "[Path where the input file is stored]"
Dim booksDict = New Dictionary(Of String, Integer)
Dim currentValue As Integer = 0
Dim previousLine As String = String.Empty
Using sr As New StreamReader(Path.Combine(basePath, "Books.txt"))
While sr.Peek > -1
Dim line = sr.ReadLine().Trim()
If Not String.IsNullOrEmpty(line) Then
If Integer.TryParse(line, currentValue) AndAlso (Not String.IsNullOrEmpty(previousLine)) Then
booksDict(previousLine) += currentValue
Else
If Not booksDict.ContainsKey(line) Then
booksDict.Add(line, 0)
End If
End If
End If
previousLine = line
End While
End Using
Now, you have a Dictionary where the Keys represent categories and the related Value is the sum of all books in that category.
You can Select() each KeyValuePair of the Dictionary and transform it into a string that represents the Key and its Value (Category:Number).
Here, also OrderBy() is used, to order the categories alphabetically, in ascending order; it may be useful.
File.WriteAllLines is then called to store the strings generated.
In the end, a new string is appended to the file, using File.AppendAllText, to write the sum of all books in all categories. The Sum() method sums all the Values in the Dictionary.
Dim newFilePath = Path.Combine(basePath, "BooksNew.txt")
File.WriteAllLines(newFilePath, booksDict.
Select(Function(kvp) $"{kvp.Key}:{kvp.Value}").OrderBy(Function(s) s))
File.AppendAllText(newFilePath, vbCrLf & "Total Books: " & booksDict.Sum(Function(kvp) kvp.Value).ToString())
The output is:
C#:100
C++:10000
Data Communication:20
Other stuff:3
Visual Basic:50
Total Books: 10173
Sure.. System.IO.File.ReadAllLines() will read the whole file and split into an array based on newlines, so you'll get an array of 4 elements. You can process it with a flipflop boolean to get alternate lines, or you can try and parse the line to a number and if it works, then its a number and if not, it's a string. If it's a number take the string you remembered (using a variable) from the previous loop
Dim arr = File.ReadALlLines(...)
Dim isStr = True
Dim prevString = ""
For Each s as String in arr
If isStr Then
prevString = s
Else
Console.WriteLine($"The string is {prevString} and the number is {s}")
End If
'flip the boolean
isStr = Not isStr
Next s
I used File.ReadAllLines to get an array containing each line in the file. Since the size of the file could be larger than the sample shown, I am using a StringBuilder. This save having to throw away and create a new string on each iteration of the loop.
I am using interpolated strings indicated by the $ preceding the quotes. This allows you to insert variables into the string surrounded by braces.
Note the Step 2 in the For loop. i will increment by 2 instead of the default 1.
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim lines = File.ReadAllLines("input.txt")
Dim sb As New StringBuilder
Dim total As Integer
For i = 0 To lines.Length - 2 Step 2
sb.AppendLine($"{lines(i)} {lines(i + 1)}")
total += CInt(lines(i + 1))
Next
sb.AppendLine($"Total Books: {total}")
TextBox1.Text = sb.ToString
End Sub

Text file split in blocks vb.net

I am trying to go through my text file and create a new file that will contain only the text I require. My current line looks like:
Car-1I
Colour-39
Cost-328
Dealer-28
Car-2
Colour-30
Cost-234
For each block of text I would like to read the first line, if the first line ends with an I, then read the next line, if that line contains a colour 39, then I would like to save the whole block of text to another file. If these two conditions aren't met, I dont want to save my values to the new text file.
Before anything about saving my values in classes are mentioned, these blocks of text can vary in size and values, so I dont always have a set range of values which is why i need to skip to the blank line
IO.File.WriteAllText("C:\Users\test2.txt", "") 'write to new file
Dim sKey As String
Dim sValue As Integer
For Each filterLine As String In File.ReadLines("C:\Users\test.txt")
sKey = Split(filterLine, ":")(0)
sValue = Split(filterLine, ":")(1)
If Not sValue.EndsWith("I") Then
ElseIf sValue.EndsWith("I") Then
End If
Next
Another method, using File.ReadLines to read lines of text from file. This method doesn't load all the text in memory, it reads from disc single lines of text, so it can also be useful when dealing with big files.
You could loop the IEnumerable collection it returns, but also use its GetEnumerator() method to control more directly when to move to the next line, or move more then one lines forward.
Its Enumerator.Current object returns the line of text currently read, Enumerator.MoveNext() moves to the next line.
A StringBuilder is used to store the strings when a match found. Strings are added to the StringBuilder object using its AppendLine() method.
This class is useful when dealing with strings that you need to store, compare and discard (or modify) quickly: since string are immutable, when you use String variables directly, especially in loops, you generate a whole lot of garbage that slows down any procedure quite a lot.
The blocks of text stored in the StringBuilder object are then written to a destination file using a StreamWriter with explicit encoding set to UTF-8 (writes the BOM). Its methods include asynchronous versions: WriteLine() can be replaced by awaitWriteLineAsync() to allow an async procedure.
Imports System.IO
Imports System.Text
Dim sourceFilePath = "<Path of the source file>"
Dim resultsFilePath = "<Path of the destination file>"
Dim sb As New StringBuilder()
Dim enumerator = File.ReadLines(sourceFilePath).GetEnumerator()
Using sWriter As New StreamWriter(resultsFilePath, False, Encoding.UTF8)
While enumerator.MoveNext()
If enumerator.Current.EndsWith("I") Then
sb.AppendLine(enumerator.Current)
enumerator.MoveNext()
If enumerator.Current.EndsWith("39") Then
While Not String.IsNullOrWhiteSpace(enumerator.Current)
sb.AppendLine(enumerator.Current)
enumerator.MoveNext()
End While
sWriter.WriteLine(sb.ToString())
End If
sb.Clear()
End If
End While
End Using
This will work:
Dim strFile As String = "c:\Test5\Source.txt"
Dim strOutFile As String = "c:\Test5\OutPut.txt"
Dim strOutData As String = ""
Dim SourceGroups As String() = Split(File.ReadAllText(strFile), vbCrLf + vbCrLf)
For Each sGroup As String In SourceGroups
Dim OneGroup() As String = Split(sGroup, vbCrLf)
If Strings.Right(OneGroup(0), 1) = "I" And (Strings.Right(OneGroup(1), 2) = "39") Then
If strOutData <> "" Then strOutData += (vbCrLf & vbCrLf)
strOutData += sGroup
End If
Next
File.WriteAllText(strOutFile, strOutData)
Something like this should work:
Dim base, i, c as Integer
Dim lines1$() = File.ReadLines("C:\Users\test.txt")
c = lines1.count
While i < c
if Len(RTrim(lines1(i))) Then
If Strings.Right(RTrim(lines1(i)), 1)="I" Then
base = i
i += 1
If Strings.Right(RTrim(lines1(i)), 2)="39" Then
While Len(RTrim(lines1(i))) 'skip to the next blank
i += 1
End While
' write lines1(from base to (i-1)) here
Else
While Len(RTrim(lines1(i)))
i += 1
End While
End If
Else
i += 1
End If
Else
i += 1
End If
End While

Reading from text files in Visual Basic

This is the first challenge on Day 1 of the 2018 Advent of Code
(link: https://adventofcode.com/2018/day/1)
So I am trying to create a program that reads a long list of positive and negative numbers (e.g +1, -2, +3, etc.) and then add them up to create a total. I have researched some methods of file handling in Visual Basic, and have come up with the below method:
Sub Main()
Dim objStreamReader As StreamReader
Dim strLine As String = ""
Dim total As Double = 0
objStreamReader = New StreamReader(AppDomain.CurrentDomain.BaseDirectory & "frequencies.txt")
strLine = objStreamReader.ReadLine
Do While Not strLine Is Nothing
Console.WriteLine(strLine)
strLine = objStreamReader.ReadLine
total += strLine
Loop
Console.WriteLine(total)
objStreamReader.Close()
Console.ReadLine()
End Sub
Here is a link to the list of numbers: https://adventofcode.com/2018/day/1/input
It is not a syntax error I am getting but a logic error. The answer is somehow wrong, but I cannot seem to figure out where! I have tried to remove the signs from each number but that throws me a NullException error when it compiles.
So far I have come out with the answer 549, which the Advent of Code webiste rejects. Any ideas?
Make your life easier by using File.ReadLines(fileName) instead of dealing with StreamReader. Use Path.Combine instead of string concatenation to create a path. Path.Combine takes care of adding missing \ or removing extra ones etc.
Your file might contain an extra empty line at its end, that does not convert to a number. Use Double.TryParse to make sure you have a valid number before totalizing it. You should have Option Strict On anyway to enforce explicit conversions.
Dim fileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "frequencies.txt")
Dim total As Double = 0
For Each strLine As String In File.ReadLines(fileName)
Console.WriteLine(strLine)
Dim n As Double
If Double.TryParse(strLine, n) Then
total += n
End If
Next
Console.WriteLine(total)
Console.ReadLine()
For appending two string, please use string builder.
Dim test as new stringbuilder()
Test.append("your string")
It will not affect performance.

Visual Basic Read File Line by Line storing each Line in str

I am trying to loop through the contents of a text file reading the text file line by line. During the looping process there is several times I need to use the files contents.
Dim xRead As System.IO.StreamReader
xRead = File.OpenText(TextBox3.Text)
Do Until xRead.EndOfStream
Dim linetext As String = xRead.ReadLine
Dim aryTextFile() As String = linetext.Split(" ")
Dim firstname As String = Val(aryTextFile(0))
TextBox1.Text = firstname.ToString
Dim lastname As String = Val(aryTextFile(0))
TextBox2.Text = lastname.ToString
Loop
Edit: What I am trying to do is read say the first five items in a text file perform some random processing then read the next 5 lines of the text file.
I would like to be able to use the lines pulled from the text file as separated string variables.
It is not clear why you would need to have 5 lines stored at any time, according to your code sample, since you are only processing one line at a time. If you think that doing 5 lines at once will be faster - this is unlikely, because .NET maintains caching internally, so both approaches will probably perform the same. However, reading one line at a time is a much more simple pattern to use, so better look into that first.
Still, here is an approximate version of the code that does processing every 5 lines:
Sub Main()
Dim bufferMaxSize As Integer = 5
Using xRead As New System.IO.StreamReader(TextBox3.Text)
Dim buffer As New List(Of String)
Do Until xRead.EndOfStream
If buffer.Count < bufferMaxSize Then
buffer.Add(xRead.ReadLine)
Continue Do
Else
PerformProcessing(buffer)
buffer.Clear()
End If
Loop
If buffer.Count > 0 Then
'if line count is not divisible by bufferMaxSize, 5 in this case
'there will be a remainder of 1-4 records,
'which also needs to be processed
PerformProcessing(buffer)
End If
End Using
End Sub
Here is mine . Rely easy . Just copy the location from the file and copy1 folder to does locations . This is my first program :) . ready proud of it
Imports System.IO
Module Module1
Sub Main()
For Each Line In File.ReadLines("C:\location.txt".ToArray)
My.Computer.FileSystem.CopyDirectory("C:\Copy1", Line, True)
Next
Console.WriteLine("Done")
Console.ReadLine()
End Sub
End Module