Count occurance of specific words in a text file in vb.net - vb.net

I'm trying to count the number of an item in a text file, by counting each instance the item was entered into the file earlier on in the program.
I already have the text read from the file and in a text box. The problem is that my current code was just counting the characters in the textbox and not the number of times my desired word was in the file.
For Each desiredword As String In txtContentofFile.Text
intdesiredword = intdesiredword + 1
txtdesiredwordcount.Text = intdesiredword
Next
This counts the characters in the textbox instead of counting the number of desired words. I tried repeatedly before asking help and searched extensively, but I just don't understand what's wrong with my code. Please help :)

You can use Split Function :
C#:
int count = txtContentofFile.Text.Split(desiredword).Length - 1;
VB.net:
Dim count As Integer = txtContentofFile.Text.Split(desiredword).Length - 1

I prefer to use Regular Expressions in this type of situation. They are very tricky to understand but they are extremely powerful and typically faster than other string manipulation techniques.
Dim AllMatchResults As MatchCollection
Try
Dim RegexObj As New Regex(desiredword)
AllMatchResults = RegexObj.Matches(txtContentofFile.Text)
If AllMatchResults.Count > 0 Then
' Access individual matches using AllMatchResults.Item[]
Else
' Match attempt failed
End If
Catch ex As ArgumentException
'Syntax error in the regular expression
End Try
In your case you are looking for the value from AllMatchResults.Count.
Using a great Regular Expression tool like RegexBuddy to build and test the expressions is a great help too. (The above code snippet was generated by RegexBuddy!)

Try this:
Dim text As String = IO.File.ReadAllText("C:\file.txt")
Dim wordsToSearch() As String = New String() {"Hello", "World", "foo"}
Dim words As New List(Of String)()
Dim findings As Dictionary(Of String, List(Of Integer))
'Dividing into words
words.AddRange(text.Split(New String() {" ", Environment.NewLine()}, StringSplitOptions.RemoveEmptyEntries))
findings = SearchWords(words, wordsToSearch)
Console.WriteLine("Number of 'foo': " & findings("foo").Count)
Function used:
Private Function SearchWords(ByVal allWords As List(Of String), ByVal wordsToSearch() As String) As Dictionary(Of String, List(Of Integer))
Dim dResult As New Dictionary(Of String, List(Of Integer))()
Dim i As Integer = 0
For Each s As String In wordsToSearch
dResult.Add(s, New List(Of Integer))
While i >= 0 AndAlso i < allWords.Count
i = allWords.IndexOf(s, i)
If i >= 0 Then dResult(s).Add(i)
i += 1
End While
Next
Return dResult
End Function
You will have not only the number of occurances, but the index positions in the file, grouped easily in a Dictionary.

Try the following code
Function word_frequency(word_ As String, input As String) As Integer
Dim ct = 0
Try
Dim wLEN = word_.Length
Do While input.IndexOf(word_) <> -1
Dim idx = input.IndexOf(word_) + wLEN
ct += 1
input = input.Substring(idx)
Loop
Catch ex As Exception
End Try
Return ct
End Function

Related

Get a specific value from the line in brackets (Visual Studio 2019)

I would like to ask for your help regarding my problem. I want to create a module for my program where it would read .txt file, find a specific value and insert it to the text box.
As an example I have a text file called system.txt which contains single line text. The text is something like this:
[Name=John][Last Name=xxx_xxx][Address=xxxx][Age=22][Phone Number=8454845]
What i want to do is to get only the last name value "xxx_xxx" which every time can be different and insert it to my form's text box
Im totally new in programming, was looking for the other examples but couldnt find anything what would fit exactly to my situation.
Here is what i could write so far but i dont have any idea if there is any logic in my code:
Dim field As New List(Of String)
Private Sub readcrnFile()
For Each line In File.ReadAllLines(C:\test\test_1\db\update\network\system.txt)
For i = 1 To 3
If line.Contains("Last Name=" & i) Then
field.Add(line.Substring(line.IndexOf("=") + 2))
End If
Next
Next
End Sub
Im
You can get this down to a function with a single line of code:
Private Function readcrnFile(fileName As String) As IEnumerable(Of String)
Return File.ReadLines(fileName).Where(Function(line) RegEx.IsMatch(line, "[[[]Last Name=(?<LastName>[^]]+)]").Select(Function(line) RegEx.Match(line, exp).Groups("LastName").Value)
End Function
But for readability/maintainability and to avoid repeating the expression evaluation on each line I'd spread it out a bit:
Private Function readcrnFile(fileName As String) As IEnumerable(Of String)
Dim exp As New RegEx("[[[]Last Name=(?<LastName>[^]]+)]")
Return File.ReadLines(fileName).
Select(Function(line) exp.Match(line)).
Where(Function(m) m.Success).
Select(Function(m) m.Groups("LastName").Value)
End Function
See a simple example of the expression here:
https://dotnetfiddle.net/gJf3su
Dim strval As String = " [Name=John][Last Name=xxx_xxx][Address=xxxx][Age=22][Phone Number=8454845]"
Dim strline() As String = strval.Split(New String() {"[", "]"}, StringSplitOptions.RemoveEmptyEntries) _
.Where(Function(s) Not String.IsNullOrWhiteSpace(s)) _
.ToArray()
Dim lastnameArray() = strline(1).Split("=")
Dim lastname = lastnameArray(1).ToString()
Using your sample data...
I read the file and trim off the first and last bracket symbol. The small c following the the 2 strings tell the compiler that this is a Char. The braces enclosed an array of Char which is what the Trim method expects.
Next we split the file text into an array of strings with the .Split method. We need to use the overload that accepts a String. Although the docs show Split(String, StringSplitOptions), I could only get it to work with a string array with a single element. Split(String(), StringSplitOptions)
Then I looped through the string array called splits, checking for and element that starts with "Last Name=". As soon as we find it we return a substring that starts at position 10 (starts at zero).
If no match is found, an empty string is returned.
Private Function readcrnFile() As String
Dim LineInput = File.ReadAllText("system.txt").Trim({"["c, "]"c})
Dim splits = LineInput.Split({"]["}, StringSplitOptions.None)
For Each s In splits
If s.StartsWith("Last Name=") Then
Return s.Substring(10)
End If
Next
Return ""
End Function
Usage...
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
TextBox1.Text = readcrnFile()
End Sub
You can easily split that line in an array of strings using as separators the [ and ] brackets and removing any empty string from the result.
Dim input As String = "[Name=John][Last Name=xxx_xxx][Address=xxxx][Age=22][Phone Number=8454845]"
Dim parts = input.Split(New Char() {"["c, "]"c}, StringSplitOptions.RemoveEmptyEntries)
At this point you have an array of strings and you can loop over it to find the entry that starts with the last name key, when you find it you can split at the = character and get the second element of the array
For Each p As String In parts
If p.StartsWith("Last Name") Then
Dim data = p.Split("="c)
field.Add(data(1))
Exit For
End If
Next
Of course, if you are sure that the second entry in each line is the Last Name entry then you can remove the loop and go directly for the entry
Dim data = parts(1).Split("="c)
A more sophisticated way to remove the for each loop with a single line is using some of the IEnumerable extensions available in the Linq namespace.
So, for example, the loop above could be replaced with
field.Add((parts.FirstOrDefault(Function(x) x.StartsWith("Last Name"))).Split("="c)(1))
As you can see, it is a lot more obscure and probably not a good way to do it anyway because there is no check on the eventuality that if the Last Name key is missing in the input string
You should first know the difference between ReadAllLines() and ReadLines().
Then, here's an example using only two simple string manipulation functions, String.IndexOf() and String.Substring():
Sub Main(args As String())
Dim entryMarker As String = "[Last Name="
Dim closingMarker As String = "]"
Dim FileName As String = "C:\test\test_1\db\update\network\system.txt"
Dim value As String = readcrnFile(entryMarker, closingMarker, FileName)
If Not IsNothing(value) Then
Console.WriteLine("value = " & value)
Else
Console.WriteLine("Entry not found")
End If
Console.Write("Press Enter to Quit...")
Console.ReadKey()
End Sub
Private Function readcrnFile(ByVal entry As String, ByVal closingMarker As String, ByVal fileName As String) As String
Dim entryIndex As Integer
Dim closingIndex As Integer
For Each line In File.ReadLines(fileName)
entryIndex = line.IndexOf(entry) ' see if the marker is in our line
If entryIndex <> -1 Then
closingIndex = line.IndexOf(closingMarker, entryIndex + entry.Length) ' find first "]" AFTER our entry marker
If closingIndex <> -1 Then
' calculate the starting position and length of the value after the entry marker
Dim startAt As Integer = entryIndex + entry.Length
Dim length As Integer = closingIndex - startAt
Return line.Substring(startAt, length)
End If
End If
Next
Return Nothing
End Function

New to programming language, need help to read a .dat file into array Vs VB, console .net framework

I am trying to read from a data file called TXT.dat and store the circled values into a separate array's using the type of commands I have used. I cannot use streamreader,streamwriter this is what I am learning at university but we were taught to read an array not certain values as I will be tested using a file which has
2 factorial values from a single file and store int two arrays.
I have spent hours trying before going to paid sites who always answer with i/o streamreader which doesnt help
.dat file i created to practice
MY CODE
Sub Main()
Const i_val As Integer = 6
Dim j As Integer = 6 'loop readers
Dim Arayn_Fact(i_val - 1) As Double 'array for 2nd value per line
Dim Aray_Fact2n(j - 1) As Double 'array for 4th value per line
Read_Values(i_val - 1, Arayn_Fact)
End Sub
Sub Read_Values(ByVal i As Integer, ByRef _A() As Double)
Dim fid1 As Integer = FreeFile()
Dim fid2 As Integer = FreeFile() + 1
Dim tmp As Double
FileOpen(fid1, "TXT.dat", OpenMode.Input, OpenAccess.Read)
For i = 0 To 5 Step 1
Input(fid1, tmp)
Input(fid1, _A(i))
Next i
FileClose(fid1)
Console.ReadKey()
End Sub
Without getting too fancy, here's some very basic code to do what you've outlined:
Sub Main()
Dim fileName As String = "txt.dat"
Dim Aray_Fact2() As Integer 'array for 2nd value per line
Dim Aray_Fact4() As Integer 'array for 4th value per line
Dim list2 As New List(Of Integer)
Dim list4 As New List(Of Integer)
Dim lines() As String = System.IO.File.ReadAllLines(fileName)
For Each line As String In lines
Dim values() As String = line.Split(",")
If values.Length >= 3 Then
Dim f2, f4 As Integer
If Integer.TryParse(values(1), f2) AndAlso Integer.TryParse(values(3), f4) Then
list2.Add(f2)
list4.Add(f4)
Else
Console.WriteLine("Invalid value in line: " & line)
End If
Else
Console.WriteLine("Invalid number of entries in line: " & line)
End If
Next
Aray_Fact2 = list2.ToArray
Aray_Fact4 = list4.ToArray
Console.WriteLine("Aray_Fact2: " & String.Join(",", Aray_Fact2))
Console.WriteLine("Aray_Fact4: " & String.Join(",", Aray_Fact4))
Console.Write("Press Enter to Quit...")
Console.ReadLine()
End Sub
If you are completely against the use of Lists and ToArray, then you'd have to read the file TWICE. Once to determine how many lines are in the file so you can dimension the arrays to the correct size. Then another time to actually read the values and populate the arrays.
First, some style things. Passing the array ByRef is incorrect. Arrays are already reference types, and so passing ByVal still passes the reference to the array object. But don't even pass the array in the first place. It's better practice to let the Read_Values() method create and return the array to the calling function.
And no one uses that ancient/ganky FileOpen() API anymore. I mean that. NO. ONE. It's not even appropriate for teaching, as the Stream types are closer to what the low level operating system/file system are doing. If you can't use StreamReader and friends, there are still better options out there.
Sub Main()
Dim Array_Fact() As Double = Read_Values("TXT.dat", 1)
Dim Array_Fact2n() As Double = Read_Values("TXT.dat", 3)
Console.ReadKey(True)
End Sub
Function Read_Values(filePath As String, position As Integer) As Double()
Dim lines() As String = File.ReadAllLines(filePath) 'Look Ma, no StreamReader
Dim result(lines.Length - 1) As Double
For i As Integer = 0 To Lines.Length - 1
result(i) = Double.Parse(lines(i).Split(",")(position).Trim())
Next line
Return result
End Function
But what I'd really do is keep the 2nd and 4th values in the same collection and read everything in one pass through the file. This uses features like Generics, Tuples, Lambdas+Linq, and more that you won't get to for some time, but I feel like it's useful to show you where you're going:
Sub Main()
Dim Facts = Read_Values("TXT.dat")
Console.WriteLine(String.Join(",",Facts.Select(Function(f) $"({f.Item1}:{f.Item2})")))
Console.ReadKey(True)
End Sub
Function Read_Values(filePath As String) As IEnumerable(Of (Double, Double))
Return File.ReadLines(filePath).Select(
Function(line) 'Poor man's CSV. In real code, I'd pull in a CSV parser from NuGet
Dim fields = line.Split(",")
Return (Double.Parse(fields(1).Trim()), Double.Parse(fields(3).Trim()))
End Function
)
End Function

Searching Multiple strings with 1st criteria, then searching those returned values with a different criteria and so on

so..
I have a txt file with hundreds of sentences or strings.
I also have 4 comboboxes with options that a user can select from and
each combobox is part of a different selection criteria. They may or may not use all the comboboxes.
When a user selects an option from any combobox I use a For..Next statement to run through the txt file and pick out all the strings that contain or match whatever the user selected. It then displays those strings for the user to see, so that if they wanted to they could further narrow down the search from that point by using the 3 remaining comboboxes making it easier to find what they want.
I can achieve this by using lots of IF statements within the for loop but is that the only way?
No, there are other ways. You can leverage LINQ to get rid of some of those if statements:
Private _lstLinesInFile As List(Of String) = New List(Of String)
Private Function AddClause(ByVal qryTarget As IEnumerable(Of String), ByVal strToken As String) As IEnumerable(Of String)
If Not String.IsNullOrWhiteSpace(strToken) Then
qryTarget = qryTarget.Where(Function(ByVal strLine As String) strLine.Contains(strToken))
End If
Return qryTarget
End Function
Public Sub YourEventHandler()
'Start Mock
Dim strComboBox1Value As String = "Test"
Dim strComboBox2Value As String = "Stack"
Dim strComboBox3Value As String = String.Empty
Dim strComboBox4Value As String = Nothing
'End Mock
If _lstLinesInFile.Count = 0 Then
'Only load from the file once.
_lstLinesInFile = IO.File.ReadAllLines("C:\Temp\Test.txt").ToList()
End If
Dim qryTarget As IEnumerable(Of String) = (From strTarget In _lstLinesInFile)
'Assumes you don't have to match tokens that are split by line breaks.
qryTarget = AddClause(qryTarget, strComboBox1Value)
qryTarget = AddClause(qryTarget, strComboBox2Value)
qryTarget = AddClause(qryTarget, strComboBox3Value)
qryTarget = AddClause(qryTarget, strComboBox4Value)
Dim lstResults As List(Of String) = qryTarget.ToList()
End Sub
Keep in mind this is case sensitive so you may want to throw in some .ToLower() calls in there:
qryTarget = qryTarget.Where(Function(ByVal strLine As String) strLine.ToLower().Contains(strToken.ToLower()))
I think a compound If statement is the simplest:
Dim strLines() As String = IO.File.ReadAllText(strFilename).Split(vbCrLf)
Dim strSearchTerm1 As String = "Foo"
Dim strSearchTerm2 As String = "Bar"
Dim strSearchTerm3 As String = "Two"
Dim strSearchTerm4 As String = ""
Dim lstOutput As New List(Of String)
For Each s As String In strLines
If s.Contains(strSearchTerm1) AndAlso
s.Contains(strSearchTerm2) AndAlso
s.Contains(strSearchTerm3) AndAlso
s.Contains(strSearchTerm4) Then
lstOutput.Add(s)
End If
Next

Length of 2D String list does not return the right value

I have a problem when I use .count in my 2D String list. This is the code:
If File.Exists(fullPath) = True Then
Dim readText() As String = File.ReadAllLines(fullPath)
Dim s As String
accountCounter = 0
For Each s In readText
accountList.Add(New List(Of String))
accountList.Add(New List(Of String))
accountList.Add(New List(Of String))
accountList(accountCounter).Add(s.Split(",")(0))
accountList(accountCounter).Add(s.Split(",")(1))
accountList(accountCounter).Add(s.Split(",")(2))
accountCounter += 1
Next
print_logs(accountList.count)
End If
The result is this:
{{name,email,password},{name2,email2,password2},{name3,email3,password3},{name4,email4,password4}}
beacuse in the file there are the following lines:
name,email,password
name2,email2,password2
name3,email3,password3
name4,email4,password4
But data is not the problem, the real problem is the Count method, it returns (12). I think that it returns 4 * 3 result, because if I add this in the code:
print_logs(accountList(0).Count)
it correctly returns 3.
So, how can I just return 4?
In this code you create three new rows everytime you do an iteration... If there are four lines in your text files then you will create twelve...
Do this instead :
If File.Exists(fullPath) = True Then
Dim readText() As String = File.ReadAllLines(fullPath)
Dim s As String
accountCounter = 0
For Each s In readText
accountList.Add(New List(Of String))
accountList(accountCounter).Add(s.Split(",")(0))
accountList(accountCounter).Add(s.Split(",")(1))
accountList(accountCounter).Add(s.Split(",")(2))
accountCounter += 1
Next
print_logs(accountList.count)
End If
And if you want to make it even better :
If File.Exists(fullPath) = True Then
Dim readText() As String = File.ReadAllLines(fullPath)
For Each s As String In readText
Dim newList = New List(Of String)
newList.Add(s.Split(",")(0))
newList.Add(s.Split(",")(1))
newList.Add(s.Split(",")(2))
accountList.Add(newList)
Next
print_logs(accountList.count)
End If

Splitting a CSV Visual basic

I have a string like
Query_1,ab563372363_C/R,100.00,249,0,0,1,249,1,249,1e-132, 460
Query_1,ab563372356_C/R,99.60,249,1,0,1,249,1,249,5e-131, 455
in a file
in two separate lines. I am reading it from the textbox. I have to output ab563372363_C/R and ab563372356_C/R in a text box. I am trying to use the split function for that but its not working..
Dim splitString as Array
results = "test.txt"
Dim FileText As String = IO.File.ReadAllText(results) 'reads the above contents from file
splitString = Split(FileText, ",", 14)
TextBox2.text = splitString(1) & splitString(13)
for the above code, it just prints the whole thing.. What's wrong?
Try this
Private Function GetRequiredText() As List(Of String)
Dim requiredStringList As New List(Of String)
Dim file = "test.txt"
If FileIO.FileSystem.FileExists(file) Then
Dim reader As System.IO.StreamReader = System.IO.File.OpenText(file)
Dim line As String = reader.ReadLine()
While line IsNot Nothing
requiredStringList.Add(line.Split(",")(1))
line = reader.ReadLine()
End While
reader.Close()
reader.Dispose()
End If
Return requiredStringList
End Function
This will read the file line by line and add the item you require to a list of strings which will be returned by the function.
Returning a List(Of String) may be overkill, but it's quite simple to illustrate and to work with.
You can then iterate through the list and do what you need with the contents of the list.
Comments welcome!!
Also this might work...
Dim query = From lines In System.IO.File.ReadAllLines(file) _
Select lines.Split(",")(1)
this will return an IEnumerable(Of String)
Enjoy
First
Since you are reading the whole text, your FileText would be ending like this:
Query_1,ab563372363_C/R,100.00,249,0,0,1,249,1,249,1e-132,460
\r\n
Query_1,ab563372356_C/R,99.60,249,1,0,1,249,1,249,5e-131, 455
So when you are referencing to your splitStringwith those indexes (1, 13) your result might probably be wrong.
Second
Try to specify what kind of type your array is, Dim splitString as Array should be Dim splitString As String()
Third
Make your code more readable/maintainable and easy to edit (not only for you, but others)
The Code
Private const FirstIndex = 1
Private const SecondIndex = 12
Sub Main
Dim myDelimiter As Char
Dim myString As String
Dim mySplit As String()
Dim myResult1 As String
Dim myResult2 As String
myDelimiter = ","
myString += "Query_1,ab563372363_C/R,100.00,249,0,0,1,249,1,249,1e-132, 460"
myString += "Query_1,ab563372356_C/R,99.60,249,1,0,1,249,1,249,5e-131, 455"
mySplit = myString.Split(myDelimiter)
myResult1 = mySplit(FirstIndex)
myResult2 = mySplit(SecondIndex)
Console.WriteLine(myResult1)
Console.WriteLine(myResult2)
End Sub