How to extract same word from a paragraph - vb.net

I want to extract same word from a paragraph. My paragraph is in richtextbox1 and the words to be extracted are given in an array. My code is as below:
Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
Dim A(1) As Char
A(0) = " "
A(1) = ","
Dim B As String = RichTextBox1.Text
Dim x As String() = Nothing
Dim F As Array = {"SMUGGLING", "CROSSING", "INFILTRATION"}
x = B.Split(A)
For Each F In x
Label1.Text += F.Contains(x) & ControlChars.NewLine
Next
End Sub

I think you have got your for each loop mixed up a bit.
With the below code, having the F.contains here, will output as either true or false in the label rather than the word itself.
Label1.Text += F.Contains(x) & ControlChars.NewLine
I don't think F.contains will work while using an array, as .Contains is not a member of System.Array.
I would consider using generic lists instead.
Here is an example I have made using generic lists instead.
Dim A(1) As Char
A(0) = CChar(" ")
A(1) = CChar(",")
Dim B As String = RichTextBox1.Text
Dim x As String() = Nothing
Dim F As List(Of String) = New List(Of String)
F.Add("SMUGGLING")
F.Add("CROSSING")
F.Add("INFILTRATION")
x = B.Split(A)
For Each word In x
If F.Contains(UCase(word)) Then
Label1.Text += word & ControlChars.NewLine
End If
Next
I have reworded your for each loop so that the the F.contains is an If statement and then adds it to the label if it returns true. Also in the for each loop you don't really want to be using for every F in x as it doesnt make sense and you are already using F. So i changed it to for each word in x.
Hope this helps :)

Related

Delete first letter of the words using visual basic

I have a code, which can change last letters of words to the dot. I need to how, how to change the code, so when I write some words, in output I will get them without first letter?
for ex:
Input: Hello,how are you?
Output: ello, ow re ou?
Here is my code:
Sub New5
dim s, ns as String
dim r as String
s = inputbox("Input text")
r = "Inputed text:" & chr(9) & s & chr(13)
for i = 2 to len(s)
if mid(s,i,1)=" " then ns = ns + "." else ns = ns + mid(s,i-1,1)
next i
ns = ns + "."
r = r & "Result of work:" & chr(9) & ns
MsgBox r
End Sub
For VB6:
Private Sub Convert()
Dim strIn as string
Dim strA() As String
Dim strOut As String
Dim iX As Integer
strIn - "Hello, how are you?"
strA = Split(strIn, " ")
For iX = 0 To UBound(strA)
strA(iX) = Mid$(strA(iX), 2)
Next
strOut = Join(strA, " ")
End Sub
Incidentally your libreoffice tag is also inappropriate as LibreOffice doesn't use the same language as vb6 or vba.
Sorry, just saw this was tagged vb6. This is a vb.net answer.
If you want to get rid of the first letter of each word, the first thing to do is get the words. String.Split will return an array based on the split character you provide. In this case that character is a space. The small c following the string tells the compiler that this is Char.
Now we can loop through each word and cut off the first letter. I am storing the shortened words in a List(Of String). You can get rid of the first letter by using Substring passing the start index. We want to start at the second letter so we pass 1. Indexes start at 0 for the first letter.
Finally, use String.Join to put everything back together.
Chr, Len, Mid, and MsgBox are all left overs from VB6. They work for backward compatibility but it is a good idea to learn the .net classes functionality.
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
New5()
End Sub
Private Sub New5()
Dim input = InputBox("Input text")
Dim words = input.Split(" "c)
Dim ShortWords As New List(Of String)
For Each word In words
ShortWords.Add(word.Substring(1))
Next
Dim shortenedString = String.Join(" ", ShortWords)
MessageBox.Show(shortenedString)
End Sub

Find all instances of a word in a string and display them in textbox (vb.net)

I have a string filled with the contents of a textbox (pretty large).
I want to search through it and display all occurances of this word. In addition I need the searchresult to display some charachters in the string before and after the actual searchterm to get the context for the word.
The code below is part of a code that takes keywords from a listbox one by one using For Each. The code displays the first occurance of a word together with the characters in front and after the word - and stop there. It will also display "no Match for: searched word" if not found.
As stated in the subject of this question - I need it to search the whole string and display all matches for a particular word together with the surrounding characters.
Where = InStr(txtScrape.Text, Search)
If Where <> 0 Then
txtScrape.Focus()
txtScrape.SelectionStart = Where - 10
txtScrape.SelectionLength = Where + 50
Result = txtScrape.SelectedText
AllResults = AllResults + Result
Else
AllResults = AllResults + "No Match for: " & item
End If
I recommend that you can split the string into long sentences by special symbols, such as , : ? .
Split(Char[])
You can refer to the following code.
Public Class Form1
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
RichTextBox1.Text = ""
Dim Index As Integer
Dim longStr() As String
Dim str = TextBox3.Text
longStr = TextBox1.Text.Split(New Char() {CChar(":"), CChar(","), CChar("."), CChar("?"), CChar("!")})
Index = 0
For Each TheStr In longStr
If TheStr.Contains(str) Then
RichTextBox1.AppendText(longStr(Index) & vbCrLf)
End If
Index = Index + 1
Next
End Sub
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
TextBox1.Text = "....."
End Sub
End Class
Result:
Try like this:
Dim ArrStr() As String
Dim Index As Integer
Dim TheStr As String
Dim MatchFound As Boolean
MatchFound = False
ArrStr = Split(txtScrape.text," ")
Index = 1
For Each TheStr In ArrStr
If TheStr = Search Then
Console.WriteLine(Index)
MatchFound = True
End If
Index = Index + 1
Next
Console.WriteLine(MatchFound)
Inside the If statement you will get the index there. And MatchFound is the Boolean value if match found.

Using Functions in Visual Basic

The program I'm working on has two different functions, one that calculates the number of syllables in a text file, and another that calculates the readability of the text file based on the formula
206.835-85.6*(Number of Syllables/Number of Words)-1.015*(Number of Words/Number of Sentences)
Here are the problems I'm having:
I'm supposed to display the contents of the text file in a multi-line text box.
I'm supposed to display the answer I get from the function indexCalculation in a label below the text box.
I'm having trouble calling the function to actually have the program calculate the answer to be displayed in the label.
Here is the code I have so far.
Option Strict On
Imports System.IO
Public Class Form1
Private Sub ExitToolStripMenuItem_Click(sender As Object, e As EventArgs) Handles ExitToolStripMenuItem.Click
Me.Close()
End Sub
Private Sub OpenToolStripMenuItem_Click(sender As Object, e As EventArgs) Handles OpenToolStripMenuItem.Click
Dim open As New OpenFileDialog
open.Filter = "text files |project7.txt|All file |*.*"
open.InitialDirectory = Environment.GetFolderPath(Environment.SpecialFolder.DesktopDirectory)
If open.ShowDialog() = Windows.Forms.DialogResult.OK Then
Dim selectedFileName As String = System.IO.Path.GetFileName(open.FileName)
If selectedFileName.ToLower = "project7.txt" Then
Dim text As String = File.ReadAllText("Project7.txt")
Dim words = text.Split(" "c)
Dim wordCount As Integer = words.Length
Dim separators As Char() = {"."c, "!"c, "?"c, ":"c}
Dim sentences = text.Split(separators, StringSplitOptions.RemoveEmptyEntries)
Dim sentenceCount As Integer = sentences.Length
Dim vowelCount As Integer = 0
For Each word As String In words
vowelCount += CountSyllables(word)
Next
vowelCount = CountSyllables(text)
Label1.Show(indexCalculation(wordCount, sentenceCount, vowelCount))
Else
MessageBox.Show("You cannot use that file!")
End If
End If
End Sub
Function CountSyllables(word As String) As Integer
word = word.ToLower()
Dim dipthongs = {"oo", "ou", "ie", "oi", "ea", "ee", _
"eu", "ai", "ua", "ue", "au", "io"}
For Each dipthong In dipthongs
word = word.Replace(dipthong, dipthong(0))
Next
Dim vowels = "aeiou"
Dim vowelCount = 0
For Each c In word
If vowels.IndexOf(c) >= 0 Then vowelCount += 1
Next
If vowelCount = 0 Then
vowelCount = 1
End If
Return vowelCount
End Function
Function indexCalculation(ByRef wordCount As Integer, ByRef sentenceCount As Integer, ByRef vowelCount As Integer) As Integer
Dim answer As Integer = CInt(206.835 - 85.6 * (vowelCount / wordCount) - 1.015 * (wordCount / sentenceCount))
Return answer
End Function
End Class
Any suggestions would be greatly appreciated.
Here are my suggestions:
update your indexCalculation function to take in Integers, not strings. that way you don't have to convert them to numbers.
remove all of your extra variables you are not using. this will clean things up a bit.
remove your streamreader. it appears you are reading the text via File.ReadAllText
Label1.Show(answer) should be changed to Label1.Show(indexCalculation(wordCount,sentenceCount,vowelCount)) -- unless Label1 is something other than a regular label, use Label1.Text = indexCalculation(wordCount,sentenceCount,vowelCount))
Then for the vowelCount, you need to do the following:
Dim vowelCount as Integer = 0
For Each word as String in words
vowelCount += CountSyllables(word)
Next
Also, add the logic to the CountSyllables function to make it 1 if 0. If you don't want to include the last character in your vowel counting, then use a for loop instead of a for each loop and stop 1 character short.

Flesch Readability Index in Visual Basic

I'm working on a program that is supposed to perform the calculations for the Flesch Readability Index. The program is supposed to read in a text file "Project7.txt", it's then supposed to display the text in a multi-line text box and perform the following calculations:
Count the number of words in the file.
Count the number of syllables in the file.
Count the number of sentences in the file (a sentence can be ended by a ".", "?", "!", or ":"
The program is then supposed to plug the values into the following formula and display the result in a label (label1).
206.835-85.6*(Number of syllables/Number of words) - 1.015*(Number of words/Number of sentences)
Here is the code I have written so far.
Option Strict On
Imports System.IO
Public Class Form1
Private Sub ExitToolStripMenuItem_Click(sender As Object, e As EventArgs) Handles ExitToolStripMenuItem.Click
Me.Close()
End Sub
Private Sub OpenToolStripMenuItem_Click(sender As Object, e As EventArgs) Handles OpenToolStripMenuItem.Click
Dim open As New OpenFileDialog
open.Filter = "text files |project7.txt|All file |*.*"
open.InitialDirectory = Environment.GetFolderPath(Environment.SpecialFolder.DesktopDirectory)
If open.ShowDialog() = Windows.Forms.DialogResult.OK Then
Dim selectedFileName As String = System.IO.Path.GetFileName(open.FileName)
If selectedFileName.ToLower = "project7.txt" Then
Dim doc As String = ""
Dim line As String
Using reader As New StreamReader(open.OpenFile)
While Not reader.EndOfStream
doc += reader.ReadLine
Console.WriteLine(line)
End While
Dim text = File.ReadAllText("Project7.txt")
Dim words = text.Split(" "c)
Dim wordCount = words.Length
Dim separators As Char() = {"."c, "!"c, "?"c, ":"c}
Dim sentences = text.Split(separators, StringSplitOptions.RemoveEmptyEntries)
Dim sentenceCount = sentences.Length
End Using
Else
MessageBox.Show("You cannot use that file!")
End If
End If
End Sub
Function CountSyllables(word As String) As Integer
word = word.ToLower()
Dim dipthongs = {"oo", "ou", "ie", "oi", "ea", "ee", _
"eu", "ai", "ua", "ue", "au", "io"}
For Each dipthong In dipthongs
word = word.Replace(dipthong, dipthong(0))
Next
Dim vowels = "aeiou"
Dim vowelCount = 0
For Each c In word
If vowels.IndexOf(c) >= 0 Then vowelCount += 1
Next
Return vowelCount
End Function
End Class
Any suggestions are appreciated. Thanks in advance for the help.
Is the code always reporting one more sentence than there actually is?
If so take a look at this from the String.Split method MSDN docs:
When the Split function encounters two delimiters in a row, or a
delimiter at the beginning or end of the string, it interprets them as
surrounding an empty string ("")...
I'm sure your last sentence ends with your sentence delimiter so what's happening is your assignment to sentences is getting an extra, empty array element. See for yourself by breakpointing the line after your assignment and hovering your mouse over sentences. Examine the contents of the array.
The fix is to call Split with the option to remove empty array values. To do that though you'll need to call the Split overload that takes an array of Char for the delimiters:
Replace this line:
Dim sentences = text.Split("."c, "!"c, "?"c, ":"c)
With this:
Dim separators As Char() = {"."c, "!"c, "?"c, ":"c}
Dim sentences = text.Split(separators, StringSplitOptions.RemoveEmptyEntries)
And you should be good.

how to extract certain text from string

How do I filter/extract strings?
I have converted a PDF file into String using itextsharp and I have the text displayed into a Richtextbox1.
However there are too many irrelevant text that I don't need in the Richtextbox.
Is there a way I can display the text I want based on keywords, the entire length of the text.
Example of text that is displayed in textrichbox1 after conversation of PDF to text:
**774**
**Bos00232940
Bos00320491
Das1234
Das3216**
RAGE*
So the keywords would be "Bos", "Das", "774". and the new text that would be displayed in the richtextbox1 is shown below, instead of the entire text above.
*Bos00232940
Bos00320491
Das1234
Das3216
774*
Here is what I have so far. But it doesn't work it still displays the entire PDF in the richtextbox.
Public Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim pdffilename As String
pdffilename = TextBox1.Text
Dim filepath = "c:\temp\" & TextBox1.Text & ".pdf"
Dim thetext As String
thetext = GetTextFromPDF(filepath)
Dim lines() As String = System.Text.RegularExpressions.Regex.Split(thetext, Environment.NewLine)
Dim keywords As New List(Of String)
keywords.Add("Bos")
keywords.Add("Das")
keywords.Add("774")
Dim newTextLines As New List(Of String)
For Each line As String In lines
For Each keyw As String In thetext
If line.Contains(keyw) Then
newTextLines.Add(line)
Exit For
End If
Next
Next
RichTextBox1.Text = String.Join(Environment.NewLine, newTextLines.ToArray)
End Sub
SOLUTION
Thanks everyone for your help. Below is the code that worked and did exactly what I wanted it to do.
Public Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim pdffilename As String
pdffilename = TextBox1.Text
Dim filepath = "c:\temp\" & TextBox1.Text & ".pdf"
Dim thetext As String
thetext = GetTextFromPDF(filepath)
Dim re As New Regex("[\t ](?<w>((774)|(Bos)|(Das))[a-z0-9]*)[\t ]", RegexOptions.ExplicitCapture Or RegexOptions.IgnoreCase Or RegexOptions.Compiled)
Dim Lines() As String = {thetext}
Dim words As New List(Of String)
For Each s As String In Lines
Dim mc As MatchCollection = re.Matches(s)
For Each m As Match In mc
words.Add(m.Groups("w").Value)
Next
Next
RichTextBox1.Text = String.Join(Environment.NewLine, words.ToArray)
End Sub
For Each Word As String In thetext.Split(" ")
For Each key As String In keywords
If Word.StartsWith(key) Then
newTextLines.Add(Word)
Continue For
End If
Next
Next
or using LINQ:
Dim q = From word In thetext.Split(" ")
Where keywords.Any(Function(s) word.StartsWith(s))
Select word
RichTextBox1.Text = String.Join(Environment.NewLine, q.ToArray())
If don't know the keywords in advance but know in which context they occur, you can find them with a Regex expression. Two very handy Regex expressions allow you to find occurences succeeding or preceeding another:
(?<=prefix)find finds a pattern that follows another.
find(?=suffix) finds a pattern that comes before another.
If your number keyword (774) always preceeds " SIZE" you can find it like this: \w+(?=\sSIZE).
If the other keywords are always between "EX " and " DETAILS" you can find them like this: (?<=EX\s)(\w+\s)+(?=DETAILS).
You can put the whole thing together like this: \w+(?=\sSIZE)|(?<=EX\s)(\w+\s)+(?=DETAILS).
The disadvantage is that the keywords between "EX " and "DETAILS" will be returned as one match. But you can split the matches afterwards as in:
Const input As String = "2 3 3 4 4 A A B B SHEET 1 OF 1 774 SIZE SCALE 24.000-47.999 12.000-23.999 CON BAG WIRE 90in. EX Bos00232940 Bos00320491 Das1234 Das3216 DETAILS 1 2 RAGE"
Dim matches = Regex.Matches(input, "\w+(?=\sSIZE)|(?<=EX\s)(\w+\s)+(?=DETAILS)")
For Each m As Match In matches
Dim words = m.Value.Split(" "c)
For Each word As String In words
If word.Length > 0 Then ' Suppress the last empty word.
Console.WriteLine(word)
End If
Next
Next
Output:
774
Bos00232940
Bos00320491
Das1234
Das3216
How to do it with regular expression...
Dim re As New Regex("[\t ](?<w>((774)|(Bos)|(Das))[a-z0-9]*)[\t ]", RegexOptions.ExplicitCapture Or RegexOptions.IgnoreCase Or RegexOptions.Compiled)
Private Sub test()
Dim Lines() As String = {"2 3 3 4 4 A A B B SHEET 1 OF 1 774 SIZE SCALE 24.000-47.999 12.000-23.999 CON BAG WIRE 90in. EX Bos00232940 Bos00320491 Das1234 Das3216 DETAILS 1 2 RAGE"}
Dim words As New List(Of String)
For Each s As String In Lines
Dim mc As MatchCollection = re.Matches(s)
For Each m As Match In mc
words.Add(m.Groups("w").Value)
Next
Next
End Sub
Regex break down...
[\t ] Single tab or space (there is an alternative for whitespace too)
(?<w> Start of capture group called "w" This the the text returned later in the "m.Groups"
((774)|(Bos)|(Das)) one of the 3 blobs of text
[a-z0-9]* any a-z or 0-9 character, * = any number of them
) End of Capture group "w" from above.