Need To Extract String After '#' & Before ' ' (Blank Space) Multiple Times - vb.net

I'm working on extracting twitter/facebook style mentions from a textbox.
So far, here's my code:
Dim a As String = TextBox1.Text + " "
Dim b As Char() = a.ToCharArray
Dim c As String
Dim l As Integer = TextBox1.Text.Length
Dim temp As Integer = 0
Dim nex As Integer = a.IndexOf(" ")
For i = 0 To l - 1
If b(i) = "#" Then
temp = 1
ElseIf temp = 1 Then
temp = 2
End If
If temp = 2 Then
c = a.Substring(i, nex).Trim() 'nex needs be replaced with next space on 2nd or nth loop
MsgBox(c)
temp = 0
nex = a.IndexOf(" ") + nex
End If
Next
Now this works great if the entered text is- #one #twwo #three. (If
the next strings are greater in length.) But doesn't work elsewhere.
Also, there is probably going to be content between two #mentions so I'm not willing to change the b(i).
I'm sure there's a much more efficient way to do this.
Thanks!!

This is a job for regex. The pattern #\w+ should do nicely.
For Each m As Match In Regex.Matches("#testing this is a #test", "#\w+")
Console.WriteLine(m.Value)
Next
Will print out #testing and #test.
This pattern basically means "Find everything that starts with an '#' followed by one or more 'word characters'."
Regex is a very powerful tool for searching strings, you can read more about it on MSDN.

Related

Word VBA: iterating through characters incredibly slow

I have a macro that changes single quotes in front of a number to an apostrophe (or close single curly quote). Typically when you type something like "the '80s" in word, the apostrophe in front of the "8" faces the wrong way. The macro below works, but it is incredibly slow (like 10 seconds per page). In a regular language (even an interpreted one), this would be a fast procedure. Any insights why it takes so long in VBA on Word 2007? Or if someone has some find+replace skills that can do this without iterating, please let me know.
Sub FixNumericalReverseQuotes()
Dim char As Range
Debug.Print "starting " + CStr(Now)
With Selection
total = .Characters.Count
' Will be looking ahead one character, so we need at least 2 in the selection
If total < 2 Then
Return
End If
For x = 1 To total - 1
a_code = Asc(.Characters(x))
b_code = Asc(.Characters(x + 1))
' We want to convert a single quote in front of a number to an apostrophe
' Trying to use all numerical comparisons to speed this up
If (a_code = 145 Or a_code = 39) And b_code >= 48 And b_code <= 57 Then
.Characters(x) = Chr(146)
End If
Next x
End With
Debug.Print "ending " + CStr(Now)
End Sub
Beside two specified (Why...? and How to do without...?) there is an implied question – how to do proper iteration through Word object collection.
Answer is – to use obj.Next property rather than access by index.
That is, instead of:
For i = 1 to ActiveDocument.Characters.Count
'Do something with ActiveDocument.Characters(i), e.g.:
Debug.Pring ActiveDocument.Characters(i).Text
Next
one should use:
Dim ch as Range: Set ch = ActiveDocument.Characters(1)
Do
'Do something with ch, e.g.:
Debug.Print ch.Text
Set ch = ch.Next 'Note iterating
Loop Until ch is Nothing
Timing: 00:03:30 vs. 00:00:06, more than 3 minutes vs. 6 seconds.
Found on Google, link lost, sorry. Confirmed by personal exploration.
Modified version of #Comintern's "Array method":
Sub FixNumericalReverseQuotes()
Dim chars() As Byte
chars = StrConv(Selection.Text, vbFromUnicode)
Dim pos As Long
For pos = 0 To UBound(chars) - 1
If (chars(pos) = 145 Or chars(pos) = 39) _
And (chars(pos + 1) >= 48 And chars(pos + 1) <= 57) Then
' Make the change directly in the selection so track changes is sensible.
' I have to use 213 instead of 146 for reasons I don't understand--
' probably has to do with encoding on Mac, but anyway, this shows the change.
Selection.Characters(pos + 1) = Chr(213)
End If
Next pos
End Sub
Maybe this?
Sub FixNumQuotes()
Dim MyArr As Variant, MyString As String, X As Long, Z As Long
Debug.Print "starting " + CStr(Now)
For Z = 145 To 146
MyArr = Split(Selection.Text, Chr(Z))
For X = LBound(MyArr) To UBound(MyArr)
If IsNumeric(Left(MyArr(X), 1)) Then MyArr(X) = "'" & MyArr(X)
Next
MyString = Join(MyArr, Chr(Z))
Selection.Text = MyString
Next
Selection.Text = Replace(Replace(Selection.Text, Chr(146) & "'", "'"), Chr(145) & "'", "'")
Debug.Print "ending " + CStr(Now)
End Sub
I am not 100% sure on your criteria, I have made both an open and close single quote a ' but you can change that quite easily if you want.
It splits the string to an array on chr(145), checks the first char of each element for a numeric and prefixes it with a single quote if found.
Then it joins the array back to a string on chr(145) then repeats the whole things for chr(146). Finally it looks through the string for an occurence of a single quote AND either of those curled quotes next to each other (because that has to be something we just created) and replaces them with just the single quote we want. This leaves any occurence not next to a number intact.
This final replacement part is the bit you would change if you want something other than ' as the character.
I have been struggling with this for days now. My attempted solution was to use a regular expression on document.text. Then, using the matches in a document.range(start,end), replace the text. This preserves formatting.
The problem is that the start and end in the range do not match the index into text. I think I have found the discrepancy - hidden in the range are field codes (in my case they were hyperlinks). In addition, document.text has a bunch of BEL codes that are easy to strip out. If you loop through a range using the character method, append the characters to a string and print it you will see the field codes that don't show up if you use the .text method.
Amazingly you can get the field codes in document.text if you turn on "show field codes" in one of a number of ways. Unfortunately, that version is not exactly the same as what the range/characters shows - the document.text has just the field code, the range/characters has the field code and the field value. Therefore you can never get the character indices to match.
I have a working version where instead of using range(start,end), I do something like:
Set matchRange = doc.Range.Characters(myMatches(j).FirstIndex + 1)
matchRange.Collapse (wdCollapseStart)
Call matchRange.MoveEnd(WdUnits.wdCharacter, myMatches(j).Length)
matchRange.text = Replacement
As I say, this works but the first statement is dreadfully slow - it appears that Word is iterating through all of the characters to get to the correct point. In doing so, it doesn't seem to count the field codes, so we get to the correct point.
Bottom line, I have not been able to come up with a good way to match the indexing of the document.text string to an equivalent range(start,end) that is not a performance disaster.
Ideas welcome, and thanks.
This is a problem begging for regular expressions. Resolving the .Characters calls that many times is probably what is killing you in performance.
I'd do something like this:
Public Sub FixNumericalReverseQuotesFast()
Dim expression As RegExp
Set expression = New RegExp
Dim buffer As String
buffer = Selection.Range.Text
expression.Global = True
expression.MultiLine = True
expression.Pattern = "[" & Chr$(145) & Chr$(39) & "]\d"
Dim matches As MatchCollection
Set matches = expression.Execute(buffer)
Dim found As Match
For Each found In matches
buffer = Replace(buffer, found, Chr$(146) & Right$(found, 1))
Next
Selection.Range.Text = buffer
End Sub
NOTE: Requires a reference to Microsoft VBScript Regular Expressions 5.5 (or late binding).
EDIT:
The solution without using the Regular Expressions library is still avoiding working with Ranges. This can easily be converted to working with a byte array instead:
Sub FixNumericalReverseQuotes()
Dim chars() As Byte
chars = StrConv(Selection.Text, vbFromUnicode)
Dim pos As Long
For pos = 0 To UBound(chars) - 1
If (chars(pos) = 145 Or chars(pos) = 39) _
And (chars(pos + 1) >= 48 And chars(pos + 1) <= 57) Then
chars(pos) = 146
End If
Next pos
Selection.Text = StrConv(chars, vbUnicode)
End Sub
Benchmarks (100 iterations, 3 pages of text with 100 "hits" per page):
Regex method: 1.4375 seconds
Array method: 2.765625 seconds
OP method: (Ended task after 23 minutes)
About half as fast as the Regex, but still roughly 10ms per page.
EDIT 2: Apparently the methods above are not format safe, so method 3:
Sub FixNumericalReverseQuotesVThree()
Dim full_text As Range
Dim cached As Long
Set full_text = ActiveDocument.Range
full_text.Find.ClearFormatting
full_text.Find.MatchWildcards = True
cached = full_text.End
Do While full_text.Find.Execute("[" & Chr$(145) & Chr$(39) & "][0-9]")
full_text.End = full_text.Start + 2
full_text.Characters(1) = Chr$(96)
full_text.Start = full_text.Start + 1
full_text.End = cached
Loop
End Sub
Again, slower than both the above methods, but still runs reasonably fast (on the order of ms).

how to find the number of occurrences of a substring within a string vb.net

I have a string (for example: "Hello there. My name is John. I work very hard. Hello there!") and I am trying to find the number of occurrences of the string "hello there". So far, this is the code I have:
Dim input as String = "Hello there. My name is John. I work very hard. Hello there!"
Dim phrase as String = "hello there"
Dim Occurrences As Integer = 0
If input.toLower.Contains(phrase) = True Then
Occurrences = input.Split(phrase).Length
'REM: Do stuff
End If
Unfortunately, what this line of code seems to do is split the string every time it sees the first letter of phrase, in this case, h. So instead of the result Occurrences = 2 that I would hope for, I actually get a much larger number. I know that counting the number of splits in a string is a horrible way to go about doing this, even if I did get the correct answer, so could someone please help me out and provide some assistance?
Yet another idea:
Dim input As String = "Hello there. My name is John. I work very hard. Hello there!"
Dim phrase As String = "Hello there"
Dim Occurrences As Integer = (input.Length - input.Replace(phrase, String.Empty).Length) / phrase.Length
You just need to make sure that phrase.Length > 0.
the best way to do it is this:
Public Function countString(ByVal inputString As String, ByVal stringToBeSearchedInsideTheInputString as String) As Integer
Return System.Text.RegularExpressions.Regex.Split(inputString, stringToBeSearchedInsideTheInputString).Length -1
End Function
str="Thisissumlivinginsumgjhvgsum in the sum bcoz sum ot ih sum"
b= LCase(str)
array1=Split(b,"sum")
l=Ubound(array1)
msgbox l
the output gives u the no. of occurences of a string within another one.
You can create a Do Until loop that stops once an integer variable equals the length of the string you're checking. If the phrase exists, increment your occurences and add the length of the phrase plus the position in which it is found to the cursor variable. If the phrase can not be found, you are done searching (no more results), so set it to the length of the target string. To not count the same occurance more than once, check only from the cursor to the length of the target string in the Loop (strCheckThisString).
Dim input As String = "hello there. this is a test. hello there hello there!"
Dim phrase As String = "hello there"
Dim Occurrences As Integer = 0
Dim intCursor As Integer = 0
Do Until intCursor >= input.Length
Dim strCheckThisString As String = Mid(LCase(input), intCursor + 1, (Len(input) - intCursor))
Dim intPlaceOfPhrase As Integer = InStr(strCheckThisString, phrase)
If intPlaceOfPhrase > 0 Then
Occurrences += 1
intCursor += (intPlaceOfPhrase + Len(phrase) - 1)
Else
intCursor = input.Length
End If
Loop
You just have to change the input of the split function into a string array and then delare the StringSplitOptions.
Try out this line of code:
Occurrences = input.Split({phrase}, StringSplitOptions.None).Length
I haven't checked this, but I'm thinking you'll also have to account for the fact that occurrences would be too high due to the fact that you're splitting using your string and not actually counting how many times it is in the string, so I think Occurrences = Occurrences - 1
Hope this helps
You could create a recursive function using IndexOf. Passing the string to be searched and the string to locate, each recursion increments a Counter and sets the StartIndex to +1 the last found index, until the search string is no longer found. Function will require optional parameters Starting Position and Counter passed by reference:
Function InStrCount(ByVal SourceString As String, _
ByVal SearchString As String, _
Optional ByRef StartPos As Integer = 0, _
Optional ByRef Count As Integer = 0) As Integer
If SourceString.IndexOf(SearchString, StartPos) > -1 Then
Count += 1
InStrCount(SourceString, _
SearchString, _
SourceString.IndexOf(SearchString, StartPos) + 1, _
Count)
End If
Return Count
End Function
Call function by passing string to search and string to locate and, optionally, start position:
Dim input As String = "Hello there. My name is John. I work very hard. Hello there!"
Dim phrase As String = "hello there"
Dim Occurrences As Integer
Occurrances = InStrCount(input.ToLower, phrase.ToLower)
Note the use of .ToLower, which is used to ignore case in your comparison. Do not include this directive if you do wish comparison to be case specific.
One more solution based on InStr(i, str, substr) function (searching substr in str starting from i position, more info about InStr()):
Function findOccurancesCount(baseString, subString)
occurancesCount = 0
i = 1
Do
foundPosition = InStr(i, baseString, subString) 'searching from i position
If foundPosition > 0 Then 'substring is found at foundPosition index
occurancesCount = occurancesCount + 1 'count this occurance
i = foundPosition + 1 'searching from i+1 on the next cycle
End If
Loop While foundPosition <> 0
findOccurancesCount = occurancesCount
End Function
As soon as there is no substring found (InStr returns 0, instead of found substring position in base string), searching is over and occurances count is returned.
Looking at your original attempt, I have found that this should do the trick as "Split" creates an array.
Occurrences = input.split(phrase).ubound
This is CaSe sensitive, so in your case the phrase should equal "Hello there", as there is no "hello there" in the input
Expanding on Sumit Kumar's simple solution, here it is as a one-line working function:
Public Function fnStrCnt(ByVal str As String, ByVal substr As String) As Integer
fnStrCnt = UBound(Split(LCase(str), substr))
End Function
Demo:
Sub testit()
Dim thePhrase
thePhrase = "Once upon a midnight dreary while a man was in a house in the usa."
If fnStrCnt(thePhrase, " a ") > 1 Then
MsgBox "Found " & fnStrCnt(thePhrase, " a ") & " occurrences."
End If
End Sub 'testit()
I don't know if this is more obvious?
Starting from the beginning of longString check the next characters up to the number characters in phrase, if phrase is not found start looking from the second character etc. If it is found start agin from the current position plus the number of characters in phrase and increment the value of occurences
Module Module1
Sub Main()
Dim longString As String = "Hello there. My name is John. I work very hard. Hello there! Hello therehello there"
Dim phrase As String = "hello There"
Dim occurences As Integer = 0
Dim n As Integer = 0
Do Until n >= longString.Length - (phrase.Length - 1)
If longString.ToLower.Substring(n, phrase.Length).Contains(phrase.ToLower) Then
occurences += 1
n = n + (phrase.Length - 1)
End If
n += 1
Loop
Console.WriteLine(occurences)
End Sub
End Module
I used this in Vbscript, You can convert the same to VB.net as well
Dim str, strToFind
str = "sdfsdf:sdsdgs::"
strToFind = ":"
MsgBox GetNoOfOccurranceOf( strToFind, str)
Function GetNoOfOccurranceOf(ByVal subStringToFind As String, ByVal strReference As String)
Dim iTotalLength, newString, iTotalOccCount
iTotalLength = Len(strReference)
newString = Replace(strReference, subStringToFind, "")
iTotalOccCount = iTotalLength - Len(newString)
GetNoOfOccurranceOf = iTotalOccCount
End Function
I know this thread is really old, but I got another solution too:
Function countOccurencesOf(needle As String, s As String)
Dim count As Integer = 0
For i As Integer = 0 to s.Length - 1
If s.Substring(i).Startswith(needle) Then
count = count + 1
End If
Next
Return count
End Function

Help Visual Basic mixing characters

I'm making an application that will change position of two characters in Word.
Imports System.IO
Module Module1
Sub Main()
Dim str As String = File.ReadAllText("File.txt")
Dim str2 As String() = Split(str, " ")
For i As Integer = 0 To str2.Length - 1
Dim arr As Char() = CType(str2(i), Char())
For ia As Integer = 0 To arr.Length() - 1 Step 2
Dim pa As String
pa = arr(ia + 1)
arr(ia + 1) = arr(ia)
arr(ia) = pa
Next ia
For ib As Integer = 0 To arr.Length - 1
Console.Write(arr(ib))
File.WriteAllText("File2.txt", arr(ib))
Next ib
File.WriteAllText("File2.txt", " ")
Console.Write(" ")
Next i
Console.Read()
End Sub
End Module
For example:
Input: ab
Output: ba
Input: asdasd asdasd
Output: saadds saadds
Program works good, it is mixing characters good, but it doesn't write text to the file. It will write text in console, but not in file.
Note: Program is working only with words that are divisible by 2, but it's not a problem.
Also, it does not return any error message.
Your code is overwriting the file that you have already written with a single space (" ") each time round.
You should only open the file once, and append to it using a stream writer:
Using output = File.CreateText("file2.txt")
' Put the for loop here.
End Using
There are some other things wrong with your code. Firstly, use For Each instead of For, this makes your code much more simple and readable. Secondly, try to avoid For loops altogether where possible. For instance, instead of iterating over the characters to output them one at a time, just create a new string from the char array, and write that:
Dim shuffledWord As New String(arr)
output.Write(shuffledWord)
Some of your types are plain wrong, i.e. you are using String in places instead of Char. You should always use Option Strict On. Then the compiler will not tolerate such code.
You should also prefer to use framework methods over VB-specific methods. This makes it easier to understand for C# programmers, and also makes it easier to translate and change (that is, use the Split method of strings instead of a free function, use ToCharArray instead of a cast to Char() …).
Finally, use meaningful variable names. str, str2 and arr are particularly cryptic because they don’t tell the reader of the code anything of interest about the variables.
Sub Main()
Dim text As String = File.ReadAllText("File.txt")
Dim words As String() = str.Split(" "c)
Using output = File.CreateText("file2.txt")
For Each word In words
dim wordChars = word.ToCharArray()
For i As Integer = 0 To wordChars.Length - 1 Step 2
Dim tmp As Char = wordChars(i + 1)
wordChars(i + 1) = wordChars(i)
arr(i) = tmp
Next
Dim shuffledWord As New String(wordChars)
output.Write(shuffledWord + " ")
Console.Write(huffledWord + " ")
Next
End Using
Console.Read()
End Sub

Extract characters from a long string and reformat the output to CSV by using keywords with VB.net

I am new to VB.Net 2008. I have a task to resolve, it is regading extracting characters from a long string to the console, the extracted text shall be reformatted and saved into a CSV file. The string comes out of a database.
It looks something like: UNH+RAM6957+ORDERS:D:96A:UN:EGC103'BGM+38G::ZEW+REQEST6957+9'DTM+Z05:0:805'DTM+137:20100930154
The values are seperated by '.
I can query the database and display the string on the console, but now I need to extract the
Keyword 'ORDERS' for example, and lets say it's following 5 Characters. So the output should look like: ORDERS:D:96A then I need to extract the keyword 'BGM' and its following five characters so the output should look like: BGM+38G:
After extracting all the keywords, the result should be comma seperated and look like:
ORDERS:D:96A,BGM+38G: it should be saved into a CSV file automatically.
I tried already:
'Lookup for containing KeyWords
Dim FoundPosition1 = p_EDI.Contains("ORDERS")
Console.WriteLine(FoundPosition1)
Which gives the starting position of the Keyword.
I tried to trim the whole thing around the keyword "DTM". The EDI variable holds the entire string from the Database:
Dim FoundPosition2 = EDI
FoundPosition2 = Trim(Mid(EDI, InStr(EDI, "DTM")))
Console.WriteLine(FoundPosition2)
Can someone help please?
Thank you in advance!
To illustrate the steps involved:
' Find the position where ORDERS is in the string.'
Dim foundPosition = EDI.IndexOf("ORDERS")
' Start at that position and extract ORDERS + 5 characters = 11 characters in total.'
Dim ordersData = EDI.SubString(foundPosition, 11)
' Find the position where BGM is in the string.'
Dim foundPosition2 = EDI.IndexOf("BGM")
' Start at that position and extract BGM + 5 characters = 8 characters in total.'
Dim bgmData = EDI.SubString(foundPosition2, 8)
' Construct the CVS data.'
Dim cvsData = ordersData & "," & bgmData
I don't have my IDE here, but something like this will work:
dim EDI as string = "UNH+RAM6957+ORDERS:D:96A:UN:EGC103'BGM+38G::ZEW+REQEST6957+9'DTM+Z05:0:805'DTM+137:20100930154"
dim result as string = KeywordPlus(EDI, "ORDER", 5) + "," _
+ KeywordPlus(EDI, "BGM", 5)
function KeywordPlus(s as string, keyword as string, length as integer) as string
dim index as integer = s.IndexOf(keyword)
if index = -1 then return ""
return s.substring(index, keyword.length + length)
end function
for the interrested people among us, I have put the code together, and created
a CSV file out of it. Maybe it can be helpful to others...
If EDI.Contains("LOC") Then
Dim foundPosition1 = EDI.IndexOf("LOC")
' Start at that position and extract ORDERS + 5 characters = 11 characters in total.'
Dim locData = EDI.Substring(foundPosition1, 11)
'Console.WriteLine(locData)
Dim FoundPosition2 = EDI.IndexOf("QTY")
Dim qtyData = EDI.Substring(FoundPosition2, 11)
'Console.WriteLine(qtyData)
' Construct the CSV data.
Dim csvData = locData & "," & qtyData
'Console.WriteLine(csvData)
' Creating the CSV File.
Dim csvFile As String = My.Application.Info.DirectoryPath & "\Test.csv"
Dim outFile As IO.StreamWriter = My.Computer.FileSystem.OpenTextFileWriter(csvFile, True)
outFile.WriteLine(csvData)
outFile.Close()
Console.WriteLine(My.Computer.FileSystem.ReadAllText(csvFile))
End IF
Have fun!

Cant figure out how to merge variables vb.net

I am creating a for each loop to take the words from a string and place them each into a text box. The program allows for up to "9" variables What I am trying to attempt is.
Foreach word in Words
i = i +1
Varible & i = word
txtCritical1.Text = variable & i
any ideas on a way to make this work?
Have a look through the MSDN article on For Each. It includes a sample using strings.
https://msdn.microsoft.com/en-us/library/5ebk1751.aspx
So i went with a simple if statement this does the trick. Each text box is filled in.
Dim details As String = EditEvent.EditDbTable.Rows(0).Item(13).ToString()
Dim words As String() = details.Split(New Char() {"«"})
Dim word As String
For Each word In words
i = i + 1
v = word
If i = 1 Then
txtCritical1.Text = v
ElseIf i = 2 Then
txtCritical2.Text = v
ElseIf ....
ElseIf i = 9 then
txtCritical2.text = v
Else
....
End If
Next