How to split a string in VBA by more than one character - vba

In C# one can easily split a split string by more than one character, one supplies an array of split characters. I was wondering what is best way to achieve this in VBA. I use VBA.Split typically but to split on more than one characters requires drilling in to the results and sub-splitting the elements. Then one has to re-dimension arrays etc. Quite painful.
Contraints
VBA responses only please. You may use .NET collection classes if you wish (yes they are creatable and callable in VBA). You may use JSON, XML as vessels for the list of split segments if you wish. You may use the humble VBA.Collection class if you wish, or even a Scripting.Dictionary. You may use even a fabricated recordset if you wish.
I know full well one can write a .NET asssembly to call the .NET String.Split method and expose assembly to VBA with COM interfaces but where is the challenge in that.

This should be fairly easy to do with a regular expression. If you match on the negation of the passed characters to split on, the matches will be the members of the output array. The upside to doing this is that the output array only needs to be sized once because you can get a count of the matches returned by the RegExp. The pattern is fairly simple to build - it boils down to something like [^abc]+ where 'a', 'b', and 'c' are the characters to split on. About the only thing that you need to do to prepare the expression is to escape a couple characters that have special meaning in that context inside a regular expression (I probably forgot some):
Private Function BuildRegexPattern(ByVal inputString As String) As String
Dim escapeTargets() As String
escapeTargets = VBA.Split("- ^ \ ]")
Dim returnValue As String
returnValue = inputString
Dim idx As Long
For idx = LBound(escapeTargets) To UBound(escapeTargets)
returnValue = Replace$(returnValue, escapeTargets(idx), "\" & escapeTargets(idx))
Next
BuildRegexPattern = "[^" & returnValue & "]+"
End Function
Once you have the pattern, it's just a simple matter of sizing the array and iterating over the matches to assign them (plus some other special case handling, etc.):
Public Function MultiSplit(ByVal toSplit As String, Optional ByVal delimiters As String = " ") As String()
Dim returnValue() As String
If toSplit = vbNullString Then
returnValue = VBA.Split(vbNullString)
Else
With New RegExp
.Pattern = BuildRegexPattern(IIf(delimiters = vbNullString, " ", delimiters))
.MultiLine = True
.Global = True
If Not .Test(toSplit) Then
'Only delimiters.
ReDim returnValue(Len(toSplit) - 1)
Else
Dim matches As Object
Set matches = .Execute(toSplit)
ReDim returnValue(matches.Count - 1)
Dim idx As Long
For idx = LBound(returnValue) To UBound(returnValue)
returnValue(idx) = matches(idx)
Next
End If
End With
End If
MultiSplit = returnValue
End Function

In my attempt, I replace all the other characters with space before splitting on space. (So I cheat a little.)
Private Function SplitByMoreThanOneChars(ByVal sLine As String)
'*
'* Brought to you by the Excel Development Platform Blog
'* http://exceldevelopmentplatform.blogspot.com/2018/11/
'*
'* Don't get excited, this splits by spaces only
'* we fake splitting by multiple characters by replacing those characters
'* with spaces
'*
Dim vChars2 As Variant
vChars2 = Array(" ", "<", ">", "[", "]", "(", ")", ";")
Dim sLine2 As String
sLine2 = sLine
Dim lCharLoop As Long
For lCharLoop = LBound(vChars2) To UBound(vChars2)
Debug.Assert Len(vChars2(lCharLoop)) = 1
sLine2 = VBA.Replace(sLine2, vChars2(lCharLoop), " ")
Next
SplitByMoreThanOneChars = VBA.Split(sLine2)
End Function

Related

Look for multiple words in string

Need to build a VBA function that simulates the REGEXP_INSTR or REGEXP_LIKE in Oracle. Those make possible to look for words in string without having to loop word by word.
I've found this code, that find Names that starts with "Mr|Mrs|Ms|Dr", that meant to be used like:
Function StringStarts(strCheck As String, options As String) As Boolean
With CreateObject("VBScript.RegExp")
.IgnoreCase = True
.Pattern = "^(" & options & ")\.*\b"
StringStarts = .Test(strCheck)
End With
End Function
Debug.print StringStarts("Dr leopoldo malmeida", "Mr|Mrs|Ms|Dr")
In fact, I need help to find if it's possible to alter this function, in order to find any word, or parts of words (case-insensitive matching), in the pattern on any location of the string to search. For example:
Debug.print StringStarts("Looking for multiple words", "Word|like|for")
That should return true: "Word found in 'words'"; "for was a complete match in string".
Most probably RegEx is not needed, because it is a bit slow. A simple for-each loop in a function would be ok:
Public Function PatternPresent(testedString As String, _
Optional pattern As String = "Mr|Mrs|Ms|Dr") As Boolean
Const separator = "|"
Dim patterns As Variant: patterns = Split(pattern, separator)
Dim myVar As Variant
For Each myVar In patterns
If InStr(1, testedString, myVar) Then
PatternPresent = True
Exit Function
End If
Next myVar
End Function

CountWord and CountVowel into Function in VB.NET

Hi I need to change WordCount and CountVowel procedures to functions and create a function to count number of consonants in a string.
I have done these two procedures but I cannot figure out how to do the last part. I am fairly new to programming.
My current code is given below:
Sub Main()
Dim Sentence As String
Console.WriteLine("Sentence Analysis" + vbNewLine + "")
Console.WriteLine("Enter a sentence, then press 'Enter'" + vbNewLine + "")
Sentence = Console.ReadLine()
Console.WriteLine("")
Call WordCount(Sentence)
Call VowelCount(Sentence)
Console.ReadLine()
End Sub
Sub WordCount(ByVal UserInput As String)
Dim Space As String() = UserInput.Split(" ")
Console.WriteLine("There are {0} words", Space.Length)
End Sub
Sub VowelCount(ByVal UserInput As String)
Dim i As Integer
Dim VowelNumber As Integer
Dim Vowels As String = "aeiou"
For i = 1 To Len(UserInput)
If InStr(Vowels, Mid(UserInput, i, 1)) Then
VowelNumber = VowelNumber + 1
End If
Next
Console.WriteLine("There are {0} vowels", VowelNumber)
End Sub
Thanks for your time
I would use the following three functions. Note that WordCount uses RemoveEmptyEntries avoids counting empty words when there are multiple spaces between words.
The other two functions count upper case vowels as vowels, rather than just lower case. They take advantage of the fact that strings can be treated as arrays of Char, and use the Count method to count how many of those Chars meet certain criteria.
Note that the designation of "AEIOU" as vowels may not be correct in all languages, and even in English "Y" is sometimes considered a vowel. You might also need to consider the possibility of accented letters such as "É".
Function WordCount(UserInput As String) As Integer
Return UserInput.Split({" "c}, StringSplitOptions.RemoveEmptyEntries).Length
End Function
Function VowelCount(UserInput As String) As Integer
Return UserInput.Count(Function(c) "aeiouAEIOU".Contains(c))
End Function
Function ConsonantCount(UserInput As String) As Integer
Return UserInput.Count(Function(c) Char.IsLetter(c) And Not "aeiouAEIOU".Contains(c))
End Function
To turn each of your Sub routines into a Function, you need to do three things. First, you need to change the Sub and End Sub keywords to Function and End Function, respectively. So:
Sub MyMethod(input As String)
' ...
End Sub
Becomes:
Function MyMethod(input As String)
' ...
End Function
Next, since it's a function, it needs to return a value, so your Function declaration needs to specify the type of the return value. So, the above example would become:
Function MyMethod(input As String) As Integer
' ...
End Function
Finally, the code in the function must actually specify what the return value will be. In VB.NET, that is accomplished by using the Return keyword, like this:
Function MyMethod(input As String) As Integer
Dim result As Integer
' ...
Return result
End Function
So, to apply that to your example:
Sub WordCount(ByVal UserInput As String)
Dim Space As String() = UserInput.Split(" ")
Console.WriteLine("There are {0} words", Space.Length)
End Sub
Would become:
Function WordCount(userInput As String) As Integer
Dim Space As String() = UserInput.Split(" ")
Return Space.Length
End Sub
Note, ByVal is the default, so you don't need to specify it, and parameter variables, by standard convention in .NET are supposed to be camelCase rather than PascalCase. Then, when you call the method, you can use the return value of the function like this:
Dim count As Integer = WordCount(Sentence)
Console.WriteLine("There are {0} words", count)
As far as counting consonants goes, that will be very similar to your VowelCount method, except that you would give it the list of consonants to look for instead of vowels.
You could use the Regex class. It's designed to search for substrings using patterns, and it's rather fast at it too.
Sub VowelCount(ByVal UserInput As String)
Console.WriteLine("There are {0} vowels", System.Text.RegularExpressions.Regex.Matches(UserInput, "[aeiou]", System.Text.RegularExpressions.RegexOptions.IgnoreCase).Count.ToString())
End Sub
[aeiou] is the pattern used when performing the search. It matches any of the characters you've written inside the brackets.
Example:
http://ideone.com/LEYC30
Read more about Regex:
MSDN - .NET Framework Regular Expressions
MSDN - Regular Expression Language - Quick Reference
VB is no longer a language I use frequently but I don't think I'm going to steer you wrong even without testing this out.
Sub Main()
Dim Sentence As String
Console.WriteLine("Sentence Analysis" + vbNewLine + "")
Console.WriteLine("Enter a sentence, then press 'Enter'" + vbNewLine + "")
Sentence = Console.ReadLine()
Console.WriteLine("")
'usually it's better just let the function calculate a value and do output elsewhere
'so I've commented your original calls so you can see where they used to be
'Call WordCount(Sentence)
Console.WriteLine("There are {0} words", WordCount(Sentence))
'Call VowelCount(Sentence)
Console.WriteLine("There are {0} vowels", VowelCount(Sentence))
Console.ReadLine()
End Sub
Function WordCount(ByVal UserInput As String) As Integer
Dim Space As String() = UserInput.Split(" ")
WordCount = Space.Length
'or just shorten it to one line...
'Return UserInput.Split(" ").Length
End Function
Function VowelCount(ByVal UserInput As String) As Integer
Dim i As Integer
Dim VowelNumber As Integer
Dim Vowels As String = "aeiou"
For i = 1 To Len(UserInput)
If InStr(Vowels, Mid(UserInput, i, 1)) Then
VowelNumber = VowelNumber + 1
End If
Next
VowelCount = VowelNumber
End Function
The most obvious change between a sub and a function is changing the keywords that wrap up the procedure. For this conversation let's just say that's one good word to use for encompassing both concepts since they're very similar and many languages don't really draw such a big distinction.
For Visual Basic's purposes a function needs to return something and that's indicated by the As Integer that I added to the end of both of the function declarations (can't remember if that's the right VB terminology.) Also in VB you return a value to the caller by assigning to the name of the function (also see edit below.) So I replaced those lines that were WriteLines with appropriate assignments. Last I moved those WriteLine statements up into Main. The arguments needed to be changed to use the function return values rather than the variables they originally referenced.
Hopefully I'm not doing your homework for you!
EDIT: Visual Basic underwent a lot of changes to the language during the move to .Net back in the early 2000's. I had forgotten (or possibly not even realized) that the new preferred choice for returning a value is now more in line with languages like C#. So rather than assigning values to WordCount and VowelCount you can just use Return. One difference between the two is that a Return will cause the sub/function to exit at that point even if there is other code afterward. This might be useful inside an if...end if for example. I'm hoping this helps you learn something rather than just being confusing.
EDIT #2: Now that I see the accepted answer and re-read the question it seems there was a small part about counting consonants that got overlooked. At this point I assume this was indeed a classroom exercise and the intended answer was possibly even to derive the consonant count by using the other functions.
Here you go.
Function WordCount(ByVal UserInput As String) As Integer
Dim Space As String() = UserInput.Split(" ")
Return Space.Length
End Function
Function VowelCount(ByVal UserInput As String) As Integer
Dim i As Integer
Dim VowelNumber As Integer
Dim Vowels As String = "aeiou"
For i = 1 To Len(UserInput)
If InStr(Vowels, Mid(UserInput, i, 1)) Then
VowelNumber = VowelNumber + 1
End If
Next
Return VowelNumber
End Function
Function ConsonantCount(ByVal UserInput As String) As Integer
Dim i As Integer
Dim ConsonantNumber As Integer
Dim Consonants As String = "bcdfghjklmnpqrstvwxyz"
For i = 1 To Len(UserInput)
If InStr(Consonants, Mid(UserInput, i, 1)) Then
ConsonantNumber = ConsonantNumber + 1
End If
Next
Return ConsonantNumber
End Function

How to find indexes for certain character in a string VB.NET

I'm beginner with VB.net.
How do I read indexes for certain character in a string? I read an barcode and I get string like this one:
3XXX123456-C-AA123456TY-667
From that code I should read indexes for character "-" so I can cut the string in parts later in the code.
For example code above:
3456-C
6TY-667
The length of the string can change (+/- 3 characters). Also the places and count of the hyphens may vary.
So, I'm looking for code which gives me count and position of the hyphens.
Thanks in advance!
Use the String.Splt method.
'a test string
Dim BCstring As String = "3XXX123456-C-AA123456TY-667"
'split the string, removing the hyphens
Dim BCflds() As String = BCstring.Split({"-"c}, StringSplitOptions.None)
'number of hyphens in the string
Dim hyphCT As Integer = BCflds.Length - 1
'look in the debuggers immediate window
Debug.WriteLine(BCstring)
'show each field
For Each s As String In BCflds
Debug.WriteLine(String.Format("{0,5} {1}", s.Length, s))
Next
'or
Debug.WriteLine(BCstring)
For idx As Integer = 0 To hyphCT
Debug.WriteLine(String.Format("{0,5} {1}", BCflds(idx).Length, BCflds(idx)))
Next
If all you need are the parts between hyphens then as suggested by dbasnett use the split method for strings. If by chance you need to know the index positions of the hyphens you can use the first example using Lambda to get the positions which in turn the count give you how many hyphens were located in the string.
When first starting out with .NET it's a good idea to explore the various classes for strings and numerics as there are so many things that some might not expect to find that makes coding easier.
Dim barCode As String = "3XXX123456-C-AA123456TY-667"
Dim items = barCode _
.Select(Function(c, i) New With {.Character = c, .Index = i}) _
.Where(Function(item) item.Character = "-"c) _
.ToList
Dim hyphenCount As Integer = items.Count
Console.WriteLine("hyphen count is {0}", hyphenCount)
Console.WriteLine("Indices")
For Each item In items
Console.WriteLine(" {0}", item.Index)
Next
Console.WriteLine()
Console.WriteLine("Using split")
Dim barCodeParts As String() = barCode.Split("-"c)
For Each code As String In barCodeParts
Console.WriteLine(code)
Next
Here is an example that'll split your string and allow you to parse through the values.
Private Sub TestSplits2Button_Click(sender As Object, e As EventArgs) Handles TestSplits2Button.Click
Try
Dim testString As String = "3XXX123456-C-AA123456TY-667"
Dim vals() As String = testString.Split(Convert.ToChar("-"))
Dim numberOfValues As Integer = vals.GetUpperBound(0)
For Each testVal As String In vals
Debug.Print(testVal)
Next
Catch ex As Exception
MessageBox.Show(String.Concat("An error occurred: ", ex.Message))
End Try
End Sub

How to extract numbers UNTIL a space is reached in a string using Excel 2010?

I need to pull the code from the following string: 72381 Test 4Dx for Worms. The code is 72381 and the function that I'm using does a wonderful job of pulling ALL the numbers from a string and gives me back 723814, which pulls the 4 from the description of the code. The actual code is only the 72381. The codes are of varying length and are always followed by a space before the description begins; however there are spaces in the descriptions as well. This is the function I am using that I found from a previous search:
Function OnlyNums(sWord As String)
Dim sChar As String
Dim x As Integer
Dim sTemp As String
sTemp = ""
For x = 1 To Len(sWord)
sChar = Mid(sWord, x, 1)
If Asc(sChar) >= 48 And _
Asc(sChar) <= 57 Then
sTemp = sTemp & sChar
End If
Next
OnlyNums = Val(sTemp)
End Function
If the first character in the description part of your string is never numeric, you could use the VBA Val(string) function to return all of the numeric characters before the first non-numeric character.
Function GetNum(sWord As String)
GetNum = Val(sWord)
End Function
See the syntax of the Val(string) function for full details of it's usage.
You're looking for the find function.. Example:
or in VBA instr() and left()
Since you know the pattern is always code followed by space just use left of the string for the number of characters to the first space found using instr. Sample in immediate window above. Loop is going to be slow, and while it may validate they are numeric why bother if you know pattern is code then space?
In similar situations in C# code, I leave the loop early after finding the first instance of a space character (32). In VBA, you'd use Exit For.
You can get rid of the function altogether and use this:
split("72381 Test 4Dx for Worms"," ")(0)
This will split the string into an array using " " as the split char. Then it shows us address 0 in the array (the first element)
In the context of your function if you are dead set on using one it is this:
Function OnlyNums(sWord As String)
OnlyNums = Split(sWord, " ")(0)
End Function
While I like the simplicity of Mark's solution, you could use an efficient parser below to improve your character by character search (to cope with strings that don't start with numbers).
test
Sub test()
MsgBox StrOut("72381 Test 4Dx")
End Sub
code
Function StrOut(strIn As String)
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "^(\d+)(\s.+)$"
If .test(strIn) Then
StrOut = .Replace(strIn, "$1")
Else
StrOut = "no match"
End If
End With
End Function

how to read all strings that are between 2 other strings

i have managed to find a string between 2 specified strings,
the only issue now is that it will only find one and then stop.
how am i possible to make it grab all the strings in a textbox?
the textbox is multiline and i have put a litle config in it.
now i want that the listbox will add all the strings that are between my 2 specified strings.
textbox3.text containts "<"
and textbox 4.text contains ">"
Public Function GetClosedText(ByVal source As String, ByVal opener As String, ByVal closer As String) As String
Dim intStart As Integer = InStr(source, opener)
If intStart > 0 Then
Dim intStop As Integer = InStr(intStart + Len(opener), source, closer)
If intStop > 0 Then
Try
Dim value As String = source.Substring(intStart + Len(opener) - 1, intStop - intStart - Len(opener))
Return value
Catch ex As Exception
Return ""
End Try
End If
End If
Return ""
End Function
usage:
ListBox1.Items.Add(GetClosedText(TextBox1.Text, TextBox3.Text, TextBox4.Text))
The easiest way (least lines of code) to do this would be to use a regular expression. For instance, to find all of that strings enclosed in pointy brackets, you could use this regular expression:
\<(?<value>.*?)\>
Here's what that all means:
\< - Find a string which starts with a < character. Since < has a special meaning in RegEx, it must be escaped (i.e. preceded with a backslash)
(?<value>xxx) - This creates a named group so that we can later access this portion of the matched string by the name "value". Everything contained in the name group (i.e. where xxx is), is considered part of that group.
.*? - This means find any number of any characters, up to, but not including whatever comes next. The . is a wildcard which means any character. The * means any number of times. The ? makes it non-greedy so it stops matching as soon as if finds whatever comes next (the closing >).
\> - Specifies that matching strings must end with a > character. Since > has a special meaning in RegEx, it must also be escaped.
You could use that RegEx expression to find all the matches, like this:
Dim items As New List(Of String)()
For Each i As Match In Regex.Matches(source, "\<(?<value>.*?)\>")
items.Add(i.Groups("value").Value)
Next
The trick to making it work in your scenario is that you need to dynamically specify the opening and closing characters. You can do that by concatenating them to the RegEx, like this:
Regex.Matches(source, opener & "(?<value>.*?)" & closer)
But the problem is, that will only work if source and closer are not special RegEx characters. In your example, they are < and >, which are special characters, so they need to be escaped. The safe way to do that is to use the Regex.Escape method, which only escapes the string if it needs to be:
Private Function GetClosedText(source As String, opener As String, closer As String) As String()
Dim items As New List(Of String)()
For Each i As Match In Regex.Matches(source, Regex.Escape(opener) & "(?<value>.*?)" & Regex.Escape(closer))
items.Add(i.Groups("value").Value)
Next
Return items.ToArray()
End Function
Notice that in the above example, rather than finding a single item and returning it, I changed the GetClosedText function to return an array of strings. So now, you can call it like this:
ListBox1.Items.AddRange(GetClosedText(TextBox1.Text, TextBox3.Text, TextBox4.Text))
I ssume you want to loop all openers and closers:
' always use meaningful variable/control names instead of
If TextBox3.Lines.Length <> TextBox4.Lines.Length Then
MessageBox.Show("Please provide the same count of openers and closers!")
Return
End If
Dim longText = TextBox1.Text
For i As Int32 = 0 To TextBox3.Lines.Length - 1
Dim opener = TextBox3.Lines(i)
Dim closer = TextBox4.Lines(i)
listBox1.Items.Add(GetClosedText(longText, opener , closer))
Next
However, you should use .NET methods as shown here:
Public Function GetClosedText(ByVal source As String, ByVal opener As String, ByVal closer As String) As String
Dim indexOfOpener = source.IndexOf(opener)
Dim result As String = ""
If indexOfOpener >= 0 Then ' default is -1 and indices start with 0
indexOfOpener += opener.Length ' now look behind the opener
Dim indexOfCloser = source.IndexOf(closer, indexOfOpener)
If indexOfCloser >= 0 Then
result = source.Substring(indexOfOpener, indexOfCloser - indexOfOpener)
Else
result = source.Substring(indexOfOpener) ' takes the rest behind the opener
End If
End If
Return result
End Function