How to extract numbers UNTIL a space is reached in a string using Excel 2010? - vba

I need to pull the code from the following string: 72381 Test 4Dx for Worms. The code is 72381 and the function that I'm using does a wonderful job of pulling ALL the numbers from a string and gives me back 723814, which pulls the 4 from the description of the code. The actual code is only the 72381. The codes are of varying length and are always followed by a space before the description begins; however there are spaces in the descriptions as well. This is the function I am using that I found from a previous search:
Function OnlyNums(sWord As String)
Dim sChar As String
Dim x As Integer
Dim sTemp As String
sTemp = ""
For x = 1 To Len(sWord)
sChar = Mid(sWord, x, 1)
If Asc(sChar) >= 48 And _
Asc(sChar) <= 57 Then
sTemp = sTemp & sChar
End If
Next
OnlyNums = Val(sTemp)
End Function

If the first character in the description part of your string is never numeric, you could use the VBA Val(string) function to return all of the numeric characters before the first non-numeric character.
Function GetNum(sWord As String)
GetNum = Val(sWord)
End Function
See the syntax of the Val(string) function for full details of it's usage.

You're looking for the find function.. Example:
or in VBA instr() and left()
Since you know the pattern is always code followed by space just use left of the string for the number of characters to the first space found using instr. Sample in immediate window above. Loop is going to be slow, and while it may validate they are numeric why bother if you know pattern is code then space?

In similar situations in C# code, I leave the loop early after finding the first instance of a space character (32). In VBA, you'd use Exit For.

You can get rid of the function altogether and use this:
split("72381 Test 4Dx for Worms"," ")(0)
This will split the string into an array using " " as the split char. Then it shows us address 0 in the array (the first element)
In the context of your function if you are dead set on using one it is this:
Function OnlyNums(sWord As String)
OnlyNums = Split(sWord, " ")(0)
End Function

While I like the simplicity of Mark's solution, you could use an efficient parser below to improve your character by character search (to cope with strings that don't start with numbers).
test
Sub test()
MsgBox StrOut("72381 Test 4Dx")
End Sub
code
Function StrOut(strIn As String)
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "^(\d+)(\s.+)$"
If .test(strIn) Then
StrOut = .Replace(strIn, "$1")
Else
StrOut = "no match"
End If
End With
End Function

Related

Look for multiple words in string

Need to build a VBA function that simulates the REGEXP_INSTR or REGEXP_LIKE in Oracle. Those make possible to look for words in string without having to loop word by word.
I've found this code, that find Names that starts with "Mr|Mrs|Ms|Dr", that meant to be used like:
Function StringStarts(strCheck As String, options As String) As Boolean
With CreateObject("VBScript.RegExp")
.IgnoreCase = True
.Pattern = "^(" & options & ")\.*\b"
StringStarts = .Test(strCheck)
End With
End Function
Debug.print StringStarts("Dr leopoldo malmeida", "Mr|Mrs|Ms|Dr")
In fact, I need help to find if it's possible to alter this function, in order to find any word, or parts of words (case-insensitive matching), in the pattern on any location of the string to search. For example:
Debug.print StringStarts("Looking for multiple words", "Word|like|for")
That should return true: "Word found in 'words'"; "for was a complete match in string".
Most probably RegEx is not needed, because it is a bit slow. A simple for-each loop in a function would be ok:
Public Function PatternPresent(testedString As String, _
Optional pattern As String = "Mr|Mrs|Ms|Dr") As Boolean
Const separator = "|"
Dim patterns As Variant: patterns = Split(pattern, separator)
Dim myVar As Variant
For Each myVar In patterns
If InStr(1, testedString, myVar) Then
PatternPresent = True
Exit Function
End If
Next myVar
End Function

How to split a string in VBA by more than one character

In C# one can easily split a split string by more than one character, one supplies an array of split characters. I was wondering what is best way to achieve this in VBA. I use VBA.Split typically but to split on more than one characters requires drilling in to the results and sub-splitting the elements. Then one has to re-dimension arrays etc. Quite painful.
Contraints
VBA responses only please. You may use .NET collection classes if you wish (yes they are creatable and callable in VBA). You may use JSON, XML as vessels for the list of split segments if you wish. You may use the humble VBA.Collection class if you wish, or even a Scripting.Dictionary. You may use even a fabricated recordset if you wish.
I know full well one can write a .NET asssembly to call the .NET String.Split method and expose assembly to VBA with COM interfaces but where is the challenge in that.
This should be fairly easy to do with a regular expression. If you match on the negation of the passed characters to split on, the matches will be the members of the output array. The upside to doing this is that the output array only needs to be sized once because you can get a count of the matches returned by the RegExp. The pattern is fairly simple to build - it boils down to something like [^abc]+ where 'a', 'b', and 'c' are the characters to split on. About the only thing that you need to do to prepare the expression is to escape a couple characters that have special meaning in that context inside a regular expression (I probably forgot some):
Private Function BuildRegexPattern(ByVal inputString As String) As String
Dim escapeTargets() As String
escapeTargets = VBA.Split("- ^ \ ]")
Dim returnValue As String
returnValue = inputString
Dim idx As Long
For idx = LBound(escapeTargets) To UBound(escapeTargets)
returnValue = Replace$(returnValue, escapeTargets(idx), "\" & escapeTargets(idx))
Next
BuildRegexPattern = "[^" & returnValue & "]+"
End Function
Once you have the pattern, it's just a simple matter of sizing the array and iterating over the matches to assign them (plus some other special case handling, etc.):
Public Function MultiSplit(ByVal toSplit As String, Optional ByVal delimiters As String = " ") As String()
Dim returnValue() As String
If toSplit = vbNullString Then
returnValue = VBA.Split(vbNullString)
Else
With New RegExp
.Pattern = BuildRegexPattern(IIf(delimiters = vbNullString, " ", delimiters))
.MultiLine = True
.Global = True
If Not .Test(toSplit) Then
'Only delimiters.
ReDim returnValue(Len(toSplit) - 1)
Else
Dim matches As Object
Set matches = .Execute(toSplit)
ReDim returnValue(matches.Count - 1)
Dim idx As Long
For idx = LBound(returnValue) To UBound(returnValue)
returnValue(idx) = matches(idx)
Next
End If
End With
End If
MultiSplit = returnValue
End Function
In my attempt, I replace all the other characters with space before splitting on space. (So I cheat a little.)
Private Function SplitByMoreThanOneChars(ByVal sLine As String)
'*
'* Brought to you by the Excel Development Platform Blog
'* http://exceldevelopmentplatform.blogspot.com/2018/11/
'*
'* Don't get excited, this splits by spaces only
'* we fake splitting by multiple characters by replacing those characters
'* with spaces
'*
Dim vChars2 As Variant
vChars2 = Array(" ", "<", ">", "[", "]", "(", ")", ";")
Dim sLine2 As String
sLine2 = sLine
Dim lCharLoop As Long
For lCharLoop = LBound(vChars2) To UBound(vChars2)
Debug.Assert Len(vChars2(lCharLoop)) = 1
sLine2 = VBA.Replace(sLine2, vChars2(lCharLoop), " ")
Next
SplitByMoreThanOneChars = VBA.Split(sLine2)
End Function

CountWord and CountVowel into Function in VB.NET

Hi I need to change WordCount and CountVowel procedures to functions and create a function to count number of consonants in a string.
I have done these two procedures but I cannot figure out how to do the last part. I am fairly new to programming.
My current code is given below:
Sub Main()
Dim Sentence As String
Console.WriteLine("Sentence Analysis" + vbNewLine + "")
Console.WriteLine("Enter a sentence, then press 'Enter'" + vbNewLine + "")
Sentence = Console.ReadLine()
Console.WriteLine("")
Call WordCount(Sentence)
Call VowelCount(Sentence)
Console.ReadLine()
End Sub
Sub WordCount(ByVal UserInput As String)
Dim Space As String() = UserInput.Split(" ")
Console.WriteLine("There are {0} words", Space.Length)
End Sub
Sub VowelCount(ByVal UserInput As String)
Dim i As Integer
Dim VowelNumber As Integer
Dim Vowels As String = "aeiou"
For i = 1 To Len(UserInput)
If InStr(Vowels, Mid(UserInput, i, 1)) Then
VowelNumber = VowelNumber + 1
End If
Next
Console.WriteLine("There are {0} vowels", VowelNumber)
End Sub
Thanks for your time
I would use the following three functions. Note that WordCount uses RemoveEmptyEntries avoids counting empty words when there are multiple spaces between words.
The other two functions count upper case vowels as vowels, rather than just lower case. They take advantage of the fact that strings can be treated as arrays of Char, and use the Count method to count how many of those Chars meet certain criteria.
Note that the designation of "AEIOU" as vowels may not be correct in all languages, and even in English "Y" is sometimes considered a vowel. You might also need to consider the possibility of accented letters such as "É".
Function WordCount(UserInput As String) As Integer
Return UserInput.Split({" "c}, StringSplitOptions.RemoveEmptyEntries).Length
End Function
Function VowelCount(UserInput As String) As Integer
Return UserInput.Count(Function(c) "aeiouAEIOU".Contains(c))
End Function
Function ConsonantCount(UserInput As String) As Integer
Return UserInput.Count(Function(c) Char.IsLetter(c) And Not "aeiouAEIOU".Contains(c))
End Function
To turn each of your Sub routines into a Function, you need to do three things. First, you need to change the Sub and End Sub keywords to Function and End Function, respectively. So:
Sub MyMethod(input As String)
' ...
End Sub
Becomes:
Function MyMethod(input As String)
' ...
End Function
Next, since it's a function, it needs to return a value, so your Function declaration needs to specify the type of the return value. So, the above example would become:
Function MyMethod(input As String) As Integer
' ...
End Function
Finally, the code in the function must actually specify what the return value will be. In VB.NET, that is accomplished by using the Return keyword, like this:
Function MyMethod(input As String) As Integer
Dim result As Integer
' ...
Return result
End Function
So, to apply that to your example:
Sub WordCount(ByVal UserInput As String)
Dim Space As String() = UserInput.Split(" ")
Console.WriteLine("There are {0} words", Space.Length)
End Sub
Would become:
Function WordCount(userInput As String) As Integer
Dim Space As String() = UserInput.Split(" ")
Return Space.Length
End Sub
Note, ByVal is the default, so you don't need to specify it, and parameter variables, by standard convention in .NET are supposed to be camelCase rather than PascalCase. Then, when you call the method, you can use the return value of the function like this:
Dim count As Integer = WordCount(Sentence)
Console.WriteLine("There are {0} words", count)
As far as counting consonants goes, that will be very similar to your VowelCount method, except that you would give it the list of consonants to look for instead of vowels.
You could use the Regex class. It's designed to search for substrings using patterns, and it's rather fast at it too.
Sub VowelCount(ByVal UserInput As String)
Console.WriteLine("There are {0} vowels", System.Text.RegularExpressions.Regex.Matches(UserInput, "[aeiou]", System.Text.RegularExpressions.RegexOptions.IgnoreCase).Count.ToString())
End Sub
[aeiou] is the pattern used when performing the search. It matches any of the characters you've written inside the brackets.
Example:
http://ideone.com/LEYC30
Read more about Regex:
MSDN - .NET Framework Regular Expressions
MSDN - Regular Expression Language - Quick Reference
VB is no longer a language I use frequently but I don't think I'm going to steer you wrong even without testing this out.
Sub Main()
Dim Sentence As String
Console.WriteLine("Sentence Analysis" + vbNewLine + "")
Console.WriteLine("Enter a sentence, then press 'Enter'" + vbNewLine + "")
Sentence = Console.ReadLine()
Console.WriteLine("")
'usually it's better just let the function calculate a value and do output elsewhere
'so I've commented your original calls so you can see where they used to be
'Call WordCount(Sentence)
Console.WriteLine("There are {0} words", WordCount(Sentence))
'Call VowelCount(Sentence)
Console.WriteLine("There are {0} vowels", VowelCount(Sentence))
Console.ReadLine()
End Sub
Function WordCount(ByVal UserInput As String) As Integer
Dim Space As String() = UserInput.Split(" ")
WordCount = Space.Length
'or just shorten it to one line...
'Return UserInput.Split(" ").Length
End Function
Function VowelCount(ByVal UserInput As String) As Integer
Dim i As Integer
Dim VowelNumber As Integer
Dim Vowels As String = "aeiou"
For i = 1 To Len(UserInput)
If InStr(Vowels, Mid(UserInput, i, 1)) Then
VowelNumber = VowelNumber + 1
End If
Next
VowelCount = VowelNumber
End Function
The most obvious change between a sub and a function is changing the keywords that wrap up the procedure. For this conversation let's just say that's one good word to use for encompassing both concepts since they're very similar and many languages don't really draw such a big distinction.
For Visual Basic's purposes a function needs to return something and that's indicated by the As Integer that I added to the end of both of the function declarations (can't remember if that's the right VB terminology.) Also in VB you return a value to the caller by assigning to the name of the function (also see edit below.) So I replaced those lines that were WriteLines with appropriate assignments. Last I moved those WriteLine statements up into Main. The arguments needed to be changed to use the function return values rather than the variables they originally referenced.
Hopefully I'm not doing your homework for you!
EDIT: Visual Basic underwent a lot of changes to the language during the move to .Net back in the early 2000's. I had forgotten (or possibly not even realized) that the new preferred choice for returning a value is now more in line with languages like C#. So rather than assigning values to WordCount and VowelCount you can just use Return. One difference between the two is that a Return will cause the sub/function to exit at that point even if there is other code afterward. This might be useful inside an if...end if for example. I'm hoping this helps you learn something rather than just being confusing.
EDIT #2: Now that I see the accepted answer and re-read the question it seems there was a small part about counting consonants that got overlooked. At this point I assume this was indeed a classroom exercise and the intended answer was possibly even to derive the consonant count by using the other functions.
Here you go.
Function WordCount(ByVal UserInput As String) As Integer
Dim Space As String() = UserInput.Split(" ")
Return Space.Length
End Function
Function VowelCount(ByVal UserInput As String) As Integer
Dim i As Integer
Dim VowelNumber As Integer
Dim Vowels As String = "aeiou"
For i = 1 To Len(UserInput)
If InStr(Vowels, Mid(UserInput, i, 1)) Then
VowelNumber = VowelNumber + 1
End If
Next
Return VowelNumber
End Function
Function ConsonantCount(ByVal UserInput As String) As Integer
Dim i As Integer
Dim ConsonantNumber As Integer
Dim Consonants As String = "bcdfghjklmnpqrstvwxyz"
For i = 1 To Len(UserInput)
If InStr(Consonants, Mid(UserInput, i, 1)) Then
ConsonantNumber = ConsonantNumber + 1
End If
Next
Return ConsonantNumber
End Function

Strip out non-numeric characters in SELECT

In an MS Access 2007 project report, I have the following (redacted) query:
SELECT SomeCol FROM SomeTable
The problem is, that SomeCol apparently contains some invisible characters. For example, I see one result returned as 123456 but SELECT LEN(SomeCol) returns 7. When I copy the result to Notepad++, it shows as ?123456.
The column is set to TEXT. I have no control over this data type, so I can't change it.
How can I modify my SELECT query to strip out anything non-numeric. I suspect RegEx is the way to go... alternatively, is there a CAST or CONVERT function?
You mentioned using a regular expression for this. It is true that Access' db engine doesn't support regular expressions directly. However, it seems you are willing to use a VBA user-defined function in your query ... and a UDF can use a regular expression approach. That approach should be simple, easy, and faster performing than iterating through each character of the input string and storing only those characters you want to keep in a new output string.
Public Function OnlyDigits(ByVal pInput As String) As String
Static objRegExp As Object
If objRegExp Is Nothing Then
Set objRegExp = CreateObject("VBScript.RegExp")
With objRegExp
.Global = True
.Pattern = "[^\d]"
End With
End If
OnlyDigits = objRegExp.Replace(pInput, vbNullString)
End Function
Here is an example of that function in the Immediate window with "x" characters as proxies for your invisible characters. (Any characters not included in the "digits" character class will be discarded.)
? OnlyDigits("x1x23x")
123
If that is the output you want, just use the function in your query.
SELECT OnlyDigits(SomeCol) FROM SomeTable;
There is no RegEx in Access, at least not in SQL. If you venture to VBA, you might as well use a custom StripNonNumeric VBA function in the SQL statement.
e.g. SELECT StripNonNumeric(SomeCol) as SomeCol from SomeTable
Function StripNonNumeric(str)
keep = "0123456789"
outstr = ""
For i = 1 to len(str)
strChar = mid(str,i,1)
If instr(keep,strChar) Then
outstr = outstr & strChar
End If
Next
StripNonNumeric = outstr
End Function
You can do it all in a query, combining this question with your previous question, you get:
SELECT IIf(IsNumeric([atext]),
IIf(Len([atext])<4,Format([atext],"000"),
Replace(Format(Val([atext]),"#,###"),",",".")),
IIf(Len(Mid([atext],2))<4,Format(Mid([atext],2),"000"),
Replace(Format(Val(Mid([atext],2)),"#,###"),",","."))) AS FmtNumber
FROM Table AS t;
Public Function fExtractNumeric(strInput) As String
' Returns the numeric characters within a string in
' sequence in which they are found within the string
Dim strResult As String, strCh As String
Dim intI As Integer
If Not IsNull(strInput) Then
For intI = 1 To Len(strInput)
strCh = Mid(strInput, intI, 1)
Select Case strCh
Case "0" To "9"
strResult = strResult & strCh
Case Else
End Select
Next intI
End If
fExtractNumeric = strResult
End Function

How to get rid of the zero '0' numeric from a string?

BEFORE:
Johnson0, Yvonne
AFTER:
Johnson, Yvonne
String functions for Access can be found at http://www.techonthenet.com/access/functions/string/replace.php
In your example, code like
Replace("Johnson0", "0", "")
will do the trick for the particular string Johnson0. If you need to only remove the zero if it is the last character, play with the additional start and count parameters described in the link above.
You can try executing following query..
UPDATE table set
columnName = REPLACE(columnName,'0','')
WHERE columnName LIKE "%0%";
This will replace all occurrence of "0" with "".
The answer you submitted clarifies your requirement. Based on that, you don't need to create a user-defined function if your Access version is 2000 or later. You can get the same result with the Replace() function.
MsgBox Replace("Jonson0, Yvonne", "0,", ",")
One approach is to create a custom function
See http://www.techonthenet.com/access/functions/misc/alphanumeric.php for an example. You could do something similar, but in the loop you would only keep the alpha characters.
Public Sub xxx()
MsgBox RemoveStr0("Jonson0, Yvonne")
End Sub
Public Function RemoveStr0(sString As String) As String
Dim ipos As Long, sTemp As String
ipos = InStr(1, sString, "0,")
sTemp = Mid$(sString, 1, ipos - 1)
sTemp = sTemp & Mid$(sString, ipos + 1)
RemoveStr0 = sTemp
End Function
if you can pull it out to java or another OO lang you can just do a matching using regexes.