Look for multiple words in string - vba

Need to build a VBA function that simulates the REGEXP_INSTR or REGEXP_LIKE in Oracle. Those make possible to look for words in string without having to loop word by word.
I've found this code, that find Names that starts with "Mr|Mrs|Ms|Dr", that meant to be used like:
Function StringStarts(strCheck As String, options As String) As Boolean
With CreateObject("VBScript.RegExp")
.IgnoreCase = True
.Pattern = "^(" & options & ")\.*\b"
StringStarts = .Test(strCheck)
End With
End Function
Debug.print StringStarts("Dr leopoldo malmeida", "Mr|Mrs|Ms|Dr")
In fact, I need help to find if it's possible to alter this function, in order to find any word, or parts of words (case-insensitive matching), in the pattern on any location of the string to search. For example:
Debug.print StringStarts("Looking for multiple words", "Word|like|for")
That should return true: "Word found in 'words'"; "for was a complete match in string".

Most probably RegEx is not needed, because it is a bit slow. A simple for-each loop in a function would be ok:
Public Function PatternPresent(testedString As String, _
Optional pattern As String = "Mr|Mrs|Ms|Dr") As Boolean
Const separator = "|"
Dim patterns As Variant: patterns = Split(pattern, separator)
Dim myVar As Variant
For Each myVar In patterns
If InStr(1, testedString, myVar) Then
PatternPresent = True
Exit Function
End If
Next myVar
End Function

Related

How to split a string in VBA by more than one character

In C# one can easily split a split string by more than one character, one supplies an array of split characters. I was wondering what is best way to achieve this in VBA. I use VBA.Split typically but to split on more than one characters requires drilling in to the results and sub-splitting the elements. Then one has to re-dimension arrays etc. Quite painful.
Contraints
VBA responses only please. You may use .NET collection classes if you wish (yes they are creatable and callable in VBA). You may use JSON, XML as vessels for the list of split segments if you wish. You may use the humble VBA.Collection class if you wish, or even a Scripting.Dictionary. You may use even a fabricated recordset if you wish.
I know full well one can write a .NET asssembly to call the .NET String.Split method and expose assembly to VBA with COM interfaces but where is the challenge in that.
This should be fairly easy to do with a regular expression. If you match on the negation of the passed characters to split on, the matches will be the members of the output array. The upside to doing this is that the output array only needs to be sized once because you can get a count of the matches returned by the RegExp. The pattern is fairly simple to build - it boils down to something like [^abc]+ where 'a', 'b', and 'c' are the characters to split on. About the only thing that you need to do to prepare the expression is to escape a couple characters that have special meaning in that context inside a regular expression (I probably forgot some):
Private Function BuildRegexPattern(ByVal inputString As String) As String
Dim escapeTargets() As String
escapeTargets = VBA.Split("- ^ \ ]")
Dim returnValue As String
returnValue = inputString
Dim idx As Long
For idx = LBound(escapeTargets) To UBound(escapeTargets)
returnValue = Replace$(returnValue, escapeTargets(idx), "\" & escapeTargets(idx))
Next
BuildRegexPattern = "[^" & returnValue & "]+"
End Function
Once you have the pattern, it's just a simple matter of sizing the array and iterating over the matches to assign them (plus some other special case handling, etc.):
Public Function MultiSplit(ByVal toSplit As String, Optional ByVal delimiters As String = " ") As String()
Dim returnValue() As String
If toSplit = vbNullString Then
returnValue = VBA.Split(vbNullString)
Else
With New RegExp
.Pattern = BuildRegexPattern(IIf(delimiters = vbNullString, " ", delimiters))
.MultiLine = True
.Global = True
If Not .Test(toSplit) Then
'Only delimiters.
ReDim returnValue(Len(toSplit) - 1)
Else
Dim matches As Object
Set matches = .Execute(toSplit)
ReDim returnValue(matches.Count - 1)
Dim idx As Long
For idx = LBound(returnValue) To UBound(returnValue)
returnValue(idx) = matches(idx)
Next
End If
End With
End If
MultiSplit = returnValue
End Function
In my attempt, I replace all the other characters with space before splitting on space. (So I cheat a little.)
Private Function SplitByMoreThanOneChars(ByVal sLine As String)
'*
'* Brought to you by the Excel Development Platform Blog
'* http://exceldevelopmentplatform.blogspot.com/2018/11/
'*
'* Don't get excited, this splits by spaces only
'* we fake splitting by multiple characters by replacing those characters
'* with spaces
'*
Dim vChars2 As Variant
vChars2 = Array(" ", "<", ">", "[", "]", "(", ")", ";")
Dim sLine2 As String
sLine2 = sLine
Dim lCharLoop As Long
For lCharLoop = LBound(vChars2) To UBound(vChars2)
Debug.Assert Len(vChars2(lCharLoop)) = 1
sLine2 = VBA.Replace(sLine2, vChars2(lCharLoop), " ")
Next
SplitByMoreThanOneChars = VBA.Split(sLine2)
End Function

Checking in a VBA sub if a cell contains a word from a set of words

I'm working on a cargo calculator that uses different algorithms depending on whether the cargo can be classified as pipes, plates or beams. I'm trying to get it to automatically detect the cargo segment (if it can; the operator will be able to manually select a segment if there's not enough data) based on the item's description and dimensions.
My initial thought was to set lists of keywords as arrays; if a spot of pseudocode can be indulged, I'm thinking something along these lines:
Pipes = {pipe, tube, conduit, duct}
Plates = {plate, sheet, panel}
Beams = {beam, rail, girder}
IF Description CONTAINS Pipes THEN Calc = "Pipes & Tubes"
I know it could be done with lots and lots of IF clauses, but using arrays or similar would make it easier to maintain the list as synonyms crop up - and of course makes the code tidier.
Any thoughts on a nice efficient way of doing this?
EDIT: To clarify, I'm not trying to see if a whole string is found in an array, I'm trying to check if any of the words in an array (or collection of words, however arranged) is found in a descriptive string. For instance, using the arrays above, "Steel sheets" should come back as being in the category "Plates" because the description contains "sheet".
EDIT: #R3uk found a solution that works nicely. Here's the code I ended up using:
In my declarations module:
Public aPipe As String ' Pipe synonym array
Public aPlate As String ' Plate synonym array
Public aBeam As String ' Beam synonym array
In my admin module:
aPipe = "pipe/tube/conduit/duct"
aPlate = "plate/sheet/panel"
aBeam = "beam/rail/girder/truss"
In the main importer module, in the import sub:
ImpCalcDetect ' Imported calculator segment detection (experimental)
And the bit itself, essentially unchanged from R3uk's answer but with a minor tweak to make it case-insensitive:
Sub ImpCalcDetect()
' Experimental calculator segment detection
If Contains_Keyword(LCase(wsCalc.Cells(iImportCounter, 2).Value), aPipe) Then wsCalc.Cells(iImportCounter, 3).Value = "Pipes"
If Contains_Keyword(LCase(wsCalc.Cells(iImportCounter, 2).Value), aPlate) Then wsCalc.Cells(iImportCounter, 3).Value = "Plates"
If Contains_Keyword(LCase(wsCalc.Cells(iImportCounter, 2).Value), aBeam) Then wsCalc.Cells(iImportCounter, 3).Value = "Beams"
End Sub
Function Contains_Keyword(Descr As String, KeyWordS As String) As Boolean
Dim A() As String, IsIn As Boolean, i As Integer
A = Split(KeyWordS, "/")
IsIn = False
For i = LBound(A) To UBound(A)
If InStr(1, Descr, A(i)) Then
IsIn = True
Exit For
Else
End If
Next i
Contains_Keyword = IsIn
End Function
Many thanks!
You could indeed use arrays, here is a version for strings where you just need to separate keywords with a slash / :
Sub Test_AndrewPerry()
Dim Pipes As String, Plates As String, Beams As String
Pipes = "pipe/tube/conduit/duct"
Plates = "plate/sheet/panel"
Beams = "beam/rail/girder"
If Contains_Keyword(Description, Pipes) Then
Calc = "Pipes & Tubes"
Else
'Nothing to do?
End If
End Sub
And the function to "decompress" the strings and test each keyword until it find a match :
Function Contains_Keyword(Descr As String, KeyWordS As String) As Boolean
Dim A() As String, IsIn As Boolean, i As Integer
A = Split(KeyWordS, "/")
IsIn = False
For i = LBound(A) To UBound(A)
If InStr(1, Descr, A(i)) Then
IsIn = True
Exit For
Else
End If
Next i
Contains_Keyword = IsIn
End Function

How to extract numbers UNTIL a space is reached in a string using Excel 2010?

I need to pull the code from the following string: 72381 Test 4Dx for Worms. The code is 72381 and the function that I'm using does a wonderful job of pulling ALL the numbers from a string and gives me back 723814, which pulls the 4 from the description of the code. The actual code is only the 72381. The codes are of varying length and are always followed by a space before the description begins; however there are spaces in the descriptions as well. This is the function I am using that I found from a previous search:
Function OnlyNums(sWord As String)
Dim sChar As String
Dim x As Integer
Dim sTemp As String
sTemp = ""
For x = 1 To Len(sWord)
sChar = Mid(sWord, x, 1)
If Asc(sChar) >= 48 And _
Asc(sChar) <= 57 Then
sTemp = sTemp & sChar
End If
Next
OnlyNums = Val(sTemp)
End Function
If the first character in the description part of your string is never numeric, you could use the VBA Val(string) function to return all of the numeric characters before the first non-numeric character.
Function GetNum(sWord As String)
GetNum = Val(sWord)
End Function
See the syntax of the Val(string) function for full details of it's usage.
You're looking for the find function.. Example:
or in VBA instr() and left()
Since you know the pattern is always code followed by space just use left of the string for the number of characters to the first space found using instr. Sample in immediate window above. Loop is going to be slow, and while it may validate they are numeric why bother if you know pattern is code then space?
In similar situations in C# code, I leave the loop early after finding the first instance of a space character (32). In VBA, you'd use Exit For.
You can get rid of the function altogether and use this:
split("72381 Test 4Dx for Worms"," ")(0)
This will split the string into an array using " " as the split char. Then it shows us address 0 in the array (the first element)
In the context of your function if you are dead set on using one it is this:
Function OnlyNums(sWord As String)
OnlyNums = Split(sWord, " ")(0)
End Function
While I like the simplicity of Mark's solution, you could use an efficient parser below to improve your character by character search (to cope with strings that don't start with numbers).
test
Sub test()
MsgBox StrOut("72381 Test 4Dx")
End Sub
code
Function StrOut(strIn As String)
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "^(\d+)(\s.+)$"
If .test(strIn) Then
StrOut = .Replace(strIn, "$1")
Else
StrOut = "no match"
End If
End With
End Function

Strip out non-numeric characters in SELECT

In an MS Access 2007 project report, I have the following (redacted) query:
SELECT SomeCol FROM SomeTable
The problem is, that SomeCol apparently contains some invisible characters. For example, I see one result returned as 123456 but SELECT LEN(SomeCol) returns 7. When I copy the result to Notepad++, it shows as ?123456.
The column is set to TEXT. I have no control over this data type, so I can't change it.
How can I modify my SELECT query to strip out anything non-numeric. I suspect RegEx is the way to go... alternatively, is there a CAST or CONVERT function?
You mentioned using a regular expression for this. It is true that Access' db engine doesn't support regular expressions directly. However, it seems you are willing to use a VBA user-defined function in your query ... and a UDF can use a regular expression approach. That approach should be simple, easy, and faster performing than iterating through each character of the input string and storing only those characters you want to keep in a new output string.
Public Function OnlyDigits(ByVal pInput As String) As String
Static objRegExp As Object
If objRegExp Is Nothing Then
Set objRegExp = CreateObject("VBScript.RegExp")
With objRegExp
.Global = True
.Pattern = "[^\d]"
End With
End If
OnlyDigits = objRegExp.Replace(pInput, vbNullString)
End Function
Here is an example of that function in the Immediate window with "x" characters as proxies for your invisible characters. (Any characters not included in the "digits" character class will be discarded.)
? OnlyDigits("x1x23x")
123
If that is the output you want, just use the function in your query.
SELECT OnlyDigits(SomeCol) FROM SomeTable;
There is no RegEx in Access, at least not in SQL. If you venture to VBA, you might as well use a custom StripNonNumeric VBA function in the SQL statement.
e.g. SELECT StripNonNumeric(SomeCol) as SomeCol from SomeTable
Function StripNonNumeric(str)
keep = "0123456789"
outstr = ""
For i = 1 to len(str)
strChar = mid(str,i,1)
If instr(keep,strChar) Then
outstr = outstr & strChar
End If
Next
StripNonNumeric = outstr
End Function
You can do it all in a query, combining this question with your previous question, you get:
SELECT IIf(IsNumeric([atext]),
IIf(Len([atext])<4,Format([atext],"000"),
Replace(Format(Val([atext]),"#,###"),",",".")),
IIf(Len(Mid([atext],2))<4,Format(Mid([atext],2),"000"),
Replace(Format(Val(Mid([atext],2)),"#,###"),",","."))) AS FmtNumber
FROM Table AS t;
Public Function fExtractNumeric(strInput) As String
' Returns the numeric characters within a string in
' sequence in which they are found within the string
Dim strResult As String, strCh As String
Dim intI As Integer
If Not IsNull(strInput) Then
For intI = 1 To Len(strInput)
strCh = Mid(strInput, intI, 1)
Select Case strCh
Case "0" To "9"
strResult = strResult & strCh
Case Else
End Select
Next intI
End If
fExtractNumeric = strResult
End Function

How can I check if filename contains a portion of a string in vb.net

I have a userform in 2008 vb express edition. A part number is created from user input via a concat string. I want to then check if a certain portion of the part number exists in the existing file names in a directory. Below is a more detailed explanation.
This is my code for creating a part number from the user input on the form.
L_PartNo.Text = String.Concat(CB_Type.Text, CB_Face.Text, "(", T_Width.Text, "x", T_Height.Text, ")", mount, T_Qty.Text, weep, serv)
I then have the following code to tell the user if the configuration (part no) they just created exists
L_Found.Visible = True
If File.Exists("Z:\Cut Sheets\TCS Products\BLANK OUT SIGN\" & (L_PartNo.Text) & ".pdf") Then
L_Found.Text = "This configuration exists"
Else
L_Found.Text = "This configuration does NOT exist"
End If
This is where I need help. The part no will look like this BX002(30x30)A1SS I want to compare 002(30x30) (just this part of the file name) to all the files in one directory. I want a yes or no answer to the existance and not a list of all matching files. The code below is everything I've tried, not all at the same time.
Dim b As Boolean
b = L_PartNo.Text.Contains(NewFace)
Dim NewFace As String = String.Concat(CB_Face.Text, "(", T_Width.Text, "x", T_Height.Text, ")")
Dim NewFace = L_PartNo.Text.Substring(2, 10)
If filename.Contains(NewFace) Then
lNewFace.Visible = False
Else
lNewFace.Visible = True
End If
The code below was a translation from the answer in C# but it does not work either
Dim contains As Boolean = Directory.EnumerateFiles(path).Any(Function(f) [String].Equals(f, "myfilethree", StringComparison.OrdinalIgnoreCase))
Here's an example of how you can do it without the fancy LINQ and Lambda which seem to be confusing you:
Public Function FileMatches(folderPath As String, filePattern As String, phrase As String) As Boolean
For Each fileName As String In Directory.GetFiles(folderPath, filePattern)
If fileName.Contains(phrase) Then
Return True
End If
Next
Return False
End Function
Or, if you need it to be case insensitive:
Public Function FileMatches(folderPath As String, filePattern As String, phrase As String) As Boolean
For Each fileName As String In Directory.GetFiles(folderPath, filePattern)
If fileName.ToLower().Contains(phrase.ToLower()) Then
Return True
End If
Next
Return False
End Function
You would call the method like this:
lNewFace.Visible = FileMatches(path, "*.pdf", NewFace)
Try this:
lNewFace.Visible = IO.Directory.GetFiles(path, "*.pdf").Where(Function(file) file. _
Substring(2, 10) = NewFace).FirstOrDefault Is Nothing
Consider that the substring function will throw an exception if its arguments exceed the length of the string it is parsing