Excel extract from brackets - vba

The code from Get the value between the brackets works well if the cell contains just one "(text)".
Unfortunately, in my rows there are many "Sample (sample1) (sample2)" format sentences and I need the last part.
Function GetParen(strIn As String) As String
Dim objRegex As Object
Dim objRegMC As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "\((.+?)\)"
If .Test(strIn) Then
Set objRegMC = .Execute(strIn)
GetParen = objRegMC(0).submatches(0)
Else
GetParen = "No match"
End If
End With
Set objRegex = Nothing
End Function
Could somebody help me to modify the code? Because if the cell contains "Text (text part1) (text part2)" The result what I get is "text part1" but I need the "text part2".
Thanks.

Why bother with regex ?? Consider the alternative:
Public Function GetParen(strIn As String) As String
Dim gather As Boolean, L As Long, i As Long
Dim CH As String
gather = False
L = Len(strIn)
For i = L To 1 Step -1
CH = Mid(strIn, i, 1)
If gather Then GetParen = CH & GetParen
If CH = ")" Then gather = True
If CH = "(" Then Exit For
Next i
GetParen = Mid(GetParen, 2)
End Function
EDIT#1:
Simpler:
Public Function GetParen2(strIn As String) As String
ary = Split(strIn, "(")
bry = Split(ary(UBound(ary)), ")")
GetParen2 = bry(0)
End Function

tiborjan:
I have a function that I wrote for just that reason. Here it is:
Function SubStr(s1 As String, sLeft As String, sL_Occ As Integer, sRight As String, sR_Occ As Integer) As String
'Cuts a piece of text from between two strings within another string
Dim LeftBound As Integer, RightBound As Integer, i As Integer
If sLeft = "Minimum" Then
LeftBound = 0
Else
For i = 1 To sL_Occ
LeftBound = InStr(LeftBound + 1, s1, sLeft, vbTextCompare)
Next i
End If
LeftBound = LeftBound + Len(sLeft) - 1 'measure from the right of the left bound
If sRight = "Maximum" Then
RightBound = Len(s1) + 1
Else
For i = 1 To sR_Occ
RightBound = InStr(RightBound + 1, s1, sRight, vbTextCompare)
Next i
End If
SubStr = Mid(s1, LeftBound + 1, RightBound - LeftBound - 1)
End Function
It has 5 parameters:
s1: the string that contains the text you want to parse
sLeft: the left-bound of the text you want to parse. Use "(" for your application.
sL_Occ: iteration number of sLeft. So if you wanted the second "(" that appears in s1, make this 2.
sRight: same as sLeft, but the right-bound.
sR_Occ: same as sL_Occ, but referring to the right-bound.
To make it easy: use this function code instead of yours. If you want to pull text from the second set of parentheses, use
s1 = "(P1) (P2) (P3) (P4)"
sLeft = "("
sL_Occ = 2
sRight = ")"
sR_Occ = 2
The return in the above would be "P2".
Hope that helps!
Matt, via ExcelArchitect.com

or how about simply
Function LastParam(ByVal str As String) As String
Dim arr() As String
arr = Split(str, "(")
LastParam = Split(arr(UBound(arr, 1)), ")")(0)
End Function

For completeness, you would only need minor changes to your code to make it work with your regex.
Set the Global flag to True, and return the last match from the match collection.
Function GetParen(strIn As String) As String
Dim objRegex As Object
Dim objRegMC As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "\((.+?)\)"
If .Test(strIn) Then
Set objRegMC = .Execute(strIn)
GetParen = objRegMC(objRegMC.Count - 1).submatches(0)
Else
GetParen = "No match"
End If
End With
Set objRegex = Nothing
End Function

A minor tweak to my earlier Regexpwill extract the last match.
test
Sub Test()
MsgBox GetParen("(Sample (sample1) (sample2)")
End Sub
code
Function GetParen(strIn As String) As String
Dim objRegex As Object
Dim objRegMC As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "\((.+?)\)"
.Global = True
If .Test(strIn) Then
Set objRegMC = .Execute(strIn)
GetParen = objRegMC(objRegMC.Count - 1).submatches(0)
Else
GetParen = "No match"
End If
End With
Set objRegex = Nothing
End Function

Related

locate exact word in excel VBA

I have more than one text with the word "123" inside Textabc
like a123b , c123erf and 123
but I just wanna locate the exact word "123"
Text_u1 = Mid(Textabc, InStr(Text, "123"))
I tried &123& but not working
Thanks
Option Explicit
Sub GetWord()
Dim Textabc As String, s As Variant, i As Variant, abc As String, sz As Long
Dim foundStart As Long, foundLen As Long
Textabc = "like a123b , c123erf and 123"
abc = "123"
s = Split(Textabc)
For i = 0 To UBound(s)
sz = sz + Len(s(i)) + 1
If Trim(s(i)) = abc Then
foundStart = sz - Len(s(i))
foundLen = Len(Textabc) - (sz - Len(s(i))) + 1
Debug.Print "Textabc length: " & Len(Textabc)
Debug.Print "foundStart: " & foundStart
Debug.Print "foundLen: " & foundLen
Debug.Print Mid(Textabc, foundStart, foundLen)
End If
Next
End Sub
Try one of these two, depending on what you want:
Public Sub TestMe()
Debug.Print InStr("a123b", "123")
Debug.Print Mid("a123b", InStr("a123b", "123"))
End Sub
You can try using Regular Expressions
Sub Test()
Dim regEx As Object
Dim str As String
Set regEx = CreateObject("vbscript.RegExp")
str = "a123b , c123erf and 123"
With regEx
.Global = True
.IgnoreCase = True
.Pattern = "\b(123)"
Debug.Print regEx.Execute(str)(0).FirstIndex + 1
End With
End Sub
This will return the position of the first match it finds that is equal to just 123 everything else will be ignored. If there are more then one match you will need to loop over the output of regEx.Execute(str) to get each position

Word occurences in VBA: how to speed up

I need to write a MS Word macro to count occurence of every word within a given document and print out the list like .
I did the macro and it works, but it is so sloooow, it takes several hours to get results for a document of 60000 words.
Could you please give me some advices/suggestions on how to make the macro run faster?
(I checked a similar question here WORD VBA Count Word Occurrences but still don't get it how to speed up and need my macro to be reviewed). Thank you.
Private Type WordStatData
WordText As String
WordCount As Integer
End Type
Option Base 1
'Check if the word is valid
Private Function IsValidWord(SomeString As String) As Boolean
Dim Retval As Boolean
Retval = True
If Not (InStr(SomeString, " ") = 0) Then Retval = False
If Not (InStr(SomeString, ".") = 0) Then Retval = False
If Not (InStr(SomeString, ",") = 0) Then Retval = False
If Not InStr(SomeString, "0") = 0 Then Retval = False
If Not InStr(SomeString, "1") = 0 Then Retval = False
If Not InStr(SomeString, "2") = 0 Then Retval = False
If Not InStr(SomeString, "3") = 0 Then Retval = False
If Not InStr(SomeString, "4") = 0 Then Retval = False
If Not InStr(SomeString, "5") = 0 Then Retval = False
If Not InStr(SomeString, "6") = 0 Then Retval = False
If Not InStr(SomeString, "7") = 0 Then Retval = False
If Not InStr(SomeString, "8") = 0 Then Retval = False
If Not InStr(SomeString, "9") = 0 Then Retval = False
IsValidWord = Retval
End Function
Private Sub CommandButton1_Click()
SpanishLCID = 3082 'The source text is in Spanish
ListBox1.Clear
Dim WordsTotal As Long
WordsTotal = ActiveDocument.Words.Count
TextBox1.Text = Str(WordsTotal)
Dim Wordfound As Boolean
Dim NewWord As String
Dim MyData() As WordStatData
ReDim Preserve MyData(1)
NewWord = ""
For i = 1 To WordsTotal
NewWord = Trim(StrConv(Trim(ActiveDocument.Words(i)), vbLowerCase, SpanishLCID))
'Check if the word is in the list
If IsValidWord(NewWord) Then
Wordfound = False
For j = 1 To UBound(MyData)
If StrComp(MyData(j).WordText, NewWord) = 0 Then
Wordfound = True: Exit For
End If
Next j
If Wordfound Then
MyData(j).WordCount = MyData(j).WordCount + 1
Else
ReDim Preserve MyData(UBound(MyData) + 1)
MyData(UBound(MyData)).WordText = NewWord
MyData(UBound(MyData)).WordCount = 1
End If
End If
Next i
'Printing out the word list
For i = 1 To UBound(MyData)
ListBox1.AddItem (MyData(i).WordText & "=" & Str(MyData(i).WordCount))
Next i
End Sub
Add a reference to the Microsoft Scripting Runtime (Tools -> References...). Then use the following:
Private Sub CommandButton1_Click()
Const SpanishLCID = 3082
Dim dict As New Scripting.Dictionary, word As Variant, fixedWord As String
Dim key As Variant
dict.CompareMode = SpanishLCID
For Each word In ActiveDocument.Words
fixedWord = Trim(StrConv(Trim(word), vbLowerCase, SpanishLCID))
If Not dict.Exists(fixedWord) Then
dict(fixedWord) = 1
Else
dict(fixedWord) = dict(fixedWord) + 1
End If
Next
ListBox1.Clear
For Each key In dict.Keys
ListBox1.AddItem key & "=" & dict(key)
Next
End Sub
NB. Word treats each punctuation symbol or paragraph as a new word. It may be advisable to specify another Dictionary or Collection with the strings that shouldn't be added to the dictionary, and test for those strings using .Exists before adding to the dictionary.
A more concise version of IsValidWord without regular expressions:
Function IsValidWord(s As String) As Boolean
Const validChars As String = "abcdefghijklmnopqrstuvwxyz"
Dim i As Integer, char As String * 1
For i = 1 To Len(s)
char = Mid(s, i, 1)
If InStr(1, validChars, char, vbTextCompare) = 0 Then Exit Function
Next
IsValidWord = True
End Function
and using regular expressions (add a reference to Microsoft VBScript Regular Expressions 5.5):
Dim regex As RegExp
Function IsValidWord2(s As String) As Boolean
If regex Is Nothing Then
Set regex = New RegExp
regex.Pattern = "[^a-z]"
regex.IgnoreCase = True
End If
IsValidWord2 = Not regex.Test(s)
End Function
and using regular expressions with replacement:
Function GetValidWord(s As String) As String
'GetValidWord("Introduction.......3") will return "Introduction"
If regex2 Is Nothing Then
Set regex2 = New RegExp
regex2.Pattern = "[^a-z]"
regex2.Global = True
regex2.IgnoreCase = True
End If
GetValidWord = regex2.Replace(s, "")
End Function
and you would use it as follows:
For Each word In ActiveDocument.Words
fixedWord = Trim(StrConv(Trim(word), vbLowerCase, SpanishLCID))
fixedWord = GetValidWord(fixedWord)
If Not dict.Exists(fixedWord) Then
NB: You might combine the language conversion and Trim into GetValidWord.

How to add slashes between the chars of a String?

I am trying to add slashes between characters in a string, e. g. hello -> h/e/l/l/o
I tried to use Replace(string, "", "/") but the String remained unchanged.
Try this:
Sub Main
dim initial_text as string
dim final_text as string
initial_text = "hello"
final_text = ""
Dim i As Integer
For i = 1 To len(initial_text)
if i = len(initial_text) then
final_text = final_text + Mid(initial_text,i,1)
else
final_text = final_text + Mid(initial_text,i,1) + "/"
end if
Next i
msgbox final_text
End Sub
EDIT
For study porpuse, I add another answer based on comments from (The next answer is a good answer too):
Dim s As String
s = "Hello"
#wqw
Debug.Print Left(Replace(StrConv(s, vbUnicode), Chr$(0), "/"), Len(s) * 2 - 1)
#Jeeped
Debug.Print Left(Join(Split(StrConv(s, vbUnicode), Chr$(0)), "/"), Len(s) * 2 - 1)
Here is an emulation of preg_replace function from PHP that I'm using which allows a simple regex to do the job. The look-ahead part gets rid of the last slash too.
Option Explicit
Private Sub Form_Load()
Dim s As String
s = "Hello"
Debug.Print preg_replace("(.)(?=.)", "$1/", s)
End Sub
Public Function preg_replace(sPattern As String, sReplace As String, sText As String) As String
Dim lIdx As Long
With CreateObject("VBScript.RegExp")
.Global = True
If Left$(sPattern, 1) = "/" Then
lIdx = InStrRev(sPattern, "/")
.Pattern = Mid$(sPattern, 2, lIdx - 2)
.IgnoreCase = (InStr(lIdx, sPattern, "i") > 0)
.MultiLine = (InStr(lIdx, sPattern, "m") > 0)
Else
.Pattern = sPattern
End If
preg_replace = .Replace(sText, sReplace)
End With
End Function

How to find the first instance of a "lower case character" in a cell?

How to find the first instance of a "lower case character" in a cell using VBA in Excel?
I have tried using ASCII values but that didn't work.
Try the following small UDF:
Public Function Findlower(rin As Range) As Long
Dim v As String, CH As String, i As Long
Findlower = 0
v = rin.Text
L = Len(v)
For i = 1 To L
If Mid(v, i, 1) Like "[a-z]" Then
Findlower = i
Exit Function
End If
Next i
End Function
It will return the position of the first instance of any lower case letter in a string:
You can use a RegExp in a UDF to avoid looping through each character:
Function FirstLower(strIn As String) as String
Dim objRegex As Object
Dim objRegM As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "[a-z]"
.ignorecase = False
If .test(strIn) Then
Set objRegM = .Execute(strIn)(0)
FirstLower = objRegM.firstindex + 1
Else
FirstLower = "no match"
End If
End With
End Function
I think you want to remove the first part in your string that is in lower case:
Public Function DeletFirstLowerPart(strTemp As String) As String
Dim strResult As String, i As Long, findLower As Boolean
strResult = ""
findLower = False
For i = 1 To Len(strTemp)
If (Mid(strTemp, i, 1) Like "[a-z]") Then
findLower = True
Else
If findLower = True Then
strResult = strResult & Mid(strTemp, i)
DeletFirstLowerPart = strResult
Exit Function
End If
strResult = strResult & Mid(strTemp, i, 1)
End If
Next i
DeletFirstLowerPart = strResult
End Function
DeletFirstLowerPart("ABCdefGHI") = "ABCGHI"
DeletFirstLowerPart("ABCdefGHIjkl") = "ABCGHIjkl"
Private Function DeleteLowerCasesLike(InputString As String) As String
Dim i As Integer
For i = 1 To Len(InputString)
If Mid(InputString, i, 1) Like "[a-z]" Then
InputString = Left(InputString, i - 1) & Mid(InputString, i + 1)
i = i - 1
End If
Next
DeleteLowerCasesLike = InputString
End Function
Another RegExp solution which needs to addig referance to Microsoft VBScript Regular Expressions 1.0 (In the VBA window, Tools->Referances menu)
Private Function DeleteLowerCasesRegExp(InputString As String)
Dim RE As New RegExp
With RE
.Global = True
.IgnoreCase = False
.Pattern = "[a-z]"
DeleteLowerCasesRegExp = .Replace(InputString, "")
End With
End Function
And another solution nor Like neither RegExp is used:
Private Function DeleteLowerCasesAsc(InputString As String) As String
Dim i As Integer
For i = 1 To Len(InputString)
If Mid(InputString, i, 1) = Empty Then Exit For
If Asc(Mid(InputString, i, 1)) >= 97 And Asc(Mid(InputString, i, 1)) <= 122 Then
InputString = Left(InputString, i - 1) & Mid(InputString, i + 1)
i = i - 1
End If
Next
DeleteLowerCasesAsc = InputString
End Function
Another solution in which replace function is used:
Private Function DeleteLowerCasesReplace(InputString As String) As String
Dim i As Integer
For i = 97 To 122
InputString = Replace(InputString, Chr(i), "")
Next
DeleteLowerCasesReplace = InputString
End Function

Extracting text to table without special characters

this script is going to extract each text between () from whole string. Now i would prepare it as a function because many string will be taking under that then i would like to add each extracted words inside table/or list. Each of extracted texts between () should be without special characters which sometimes could occur inside text and i would like to cut them of the string (i would like only to stay A-Z a-z 0-9 how could i do it all?
Sub Main()
Dim s$
s = "hsus(irt)bla dsd (got)(rifk)"
Debug.Print extract_value(s)
End Sub
Public Function extract_value$(s$)
Dim returnS$
Dim v
v = Split(s, Chr(40))
For Each Item In v
If InStr(Item, Chr(41)) Then
returnS = returnS & Chr(32) & Split(Item, ")")(0)
End If
Next
extract_value = Trim$(returnS)
End Function
When parsing you can use a helper function to check for just letters and numbers using the ASCII encoding.
Function validChr(chr As String) as Boolean
Dim ascCode As Integer
ascCode = Asc(chr)
If ascCode >= 65 And ascCode <= 90 Then 'Uppercase
validChr = True
ElseIf ascCode >= 97 And ascCode <= 122 Then 'Lowercase
validChr = True
ElseIf ascCode >= 48 And ascCode <= 57 Then 'Numbers
validChr = True
Else
validChr = False
End Function
You can also look at using RegEx.
This two stage `Regexp`
Converts hsus(ir%$%^t)bla dsd (g°ot)(rifk) to ir%$%^t g°ot rifk
In a single shot ir%$%^t g°ot rifkto irt got rifk
test sub
Sub Main()
Dim strTest As String
strTest = "hsus(ir%$%^t)bla dsd (g°ot)(rifk)"
MsgBox GrabIt(strTest)
End Sub
main sub
Function GrabIt(strIn As String) As String
Dim objRegex As Object
Dim objRegMC As Object
Dim objRegM As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "\((.*?)\)"
.Global = True
If .test(strIn) Then
Set objRegMC = .Execute(strIn)
For Each objRegM In objRegMC
GrabIt = GrabIt & Chr(32) & objRegM.submatches(0)
Next
End If
.Pattern = "[^\w|\s]+"
GrabIt = .Replace(GrabIt, vbNullString)
End With
End Function