Substitute wildcard characters (? and *) in Excel 2010 vba macro - vba

Using a macro in Excel 2010, I am trying to replace all "invalid" characters (as defined by a named range) with spaces.
Dim sanitisedString As String
sanitisedString = Application.WorksheetFunction.Clean(uncleanString)
Dim validCharacters As Range
Set validCharacters = ActiveWorkbook.Names("ValidCharacters").RefersToRange
Dim pos As Integer
For pos = 1 To Len(sanitisedString)
If WorksheetFunction.CountIf(validCharacters, Mid(sanitisedString, pos, 1)) = 0 Then
sanitisedString = WorksheetFunction.Replace(sanitisedString, pos, 1, " ")
End If
Next
It works for all characters except * and ?, because CountIf is interpreting those as wildcard characters.
I have tried escaping all characters in the CountIf, using:
If WorksheetFunction.CountIf(validCharacters, "~" & Mid(sanitisedString, pos, 1)) = 0
but this led to all characters being replaced, regardless of whether they are in the list or not.
I then tried doing two separate Substitute commands, placed after the for loop using "~*" and "~?":
sanitisedString = WorksheetFunction.Substitute(sanitisedString, "~*", " ")
sanitisedString = WorksheetFunction.Substitute(sanitisedString, "~?", " ")
but the * and ? still make it through.
What am I doing wrong?

Since there are onlyl two wildcards to worry about, you can test for those explicitly:
Dim character As String
For pos = 1 To Len(sanitisedString)
character = Mid(sanitisedString, pos, 1)
If character = "*" Or character = "?" Then character = "~" & character
If WorksheetFunction.CountIf(validCharacters, character) = 0 Then
Mid$(sanitisedString, pos, 1) = " "
End If
Next

Related

String.Replace() for quotation marks

I am trying to run the following line of code to replace the Microsoft Word quotes with ones our database can store. I need to work around users copying strings from Microsoft Word into my textareas.
instrText = instrText.Replace("“", """).Replace("”", """)
I am getting syntax errors for the number of arguments.
I have tried character escapes and a couple other ways of formatting the arguments with no luck.
This changes the 'smart' quotes from word,
'non standard quotes
Dim Squotes() As Char = {ChrW(8216), ChrW(8217)} 'single
Dim Dquotes() As Char = {ChrW(8220), ChrW(8221)} 'double
'build test string
Dim s As String = ""
For x As Integer = 0 To Squotes.Length - 1
s &= x.ToString & Squotes(x) & ", "
Next
For x As Integer = 0 To Dquotes.Length - 1
s &= (x + Squotes.Length).ToString & Dquotes(x) & ", "
Next
'replace
For Each c As Char In Squotes
s = s.Replace(c, "'"c)
Next
For Each c As Char In Dquotes
s = s.Replace(c, ControlChars.Quote)
Next
Try the following:
Private Function CleanInput(input As String) As String
DisplayUnicode(input)
'8216 = &H2018 - left single-quote
'8217 = &H2019 - right single-quote
'8220 = &H201C - left double-quote
'8221 = &H201D - right double-quote
'Return input.Replace(ChrW(&H2018), Chr(39)).Replace(ChrW(&H2019), Chr(39)).Replace(ChrW(&H201C), Chr(34)).Replace(ChrW(&H201D), Chr(34))
Return input.Replace(ChrW(8216), Chr(39)).Replace(ChrW(8217), Chr(39)).Replace(ChrW(8220), Chr(34)).Replace(ChrW(8221), Chr(34))
End Function
Private Sub DisplayUnicode(input As String)
For i As Integer = 0 To input.Length - 1
Dim lngUnicode As Long = AscW(input(i))
If lngUnicode < 0 Then
lngUnicode = 65536 + lngUnicode
End If
Debug.WriteLine(String.Format("char: {0} Unicode: {1}", input(i).ToString(), lngUnicode.ToString()))
Next
Debug.WriteLine("")
End Sub
Usage:
Dim cleaned As String = CleanInput(TextBoxInput.Text)
Resources:
ASCII table
C# How to replace Microsoft's Smart Quotes with straight quotation marks?
How to represent Unicode character in VB.Net String literal?
Note: Also used Character Map in Windows.
You have a solution that works above, but in keeping with your original form:
instrText = instrText.Replace(ChrW(8220), """"c).Replace(ChrW(8221), """"c)

How to split an unicode-string to readable characters?

I have a VBA formula-function to split a string and add space between each character. It works fines only for an Ascii string. But I want to do the same for the Tamil Language. Since it is Unicode, the result is not readable. It splits even the auxiliary characters, Upper dots, Prefix, Suffix auxilary characters which should not be separated in Tamil/Hindi/Kanada/Malayalam/All India Languages. So, how to write a function to split a Tamil Word into readable characters.
Function AddSpace(Str As String) As String
Dim i As Long
For i = 1 To Len(Str)
AddSpace = AddSpace & Mid(Str, i, 1) & " "
Next i
AddSpace = Trim(AddSpace)
End Function
Adding Space is not the important point of this question. Splitting the Unicode string into an array from any of those languages is the requirement.
For example, the word, "பார்த்து" should be separated as "பா ர் த் து", not as "ப ா ர ் த ் த ு". As you can see, the first two letters "பா" (ப + ா) are combined. If I try to manually put a space in between them, I can't do it in any word processor. If you want to test, please put it in Notepad and add space between each character. It won't allow you to separate as ("ப ா"). So "பார்த்து" should be separated as "பா ர் த் து". It is the correct separation in Tamil like languages. This is the one that I am struggling to achieve in VBA.
The Character Code table for Tamil is here.
Tamil/Hindi/many Indian languages have (1)Consonants, (2)Independent vowels, (3)Dependent vowel signs, (4)Two-part dependent vowel signs. Among these 4 types, the first two are each one separate lettter, no issues with them. but the last 2 are dependent, they should not be separated from its joint character. For example, the letter, பா (ப + ் ), it contains one independent (ப) and one dependent (ா) letter.
If this info is not enough, please comment what should I post more.
(Note: It is possible in C#.Net using the code from the MS link by #Codo)
You can assign a string to a Byte array so the following might work
Dim myBytes as Byte
myBytes = "Tamilstring"
which generates two bytes for each character. You could then create a second byte array twice the size of the first by using space$ to crate a suitable string and then use a for loop (step 4) to copy two bytes at a time from the first to the second array. Finally, assign the byte array back to a string.
The problem you have is you are looking for what Unicode calls an extended grapheme cluster.
For a Unicode compatible regex engine that is simply /\X/
Not sure how you do that in VBA.
Referring the link mentioned by #ScottCraner in comments on the question and Character code for Tamil.
Check the result in cell A2 and highlighted in yellow are Dependent vowel signs which are used in DepVow string
Sub Split_Unicode_String()
'https://stackoverflow.com/questions/68774781/how-to-split-an-unicode-string-to-readable-characters
Dim my_string As String
'input string
Dim buff() As String
'array of input string characters
Dim DepVow As String
'Create string of Dependent vowel signs
Dim newStr As String
'result string with spaces as desired
Dim i As Long
my_string = Range("A1").Value
ReDim buff(Len(my_string) - 1) 'array of my_string characters
For i = 1 To Len(my_string)
buff(i - 1) = Mid$(my_string, i, 1)
Cells(1, i + 2) = buff(i - 1)
Cells(2, i + 2) = AscW(buff(i - 1)) 'used this for creating DepVow below
Next i
'Create string of Dependent vowel signs preceded and succeeded by comma
DepVow = "," & Join(Array(ChrW$(3006), ChrW$(3021), ChrW$(3009)), ",")
newStr = ""
For i = LBound(buff) To UBound(buff)
If InStr(1, DepVow, ChrW$(AscW(buff(i + 1))), vbTextCompare) > 0 Then
newStr = newStr & ChrW$(AscW(buff(i))) & ChrW$(AscW(buff(i + 1))) & " "
i = i + 1
Else
newStr = newStr & ChrW$(AscW(buff(i))) & " "
End If
Next i
'result string in range A2
Cells(2, 1) = Left(newStr, Len(newStr) - 1)
End Sub
Try below algorithm. which will concat all the mark characters with letter characters.
redim letters(0)
For i=1 To Len(Str)
If ascW(Mid(Str,i,1)) >3005 And ascW(Mid(Str,i,1)) <3022 Then
letters(UBound(letters)-1) = letters(UBound(letters)-1)+Mid(Str,i,1)
Else REDIM PRESERVE
letters(UBound(letters) + 1)
letters(UBound(letters)-1) = Mid(Str,i,1)
End If
Next
MsgBox(join(letters, ", "))'return பா, ர், த், து,

How do I get all text from all cells to one variable?

I have a large range that I need to find all numbers that is between four and six digits long.
I know I can use regex for this but I don't want to loop each cell and check them all.
What I need is kind of selecting the range copy and paste in notepad and copy back to a variable.
This way I can regex the variable and find all matches at once.
I don't need to know where the number was found, I just need the numbers.
Is there any way to copy the values to a string like this?
Dim text As String
text = ActiveSheet.Range("C9:IQ56").Value
is not compatible datatypes.
If I use variant I get an array of the columns and cells.
My attempt to join the array is not successful either.
text = ActiveSheet.Range("C9:IQ56").Value
textstring = ""
For i = 1 To UBound(text, 1)
textstring = textstring & " " & Join(text(i))
Next i
Any help with this?
use Application Index to do each row at a time:
text = ActiveSheet.Range("C9:IQ56").Value
textstring = ""
For i = 1 To UBound(text, 1)
textstring = textstring & " " & Join(application.Index(text,i,0))
Next i
There are two problems in your code, the declaration and the dimensions of the variable. Here is what you can do:
Dim Text() As Variant
Text = ActiveSheet.Range("C9:IQ56").Value
textstring = ""
For i = 1 To UBound(Text, 1)
For j = 1 To UBound(Text, 2)
textstring = textstring & " " & Text(i, j)
Next j
Next i
Similar approach with delimiters concatenating row strings after loop
Added a Timer and the feature to use separators (delimiters) as well for rows (e.g. "|") as for columns (e.g. ","). Furthermore I demonstrate a way to join all row strings at once after loop via Application.Transpose() just for the sake of the art, though this isn't faster nor slower than #Scott Craner 's valid solution :+).
Code
Sub arr2txt()
Const SEPROWS As String = "|" ' << change to space or any other separator/delimiter
Const SEPCOLS As String = "," ' << change to space or any other separator/delimiter
Dim v
Dim textstring As String, i As Long
Dim t As Double: t = Timer ' stop watch
v = ActiveSheet.Range("C2:E2000").Value ' get data into 1-based 2-dim datafield array
For i = 1 To UBound(v, 1)
v(i, 1) = Join(Application.Index(v, i, 0), SEPCOLS)
Next i
textstring = Join(Application.Transpose(Application.Index(v, 0, 1)), SEPROWS)
Debug.Print Format(Timer - t, "0.00 seconds needed")
End Sub

How to replace a character within a string

I'm trying to convert WText into its ASCII code and put it into a TextBox; Numencrypt. But I don't want to convert the spaces into ASCII code.
How do I replace the spaces with null?
Current code:
Dim withSpace As String = Numencrypt.Text
For h = 1 To lenText
wASC = wASC & CStr(Asc(Mid$(WText, h, 1)))
Next h
Numencrypt.Text = wASC
Numencrypt2.Text = Numencrypt2.Replace(Numencrypt.Text, " ", "")
By the way, the TextBox Numencrypt2 is the WText without a space inside it.
Without knowing whether or not you want the null character or empty string I did the following in a console app so I don't have your variables. I also used a string builder to make the string concatenation more performant.
Dim withSpaces = "This has some spaces in it!"
withSpaces = withSpaces.Replace(" "c, ControlChars.NullChar)
Dim wASC As New StringBuilder
For h = 1 To withSpaces.Length
wASC.Append($"{AscW(Mid(withSpaces, h, 1))} ") ' Added a space so you can see the boundaries ascii code boundaries.
Next
Dim theResult = wASC.ToString()
Console.WriteLine(theResult)
You will find that if you use ControlChars.NewLine as I have, the place you had spaces will be represented by a zero. That position is completely ignored if you use Replace(" ", "")

How to find and copy specific text inside a string VBA?

I have a bunch of strings that i need to extract the phone numbers from, how do I manage to get them from this string and paste in a worksheet knowing that they all have the formatting
(??) ????-???? where ? is a random number from 0 to 9 and knowing that there could be multiple phone numbers inside the same string?
Example:
"Acreaves Alimentos. Rodovia Do Pacifico, (68) 3546-4754 Br 317, Km 8, S/N - Zona Rura... Brasileia - AC | CEP: 69932-000. (68) 3546-5544. Enviar "
would return (68) 3546-4754 and (68) 3546-5544
I have a snippet of code here which sets up a regular expression for the format you have specified and searches the string, then providing a msgbox for each instance it finds.
You need to ensure that you have added (using Tools->References) the Microsoft VBScript Regular Expressions 5.5 reference, or you will fail to create the RegExp object initially.
The regex pattern in this case is specified to allow a bracket (escaped with a \ since otherwise it has special meaning in a regular expression), then two digits, each of which can be 0-9, a close bracket (escaped again), \s to indicate a space, followed by 4 digits in the character set 0-9, a dash (escaped again) and the final four digits in the 0-9 set.
Don't forget to set the regex Global attribute to True so that it returns all matches.
sString = "Acreaves Alimentos. Rodovia Do Pacifico, (68) 3546-4754 Br 317, Km 8, S/N - Zona Rura... Brasileia - AC | CEP: 69932-000. (68) 3546-5544 . Enviar"
Dim oReg : Set oReg = New RegExp
oReg.Global = True
oReg.Pattern = "\([0-9]{2}\)\s[0-9]{4}\-[0-9]{4}"
Set Matches = oReg.Execute(sString)
For Each oMatch In Matches
MsgBox oMatch.Value
Next
Should do what you require, based on your details and the string you provided.
If the format actually stays the same throughout you can try something like this:
a = "Acreaves Alimentos. Rodovia Do Pacifico, (68) 3546-4754 Br 317, Km 8, S/N - Zona Rura... Brasileia - AC | CEP: 69932-000. (68) 3546-5544. Enviar "
arrNums = Split(a, "(")
For i = 1 To UBound(arrNums)
num = "(" & Left(arrNums(i), 13)
Next
This function will return an array containing the numbers:
Function ReturnNumbers(s As String) As variant
Dim s As String, a As Variant, r As Variant, i As Integer
a = Split(s, "(")
ReDim r(1 To UBound(a, 1))
For i = 1 To UBound(a, 1)
r(i) = "(" & Left(a(i), 13)
Next
ReturnNumbers = r
End Function