Removing characters with .Replace in VBA for Excel - vba

The following function was given to me via an answer that I asked earlier today.
What I'm trying to do is to remove a character from a string in Excel using VBA. However, whenever the function runs, it ends up erasing the value stored and returning a #!VALUE error. I cannot seem to figure out what is going on. Anyone mind explaining an alternative:
Function ReplaceAccentedCharacters(S As String) As String
Dim I As Long
With WorksheetFunction
For I = 1 To Len(S)
Select Case Asc(Mid(S, I, 1))
' Extraneous coding removed. Leaving the examples which
' do work and the one that is causing the problem.
Case 32
S = .Replace(S, I, 1, "-")
Case 94
S = .Replace(S, I, 1, "/")
' This is the coding that is generating the error.
Case 34
S = .Replace(S, I, 1, "")
End Select
Next I
End With
ReplaceAccentedCharacters = S
End Function
When the string contains a " (or character code 34 in Decimal, 22 in Hexadecimal... I used both) it is supposed to remove the quotation mark. However, instead, Excel ignores it, and still returns the " mark anyway.
I then tried to go ahead and replace the .Replace() clause with another value.
Case 34
S = .Replace(S, I, 1, "/")
End Select
Using the code above, the script indeed does replace the " with a /.
I ended up finding the following example here in Stack Overflow:
https://stackoverflow.com/a/7386565/692250
And in the answer given, I see the same exact code example similar to the one that I gave and nothing. Excel is still ignoring the quotation mark. I even went so far as to expand the definition with curly braces and still did not get anything.

Try this:
Function blah(S As String) As String
Dim arr, i
'array of [replace, with], [replace, with], etc
arr = Array(Chr(32), "-", Chr(94), "/", Chr(34), "")
For i = LBound(arr) To UBound(arr) Step 2
S = Replace(S, arr(i), arr(i + 1))
Next i
blah = S
End Function

This function was designed to replace one character with another. It was not designed to replace a character with nothing. What happens when you try to replace a character with nothing is that the Counter for iterating through the word will now look (at the last iteration) for a character position that is greater than the length of the word. That returns nothing, and when you try to determine ASC(<nothing>) an error occurs. Other errors in the replacement routine will also occur when the length of the string is changed while the code is running
To modify the routine to replace a character with nothing, I would suggest the following:
In the Case statements:
Case 34
S = .Replace(S, I, 1, Chr(1))
And in the assignment statement:
ReplaceAccentedCharacters = Replace(S, Chr(1), "")
Note that VBA Replace is different from Worksheetfunction Replace

Related

How can I replace a string without the replace function?

I have a long string of random letters and I need to remove a couple of the front letters a few at a time. By using the replace function, if I replace a piece of string that then repeats later on, it removes the piece of string entirely from the long string instead of just the beginning.
Is there a way to remove a piece of string without using the replace function? The code below might clear up some of the confusion.
Dim protein As String
protein = "GLSDGEWQQVLNVWGKVEADIAGHGQEVLIRLFTGHPETLEKFDKFKHLKTEAEMKASEDLKKHGTVVLTALGGILKKKEGHHEAELKPLAQSHATKHKIPIKYLEFISDAIIHVLHSKHRPGDFGADAQGAMTKALELFRNDIAAKYKELGFQG"
Dim IndexPosition
For Each index In protein
If index = "K" Or index = "R" Then
IndexPosition = InStr(protein, index)
Dim NextPosition = IndexPosition + 1
Dim NextLetter = Mid(protein, NextPosition, 0)
If NextLetter <> "P" Then
Dim PortionToCutOut = Mid(protein, 1, IndexPosition)
protein = Replace(protein, PortionToCutOut, "")
Console.WriteLine(PortionToCutOut)
End If
End If
Next index
Regex might be a simpler way to solve this:
Regex.Replace(protein, "^(.*?)[KR][^P]", "$1")
It means "from the start of the string, for zero or more captured characters up to the first occurrence of K or R followed by anything other than P, replace it with (the captured string)"
GLSDGEWQQVLNVWGKVEADIAGHGQEVLIRLFTGHPETL
^^^^^^^^^^^^^^^^^
captured string||
xx
Everything underlined with ^^^ is replaced by everything apart from the xx bit
It makes a single replacement, because that's what I interpreted you required when you said:
By using the replace function, if I replace a piece of string that then repeats later on, it removes the piece of string entirely from the long string instead of just the beginning
However if you do want to replace all occurrences of "K OR R followed by not P" it gets simpler:
Regex.Replace(protein, "[KR][^P]", "")
This is "K or R followed by anything other than P", replace with "nothing"
There are several issues with your code. The first issue that is likely to throw an exception is that you're modifying a collection in a For/Each loop.
The second issue that is less severe in immediate impact, but just as important in my opinion is that you're using almost exclusively legacy Visual Basic methods.
InStr should be replaced with IndexOf: https://learn.microsoft.com/en-us/dotnet/api/system.string.indexof
Mid should be replaced with Substring: https://learn.microsoft.com/en-us/dotnet/api/system.string.substring
The third issue is that you're not using the short-circuit operator OrElse in your conditional statement. Or will evaluate the right-hand side of your condition regardles of if the left-hand side is true whereas OrElse won't bother to evaluate the right-hand side if the left-hand side is true.
In terms of wanting to remove a piece of the String without using Replace, well you'd use Substring as well.
Consider this example:
Dim protein = "GLSDGEWQQVLNVWGKVEADIAGHGQEVLIRLFTGHPETLEKFDKFKHLKTEAEMKASEDLKKHGTVVLTALGGILKKKEGHHEAELKPLAQSHATKHKIPIKYLEFISDAIIHVLHSKHRPGDFGADAQGAMTKALELFRNDIAAKYKELGFQG"
Dim counter = 0
Do While counter < protein.Length - 2
counter += 1
Dim currentLetter = protein(counter)
Dim nextLetter = protein(counter + 1)
If (currentLetter = "K"c OrElse currentLetter = "R"c) AndAlso nextLetter <> "P"c Then
protein = protein.Substring(0, counter) & protein.Substring(counter + 1)
End If
Loop
Example: https://dotnetfiddle.net/vrhRdO

Excel VBA Using wildcard to replace string within string

I have a difficult situation and so far no luck in finding a solution.
My VBA collects number figures like $80,000.50. and I'm trying to get VBA to remove the last period to make it look like $80,000.50 but without using right().
The problem is after the last period there are hidden spaces or characters which will be a whole lot of new issue to handle so I'm just looking for something like:
replace("$80,000.50.",".**.",".**")
Is this possible in VBA?
I cant leave a comment so....
what about InStrRev?
Private Sub this()
Dim this As String
this = "$80,000.50."
this = Left(this, InStrRev(this, ".") - 1)
Debug.Print ; this
End Sub
Mid + Find
You can use Mid and Find functions. Like so:
The Find will find the first dot . character. If all the values you are collecting are currency with 2 decimals, stored as text, this will work well.
The formula is: =MID(A2,1,FIND(".",A2)+2)
VBA solution
Function getStringToFirstOccurence(inputUser As String, FindWhat As String) As String
getStringToFirstOccurence = Mid(inputUser, 1, WorksheetFunction.Find(FindWhat, inputUser) + 2)
End Function
Other possible solutions, hints
Trim + Clear + Substitute(Char(160)): Chandoo -
Untrimmable Spaces – Excel Formula
Ultimately, you can implement Regular expressions into Excel UDF: VBScript’s Regular Expression Support
How about:
Sub dural()
Dim r As Range
For Each r In Selection
s = r.Text
l = Len(s)
For i = l To 1 Step -1
If Mid(s, i, 1) = "." Then
r.Value = Mid(s, 1, i - 1) & Mid(s, i + 1)
Exit For
End If
Next i
Next r
End Sub
This will remove the last period and leave all the other characters intact. Before:
and after:
EDIT#1:
This version does not require looping over the characters in the cell:
Sub qwerty()
Dim r As Range
For Each r In Selection
If InStr(r.Value, ".") > 0 Then r.Characters(InStrRev(r.Text, "."), 1).Delete
Next r
End Sub
Shortest Solution
Simply use the Val command. I assume this is meant to be a numerical figure anyway? Get rid of commas and the dollar sign, then convert to value, which will ignore the second point and any other trailing characters! Robustness not tested, but seems to work...
Dim myString as String
myString = "$80,000.50. junk characters "
' Remove commas and dollar signs, then convert to value.
Dim myVal as Double
myVal = Val(Replace(Replace(myString,"$",""),",",""))
' >> myVal = 80000.5
' If you're really set on getting a formatted string back, use Format:
myString = Format(myVal, "$000,000.00")
' >> myString = $80,000.50
From the Documentation,
The Val function stops reading the string at the first character it can't recognize as part of a number. Symbols and characters that are often considered parts of numeric values, such as dollar signs and commas, are not recognized.
This is why we must first remove the dollar sign, and why it ignores all the junk after the second dot, or for that matter anything non numerical at the end!
Working with Strings
Edit: I wrote this solution first but now think the above method is more comprehensive and shorter - left here for completeness.
Trim() removes whitespace at the end of a string. Then you could simply use Left() to get rid of the last point...
' String with trailing spaces and a final dot
Dim myString as String
myString = "$80,000.50. "
' Get rid of whitespace at end
myString = Trim(myString)
' Might as well check if there is a final dot before removing it
If Right(myString, 1) = "." Then
myString = Left(myString, Len(myString) - 1)
End If
' >> myString = "$80,000.50"

How do I program a loop into a DDEPoke call on VBA?

I am attempting to program a loop into a DDEPoke call to a VBA-supported function known as OPC. This will enable me to write to a PLC (RSLogix 500) database from an excel spreadsheet.
This is the code:
Private Function Open_RsLinx()
On Error Resume Next
Open_RsLinx = DDEInitiate(RsLinx, C1)
If Err.Number <> 0 Then
MsgBox "Error Connecting to topic", vbExclamation, "Error"
OpenRSLinx = 0 'Return false if there was an error
End If
End Function
Sub CommandButton1_Click()
RsLinx = Open_RsLinx()
For i = 0 To 255
DDEPoke RsLinx, "N16:0", Cells(1 + i, 2)
Next i
DDETerminate RsLinx
End Sub
This code works and will, if there is a link set up with an OPC server (in this case through RSLinx) write data to the PLC.
The problem is that I can't get the part DDEPoke RsLinx, "N16:0", Cells(1 + i, 2) to write data, sequentially, from one excel cell to one element of the PLC's data array.
I tried to do DDEPoke RsLinx, "N16:i", Cells(1 + i, 2) and DDEPoke RsLinx, "N16:0+i", Cells(1 + i, 2) but neither has any effect and the program doesn't write anything at all.
How can I set up the code to get N16:0 to increment all the way up to N16:255 and then stop?
Break the variable i out of the string. Be careful for the implicit type conversion though, depending on which (Str() or CStr()), you'll wind up with a leading space. Thus, convert the number Str(i), then wrap with Trim() to make sure there's no extra spaces, and concatenate that result back to your "N" string:
RsLinx = Open_RsLinx()
For i = 0 To 255
DDEPoke RsLinx, "N16:" & Trim(Str(i)), Cells(1 + i, 2)
Next i
The reason the i didn't work when it's inside the string is because that in VBA, anything within a set of quotes is considered a literal string. Unlike some other languages (PHP comes to mind) where variables can be resolved within a string like that, VBA must have variables concatenated. Consider the following:
Dim s As String
s = "world"
Debug.Print "Hello s!"
This outputs the literal of Hello s! to the immediate window, because s is treated not as a variable, but as part of the literal string. The correct way is through concatenation:
Dim s As String
s = "world"
Debug.Print "Hello " & s & "!"
That outputs the expected Hello World! to the immediate window, because s is now treated as a variable and is resolved and concatenated.
If that were not the case, the following might be difficult to deal with:
Dim i As Integer
For i = 0 to 9
Debug.Print "this" & i
Next i
You would then have:
th0s0
th1s1
th2s2
th3s3
th4s4
'etc
That'd make things pretty difficult to manage in a lot of cases.
With all that said, there are some languages - notably PHP - where, when using a certain set of quotes (either "" or '' - I don't recall which offhand), in fact does resolve the variable when embedded into the string itself:
$i = 5;
echo "this is number $i";
VBA does not have this feature.
Hope it helps...

Word VBA: iterating through characters incredibly slow

I have a macro that changes single quotes in front of a number to an apostrophe (or close single curly quote). Typically when you type something like "the '80s" in word, the apostrophe in front of the "8" faces the wrong way. The macro below works, but it is incredibly slow (like 10 seconds per page). In a regular language (even an interpreted one), this would be a fast procedure. Any insights why it takes so long in VBA on Word 2007? Or if someone has some find+replace skills that can do this without iterating, please let me know.
Sub FixNumericalReverseQuotes()
Dim char As Range
Debug.Print "starting " + CStr(Now)
With Selection
total = .Characters.Count
' Will be looking ahead one character, so we need at least 2 in the selection
If total < 2 Then
Return
End If
For x = 1 To total - 1
a_code = Asc(.Characters(x))
b_code = Asc(.Characters(x + 1))
' We want to convert a single quote in front of a number to an apostrophe
' Trying to use all numerical comparisons to speed this up
If (a_code = 145 Or a_code = 39) And b_code >= 48 And b_code <= 57 Then
.Characters(x) = Chr(146)
End If
Next x
End With
Debug.Print "ending " + CStr(Now)
End Sub
Beside two specified (Why...? and How to do without...?) there is an implied question – how to do proper iteration through Word object collection.
Answer is – to use obj.Next property rather than access by index.
That is, instead of:
For i = 1 to ActiveDocument.Characters.Count
'Do something with ActiveDocument.Characters(i), e.g.:
Debug.Pring ActiveDocument.Characters(i).Text
Next
one should use:
Dim ch as Range: Set ch = ActiveDocument.Characters(1)
Do
'Do something with ch, e.g.:
Debug.Print ch.Text
Set ch = ch.Next 'Note iterating
Loop Until ch is Nothing
Timing: 00:03:30 vs. 00:00:06, more than 3 minutes vs. 6 seconds.
Found on Google, link lost, sorry. Confirmed by personal exploration.
Modified version of #Comintern's "Array method":
Sub FixNumericalReverseQuotes()
Dim chars() As Byte
chars = StrConv(Selection.Text, vbFromUnicode)
Dim pos As Long
For pos = 0 To UBound(chars) - 1
If (chars(pos) = 145 Or chars(pos) = 39) _
And (chars(pos + 1) >= 48 And chars(pos + 1) <= 57) Then
' Make the change directly in the selection so track changes is sensible.
' I have to use 213 instead of 146 for reasons I don't understand--
' probably has to do with encoding on Mac, but anyway, this shows the change.
Selection.Characters(pos + 1) = Chr(213)
End If
Next pos
End Sub
Maybe this?
Sub FixNumQuotes()
Dim MyArr As Variant, MyString As String, X As Long, Z As Long
Debug.Print "starting " + CStr(Now)
For Z = 145 To 146
MyArr = Split(Selection.Text, Chr(Z))
For X = LBound(MyArr) To UBound(MyArr)
If IsNumeric(Left(MyArr(X), 1)) Then MyArr(X) = "'" & MyArr(X)
Next
MyString = Join(MyArr, Chr(Z))
Selection.Text = MyString
Next
Selection.Text = Replace(Replace(Selection.Text, Chr(146) & "'", "'"), Chr(145) & "'", "'")
Debug.Print "ending " + CStr(Now)
End Sub
I am not 100% sure on your criteria, I have made both an open and close single quote a ' but you can change that quite easily if you want.
It splits the string to an array on chr(145), checks the first char of each element for a numeric and prefixes it with a single quote if found.
Then it joins the array back to a string on chr(145) then repeats the whole things for chr(146). Finally it looks through the string for an occurence of a single quote AND either of those curled quotes next to each other (because that has to be something we just created) and replaces them with just the single quote we want. This leaves any occurence not next to a number intact.
This final replacement part is the bit you would change if you want something other than ' as the character.
I have been struggling with this for days now. My attempted solution was to use a regular expression on document.text. Then, using the matches in a document.range(start,end), replace the text. This preserves formatting.
The problem is that the start and end in the range do not match the index into text. I think I have found the discrepancy - hidden in the range are field codes (in my case they were hyperlinks). In addition, document.text has a bunch of BEL codes that are easy to strip out. If you loop through a range using the character method, append the characters to a string and print it you will see the field codes that don't show up if you use the .text method.
Amazingly you can get the field codes in document.text if you turn on "show field codes" in one of a number of ways. Unfortunately, that version is not exactly the same as what the range/characters shows - the document.text has just the field code, the range/characters has the field code and the field value. Therefore you can never get the character indices to match.
I have a working version where instead of using range(start,end), I do something like:
Set matchRange = doc.Range.Characters(myMatches(j).FirstIndex + 1)
matchRange.Collapse (wdCollapseStart)
Call matchRange.MoveEnd(WdUnits.wdCharacter, myMatches(j).Length)
matchRange.text = Replacement
As I say, this works but the first statement is dreadfully slow - it appears that Word is iterating through all of the characters to get to the correct point. In doing so, it doesn't seem to count the field codes, so we get to the correct point.
Bottom line, I have not been able to come up with a good way to match the indexing of the document.text string to an equivalent range(start,end) that is not a performance disaster.
Ideas welcome, and thanks.
This is a problem begging for regular expressions. Resolving the .Characters calls that many times is probably what is killing you in performance.
I'd do something like this:
Public Sub FixNumericalReverseQuotesFast()
Dim expression As RegExp
Set expression = New RegExp
Dim buffer As String
buffer = Selection.Range.Text
expression.Global = True
expression.MultiLine = True
expression.Pattern = "[" & Chr$(145) & Chr$(39) & "]\d"
Dim matches As MatchCollection
Set matches = expression.Execute(buffer)
Dim found As Match
For Each found In matches
buffer = Replace(buffer, found, Chr$(146) & Right$(found, 1))
Next
Selection.Range.Text = buffer
End Sub
NOTE: Requires a reference to Microsoft VBScript Regular Expressions 5.5 (or late binding).
EDIT:
The solution without using the Regular Expressions library is still avoiding working with Ranges. This can easily be converted to working with a byte array instead:
Sub FixNumericalReverseQuotes()
Dim chars() As Byte
chars = StrConv(Selection.Text, vbFromUnicode)
Dim pos As Long
For pos = 0 To UBound(chars) - 1
If (chars(pos) = 145 Or chars(pos) = 39) _
And (chars(pos + 1) >= 48 And chars(pos + 1) <= 57) Then
chars(pos) = 146
End If
Next pos
Selection.Text = StrConv(chars, vbUnicode)
End Sub
Benchmarks (100 iterations, 3 pages of text with 100 "hits" per page):
Regex method: 1.4375 seconds
Array method: 2.765625 seconds
OP method: (Ended task after 23 minutes)
About half as fast as the Regex, but still roughly 10ms per page.
EDIT 2: Apparently the methods above are not format safe, so method 3:
Sub FixNumericalReverseQuotesVThree()
Dim full_text As Range
Dim cached As Long
Set full_text = ActiveDocument.Range
full_text.Find.ClearFormatting
full_text.Find.MatchWildcards = True
cached = full_text.End
Do While full_text.Find.Execute("[" & Chr$(145) & Chr$(39) & "][0-9]")
full_text.End = full_text.Start + 2
full_text.Characters(1) = Chr$(96)
full_text.Start = full_text.Start + 1
full_text.End = cached
Loop
End Sub
Again, slower than both the above methods, but still runs reasonably fast (on the order of ms).

UDF to remove special characters, punctuation & spaces within a cell to create unique key for Vlookups

I hacked together the following User Defined Function in VBA that allows me to remove certain non-text characters from any given Cell.
The code is as follows:
Function removeSpecial(sInput As String) As String
Dim sSpecialChars As String
Dim i As Long
sSpecialChars = "\/:*?™""®<>|.&##(_+`©~);-+=^$!,'" 'This is your list of characters to be removed
For i = 1 To Len(sSpecialChars)
sInput = Replace$(sInput, Mid$(sSpecialChars, i, 1), " ")
Next
removeSpecial = sInput
End Function
This portion of the code obviously defines what characters are to be removed:
sSpecialChars = "\/:*?™""®<>|.&##(_+`©~);-+=^$!,'"
I also want to include a normal space character, " ", within this criteria. I was wondering if there is some sort of escape character that I can use to do this?
So, my goal is to be able to run this function, and have it remove all specified characters from a given Excel Cell, while also removing all spaces.
Also, I realize I could do this with a =SUBSTITUTE function within Excel itself, but I would like to know if it is possible in VBA.
Edit: It's fixed! Thank you simoco!
Function removeSpecial(sInput As String) As String
Dim sSpecialChars As String
Dim i As Long
sSpecialChars = "\/:*?™""®<>|.&## (_+`©~);-+=^$!,'" 'This is your list of characters to be removed
For i = 1 To Len(sSpecialChars)
sInput = Replace$(sInput, Mid$(sSpecialChars, i, 1), "") 'this will remove spaces
Next
removeSpecial = sInput
End Function
So after the advice from simoco I was able to modify my for loop:
For i = 1 To Len(sSpecialChars)
sInput = Replace$(sInput, Mid$(sSpecialChars, i, 1), "") 'this will remove spaces
Next
Now for every character in a given cell in my spreadsheet, the special characters are removed and replaced with nothing. This is essentially done by the Replace$ and Mid$ functions used together as shown:
sInput = Replace$(sInput, Mid$(sSpecialChars, i, 1), "") 'this will remove spaces
This code is executed for every single character in the cell starting with the character at position 1, via my for loop.
Hopefully this answer benefits someone in the future if the stumble upon my original question.