In VBA, How can I trim any characters not just whitespaces? - vba

Due to some other sloppy coding, I need to remove leading or trailing vbCrLf or reduce any instances of more than one consecutive vbCrLf to a single line.
So how can I execute the trim function, but to trim a character other than " " ?

Here are functions to do just that
Function TrimAny(ByVal myString As String, myTrim As String) As String
If myString = "" Or myTrim = "" Then TrimAny = myString: Exit Function
While inStB(myString, myTrim & myTrim): myString = Replace(myString, myTrim & myTrim, myTrim): Wend
myString = TrimStart(myString, myTrim)
myString = TrimEnd(myString, myTrim)
TrimAny = myString
End Function
Function TrimEnd(ByVal myString As String, myTrim As String) As String
If myString = "" Or myTrim = "" Then TrimEnd = myString: Exit Function
While Right(myString, Len(myTrim)) = myTrim: myString = Left(myString, Len(myString) - Len(myTrim)): Wend
TrimEnd = myString
End Function
Function TrimStart(ByVal myString As String, myTrim As String) As String
If myString = "" Or myTrim = "" Then TrimStart = myString: Exit Function
While Left(myString, Len(myTrim)) = myTrim: myString = Right(myString, Len(myString) - Len(myTrim)): Wend
TrimStart = myString
End Function

Solution with RegExp:
Function CrLfTrim(ByVal text As String) As String
' set a reference to 'Microsoft VBScript Regular Expression 5.5' in Tools->References VBE menu
Static re As RegExp ' re is Static and stores the object between function calls, so there is no need to create it each time
If re Is Nothing Then ' it is only necessary to create re if it is Nothing (first time)
Set re = New RegExp
re.Global = True
End If
With re
.Pattern = "(^(\x0D\x0A)+)|((\x0D\x0A)+$)" 'set pattern to remove leading and trailing vbCrLf
text = .Replace(text, "") 'remove leading and trailing vbCrLf
.Pattern = "(\x0D\x0A){2,}" 'set the pattern for the internal sequential vbCrLf
CrLfTrim = .Replace(text, vbCrLf) ' replace the internal sequential vbCrLf with one vbCrLf
End With
End Function
Usage example:
Sub Example()
Dim txt As String: txt = vbCrLf & "Lorem ipsum dolor sit " & vbCrLf & vbCrLf & vbCrLf & "amet, consectetur " & vbCrLf & vbCrLf
Debug.Print "Before: [" & txt & "]"
Debug.Print "After: [" & CrLfTrim(txt) & "]"
End Sub
Prints:
Before: [
Lorem ipsum dolor sit
amet, consectetur
]
After: [Lorem ipsum dolor sit
amet, consectetur ]

Related

Replace string in VBA

sampleString = "Lorem ipsumxolor sit"
I want the immediate left and right characters of x to be blank. The desired output would be "Lorem ipsu x lor sit"
Using stringtext = replace(stringtext, "x", " x "), the output is Lorem ipsum x olor sit. However, the length of the string obviously increases and doesn't match the desired output.
Another limitation is that incase of sampleString = "Lorem ipsumxxxolor sit". I can't use stringtext = replace(stringtext, "x", " x ") as the output becomes Lorem ipsum x x x olor sit instead of the desired result Lorem ipsu xxx lor sit. I can use replace(stringtext, "xxx", " xxx ") but that would cause me to use multiple conditions instead of one single solution.
Is there an efficient way to deal with this?
Thank you!
efficient
Private Function SpaceOutExes(ByVal s As String) As String
SpaceOutExes = s
Dim i As Long
Dim PrevCharIsX As Boolean
PrevCharIsX = Left$(SpaceOutExes, 1) = "x"
For i = 2 To Len(SpaceOutExes)
If Mid$(SpaceOutExes, i, 1) = "x" Then
If Not PrevCharIsX Then Mid$(SpaceOutExes, i - 1, 1) = " "
PrevCharIsX = True
Else
If PrevCharIsX Then Mid$(SpaceOutExes, i, 1) = " "
PrevCharIsX = False
End If
Next
End Function
Dim sampleString As String
sampleString = "Lorem ipsumxolor sit"
Debug.Print SpaceOutExes(sampleString)
You need to cut the original string in pieces and put it together the way you like:
Option Explicit
Public Sub StringExample()
Dim SampleString As String
SampleString = "Lorem ipsumxolor sit"
Dim FoundPosition As Long
FoundPosition = InStr(SampleString, "x")
Dim ResultString As String
ResultString = Left$(SampleString, FoundPosition - 2) & " " & Mid$(SampleString, FoundPosition, 1) & " " & Mid$(SampleString, FoundPosition + 2)
End Sub
Output is then
Lorem ipsu x lor sit
Or for multiple x:
Public Sub StringExampleMulti()
Const Delimiter As String = "x"
Dim SampleString As String
SampleString = "Lorem ipsumxxxolor sit amxet, conse!"
Dim Splitted() As String
Splitted = Split(SampleString, "x")
Dim ResultString As String
Dim i As Long
For i = LBound(Splitted) To UBound(Splitted)
If Splitted(i) <> vbNullString Then
If i = LBound(Splitted) Then
ResultString = Left$(Splitted(i), Len(Splitted(i)) - 1) & " " & Delimiter
ElseIf i = UBound(Splitted) Then
ResultString = ResultString & Delimiter & " " & Right$(Splitted(i), Len(Splitted(i)) - 1)
Else
ResultString = ResultString & " " & Mid$(Splitted(i), 2, Len(Splitted(i)) - 2) & " "
End If
Else
ResultString = ResultString & Delimiter
End If
Next i
Debug.Print ResultString
End Sub
outputs:
Lorem ipsu xxx lor sit a x t, conse!

An app operating under Windows 10 finds a byteOrderMarkUtf8 in a string, but there isn't one in the string

I have an app that receives an XML string and tries to clean it up before processing. For some reason, under the Windows 10 operating system, the app thinks there is a byteOrderMarkUtf8 leading the string. There isn't one.
The first character is "<". The app removes the "<", and then removes the rest of the tag, too, creating an invalid XML.
This used to work under Windows 7.
In the code below, I have commented the offending section out.
Is there something about character encoding that has changed with Windows 10 that would be causing this?
Private Sub CleanXML(ByRef InString As String)
' This subroutine cleans trash characters out of XML streams
If (InString = "") Then
MessageBox.Show("Null String passed to CleanXML." & vbCr & _
"String Length: " & InString.Length & vbCr & _
"Instring: " & InString & vbCr)
End If
If (InString.Length = 0) Then
MessageBox.Show("String of 0 length or null String passed to CleanXML." & vbCr & _
"String Length: " & InString.Length & vbCr & _
"Instring: " & InString & vbCr)
End If
Dim CleanString As String = InString
CleanString = CleanString.Trim() ' Trim leading and trailing spaces
CleanString = CleanString.Replace("- ", "") ' Replace the dashes
CleanString = CleanString.Replace(" <", "<") ' Replace some white space
CleanString = CleanString.Replace(" <", "<") ' Replace some white space
CleanString = CleanString.Replace("-<", "<") ' Replace dash+lessthan with lessthan
CleanString = CleanString.Replace("- <", "<") ' Replace dash+space+lessthan with lessthan
CleanString = CleanString.Replace("&", "&") ' Replace & with &
Dim Tempstring As String = ""
If CleanString.Length > 0 Then
Try
Dim byteOrderMarkUtf8 = Encoding.UTF8.GetString(Encoding.UTF8.GetPreamble())
' This is the offending code that I have commented out.
'-------------------------------------------------------------
'If (CleanString.StartsWith(byteOrderMarkUtf8)) Then
' CleanString = CleanString.Remove(0, byteOrderMarkUtf8.Length)
'End If
'If (CleanString.EndsWith(byteOrderMarkUtf8)) Then
' CleanString = CleanString.Remove(CleanString.Length - 1, byteOrderMarkUtf8.Length)
'End If
'-------------------------------------------------------------
' Make sure the first and last characters are "<" and ">".
Tempstring = CleanString
Do Until (CleanString.StartsWith("<") Or (CleanString.Length = 0))
CleanString = CleanString.Remove(0, 1)
Loop
Do Until (CleanString.EndsWith(">") Or (CleanString.Length = 0))
CleanString = CleanString.Remove(CleanString.Length - 1, 1)
Loop
Catch ex As Exception
MessageBox.Show("Error in CleanXML." & vbCr & _
"String Length: " & CleanString.Length & vbCr & _
"Instring: " & InString & vbCr & _
"CleanString: " & CleanString & _
" Length: " & CleanString.Length.ToString)
MessageBox.Show(ex.Message + " Inner exception: " + ex.InnerException.Message)
MessageBox.Show(Tempstring)
End Try
Else
MessageBox.Show("Clean string of 0 length in CleanXML." & vbCr & _
"String Length: " & CleanString.Length & vbCr & _
"Instring: " & InString & vbCr & _
"CleanString: " & CleanString)
End If
' Remove any BOMs (Byte-Order Marks) from the string.
'Dim i As Integer = InStr(1, CleanString, byteOrderMarkUtf8)
'Do Until i = 0
' CleanString = CleanString.Remove(i - 1, byteOrderMarkUtf8.Length)
' i = InStr(i, CleanString, byteOrderMarkUtf8)
'Loop
InString = CleanString
End Sub

Why Doesn't VBA replace function work with CRLF in Word and Excel

I could have sworn I have stripped CRLF in the past but not sure why the following isn't working:
myString = "ABC" & vbCrLf & "DEF"
str1 = Replace(myString, vbLf, "")
str2 = Replace(str1, vbCrLf, "")
str3 = Replace(str2, vbNewLine, "")
MsgBox str3
The code above doesn't work the result is:
ABC
DEF
myString = "ABC" & vbCrLf & "DEF"
str1 = Replace(myString, Chr(13), "")
str2 = Replace(str1, Chr(10), "")
MsgBox str2
The code above does work the result is:
ABCDEF
Solution: Thanks # Mat for the answer (The problem on the first code was the order I was trying to remove the items)
VbCrLf & VbNewLine is the same and trying to remove the combo vbCr+VbLf after removing VbLf won't work
The premise is flawed:
myString = "ABC" & vbCrLf & "DEF"
The string is made of "ABC", vbCrLf, and "DEF".
vbCrLf is vbCr and vbLf, which on any Windows box is vbNewLine.
When you do:
str1 = Replace(myString, vbLf, "")
You replace vbLf and leave the vbCr character in place.
str2 = Replace(str1, vbCrLf, "")
Then you replace vbCrLf but vbLf is already gone so vbCrLf isn't in the string.
str3 = Replace(str2, vbNewLine, "")
Then you replace vbNewLine which is basically doing the exact same thing as the previous instruction, and the result is a string that's been stripped of vbLf but still contains vbCr.
This code works as expected:
Sub Test()
Dim foo As String
foo = "foo" & vbCrLf & "bar"
Debug.Print foo
foo = Replace(foo, vbNewLine, vbNullString)
Debug.Print foo
End Sub
As does this:
Sub Test()
Dim foo As String
foo = "foo" & vbNewLine & "bar"
Debug.Print foo
foo = Replace(foo, vbNewLine, vbNullString)
Debug.Print foo
End Sub
Or this:
Sub Test()
Dim foo As String
foo = "foo" & vbNewLine & "bar"
Debug.Print foo
foo = Replace(foo, vbCrLf, vbNullString)
Debug.Print foo
End Sub
Or even this:
Sub Test()
Dim foo As String
foo = "foo" & vbNewLine & "bar"
Debug.Print foo
foo = Replace(foo, vbCr, vbNullString)
foo = Replace(foo, vbLf, vbNullString)
Debug.Print foo
End Sub
Your second snippet works as intended, because you do remove both vbCr (Chr(13)) and vbLf (Chr(10)) characters. Simple as that.

VB Script in excel to separate a paragraph in lines of 38

I am working on taking a paragraph and separating that paragraph out in lines of 38 characters. What I was thinking was having a button for the employee to click and it will grab the information out of a certain cell, and place it in a text box of numerous lines that only have 38 characters, so the employee then can copy and paste it in our ERP system. The other kicker is I need a hard return after each line for it to probably be copied over. I was trying to use Str.Substring. Any help would greatly be appreciated.
So you need to take a paragraph and break it into a series of 38 character lines, separated by hard line breaks. The following code does so to the first cell in the worksheet (A1) and puts the clipboard, so all your users have to do is paste.
Sub main()
Dim x As Integer
Dim Output As String
Dim counter As Integer
Dim inputString As String
inputString = Cells(1, 1)
'We're going to go through the input string
'one character at a time
For x = 1 To Len(inputString)
'Each time through, we add one to our counter
counter = counter + 1
'When the counter reaches our breakpoint
If counter = 38 Then
'add a hard carriage return and line feed
Output = Output & vbCrLf
'Reset our counter
counter = 0
End If
'Append the current letter to the output
Output = Output & Mid(inputString, x, 1)
Next
'This puts the output string in the clipboard.
'You need to have a userform in your project for this to work
'or a reference to MSForms.
Dim cb As New DataObject
cb.SetText Output
cb.PutInClipboard
'This cleans up the clipboard object
Set cb = Nothing
End Sub
Here is a word wrap function that breaks the input string by space (replacing any new line characters) and fills the lines with whole words up to the specified margin. Unit tests attached to demonstrate typical usage.
Function getWordWrap(str As String, Optional margin As Long = 78) As String
Dim tokenArr() As String
Dim token As Variant
Dim line As String
Dim numChars As Long
tokenArr = Split(Replace(str, vbNewLine, " "))
For Each token In tokenArr
' if above margin - start a new line
If numChars > 0 And numChars + Len(token) + 1 > margin Then
getWordWrap = getWordWrap & Trim(line) & vbNewLine
numChars = 0: line = ""
End If
line = line & token & " "
numChars = Len(line) - 1
Next
getWordWrap = getWordWrap & Trim(line)
End Function
Sub testGetWordWrap()
' basic tests
Debug.Assert getWordWrap("Illum rerum qui.", 15) = "Illum rerum" & vbNewLine & "qui."
Debug.Assert getWordWrap("", 15) = ""
Debug.Assert getWordWrap("Illum rerum qui quia sequi excepturi adipisci iure.", 15) _
= "Illum rerum qui" & vbNewLine & "quia sequi" & vbNewLine & "excepturi" _
& vbNewLine & "adipisci iure."
' very long words
Debug.Assert getWordWrap("VeryVeryLongWord", 15) = "VeryVeryLongWord"
Debug.Assert getWordWrap("Illum rerum qui. VeryVeryLongWord", 15) _
= "Illum rerum" & vbNewLine & "qui." & vbNewLine & "VeryVeryLongWord"
Debug.Assert getWordWrap("Illum rerum qui. VeryVeryLongWord quia sequi", 15) _
= "Illum rerum" & vbNewLine & "qui." & vbNewLine & "VeryVeryLongWord" _
& vbNewLine & "quia sequi"
' pre-existing new lines
Debug.Assert getWordWrap("Illum rerum qui quia sequi" & vbNewLine & "excepturi adipisci iure.", 15) _
= "Illum rerum qui" & vbNewLine & "quia sequi" & vbNewLine & "excepturi" _
& vbNewLine & "adipisci iure."
' different margins
Debug.Assert getWordWrap("Illum rerum qui.", 1) = "Illum" & vbNewLine & "rerum" & vbNewLine & "qui."
Debug.Assert getWordWrap("Illum rerum qui.", 20) = "Illum rerum qui."
End Sub

Count the number of empty spaces in front and back of the string

I am reading a file line by line in Excel VBA.
I have some strings for example,
" ooo"
" ooo "
I want to find the number of empty spaces in the front of the string. If I use Trim, it is removing empty spaces from both back and front of the string.
You could use the LTrim and RTrim functions. - I would assume that is faster, than looping through the string and doing character comparisons.
Public Function NumberOfLeadingSpaces(ByVal theString As String) As Long
NumberOfLeadingSpaces = Len(theString) - Len(LTrim(theString))
End Function
Public Function NumberOfTrailingSpaces(ByVal theString As String) As Long
NumberOfTrailingSpaces = Len(theString) - Len(RTrim(theString))
End Function
Function test(s As String) As Integer
Dim str As String
str = "[abcdefghijklmnopqrstuvwxyz0123456789]"
Dim spaceCounter As Integer
For i = 1 To Len(s)
If Not Mid(s, i, 1) Like str Then
spaceCounter = spaceCounter + 1
Else
Exit For
End If
Next i
test = spaceCounter
End Function
By popular request: Why use this function instead of Trim, LTrim, etc?
Well, to summarize the full explanation, not all spaces can be removed with Trim. But they will be removed with this function.
Consider this example (I'll borrow PhilS' solution for illustrative purposes):
Sub testSpaceRemoval()
Dim str1 As String
str1 = " " & Chr(32) & Chr(160) & "a"
Debug.Print Chr(34) & str1 & Chr(34)
Debug.Print NumberOfLeadingSpaces(str1)
Debug.Print test(str1)
End Sub
Result:
"  a"
2
3
Here we can see that the string clearly contains 3 spaces, but the solution using LTrim only counted 2.
So, what to use?
Well, it depends. If you have a dataset where you know you won't get non-breaking characters, use Trim as much as you want! If you think you can get non-breaking characters, Trim alone will not be enough.
Characters to look out for are, quoted from the explanation linked above:
leading, trailing, or multiple embedded space characters (Unicode character set values 32 and 160), or non-printing characters (Unicode character set values 0 to 31, 127, 129, 141, 143, 144, and 157)
Trim can remove chr(32) (as demonstrated above) but not chr(160), because 32 is the regular space and 160 is a non-breaking space.
If you're a stickler for covering your behind, consider this total solution:
Function cleanSpecialCharacters(str As String) As String
bannedChars = Chr(127) & "," & Chr(129) & "," & Chr(141) & "," & Chr(143) & "," & Chr(144) & "," & Chr(157) & "," & Chr(160)
str = Application.WorksheetFunction.Clean(str)
str = Application.WorksheetFunction.Trim(str)
For Each c In Split(bannedChars, ",")
str = Replace(str, c, "")
Next
cleanSpecialCharacters = str
End Function
For OP's particular question, it would have to be a little more tailored.
Sub blanks()
cadena = Cells(1, 1)
i = Len(cadena)
Do Until Mid(cadena, i, 1) <> " "
If Mid(cadena, i, 1) = " " Then contador = contador + 1
i = i - 1
Loop
Cells(2, 1) = contador
End Sub
Sub main()
Dim strng As String
Dim i As Long
strng = " ooo "
i = 1
Do While Mid(strng, i, 1) = " "
i = i + 1
Loop
MsgBox "number of front empty spaces: " & i - 1
End Sub
or use LTrim function:
Sub main2()
Dim strng As String
strng = " ooo "
MsgBox "number of front empty spaces: " & Len(strng) - Len(LTrim(strng))
End Sub