Sampling StringName.SubString(p,1) for paragraph formatting character - vb.net

please try to specifically answer my question and not offer alternative approaches as I have a very specific problem that needs this ad-hoc solution. Thank you very much.
Automatically my code opens Word through VB.NET, opens the document, finds the table, goes to a cell, moves that cells.range.text into a String variable and in a For loop compares character at position p to a String.
I have tried Strings:
"^p", "^013", "U+00B6"
My code:
Dim nextString As String
'For each cell, extract the cell's text.
For p = 17 To word_Rng.Cells.Count
nextString = word_Rng.Cells(p).Range.Text
'For each text, search for structure.
For q = 0 To nextString.Length - 1
If (nextString.Substring(q, 1) = "U+00B6") Then
Exit For
End If
Next
Next
Is the structural data lost when assigning the cells text to a String variable. I have searched for formatting marks like this in VBA successfully in the past.

Assuming that your string contains the character, you can use ChrW to create the appropriate character from the hex value, and check for that:
If nextString.Substring(q, 1) = ChrW(&h00B6) Then
Exit For
End If
UPDATE
Here's a complete example:
Dim nextString = "This is a test " & ChrW(&H00B6) & " for that char"
Console.WriteLine(nextString)
For q = 0 To nextString.Length - 1
If nextString(q) = ChrW(&H00B6) Then
Console.WriteLine("Found it: {0}", q)
End If
Next
This outputs:
This is a test ¶ for that char
Found it: 15

Related

little boxes in front of text lines after putting contents of multiline textbox in Word document

I have a user form with multiline textbox. EnterKeyBehavior = True. I run the macro, the form opens, I type a few lines of text, pressing Enter after each line. The value from that textbox is put in a variable which is later put in the word document via Find/replace of placeholder text <>.
The text is put in the document and the first line in the multiline textbox entry looks OK, but every other line starts with a little empty box (some Ascii/unicode character I don't know). See pic.
I've tried to replace that little white box by doing a replace on vbCr, vbLf, vbCrLf, Char(10), Char(11), Char(13), from other posts I've found with those solutions, but none of them work for me. What's the fix?
This is what I have in the form right now and from where I'm trying to clean up the contents of that textbox before passing on the value to the variable that I then put in the word document.
Private Sub cmdCompInfoOK_Click()
txtCompanyInfo.Text = Replace(txtCompanyInfo.Text, Chr(11), "")
slkCompanyInfo = frmLtrAddress.txtCompanyInfo
Me.Hide
End Sub
The multiline field is called txtCompanyInfo.
slkCompanyInfo is the variable where I want to put the value from the textbox, and I'm declaring it as Public in the main module, which calls the userform.
Put the following code into a regular module, it will create a string with information about all characters in the input string.
Function DumpString(s As String, Optional specialCharsOnly As Boolean = False) As String
' Write all chars of String as ASCII-Value, concatenated as string
Dim i As Long
For i = 1 To Len(s)
Dim x As String
Dim a As Long, dumpMe As Boolean, c As String
a = AscW(Mid(s, i, 1))
dumpMe = Not specialCharsOnly Or a < 32 Or a > 128
If a = 13 Then
c = "<CR>"
ElseIf a = 10 Then
c = "<LF>"
ElseIf a = 9 Then
c = "<TAB>"
ElseIf a = 0 Then
c = "<NUL>"
Else
c = Mid(s, i, 1)
End If
x = a & "(" & c & ")"
If dumpMe Then
DumpString = DumpString & IIf(DumpString = "", "", ",") & x
End If
Next i
End Function
In your Event-Routine, put a statement like
Debug.Print DumpString(slkCompanyInfo)
If your input string is to long, you can set the 2nd parameter to True, in that case all regular characters are skipped.

Excel conversion of text containing ranges--numeric to alpha-numeric

I would like to convert a range of numbers (and single digits) from a number-only format to alpha-numeric format. Entire statement is in a single, excel cell and would like the converted version to be in a neighboring cell.
As an example:
Assuming 1-24=B1-B24
Assuming 25-48=C1-C24
INPUT—
screen 1-3,5,7-9,11-30,32-37,39-40,41,44-46
DESIRED OUTPUT (all acceptable)
screen B1-B3,B5,B7-B9,B11-C6,C8-C13,C15-C16,C17,C20-C22
OR
screen B1-B3,B5,B7-B9,B11-B24,C1-C6,C8-C13,C15-C16,C17,C20-C22
OR
screen B1-B3,B5,B7-B9,B11-B24
screen C1-C6,C8-C13,C15-C16,C17,C20-C22
Using excel functions is proving quite cumbersome so excel macro would be better. I've looked for examples of requested conversion but haven't found anything.
Any help is greatly appreciated.
Cheers,
Bob
Hey here is a solution that i tested out. Not sure if "screen" needs to be in the string or not. Let me know and I will tweak it if that's the case.
Its a user defined function. So drop this vba in a module and then go to a worksheet and type in "=AlphaConvert(" + the cell reference.
Assumption here is that only one cell will be referenced at a time.
Last this could easily be converted to a sub routine and probably run a bit faster than the function.
Public Function AlphaConvert(TargetCell As Range)
Dim v As Long
Dim vArr() As String
Dim i As Long
Dim iArr() As String
Dim a As String
vArr = Split(TargetCell.Value, ",")
For v = LBound(vArr) To UBound(vArr)
If InStr(vArr(v), "-") > 0 Then
iArr = Split(vArr(v), "-")
For i = LBound(iArr) To UBound(iArr)
If i = LBound(iArr) Then
a = AlphaCode(iArr(i))
Else
a = a & "-" & AlphaCode(iArr(i))
End If
Next i
vArr(v) = a
Else
vArr(v) = AlphaCode(vArr(v))
End If
If v = LBound(vArr) Then
AlphaConvert = vArr(v)
Else
AlphaConvert = AlphaConvert & "," & vArr(v)
End If
Next v
End Function
Private Function AlphaCode(Nbr As Variant)
Select Case Nbr
Case 1 To 24
AlphaCode = "B" & Nbr
Case Else
AlphaCode = "C" & Nbr - 24
End Select
End Function

Extracting Date/Time from comment cell

I have a comment field with cells containing text like this:
Cancelled by user at 2018-01-03 03:11:57 without charge
I want to get the date and time information, but it may not always be in the 3rd/4th from last spaces, otherwise I might try to do some sort of complicated split of the cell. Is there an "in cell" way extract the date time information? Or will this need a VBA script? I prefer the former, but I'm trying to make a macro to simplify my life anyway, so VBA would work too.
I'd propose the following formula:
=MID(A1,FIND("at 20",A1)+3,19)
This would require that the date is always preceded by the word 'at' and the date string starts with 20.
You can try this function. It splits the string checking for items that have the first letter numeric, and builds a result string of just the date information.
Public Function ParseForDate(sCell As String) As String
Dim vSplit As Variant
Dim nIndex As Integer
Dim sResult As String
vSplit = Split(sCell, " ")
For nIndex = 0 To UBound(vSplit)
If IsNumeric(Left$(vSplit(nIndex), 1)) Then
sResult = sResult & vSplit(nIndex) & " "
End If
Next
ParseForDate = Trim$(sResult)
End Function
If you wanted to use it in a formula it would look something like this:
=ParseForDate(A1)
To use it in a VBA routine:
Dim s as String
s = ParseForDate(Range("A1"))
Non-VBA solution: (this is assuming the date format is always the same for all cells)
= MAX(IFERROR(DATEVALUE(MID(A1,ROW(INDEX($A:$A,1):INDEX($A:$A,LEN(A1)-19)),20)),0))
+MAX(IFERROR(TIMEVALUE(MID(A1,ROW(INDEX($A:$A,1):INDEX($A:$A,LEN(A1)-19)),20)),0))
Note this is an array formula, so you must press Ctrl+Shift+Enter instead of just Enter when typing this formula.
You will obviously then need to format the cell as a date and time, but this formula gets the numerical value that Excel uses for its internal date and time system.
Using a regex will enable you to fetch the date and time, irrespective of its placement in the string. The following solution will work if the date and time are of the same format as shown in the example string.
Code:
Sub getDateTime()
Dim objReg, matches, str
str = Sheet1.Cells(1, 1).Value 'Change this as per your requirements
Set objReg = CreateObject("vbscript.regexp")
objReg.Global = True
objReg.Pattern = "\d{4}(?:-\d{2}){2}\s*\d{2}(?::\d{2}){2}"
If objReg.test(str) Then
Set matches = objReg.Execute(str)
strResult = matches.Item(0)
MsgBox strResult
End If
End Sub
Click for Regex Demo
Regex Explanation:
\d{4} - matches 4 digits representing the year
(?:-\d{2}){2} - matches - followed by 2 digits. {2} in the end repeats this match 2 times. Once for getting MM and the next time for DD
\s* - matches 0+ whitespaces to match the space between the Date and Time
\d{2} - matches 2 digits representing the HH
(?::\d{2}){2} - matches : followed by 2 digits. The {2} in the end repeats this match 2 times. First time for matching the :MM and the next time for matching the :SS
Screenshots:
Output:
This will be good for about 90 years (using cell C3 for example):
Sub GetDate()
Dim s As String
s = Range("C3").Comment.Text
arr = Split(s, " ")
For i = LBound(arr) To UBound(arr)
If Left(arr(i), 2) = "20" Then
msg = arr(i) & " " & arr(i + 1)
MsgBox msg
Exit Sub
End If
Next i
End Sub

Word VBA: iterating through characters incredibly slow

I have a macro that changes single quotes in front of a number to an apostrophe (or close single curly quote). Typically when you type something like "the '80s" in word, the apostrophe in front of the "8" faces the wrong way. The macro below works, but it is incredibly slow (like 10 seconds per page). In a regular language (even an interpreted one), this would be a fast procedure. Any insights why it takes so long in VBA on Word 2007? Or if someone has some find+replace skills that can do this without iterating, please let me know.
Sub FixNumericalReverseQuotes()
Dim char As Range
Debug.Print "starting " + CStr(Now)
With Selection
total = .Characters.Count
' Will be looking ahead one character, so we need at least 2 in the selection
If total < 2 Then
Return
End If
For x = 1 To total - 1
a_code = Asc(.Characters(x))
b_code = Asc(.Characters(x + 1))
' We want to convert a single quote in front of a number to an apostrophe
' Trying to use all numerical comparisons to speed this up
If (a_code = 145 Or a_code = 39) And b_code >= 48 And b_code <= 57 Then
.Characters(x) = Chr(146)
End If
Next x
End With
Debug.Print "ending " + CStr(Now)
End Sub
Beside two specified (Why...? and How to do without...?) there is an implied question – how to do proper iteration through Word object collection.
Answer is – to use obj.Next property rather than access by index.
That is, instead of:
For i = 1 to ActiveDocument.Characters.Count
'Do something with ActiveDocument.Characters(i), e.g.:
Debug.Pring ActiveDocument.Characters(i).Text
Next
one should use:
Dim ch as Range: Set ch = ActiveDocument.Characters(1)
Do
'Do something with ch, e.g.:
Debug.Print ch.Text
Set ch = ch.Next 'Note iterating
Loop Until ch is Nothing
Timing: 00:03:30 vs. 00:00:06, more than 3 minutes vs. 6 seconds.
Found on Google, link lost, sorry. Confirmed by personal exploration.
Modified version of #Comintern's "Array method":
Sub FixNumericalReverseQuotes()
Dim chars() As Byte
chars = StrConv(Selection.Text, vbFromUnicode)
Dim pos As Long
For pos = 0 To UBound(chars) - 1
If (chars(pos) = 145 Or chars(pos) = 39) _
And (chars(pos + 1) >= 48 And chars(pos + 1) <= 57) Then
' Make the change directly in the selection so track changes is sensible.
' I have to use 213 instead of 146 for reasons I don't understand--
' probably has to do with encoding on Mac, but anyway, this shows the change.
Selection.Characters(pos + 1) = Chr(213)
End If
Next pos
End Sub
Maybe this?
Sub FixNumQuotes()
Dim MyArr As Variant, MyString As String, X As Long, Z As Long
Debug.Print "starting " + CStr(Now)
For Z = 145 To 146
MyArr = Split(Selection.Text, Chr(Z))
For X = LBound(MyArr) To UBound(MyArr)
If IsNumeric(Left(MyArr(X), 1)) Then MyArr(X) = "'" & MyArr(X)
Next
MyString = Join(MyArr, Chr(Z))
Selection.Text = MyString
Next
Selection.Text = Replace(Replace(Selection.Text, Chr(146) & "'", "'"), Chr(145) & "'", "'")
Debug.Print "ending " + CStr(Now)
End Sub
I am not 100% sure on your criteria, I have made both an open and close single quote a ' but you can change that quite easily if you want.
It splits the string to an array on chr(145), checks the first char of each element for a numeric and prefixes it with a single quote if found.
Then it joins the array back to a string on chr(145) then repeats the whole things for chr(146). Finally it looks through the string for an occurence of a single quote AND either of those curled quotes next to each other (because that has to be something we just created) and replaces them with just the single quote we want. This leaves any occurence not next to a number intact.
This final replacement part is the bit you would change if you want something other than ' as the character.
I have been struggling with this for days now. My attempted solution was to use a regular expression on document.text. Then, using the matches in a document.range(start,end), replace the text. This preserves formatting.
The problem is that the start and end in the range do not match the index into text. I think I have found the discrepancy - hidden in the range are field codes (in my case they were hyperlinks). In addition, document.text has a bunch of BEL codes that are easy to strip out. If you loop through a range using the character method, append the characters to a string and print it you will see the field codes that don't show up if you use the .text method.
Amazingly you can get the field codes in document.text if you turn on "show field codes" in one of a number of ways. Unfortunately, that version is not exactly the same as what the range/characters shows - the document.text has just the field code, the range/characters has the field code and the field value. Therefore you can never get the character indices to match.
I have a working version where instead of using range(start,end), I do something like:
Set matchRange = doc.Range.Characters(myMatches(j).FirstIndex + 1)
matchRange.Collapse (wdCollapseStart)
Call matchRange.MoveEnd(WdUnits.wdCharacter, myMatches(j).Length)
matchRange.text = Replacement
As I say, this works but the first statement is dreadfully slow - it appears that Word is iterating through all of the characters to get to the correct point. In doing so, it doesn't seem to count the field codes, so we get to the correct point.
Bottom line, I have not been able to come up with a good way to match the indexing of the document.text string to an equivalent range(start,end) that is not a performance disaster.
Ideas welcome, and thanks.
This is a problem begging for regular expressions. Resolving the .Characters calls that many times is probably what is killing you in performance.
I'd do something like this:
Public Sub FixNumericalReverseQuotesFast()
Dim expression As RegExp
Set expression = New RegExp
Dim buffer As String
buffer = Selection.Range.Text
expression.Global = True
expression.MultiLine = True
expression.Pattern = "[" & Chr$(145) & Chr$(39) & "]\d"
Dim matches As MatchCollection
Set matches = expression.Execute(buffer)
Dim found As Match
For Each found In matches
buffer = Replace(buffer, found, Chr$(146) & Right$(found, 1))
Next
Selection.Range.Text = buffer
End Sub
NOTE: Requires a reference to Microsoft VBScript Regular Expressions 5.5 (or late binding).
EDIT:
The solution without using the Regular Expressions library is still avoiding working with Ranges. This can easily be converted to working with a byte array instead:
Sub FixNumericalReverseQuotes()
Dim chars() As Byte
chars = StrConv(Selection.Text, vbFromUnicode)
Dim pos As Long
For pos = 0 To UBound(chars) - 1
If (chars(pos) = 145 Or chars(pos) = 39) _
And (chars(pos + 1) >= 48 And chars(pos + 1) <= 57) Then
chars(pos) = 146
End If
Next pos
Selection.Text = StrConv(chars, vbUnicode)
End Sub
Benchmarks (100 iterations, 3 pages of text with 100 "hits" per page):
Regex method: 1.4375 seconds
Array method: 2.765625 seconds
OP method: (Ended task after 23 minutes)
About half as fast as the Regex, but still roughly 10ms per page.
EDIT 2: Apparently the methods above are not format safe, so method 3:
Sub FixNumericalReverseQuotesVThree()
Dim full_text As Range
Dim cached As Long
Set full_text = ActiveDocument.Range
full_text.Find.ClearFormatting
full_text.Find.MatchWildcards = True
cached = full_text.End
Do While full_text.Find.Execute("[" & Chr$(145) & Chr$(39) & "][0-9]")
full_text.End = full_text.Start + 2
full_text.Characters(1) = Chr$(96)
full_text.Start = full_text.Start + 1
full_text.End = cached
Loop
End Sub
Again, slower than both the above methods, but still runs reasonably fast (on the order of ms).

VBA Word: Inserting Text Form Fields at a Specified Location

I am new to vba and developing a document that prompts the user to select a variable number of values from a combo box list. After selecting the values, I want to insert them in order onto the document itself as a Text Form Field. Let me show you how I generally am trying to get it to work.
First, the user selects values:
[a]
[b]
[c]
And selects an "OK" button. Then, I am attempting to add these selected values into the word document starting at a bookmark. Value "a" should be inserted followed by a space character followed by a blank Text Form Field, followed by two carriage returns. In the end the result should look something like this:
[bookmark]
[a]'_'[blank_a]'^p'
'^p'
[b]'_'[blank_b]'^p'
'^p'
[c]'_'[blank_c]'^p'
'^p'
Where [bookmark] is an invisible bookmark, '_' is a space, and '^p' is a carriage return. Currently my code is as follows:
Dim myRange As Range
Set myRange = ActiveDocument.Range(Start:=ActiveDocument.Bookmarks("START").Range.Start, _
End:=ActiveDocument.Bookmarks("END").Range.End)
For i = 1 To NUMBER_OF_RESPONSES
Selection.FormFields.Add(myRange, wdFieldFormTextInput).Name = "question_" & i
Selection.FormFields.Add(myRange, wdFieldFormTextInput).Result = "response_" & i
Next i
Naturally, there are no insertions of literal spaces or carriage returns yet as I have not figured out how to do it. The result of this code is as follows:
[START][blank_c][c][blank_b][b][blank_a][a][END]
I would like this order reversed and for there to be the aforementioned formatting inserted. Any pointers on how to go about doing it?
I am not sure if I have missed something, but why not relying on simple paragraphs instead on Bookmarks? Here you have a code doing what you want and any other thing (you can modify the ranges of the paragraphs to perform as complex actions as you wish).
Dim curRange As Range
Dim start_i As Integer
Dim end_i As Integer
Dim NUMBER_OF_RESPONSES As Integer
NUMBER_OF_RESPONSES = 3
start_i = NUMBER_OF_RESPONSES + 1 '0
end_i = 1 'NUMBER_OF_RESPONSES
Set curParagraph = ActiveDocument.Paragraphs.First
curParagraph.Range.Text = "[START]"
i = start_i
Do
If (start_i < end_i) Then
i = i + 1
Else
i = i - 1
End If
Set curParagraph = curParagraph.Range.Paragraphs.Add
curParagraph.Range.Text = "[question_" & i & "][" & "response_" & i & "]"
Loop While (i <> end_i)
Set curParagraph = curParagraph.Range.Paragraphs.Add
curParagraph.Range.Text = "[END]"
Solution
Well now I feel silly for asking the question. The solution was pretty simple.
ActiveDocument.Bookmarks("START").Select
For i = 1 To NUMBER_OF_RESPONSES
Selection.Font.Size = 11
Selection.Font.Bold = True
Selection.FormFields.Add(Range:=Selection.Range, Type:=wdFieldFormTextInput) _
.Name = "question_" & i
Selection.Font.Bold = False
Selection.TypeText Text:=" "
Selection.FormFields.Add(Range:=Selection.Range, Type:=wdFieldFormTextInput) _
.Name = "response_" & i
Selection.TypeParagraph
Selection.TypeParagraph
Next i
So the real issue was placing the cursor in the right location:
ActiveDocument.Bookmarks("START").Select
From there I was able to use Selection to insert the desired FormFields and characters.
This link was pretty helpful.
And if you are reading this because you also are new and trying to learn what to do, check out how to record a macro. It's a good first step. Record the macro, view the code it generated, and use that code to guide your own development. Cool.