Related
I have documents that have AutoTextList fields. Unlike most fields, Fields(i).Code does not show the entire code. The Display Text / Literal Text does not show up when you display the field code nor does it report out in VBA.
I am trying to pull the components of this field and place into string variables.
The syntax of the field, when written, is:
{ AUTOTEXTLIST "Literal text" \s ["Style name"] \t ["Tip text"] }
I wrote the following macro to get a look at what can be found and how.
Sub LookAtAutoTextListField()
Dim i As Long
With ActiveDocument
For i = .Fields.Count To 1 Step -1
If .Fields(i).Type = wdFieldAutoTextList Then
Debug.Print "Display Text: " & .Fields(i).Result
Debug.Print "Code: " & .Fields(i).Code
End If
Next i
End With
End Sub
Here are four examples of the results:
1
Display Text: What I want
Code: AUTOTEXTLIST \s Normal \t ""How much""
2
Display Text: What I want
Code: AUTOTEXTLIST \s "Normal" \t ""How much""
3
Display Text: What I want
Code: AUTOTEXTLIST \s "Body Text" \t "Right-click to choose how much"
4
Display Text: What I want
Code: AUTOTEXTLIST \s "Body Text" \t ""How much""
I can easily put the Display text in a string.
The pop-up tip text can have slashes or not. When inserted using the insert field, it will have backslashes front and back. See all but example 3 above.
The style name may or may not have quotation marks around it. It must have them if the style name contains a space, but otherwise they are not required but can be used. See examples 1 and 2 above.
I would like to get the text between \s and \t stripping out quotation marks and the text following \t stripping out quotation marks and the backslashes.
I can find the location of the \s in the code using the Instr function and the left and right position of the \t in the code using Instr and InstrRev functions. I can get the length of what is reported for the code using the Len function.
I can use that information, together with the Left and Right functions to get the text between \s and \t (i.e. the Style) but need to strip out any Quotation marks which may or may not be present. I don't know how to do that.
I can do the same thing with the tip text, but don't know how to strip out any quotation marks and backslashes.
You could use code along the following lines (not well tested, e.g. I haven't checked what happns with an option value that starts with a " but has no terminating " - probably terminated by the end-of-field marker) as long as you know you are dealing with a "typical" field code, e.g. the sort that Word itself might create, where there is white space between the various elements of the field, the white space is all composed of regular spaces, and multi-word strings are enclosed by straight double quotation marks (chr(34)).
FWIW here I don't see quite what you describe when entering this field type from the field dialog box. I never see \t ""How much"". If I enter the tip text without quotation marks in the dialog box, I see \t "How much" and the quotation marks are not displayed in the tip. If I enter the tip text with quotation marks in the dialog box, I see \t "\"How much\"" and the quotation marks are displayed in the tip.
Sub testgetfieldparts()
' pass an autotextlistfield object
Call getAutotextlistFieldParts2(ActiveDocument.Fields(1))
End Sub
Sub getAutotextlistFieldParts(f As Word.Field)
Const FieldCodeName As String = "AUTOTEXTLIST"
Dim i As Long
Dim s As String
Dim StyleName As String
Dim TipText As String
If f.Type = WdFieldType.wdFieldAutoTextList Then
s = f.Code
i = InStr(1, UCase(s), FieldCodeName)
s = Trim(Mid(s, i + Len(FieldCodeName)))
i = InStr(1, LCase(s), "\s")
Debug.Print "Style Name: ";
If i = 0 Then
Debug.Print "No \s option specified"
Else
StyleName = OptionValue(s, i + 2)
If StyleName = "" Then
Debug.Print "\s option specified but no name specified"
Else
Debug.Print StyleName
End If
End If
i = InStr(1, LCase(s), "\t")
Debug.Print "Tip Text: ";
If i = 0 Then
Debug.Print "No \t option specified"
Else
TipText = OptionValue(s, i + 2)
If TipText = "" Then
Debug.Print "\t option specified but no text specified"
Else
Debug.Print TipText
End If
End If
Else
Debug.Print "Not an " & FieldCodeName & " field."
End If
End Sub
Function OptionValue(s As String, iStart As Long) As String
Dim c As String
Dim i As Long
Dim escape As Boolean
OptionValue = ""
escape = False
s = Trim(Mid(s, iStart))
c = Left(s, 1)
Select Case c
Case """"
For i = 2 To Len(s)
c = Mid(s, i, 1)
If escape Then
OptionValue = OptionValue & c
escape = False
Else
If c = """" Then
Exit For
Else
If c = "\" Then
escape = True
Else
OptionValue = OptionValue & c
End If
End If
End If
Next
Case "\" ' for now, assume this is the start of another option
'
Case Else
OptionValue = OptionValue & c
For i = 2 To Len(s)
c = Mid(s, i, 1)
If c = "\" Or c = " " Or c = """" Then
Exit For
Else
OptionValue = OptionValue & c
End If
Next
End Select
End Function
If you had to deal with what is actually allowed in a Word field code rather than what you will typically find, things get considerably more complicated, for at least the following reasons in Word's "field code language":
Word generally recognises 6 different Unicode characters as double quotation marks for enclosing strings, not just ". It doesn't recognise any single quotation marks for that purpose. You can use any of the 6 to start a string and any of the 6 to end one - they don't have to match in any way.
Word recognises at least 11 Unicode characters as "white space". The VBA Trim, LTrim and RTrim functions don't remove most of them.
Inside quoted strings, the backslash character \ generally acts as an escape character, so you can insert a \" in the middle of a string to get a " character. But outside quoted strings, backslash will generally be seen as the start of a new option and will act as a string terminator for any unquoted string.
You can put all sorts of non-text items in a field, e.g. inline images, content controls etc. It may not make sense in any given field to do that, but in a complete field parsing solution you might have to deal with such things. It isn't uncommon to use nested fields (e.g. you could specify the Style Name and Tip Text that way. In that case, the field result is generally as a "spaceless string" so even if the result contains spaces you do not have to put quotation marks around it. So you cannot treat the field result in the same way as you would treat its plain text result. There is more...
in some case you don't actually need any white space in the field. e.g. I think {AUTOTEXTLIST\sMyStyle\tImportantTip} would work. So a parser shouldn't rely on the presence of white space to separate tokens in the field.
Word doesn't always (ever?) prevent you from having multiple instances of the same option. In some cases that's deliberate (e.g. it's how you can include multiple books in a CITATION field.). But say you had two \t options in your AUTOTEXTLIST. In that case Word uses the first one unless no tip text is given, i.e. \t tip1 \t tip2 should display tip1 but \t \t tip2 will display tip2. AFAICS \s Style1 \s Style2 uses Style1 but I couldn't work out what \s \s Style2 was doing.
ANd there are doubtless exceptions to all that lot too. I have some code that deals with some of those issues but it isn't complete or well-tested.
I have tried all day to replace this character I have in a string which has characters in it... and after googling, I found is called a "Control Sequence Introducer".
It looks like the hex code is 9B and the ASCII code is 155. (I think, from what I've read).
The string comes from a file which I read in, I have some null characters to replace, which is working fine, but just after that I've been working to remove this wierd character.
In notepad++ when I do show all symbols, it looks like this:
I tried the following:
strLine = strLine.Replace(Chr(155), " ")
strLine = Replace(strLine , "9B", " ")
strLine = Replace(strLine , "›", " ")
strLine = Replace(strLine , Chr(155), " ")
strLine= Regex.Replace(strLine, "\c#", " ")
strLine= Regex.Replace(strLine, "\c_", " ")
strLine= Regex.Replace(strLine, "\c", " ")
strLine= Regex.Replace(strLine, "\cA", " ")
strLine= Regex.Replace(strLine, "\cZ", " ")
I found a good section in wikipedia
Control Sequence Introducer Character ANSI Control Sequences
With everything going on in that, perhaps I have the wrong hex code?
Does anybody know how to replace this character? I have googled this for awhile now and it's elusive to me how to solve this.
I did find it finally, using regex and the hex code!
strLine = Regex.Replace(strLine, "\x9B", " ")
Can someone please guide which one of the below is the correct Data Annotation, if I want to allow just alphabets:
[RegularExpression(#"[a-zA-Z]*", ErrorMessage = "Invalid {0}")]
OR
[RegularExpression(#"^[a-zA-Z]*", ErrorMessage = "Invalid {0}")]
Both seems to be working. The difference is ^ symbol.
^ Caret is a Position Anchor.
Position Anchors does not match character, but position such as start-of-line, end-of-line, start-of-word and end-of-word.
In this case you need both ^ and $: start-of-line and end-of-line respectively. E.g., ^[0-9]$ matches a numeric string.
So you should go with,
[RegularExpression(#"^[a-zA-Z]*$", ErrorMessage = "Invalid {0}")]
Becase you need strings starts and ends with alphabetical characters only, not having any other characters such as symbols or numerals. Here are some examples that you can play with.
let str1 = 'abcDef';
let str2 = '123abcDef';
let str3 = 'abcDef123';
let str4 = 'abc123Def';
let my_regex = /^[a-zA-Z]*$/;
let your_regex = /[a-zA-Z]*/;
alert(str1 + " : " + my_regex.test(str1) + " with my regex");
alert(str2 + " : " + my_regex.test(str2) + " with my regex");
alert(str3 + " : " + my_regex.test(str3) + " with my regex");
alert(str4 + " : " + my_regex.test(str4) + " with my regex");
alert(str1 + " : " + your_regex.test(str1) + " with your regex");
alert(str2 + " : " + your_regex.test(str2) + " with your regex");
alert(str3 + " : " + your_regex.test(str3) + " with your regex");
alert(str4 + " : " + your_regex.test(str4) + " with your regex");
The Caret (^) character is also referred to by the following terms:
Terminology
hat, control, uparrow, chevron, circumflex accent
Usage
It has two uses in regular expressions:
To denote the start of the line
If used immediately after a square bracket ([^) it acts to negate the set of allowed characters (i.e. [123] means the character 1, 2, or 3 is allowed, whilst the statement [^123] means any character other than 1, 2, or 3 is allowed.
Character Escaping
To express a caret without special meaning, it should be escaped by preceding it with a backslash; i.e. ^.
You can find it here...
I think you want something more like this:
[RegularExpression(#"^[a-zA-Z]+$", ErrorMessage = "Invalid {0}")]
which says to match
the beginning of the string ^
followed by 1 or more alphabetic characters [a-zA-Z]+ (you do need more than zero right)?
followed by the end of the string $
This doesn't allow other strings to match such as 123abc or abc123 because the anchors of ^ and $ prevent that.
In your first example, the match would allow an empty string, and would allow for the cases I mentioned in the paragraph above. Your second example would allow empty string, but would at least filter out 123abc but would still allow abc123 because you don't have the $ marker.
If you want to take my solution and extend it beyond ASCII alphabetic characters, you can change [a-ZA-Z]+ to \p{L}+, which should work universally in Unicode (but that seems like it might be more than you're looking for; just including for completeness).
Finally, [RegularExpression] uses the standard regex capability that has been part of .NET for quite some time, expressed in the Regular Expression Language - Quick Reference.
I mean we have vbnewline, vblf, vbcr, vbtab.
Is there vbspace?
I mean this may sound trivial. Just press the space bar button or do ""
The thing is,
Space is unseen. Code will be clearer if we put vbspace.
Also there are many kind of space. Non breaking space etc.
So is there such thing? Where can I see list of vb special characters?
You can use space():- Returns a string consisting of the specified number of spaces
Dim TestString As String
' Returns a string with 1 spaces.
TestString = Space(1)
' Returns a string with 10 spaces.
TestString = Space(10)
' Inserts 10 spaces between two strings.
TestString = "Hi" & Space(10) & "Sharen Eayrs"
No, there is not. This page lists all of the vbFoo constant values: https://msdn.microsoft.com/en-us/library/microsoft.visualbasic.constants_fields(v=vs.110).aspx
I disagree with your notion that "code will be clearer", but if you really want:
Public Const vbSpace As String = " "
I'm trying to convert a string that contains someones name as "Last, First" to "First Last".
This is how I am doing it now:
name = name.Trim
name = name.Substring(name.IndexOf(",") + 1, name.Length) & " " & name.Substring(0, name.IndexOf(",") - 1)
When I do this I get the following error:
ArgumentOutOfRangeException was unhandled
Index and length must refer to a location within the string
Parameter name: length
Can someone explain why I am getting this error and how I should be doing this?
You are getting error on this:
name.Substring(name.IndexOf(",") + 1, name.Length)
name.Length should have subtracted with the length of the string before the comma.
The best way for that is to split the string.
Dim oFullname as string = "Last, First"
Dim oStr() as string = oFullname.split(","c)
oFullname = oStr(1).trim & " " & oStr(0).trim
MsgBox (oFullname)
The second parameter for String.Substring is the length of the substring, not the end position. For this reason, you're always going to go out of bounds if you do str.Substring(n, str.Length) with n > 0 (which would be the whole point of a substring).
You need to subtract name.IndexOf(",") + 1 from name.Length in your first substring. Or just split the string, as the others have suggested.
simply ,you only need to split the string
Dim originalName As String = "Last,First"
Dim parts = name.Split(","C)
Dim name As String = parts(1) & " " & parts(0)
If you're using the Unix command line--like the terminal on a Mac--you can do it like this:
Let's say that you have a file containing your last-comma-space-first type names like this:
Last1, First1
Last2, First2
Last3, First3
OK, now let's save it as last_comma_space_first.txt. At this point you can use this command I came up with for your particular problem:
sed -E 's/([A-Za-z0-9]+), ([A-Za-z0-9]+)/\2 \1/g' last_comma_space_first.txt > first_space_last.txt
--->>> Scroll --->>>
You're done! Now, go check that first_space_last.txt file! ^_^ You should get the following:
First1 Last1
First2 Last2
First3 Last3
Tell your friends... Or don't...
This would work keeping to the posters format.
Name = "Doe,John"
Name = Replace(Name.Substring(Name.IndexOf(","), Name.Length - Name.IndexOf(",")) & " " & Name.Substring(0, Name.IndexOf(",")), ",", "")
Result Name = "John Doe"