I have encountered an odd problem while writing VBA macro for Ppt document.
In the code I would like to extract text in frames at the very top of each slide of my presentation.
The text is divided by newline character, so first I would like to search for newline character.
I am using Instr function to search for the position of a newline in a string.
The code is the following:
Sub SetTitle()
Dim sl As Slide
Dim sh As Shape
Dim trng As TextRange
Dim asptext() As Variant
For Each sl In ActivePresentation.Slides
For Each sh In sl.Shapes
If sh.Top = 5.403701 Then
Set trng = sh.TextFrame.TextRange
txt = trng.Text
Debug.Print txt
pos = InStr(1, Chr(13), txt)
Debug.Print pos
Debug.Print Asc(Mid(txt, 7, 1))
If pos <> 0 Then
Debug.Print pos
End If
End If
Next sh
Next sl
End Sub
In Immediate window I am getting the following results : https://i.stack.imgur.com/Ig41i.png
The debugger gives an error:
Runtime error '5': Invalid procedure call or argument
for the line
Debug.Print Asc(Mid(txt, 7, 1))
So I suppose that there is a problem in recognizing this NL character.
Do you have any idea why it is like this?
The error comes likely not from any NewLine character but from a string that is too short. Mid(txt, 7, 1) will return an empty string, and that is not a valid parameter for the Asc-function.
To check for Newline-characters, you can use the constants vbCr (same as Chr(13)) and vbLf (same as Chr(10)) and vbCrLf.
If you are unsure about the content of a string, you can use the following function:
Function DumpString(s As String) As String
Const Separator = ", "
' Write all chars of String as ASCII-Value
Dim i As Long
For i = 1 To Len(s)
Dim a As Long, c As String
a = AscW(Mid(s, i, 1))
If a = AscW(vbCr) Then
c = "<CR>"
ElseIf a = AscW(vbLf) Then
c = "<LF>"
ElseIf a = AscW(vbTab) Then
c = "<TAB>"
ElseIf a = 0 Then
c = "<NUL>"
Else
c = Mid(s, i, 1)
End If
DumpString = DumpString & IIf(DumpString = "", "", Separator) & a & "(" & c & ")"
Next i
End Function
Related
I have come across this code(not mine), what it actually does is insert a Line break after a character length has been determined.
Public Function LFNearSpace(InputStr As String, CharCnt As Long)
Dim SplitStrArr() As Variant
Dim SplitCnt As Long
Dim c As Long
Dim i As Long
Dim Lcnt As Long
Dim Rcnt As Long
Dim OutputStr As String
'Split string into Array
ReDim SplitStrArr(Len(InputStr) - 1)
For i = 1 To Len(InputStr)
SplitStrArr(i - 1) = Mid$(InputStr, i, 1)
Next
SplitCnt = 0
For c = LBound(SplitStrArr) To UBound(SplitStrArr)
SplitCnt = SplitCnt + 1
If SplitCnt = CharCnt Then
'get count to space nearest to the left and right of word
For i = c To LBound(SplitStrArr) Step -1
If SplitStrArr(i) = " " Then
Lcnt = i
Exit For
End If
Next i
For i = c To UBound(SplitStrArr)
If SplitStrArr(i) = " " Then
Rcnt = i
Exit For
End If
Next i
'add line feed to nearest space
If (Rcnt - c) < (c - Lcnt) Then
SplitStrArr(Lcnt) = Chr(10)
SplitCnt = c - Lcnt
ElseIf (Rcnt - c) = (c - Lcnt) Then
SplitStrArr(Rcnt) = Chr(10)
SplitCnt = c - Rcnt
End If
End If
Next c
'Finalize the output into a single string
LFNearSpace = Join(SplitStrArr, "")
End Function
So here's my condition:
Column Width: 75
Font Name: Arial
Font Size: 9
I am customizing it for a while to fit my conditions,as far as I can think of
Unfortunately, the function cuts(inserts line break) the word not in natural way for example:
I call it like this, well if I change the 105 value the output changes but I wanted to create a solution why the output is similar to the image below.
SomeStr = LFNearSpace(SomeStr, 105)
Worksheets("Sheet1").Range("A1").Value = SomeStr
Any thoughts? Thanks
Try this
With Columns(1)
.ColumnWidth = 75
.Font.Name = "Arial"
.Font.Size = 9
.WrapText = True
End With
below code will break string to two line on occurrence of space after 20 character.
dim inputstr as string = "This is my test input string. I hope it helps!"
dim breakafter as integer= 20
dim line1 as string,line2 as string
dim found as integer=InStr(breakafter, inputstr, " ", vbTextCompare) ' KNOW WHERE IS 1st space after 20 char(s)
line1= Left(inputstr,found ) ' get 1st part of text
line2 = Replace(inputstr, " ", environment.newline() , found, 1, vbTextCompare) ' get remaining text
msgbox line1 + iif(isnothing(line2),"",line2)
I was wondering how to remove duplicate names/text's in a cell. For example
Jean Donea Jean Doneasee
R.L. Foye R.L. Foyesee
J.E. Zimmer J.E. Zimmersee
R.P. Reed R.P. Reedsee D.E. Munson D.E. Munsonsee
While googling, I stumbled upon a macro/code, it's like:
Function RemoveDupes1(pWorkRng As Range) As String
'Updateby20140924
Dim xValue As String
Dim xChar As String
Dim xOutValue As String
Set xDic = CreateObject("Scripting.Dictionary")
xValue = pWorkRng.Value
For i = 1 To VBA.Len(xValue)
xChar = VBA.Mid(xValue, i, 1)
If xDic.exists(xChar) Then
Else
xDic(xChar) = ""
xOutValue = xOutValue & xChar
End If
Next
RemoveDupes1 = xOutValue
End Function
The macro is working, but it is comparing every letter, and if it finds any repeated letters, it's removing that.
When I use the code over those names, the result is somewhat like this:
Jean Dos
R.L Foyes
J.E Zimers
R.P edsDEMuno
By looking at the result I can make out it is not what I want, yet I got no clue how to correct the code.
The desired output should look like:
Jean Donea
R.L. Foye
J.E. Zimmer
R.P. Reed
Any suggestions?
Thanks in Advance.
Input
With the input on the image:
Result
The Debug.Print output
Regex
A regex can be used dynamically iterating on the cell, to work as a Find tool. So it will extract only the shortest match. \w*( OUTPUT_OF_EXTRACTELEMENT )\w*, e.g.: \w*(Jean)\w*
The Regex's reference must be enabled.
Code
Function EXTRACTELEMENT(Txt As String, n, Separator As String) As String
On Error GoTo ErrHandler:
EXTRACTELEMENT = Split(Application.Trim(Mid(Txt, 1)), Separator)(n - 1)
Exit Function
ErrHandler:
' error handling code
EXTRACTELEMENT = 0
On Error GoTo 0
End Function
Sub test()
Dim str As String
Dim objMatches As Object
Set objRegExp = CreateObject("VBScript.RegExp") 'New regexp
lastrow = ActiveSheet.Cells(ActiveSheet.Rows.Count, "A").End(xlUp).Row
For Row = 1 To lastrow
str = Range("A" & Row)
F_str = ""
N_Elements = UBound(Split(str, " "))
If N_Elements > 0 Then
For k = 1 To N_Elements + 1
strPattern = "\w*(" & EXTRACTELEMENT(CStr(str), k, " ") & ")\w*"
With objRegExp
.Pattern = strPattern
.Global = True
End With
If objRegExp.test(strPattern) Then
Set objMatches = objRegExp.Execute(str)
If objMatches.Count > 1 Then
If objRegExp.test(F_str) = False Then
F_str = F_str & " " & objMatches(0).Submatches(0)
End If
ElseIf k <= 2 And objMatches.Count = 1 Then
F_str = F_str & " " & objMatches(0).Submatches(0)
End If
End If
Next k
Else
F_str = str
End If
Debug.Print Trim(F_str)
Next Row
End Sub
Note that you can Replace the Debug.Print to write on the target
cell, if it is column B to Cells(Row,2)=Trim(F_str)
Explanation
Function
You can use this UDF, that uses the Split Function to obtain the element separated by spaces (" "). So it can get every element to compare on the cell.
Loops
It will loop from 1 to the number of elements k in each cell and from row 1 to lastrow.
Regex
The Regex is used to find the matches on the cell and Join a new string with the shortest element of each match.
This solution operates on the assumption that 'see' (or some other three-letter string) will always be on the end of the cell value. If that isn't the case then this won't work.
Function RemoveDupeInCell(dString As String) As String
Dim x As Long, ct As Long
Dim str As String
'define str as half the length of the cell, minus the right three characters
str = Trim(Left(dString, WorksheetFunction.RoundUp((Len(dString) - 3) / 2, 0)))
'loop through the entire cell and count the number of instances of str
For x = 1 To Len(dString)
If Mid(dString, x, Len(str)) = str Then ct = ct + 1
Next x
'if it's more than one, set to str, otherwise error
If ct > 1 Then
RemoveDupeInCell = str
Else
RemoveDupeInCell = "#N/A"
End If
End Function
My spreadsheet has a column with value like this string:
some text (text1) some test (text2) (text1)
How do I get all values between parentheses? The result I am looking for is:
text1, text2
Even if text1, text2... testn is present in the cell multiple times, I need it in the result only once.
I found a function GetParen here: Get the value between the brackets
It is helpful, but it gives the fist available value in the parentheses and ignores the rest.
It seems unwieldy to have one User Defined Function for individual entries and another for a collective result of all entries.
Paste the following into a standard module code sheet.
Function getBracketedText(str As String, _
Optional pos As Integer = 0, _
Optional delim As String = ", ", _
Optional dupes As Boolean = False)
Dim tmp As String, txt As String, a As Long, b As Long, p As Long, arr() As Variant
tmp = str
ReDim arr(1 To 1)
For b = 1 To (Len(tmp) - Len(Replace(tmp, Chr(40), vbNullString)))
p = InStr(p + 1, tmp, Chr(40))
txt = Trim(Mid(tmp, p + 1, InStr(p + 1, tmp, Chr(41)) - (p + 1)))
If UBound(Filter(arr, txt, True)) < 0 Or dupes Then '<~~ check for duplicates within the array
a = a + 1
ReDim Preserve arr(1 To a)
arr(UBound(arr)) = txt
End If
Next b
If CBool(pos) Then
getBracketedText = arr(pos)
Else
getBracketedText = Join(arr, delim)
End If
End Function
Use like any other native worksheet function. There are optional parameters to retrieve an individual element or a collection as well as changing the default <comma><space> delimiter.
This code works for me:
Sub takingTheText()
Dim iniP 'first parenthesis
Dim endP 'last parentehis
Dim myText 'the text
Dim txtLen
Dim i
Dim tmp
Dim j
myText = Range("A1").Value
txtLen = Len(myText)
j = 0
Do 'Loop in the text
i = i + 1 'a counter
iniP = InStr(1, myText, "(", 1) 'found the first occurence of the (
endP = InStr(1, myText, ")", 1) 'same as above
tmp = tmp & Right(Left(myText, i), 1) 'take the text garbage text
If i = iniP Then 'here comes the work
j = j + 1 'here take the cell index
myText = Replace(myText, tmp, "") 'remove the garbage text in front the first (
tmp = Left(myText, endP - iniP - 1) 'reuse the var to store the usefull text
Cells(1, 2).Value = Cells(1, 2).Value & Chr(10) & tmp 'store in the cell B1
'If you want to stored in separated cells use the below code
'Cells(j, 2).Value = tmp
myText = Replace(myText, tmp & ")", "", 1, 1) ' remove the garbage text from the main text
tmp = Empty 'empty the var
i = 0 'reset the main counter
End If
Loop While endP <> 0
End Sub
Result:
Please check and tellme if is ok.
Edit#1
Cells(1, 2).Value = Cells(1, 2).Value & Chr(10) & tmp this code store the text in separated lines inside the same cell, may be you want to use spaces between the resulting text because of chr(10) (also you can use chr(13)), then you can use Cells(1, 2).Value = Cells(1, 2).Value & " " & tmp, or use any other character instead the string inside the & symbols
I am searching for a solution to convert a double to a string, but the string should have a comma before the decimal place, not a point.
"One and a half" should look that way 1,5 (german notation).
Thanks for your help!!
A combination of CStr and Replace will do the job.
Function Dbl2Str(dbl As Double) As String
Dbl2Str = Replace(CStr(dbl), ".", ",")
End Function
Unfortunately in VBA, you can't easily write locale-independent code. That is, you can't specify a locale when you take a CStr cast.
One work around is to convert a double like 0.5 to a string and see what you end up with. If you end up with 0,5 then you're in German (etc.) locale, and you don't need to do anything else.
If you end up with 0.5 then you know you need to make a conversion. Then you just need to traverse your string, replacing dots with commas and vice versa (the vice versa bit is important in case your string has thousands delimiters). You can use Replace for that.
Following RubberDuck comment I ended up with this:
Function DblToStr(x As Double)
DblToStr = CStr(x)
If (Application.ThousandsSeparator = ".") Then
DblToStr = Replace(DblToStr, ".", "")
End If
If (Application.DecimalSeparator = ".") Then
DblToStr = Replace(DblToStr, ".", ",")
End If
End Function
something like this then
Dim somestring As String
Dim someDec As Double
someDec = 1.5
somestring = CStr(someDec)
somestring = Replace(somestring, ".", ",")
MsgBox (somestring)
Select the cells you wish to convert and run this small macro:
Sub changeIT()
For Each r In Selection
t = r.Text
If InStr(1, r, ".") > 0 Then
r.Clear
r.NumberFormat = "#"
r.Value = Replace(t, ".", ",")
End If
Next r
End Sub
Only those cells with "." in them will change and they will be Strings rather than Doubles
I checked the other answers but ended up writing my own solution to convert user inputs like 1500.5 into 1,500.50, using below code:
'
' Separates real-numbers by "," and adds "." before decimals
'
Function FormatNumber(ByVal v As Double) As String
Dim s$, pos&
Dim r$, i&
' Find decimal point
s = CStr(v)
pos = InStrRev(s, ".")
If pos <= 0 Then
pos = InStrRev(s, ",")
If pos > 0 Then
Mid$(s, pos, 1) = "."
Else
pos = Len(s) + 1
End If
End If
' Separate numbers into "r"
On Error Resume Next
i = pos - 3
r = Mid$(s, i, 3)
For i = i - 3 To 1 Step -3
r = Mid$(s, i, 3) & "," & r
Next i
If i < 1 Then
r = Mid$(s, 1, 2 + i) & "," & r
End If
' Store dot and decimal numbers into "s"
s = Mid$(s, pos)
i = Len(s)
If i = 2 Then
s = s & "0"
ElseIf i <= 0 Then
s = ".00"
End If
' Append decimals and return
FormatNumber = r & s
End Function
I am getting the impression that this is not possible in word but I figure if you are looking for any 3-4 words that come in the same sequence anywhere in a very long paper I could find duplicates of the same phrases.
I copy and pasted a lot of documentation from past papers and was hoping to find a simple way to find any repeated information in this 40+ page document there is a lot of different formatting but I would be willing to temporarily get rid of formatting in order to find repeated information.
To highlight all duplicate sentences, you can also use ActiveDocument.Sentences(i). Here is an example
LOGIC
1) Get all the sentences from the word document in an array
2) Sort the array
3) Extract Duplicates
4) Highlight duplicates
CODE
Option Explicit
Sub Sample()
Dim MyArray() As String
Dim n As Long, i As Long
Dim Col As New Collection
Dim itm
n = 0
'~~> Get all the sentences from the word document in an array
For i = 1 To ActiveDocument.Sentences.Count
n = n + 1
ReDim Preserve MyArray(n)
MyArray(n) = Trim(ActiveDocument.Sentences(i).Text)
Next
'~~> Sort the array
SortArray MyArray, 0, UBound(MyArray)
'~~> Extract Duplicates
For i = 1 To UBound(MyArray)
If i = UBound(MyArray) Then Exit For
If InStr(1, MyArray(i + 1), MyArray(i), vbTextCompare) Then
On Error Resume Next
Col.Add MyArray(i), """" & MyArray(i) & """"
On Error GoTo 0
End If
Next i
'~~> Highlight duplicates
For Each itm In Col
Selection.Find.ClearFormatting
Selection.HomeKey wdStory, wdMove
Selection.Find.Execute itm
Do Until Selection.Find.Found = False
Selection.Range.HighlightColorIndex = wdPink
Selection.Find.Execute
Loop
Next
End Sub
'~~> Sort the array
Public Sub SortArray(vArray As Variant, i As Long, j As Long)
Dim tmp As Variant, tmpSwap As Variant
Dim ii As Long, jj As Long
ii = i: jj = j: tmp = vArray((i + j) \ 2)
While (ii <= jj)
While (vArray(ii) < tmp And ii < j)
ii = ii + 1
Wend
While (tmp < vArray(jj) And jj > i)
jj = jj - 1
Wend
If (ii <= jj) Then
tmpSwap = vArray(ii)
vArray(ii) = vArray(jj): vArray(jj) = tmpSwap
ii = ii + 1: jj = jj - 1
End If
Wend
If (i < jj) Then SortArray vArray, i, jj
If (ii < j) Then SortArray vArray, ii, j
End Sub
SNAPSHOTS
BEFORE
AFTER
I did not use my own DAWG suggestion, and I am still interested in seeing if someone else has a way to do this, but I was able to come up with this:
Option Explicit
Sub test()
Dim ABC As Scripting.Dictionary
Dim v As Range
Dim n As Integer
n = 5
Set ABC = FindRepeatingWordChains(n, ActiveDocument)
' This is a dictionary of word ranges (not the same as an Excel range) that contains the listing of each word chain/phrase of length n (5 from the above example).
' Loop through this collection to make your selections/highlights/whatever you want to do.
If Not ABC Is Nothing Then
For Each v In ABC
v.Font.Color = wdColorRed
Next v
End If
End Sub
' This is where the real code begins.
Function FindRepeatingWordChains(ChainLenth As Integer, DocToCheck As Document) As Scripting.Dictionary
Dim DictWords As New Scripting.Dictionary, DictMatches As New Scripting.Dictionary
Dim sChain As String
Dim CurWord As Range
Dim MatchCount As Integer
Dim i As Integer
MatchCount = 0
For Each CurWord In DocToCheck.Words
' Make sure there are enough remaining words in our document to handle a chain of the length specified.
If Not CurWord.Next(wdWord, ChainLenth - 1) Is Nothing Then
' Check for non-printing characters in the first/last word of the chain.
' This code will read a vbCr, etc. as a word, which is probably not desired.
' However, this check does not exclude these 'words' inside the chain, but it can be modified.
If CurWord <> vbCr And CurWord <> vbNewLine And CurWord <> vbCrLf And CurWord <> vbLf And CurWord <> vbTab And _
CurWord.Next(wdWord, ChainLenth - 1) <> vbCr And CurWord.Next(wdWord, ChainLenth - 1) <> vbNewLine And _
CurWord.Next(wdWord, ChainLenth - 1) <> vbCrLf And CurWord.Next(wdWord, ChainLenth - 1) <> vbLf And _
CurWord.Next(wdWord, ChainLenth - 1) <> vbTab Then
sChain = CurWord
For i = 1 To ChainLenth - 1
' Add each word from the current word through the next ChainLength # of words to a temporary string.
sChain = sChain & " " & CurWord.Next(wdWord, i)
Next i
' If we already have our temporary string stored in the dictionary, then we have a match, assign the word range to the returned dictionary.
' If not, then add it to the dictionary and increment our index.
If DictWords.Exists(sChain) Then
MatchCount = MatchCount + 1
DictMatches.Add DocToCheck.Range(CurWord.Start, CurWord.Next(wdWord, ChainLenth - 1).End), MatchCount
Else
DictWords.Add sChain, sChain
End If
End If
End If
Next CurWord
' If we found any matching results, then return that list, otherwise return nothing (to be caught by the calling function).
If DictMatches.Count > 0 Then
Set FindRepeatingWordChains = DictMatches
Else
Set FindRepeatingWordChains = Nothing
End If
End Function
I have tested this on a 258 page document (TheStory.txt) from this source, and it ran in just a few minutes.
See the test() sub for usage.
You will need to reference the Microsoft Scripting Runtime to use the Scripting.Dictionary objects. If that is undesirable, small modifications can be made to use Collections instead, but I prefer the Dictionary as it has the useful .Exists() method.
I chose a rather lame theory, but it seems to work (at least if I got the question right cuz sometimes I'm a slow understander).
I load the entire text into a string, load the individual words into an array, loop through the array and concatenate the string, containing each time three consecutive words.
Because the results are already included in 3 word groups, 4 word groups or more will automatically be recognized.
Option Explicit
Sub Find_Duplicates()
On Error GoTo errHandler
Dim pSingleLine As Paragraph
Dim sLine As String
Dim sFull_Text As String
Dim vArray_Full_Text As Variant
Dim sSearch_3 As String
Dim lSize_Array As Long
Dim lCnt As Long
Dim lCnt_Occurence As Long
'Create a string from the entire text
For Each pSingleLine In ActiveDocument.Paragraphs
sLine = pSingleLine.Range.Text
sFull_Text = sFull_Text & sLine
Next pSingleLine
'Load the text into an array
vArray_Full_Text = sFull_Text
vArray_Full_Text = Split(sFull_Text, " ")
lSize_Array = UBound(vArray_Full_Text)
For lCnt = 1 To lSize_Array - 1
lCnt_Occurence = 0
sSearch_3 = Trim(fRemove_Punctuation(vArray_Full_Text(lCnt - 1) & _
" " & vArray_Full_Text(lCnt) & _
" " & vArray_Full_Text(lCnt + 1)))
With Selection.Find
.Text = sSearch_3
.Forward = True
.Replacement.Text = ""
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
Do While .Execute
lCnt_Occurence = lCnt_Occurence + 1
If lCnt_Occurence > 1 Then
Selection.Range.Font.Color = vbRed
End If
Selection.MoveRight
Loop
End With
Application.StatusBar = lCnt & "/" & lSize_Array
Next lCnt
errHandler:
Stop
End Sub
Public Function fRemove_Punctuation(sString As String) As String
Dim vArray(0 To 8) As String
Dim lCnt As Long
vArray(0) = "."
vArray(1) = ","
vArray(2) = ","
vArray(3) = "?"
vArray(4) = "!"
vArray(5) = ";"
vArray(6) = ":"
vArray(7) = "("
vArray(8) = ")"
For lCnt = 0 To UBound(vArray)
If Left(sString, 1) = vArray(lCnt) Then
sString = Right(sString, Len(sString) - 1)
ElseIf Right(sString, 1) = vArray(lCnt) Then
sString = Left(sString, Len(sString) - 1)
End If
Next lCnt
fRemove_Punctuation = sString
End Function
The code assumes a continuous text without bullet points.