How do I delete any more than 1 specific character in a paragraph while leaving the first apperance? - vba

Edit:
Spaces and tabs everywhere
Goal:
If there's 1 tab or less, check the next paragraph.
If there's 2 or more tabs, leave the first and delete the rest in that paragraph. Check next paragraph.
I want to leave the first paragraph marker so I can turn it into a table with the word in the first column, and the description/definition in the second column.
Attempted Method:
Look at each character in a paragraph and count the tabs. After counting tab characters, if there are more than the desired amount (currently one tab) then replace the first tab in the paragraph with a "^t" (tab character) and then any remaining tabs in that paragraph with "" (nothing, essentially deleting it).
Issues:
The script only sometimes works. It seems to get hung up when there are lines with less than 2 tabs per line.
The other times it'll delete the first tab in a line. The first occurrence should stay while the remainder leave.
The last line of my test word document sometimes isn't touched.
IF I run in repeatedly, it'll delete all tabs in the document except the last one in the last line.
Description:
I am an ugly coder, sorry. I'm self taught with loops, if/thens, and recording and modifying macros. This is not homework, it's so I don't have to manually go through 1,500 documents to format them since people use tabs to align text instead of the ruler bar. Pacman mode (show/hide non-printable characters mode) is very scary looking.
Sub TabFinder()
Dim oPara As Word.Paragraph 'paragraph
Dim var 'a counter for a FOR loop for finding tabs in a paragraph, represents a single character
Dim TabCounter As Long 'how many tabs will be in a paragraph
Dim oChar As Word.Characters 'characters in a paragraph
Dim StartHere As Long 'not currently used - thought I would try and save the location of the first tab to save it
Dim TabsFoundAndReplaced As Long 'how many times the program has replaced a tab in a line
Dim ReplaceText 'first tab will replace with a tab, 2nd and greater tab will be replaced with nothing
Dim ReplaceAmmount As Long 'counter for how many times tabs have been replaced per line
Dim TabsWantedPerLine As Long 'number of tabs we should keep in a line and not replace
TabsWantedPerLine = 1 'desired upper limit of tabs to stay in a line
Selection.HomeKey Unit:=wdStory 'go to the beginning of the document first for find feature
For Each oPara In ActiveDocument.Paragraphs 'for every paragraph in the document
TabCounter = 0
Set oChar = oPara.Range.Characters 'find the number of characters, and set oChar as the total
For var = 1 To oChar.Count 'for each character do the following
If oChar(var) <> "" Then 'stops an error
Select Case Asc(oChar(var)) 'stops an error
Case 9 '9 is tabs 'if there character oChar is a tab, then do the following
'If TabCounter = 0 Then StartHere = var 'not used currently, just a thought on how to save the tab with additional coding
TabCounter = TabCounter + 1 'counts tabs in the line
End Select
End If
Next
If TabCounter > TabsWantedPerLine Then 'if there are more tabs in a paragraph than upper limit, do the following (hopefully deleting any after the desired tabs
For ReplaceAmmount = 1 To TabCounter 'do this for each of the tabs in the paragraph
If ReplaceAmmount <= TabsWantedPerLine Then ReplaceText = "^t" Else ReplaceText = "" 'replaces the
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^t"
.Replacement.Text = ReplaceText 'replaces with a "^t" if first go around or "" if >1st go around
.Forward = True
.Wrap = wdFindStop
End With
With Selection
If .Find.Forward = True Then
.Collapse Direction:=wdCollapseStart
Else
.Collapse Direction:=wdCollapseEnd
End If
.Find.Execute replace:=wdReplaceOne
'.Find.Execute
Set oChar = oPara.Range.Characters 'since we have replaced some characters in the paragraph
End With
Next ReplaceAmmount
End If
Next 'for each oPara
End Sub

This code makes permanent changes to your data, so test it before using in real life.
This code seems to do what you want.
Sub TabFinder()
Dim i As Long
Const sFAKETAB As String = "|tab|"
For i = 1 To ThisDocument.Paragraphs.Count
'replace the first tab w a fake one
ThisDocument.Paragraphs(i).Range.Text = Replace(ThisDocument.Paragraphs(i).Range.Text, vbTab, sFAKETAB, 1, 1)
'replace all other tabs w nothing
ThisDocument.Paragraphs(i).Range.Text = Replace(ThisDocument.Paragraphs(i).Range.Text, vbTab, vbNullString)
'replace the fake tab w a real tab
ThisDocument.Paragraphs(i).Range.Text = Replace(ThisDocument.Paragraphs(i).Range.Text, sFAKETAB, vbTab)
Next i
End Sub
Here's the thing about using a For Each and changing the .Range.Text property: it doesn't work. It seems to reset which paragraph it's on. Kind of like when you delete something in a For Each loop and VBA loses track of where it is. So the above line are a little more verbose than 1) using For Each 2) using a With Block and/or 3) setting an object variable, but it works.
Update
Or you can replace the tabs with spaces, per Wayne's excellent comment.
Sub TabFinder3()
Dim i As Long
Dim j As Long
Const sFAKETAB As String = "|tab|"
Const lMAXTABS As Long = 5 'the maximum adjacent tabs you would have
For i = 1 To ThisDocument.Paragraphs.Count
'replace the first tab a fake one
ThisDocument.Paragraphs(i).Range.Text = Replace(ThisDocument.Paragraphs(i).Range.Text, vbTab, sFAKETAB, 1, 1)
'replace all other tabs w a space - multiple tabs replace wWith one space
For j = lMAXTABS To 1 Step -1
ThisDocument.Paragraphs(i).Range.Text = Replace(ThisDocument.Paragraphs(i).Range.Text, String(j, vbTab), Space(1))
Next j
'replace the fake tab w a real tab
ThisDocument.Paragraphs(i).Range.Text = Replace(ThisDocument.Paragraphs(i).Range.Text, sFAKETAB, vbTab)
Next i
End Sub

Let's try something like this which just uses some simple string functions.
First, iterate the paragraphs by index position (I couldn't make it work with a For Each oPara loop).
Capture each paragraph's text in a string variable: paraText
Use a Do ... Loop construct to replace the tabs with the string Replace function. The replacement text is assigned dynamically using the IIF function, which will use "^t" if TabCounter=1 or otherwise use the null string "". This loops until there are no tab characters in the string.
Once the paraText string has been thoroughly manipulated, simply assign that string to the oPara.Range.Text = paraText:
Option Explicit
Sub TabFinder()
Dim oPara As Word.Paragraph 'paragraph
Dim var 'a counter for a FOR loop for finding tabs in a paragraph, represents a single character
Dim TabCounter As Long 'how many tabs will be in a paragraph
Dim TabsWantedPerLine As Long 'number of tabs we should keep in a line and not replace
Dim paraText As String 'represents the string of each paragraph
TabsWantedPerLine = 1 'desired upper limit of tabs to stay in a line
Selection.HomeKey Unit:=wdStory 'go to the beginning of the document first for find feature
For var = 1 To ActiveDocument.Paragraphs.Count 'for every paragraph in the document
Set oPara = ActiveDocument.Paragraphs(var)
TabCounter = 0
paraText = oPara.Range.Text
'# Loop until there are no more tabs left to replace
Do Until InStr(paraText, Chr(9)) = 0
TabCounter = TabCounter + 1
'# replace tabs in our temporary string variable:
paraText = Replace(paraText, Chr(9), IIf(TabCounter = 1, "^t", vbNullString), , 1)
Loop
'# Assign our new string ot the paragraph range
oPara.Range.Text = paraText
Next var
End Sub
Updated from comments above
Backwards iteration over each character in paragraph, delete character IF it's a tab AND there is another tab somwhere to the left of it within the paragraph. This is functionally same as ensuring there is only one tab per paragraph, and only the first one should remain.
Example document:
Example output:
Here is the updated code. This would need more (probably a lot more) tweaking if you wanted to allow more than one tab per paragraph, but since the limit is 1, I think this works.
Note: This does not address indentation which Word recognizes distinctly from the paragraph's text.
Sub TabFinder()
Dim oPara As Word.Paragraph 'paragraph
Dim var 'a counter for a FOR loop for finding tabs in a paragraph, represents a single character
Dim i As Integer
Dim paraText As String 'represents the string of each paragraph
Selection.HomeKey Unit:=wdStory 'go to the beginning of the document first for find feature
LeaveFirstTab = False
For var = 1 To ActiveDocument.Paragraphs.Count 'for every paragraph in the document
Set oPara = ActiveDocument.Paragraphs(var)
For i = oPara.Range.Characters.Count To 1 Step -1
If Mid(oPara.Range.Text, i, 1) = Chr(9) Then
'As long as there's another tab to the left of this one, delete this one
If InStr(Left(oPara.Range.Text, i - 1), Chr(9)) > 1 Then
oPara.Range.Characters(i).Delete
End If
End If
Next
Next
End Sub
Updated to explain Mid() Function
The VBA Mid function takes a substring of specified length, starting at a specified position, from a string. Syntax is:
MID( string, start_position, number_of_characters )
So in the code above, we are taking the substring beggining at i (our character iterator) with a length of 1. This is basic way to iterate over characters in a string.
I just noticed though that there is a better/simpler way to do this with Word's object model. Instead of the Mid function, I think you can use oPara.Range.Characters(i) to return the same substring. I don't know why I didn't realize this earlier because we actually use that method two lines later when we do the .Delete!
So modified:
For i = oPara.Range.Characters.Count To 1 Step -1
If oPara.Range.Characters(i) = Chr(9) Then
'As long as there's another tab to the left of this one, delete this one
If InStr(Left(oPara.Range.Text, i - 1), Chr(9)) > 1 Then
oPara.Range.Characters(i).Delete
End If
End If
Next

Related

VBA Word - .Find "[space]" always find matches outside the selection range thus loops undefinitely

When converting a table from PDF to word, I ended up with a format similar to the following:
([space] is a space character)
Text [space.spacing 10pts] Text [space.spacing 30pts] Text
Text [space.spacing 14pts] Text [space.spacing 31pts] Text
Text [space.spacing 12pts] Text [space.spacing 33pts] Text
Instead of a regular table with 3 columns and 3 rows containing each « Text » such as below
Text
Text
Text
Text
Text
Text
Text
Text
Text
In other words, instead of creating a column, the PDF conversion has created a regular paragraph, mimicking columns by adjusting [spaces].spacing according to the length of the text within the column.
So my inital thought was that it should be possible to recreate a table by identifing the spacing of each space for each paragraph of the converted table, eventually replacing them with identifiable symbols so I can convert the text into a table later on.
My idea was somewhat the following :
' For each paragraph of the selected text (which is the converted table)
' Find all [space] within the paragraph range
' If a [space] is found, check its spacing
' 1st case : [space].spacing is <= 1 pts (so a normal space)
' Do nothing
' 2nd case : [space].spacing is >= 10 pts (so previous Text is supposed to be within a small column)
' insert ££ (symbol for small column)
' 3rd case [space].spacing is >= 30 pts (so previous Text is supposed to be within a small column)
' insert §§ (symbol for large column)
' Once all [space] are found within the current paragraph, do the same with the next paragraph, until the last paragraph of the selected text
My current code is the following :
Private Sub Test()
Dim RngSearch As Range
Dim RngCurrent As Range
Dim Paragraph As Paragraph
For Each Paragraph In ActiveDocument.Paragraphs
Set RngCurrent = Paragraph.Range
RngCurrent.Select 'For testing purposes
With RngCurrent.Find
.Text = " "
Do While RngCurrent.Find.Execute
RngCurrent.Select 'For testing purposes
Select Case RngCurrent.Font.Spacing
Case Is >= 30
RngCurrent.Font.Spacing = 1
RngCurrent.InsertAfter ("§§")
Case Is >= 10
RngCurrent.Font.Spacing = 1
RngCurrent.InsertAfter ("¤")
Case Else
' Do Nothing
End Select
Loop
End With
Next Paragraph
End Sub
So it kinda word with one issue : it loops infinitely. Each time the text is finished, it goes back again indefinitely.
I managed to track the issue to the following code :
With RngCurrent.Find
.Text = " "
Do While RngCurrent.Find.Execute
RngCurrent.Select
' Use Case function
Loop
End With
Without it, the looping through paragraphs works normally (it ends at the last paragraph)
For Each Paragraph In ActiveDocument.Paragraphs
Set RngCurrent = Paragraph.Range
RngCurrent.Select
' Code here
Next Paragraph
But once .find.text (" ") is injected, it actually doesn't look within each Paragraphs.Range anymore as I supposed Do While RngCurrent.Find.Execute should have established.
I feel like the solution is something very stupid, but I've been searching for the reason why or alternatives for 2 days now. Everytime, it stops acting as per my understading when I'm using .find(" ").
I already tried using .wrap = wdFindStop, but it stops at the first match within the paragraph, and goes to the next paragraph prematurely.
With RngCurrent.Find
.Text = " "
.wrap = wdFindStop
Do While RngCurrent.Find.Execute
RngCurrent.Select
' Use Case function
Loop
End With
Strangely .wrap = wdFindAsk doesn't ask me anything... maybe that means something.
I believe it's because there are always spaces within each paragraph ? So it can loops indefinitely?
You're way over-complicating things:
Sub MakeTable()
Application.ScreenUpdating = False
Dim i As Single
With Selection
i = .Characters.First.Font.Size
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Wrap = wdFindStop
.Text = " "
.Replacement.Text = "^t"
.Replacement.Font.Size = i
.Font.Size = 10
.Execute Replace:=wdReplaceAll
.Font.Size = 30
.Execute Replace:=wdReplaceAll
End With
.ConvertToTable Separator:=vbTab
End With
Application.ScreenUpdating = True
End Sub
So I finally found not exactly a solution but a workaround for anyone who may need a similar solution. Instead of using a .find =" ", I decided to go the "hard" path and check for every word in a paragraph (which in MS Word, seems to end with a [space] character). Then, I check for the last character of a word (which is often a space) if its spacing is superior to a value. It the case, do something.
For Each RngWord In Paragraph.Range.Words
Set RngChar = RngWord.Characters.Last
Select Case RngChar.Font.Spacing
Case Is > 300
RngChar.Font.Spacing = 1
RngChar.InsertAfter ("£")
Case Is > 100
RngChar.Font.Spacing = 1
RngChar.InsertAfter ("#")
Case Is > 15
RngChar.Font.Spacing = 1
RngChar.InsertAfter ("¤")
Case Else
' Do Nothing
End Select
Next RngWord
It does the job, and isn't that slow, but I guess there are better solution :)

Set TextRange to start at beginning of current line (PowerPoint 2007 VBA)

Given that the cursor is within some TextRange tr, I would like a Sub that takes tr as an input argument and selects (or returns) a TextRange that starts at the start of the current line containing tr.startand ends at the next instance of a "." or ":"). Ideally this would work with an arbitrary TextRange or with the current selection (ActiveWindow.Selection.TextRange). NOTE: it maybe be that tr.Length = 0 (nothing actually selected).
I've answered the question by implementing a loop through all paragraphs in the text frame to find the paragraph containing the cursor, then through the lines in that paragraph to find the line containing the cursor. Then selecting text in the line starting at the first character and extending until the first of a ".", ":" or the end of the line. Then applying the "style" to the selected text. That code is below (some comments follow the code).
I am still hoping for a more elegant solution that doesn't require searching.
Option Explicit
Sub StyleRunInApply()
' Apply the run-in style to current selection (assumed to be a text range). If
' no characters are selected, apply from the beginning of the current line to
' the first of "." or ":" or the end of the current line.
'
' The "run-in style" is defined to be bold with Accent2 color of the current
' master theme.
Dim iLine As Long
Dim lenth As Long
Dim line As TextRange
Dim pgf As TextRange
Dim tr As TextRange
Dim thme As OfficeTheme
Set tr = ActiveWindow.Selection.TextRange
If tr.Length = 0 Then
' Loop through pgfs in parent text frame to find our line--
' the first pgf that ends at or beyond the cursor.
For Each pgf In tr.Parent.TextRange.Paragraphs
If pgf.Start + pgf.Length > tr.Start Or _
pgf.Start + pgf.Length > tr.Parent.TextRange.Length Then GoTo L_foundPgf
Next pgf ' (If fall through, pgf will be the final pgf in the frame.)
L_foundPgf:
' Find last line in pgf that starts before the cursor.
While iLine < pgf.Lines.Count And pgf.Lines(iLine + 1).Start < tr.Start
iLine = iLine + 1
Wend
Set line = pgf.Lines(iLine)
' Now look in the line for a ":" or "." and reset tr from the start of
' the line up to and including the first of a ":" or "." or the end of
' line.
lenth = line.Length
If Not line.Find(":") Is Nothing Then
lenth = line.Find(":").Start - line.Start + 1
ElseIf Not line.Find(".") Is Nothing Then
If line.Find(".").Start - line.Start + 1 < lenth Then
lenth = line.Find(".").Start - line.Start + 1
End If
End If
Set tr = line.Characters(1, lenth)
End If
' Set the finally selected text to the style!
Set thme = ActivePresentation.SlideMaster.Theme
tr.Font.Color = thme.ThemeColorScheme(msoThemeAccent2)
tr.Font.Bold = True
End Sub 'StyleRunInApply
Three comments on the code:
Improvements welcome.
A variation that set the end position of the text to be selected rather than the length seems to be about the same in terms of comprehensibility, size, and elegance.
In defense of the GoTo: I use it only as part of a substitute for "missing" language features, in this case, an Exit For, and then, for such exits, only immediately following the Then, which is the reason for not having a block follow the Then.

How to write VBA to format sentence starting with // in Word 2016?

I have a 400+ page coding manual I use, and unfortunately turned off the green for all the comments in the manual. I can't undo it, as I hadn't noticed it until it was too late. Its ruined years of work.
How would I write VBA to parse the document finding sentences starting with // and ending in a Paragraph mark and change the color of them? Or assign a style to them?
Here is a start that I have cobbled together, I admire people who can write code without intellisence, its like trying to find your way blindfolded
Dim oPara As Word.Paragraph
Dim rng As Range
Dim text As String
For Each oPara In ActiveDocument.Paragraphs
If Len(oPara.Range.text) > 1 Then
Set rng = ActiveDocument.Range(oPara.Range.Start,oPara.Range.End)
With rng.Font
.Font.Color = wdColorBlue
End With
End If
Next
End Sub
The following seems to work:
Dim oPara As Word.Paragraph
Dim text As String
For Each oPara In ActiveDocument.Paragraphs
text = oPara.Range.text
'Check the left 2 characters for //
If Left(oPara.Range.text, 2) = "//" Then
oPara.Range.text = "'" & text
End If
Next
I assume you are using VBA so by placing a ' in front of // it will turn the line green. You could modify the code to replace // with ' if desired. The opera.range.text should grab the entire paragraph.

Loop through pages OR page breaks?

I'm basically trying to create a cumulative word count for documents that will put the number of words on each page into its footer and add it to the total words each page. After a lot of looking around, I found that Word doesn't really handle pages the same for everybody and so doesn't have any interface to access the individual pages through.
Now I'm trying to separate each page with page breaks so there's a clear delimiter between pages, but I still can't find how to loop through these. Any clues?
I'm going to post the code I have, but it's only for getting the word count currently. No proper attempts at cycling through page breaks because I don't know how.
Sub getPageWordCount()
Dim iPgNum As Integer
Dim sPgNum As String
Dim ascChar As Integer
Dim rngPage As Range
Dim iBeginPage As Integer
Dim iEndPage As Integer
' Go to start of document and make sure its paginated correctly.
Selection.HomeKey Unit:=wdStory, Extend:=wdMove
ActiveDocument.Repaginate
' Loop through the number of pages in the document.
For iPgNum = 2 To Selection.Information(wdNumberOfPagesInDocument)
sPgNum = CStr(iPgNum)
iBeginPage = Selection.Start
' Go to next page
Selection.GoTo wdGoToPage, wdGoToAbsolute, sPgNum
' and to the last character of the previous page...
Selection.MoveLeft wdCharacter, 1, wdMove
iEndPage = Selection.Start
' Retrieve the character code at insertion point.
Set rngPage = ActiveDocument.Range(iBeginPage, iEndPage)
MsgBox rngPage.ComputeStatistics(wdStatisticWords)
'rngPage.Footers(wdHeaderFooterPrimary).Range.Text = rngPage.ComputeStatistics(wdStatisticWords)
'ActiveDocument.Sections(2).Footers
' Check the character code for hard page break or text.
Next
' ActiveDocument.Sections(2).Footers(wdHeaderFooterPrimary).Range.Text = "bob" 'Testing
End Sub
Finally got it, managed to guess my way through it a bit, taking assorted bits from dark corners of the internet:
Sub getPageWordCount()
'Replace all page breaks with section breaks
Dim myrange1 As Range, myrangedup As Range
Selection.HomeKey wdStory
Selection.Find.ClearFormatting
With Selection.Find
Do While .Execute(findText:="^m", Forward:=True, _
MatchWildcards:=False, Wrap:=wdFindStop) = True
Set myrange = Selection.Range
Set myrangedup = Selection.Range.Duplicate
myrange.Collapse wdCollapseEnd
myrange.InsertBreak wdSectionBreakNextPage
myrangedup.Delete
Loop
End With
'Unlink all footers and insert word count for each section
Dim sectionCount, sectionNumber, i, sectionWordCount, cumulativeWordCount As Integer
sectionCount = ActiveDocument.Sections.Count
For sectionNumber = 1 To sectionCount
With ActiveDocument.Sections(sectionNumber)
sectionWordCount = .Range.ComputeStatistics(wdStatisticWords)
cumulativeWordCount = cumulativeWordCount + sectionWordCount
With .Footers.Item(1)
.LinkToPrevious = False
.Range.Text = "This page's word count: " + CStr(sectionWordCount) + " | Cumulative word count: " + CStr(cumulativeWordCount)
.Range.ParagraphFormat.Alignment = wdAlignParagraphCenter
End With
End With
Next
End Sub
And now I've just discovered that if I want to port this macro to an add-in for ease of use for non-techy users I have to write it in VB 2010 in Visual Studio where the API is different. Good luck me!
It sounds as if you have what you need, but I was working on an alternative that I may as well post because it does not require you to add page breaks or section breaks. But you would have to add the same nested field in each footer that appears in the document (I haven't done that part here, but it's not completely trivial because there may be multiple sections and multiple footers per section).
The field code you need to add (in addition to your 'This page's word count: ' text) is
{ DOCVARIABLE "s{ SECTION }p{ PAGE \*arabic }" }
As written, the method may break in some circumstances, e.g. if there are continuous section breaks. I haven't checked.
Sub createWordCounts()
Dim i As Integer
Dim rng As Word.Range
With ActiveDocument
For i = 1 To .Range.Information(wdActiveEndPageNumber)
Set rng = .GoTo(wdGoToPage, wdGoToAbsolute, i).Bookmarks("\page").Range
.Variables("s" & CStr(rng.Information(wdActiveEndSectionNumber)) & "p" & CStr(rng.Information(wdActiveEndAdjustedPageNumber))).Value = rng.ComputeStatistics(wdStatisticWords)
Set rng = Nothing
Next
End With
End Sub

trying to remove lines containing yellow highlighted text

i have a document that i'm reviewing with yellow highlighted text. i want a macro to go through and remove the lines that are highlighted. so far this is what i have:
Sub hilight()
'
' hilight Macro
' removes lines in yellow hi-lighter
Dim p As Paragraph
For Each p In ActiveDocument.Paragraphs
Dim holder As String
If p.Range.Text = highlighted_text Then
p.Range.Text = holder
End If
Next p
End Sub
i need to know how highlighted text property is given so i can replace highlighted_text
Here is a solution. Note that it only replaces paragraphs in which the entire paragraph is yellow highlighted, not just part of it. There are a few things worth pointing out:
Replacing a paragraph will also take out the trailing line break, so I include it in the placeholder text
Since a line break is being added, if you don't loop through the paragraphs backwards, it'll be an infinite loop (thus the step -1)
Dim all variables outside of a loop
Sub ReplaceYellowParagrahps()
Dim p As Paragraph
Dim i As Long, count As Long
Dim placeholderText As String
placeholderText = "holder" & vblf
count = ActiveDocument.Paragraphs.count
For i = count To 1 Step -1
With ActiveDocument.Paragraphs(i).Range
If .HighlightColorIndex = wdYellow Then
.Text = placeholderText
End If
End With
Next
End Sub