I'm trying to transfer info from the following format in Word to Excel columns "a", "b", "c", "d" while ignoring the number in front which is the index of the entry (21 in this case)
So far this is what I got but it's only for the bold text on the top left but I don't know how to get the other substrings. Any help with this will be appreciated.
Sub TheBoldAndTheExcelful()
Dim docCur As Document
Dim snt As Range
Dim i As Integer
'Requires a reference to the 'Microsoft Excel XX.0 Object Library'
Dim appXL As Excel.Application, xlWB As Excel.Workbook, xlWS As Excel.Worksheet
'This assumes excel is currently closed
Set appXL = CreateObject("Excel.Application")
appXL.Visible = True
Set xlWB = appXL.Workbooks.Add
Set xlWS = xlWB.Worksheets(1)
On Error GoTo ErrHandler
Application.ScreenUpdating = False
Set docCur = ActiveDocument
For Each snt In docCur.Sentences
If snt.Bold = True Then
i = i + 1
xlWS.Cells(i, 1).Value = snt.Text
End If
Next snt
ExitHandler:
Application.ScreenUpdating = True
Set snt = Nothing
Exit Sub
ErrHandler:
MsgBox Err.Description, vbExclamation
Resume ExitHandler
End Sub
In your example,
For Each snt In docCur.Sentences
If snt.Bold = True Then
i = i + 1
xlWS.Cells(i, 1).Value = snt.Text
End If
Next snt
Let's rewrite that first
For Each snt In docCur.Sentences
If snt.Bold = True Then
i = i + 1
xlWS.Cells(i, COLUMN_A).Value = snt.Text
End If
Next snt
You are only including the bold sentence (If snt.Bold = True), and writing to COLUMN_A alone.
What you want is the bold sentence and the three sentences that follow after it, and you want to write to four columns.
So change this section to:
' Dim j As Long ' - make sure to have already declared this, or just uncomment this line
For j = 1 to docCur.Sentences.Count ' perhaps docCur.Paragraphs instead?
If docCur.Sentences(j).Bold = True Then
i = i + 1
' used 1+n and j+n for ease of understanding, but you can make these constant with a real solution; or you could even put this in another loop if you wanted, e.g. For n = 0 to 3, ...
xlWS.Cells(i, 1+0).Value = docCur.Sentences(j+0).Text
xlWS.Cells(i, 1+1).Value = docCur.Sentences(j+1).Text
xlWS.Cells(i, 1+2).Value = docCur.Sentences(j+2).Text
xlWS.Cells(i, 1+3).Value = docCur.Sentences(j+3).Text
End If
Next j
Or, to maximise performance:
' Dim j As Long ' - make sure to have already declared this, or just uncomment this line
With docCur.Sentences ' perhaps docCur.Paragraphs instead?
For j = 1 To .Count
If .Item(j).Bold = True Then
i = i + 1
xlWS.Cells(i, 1).Resize(, 4).Value = Array(.Item(j + 0).Text, .Item(j + 1).Text, .Item(j + 2).Text, .Item(j + 3).Text)
End If
Next j
End With
Based on comments, changes:
Problem: "Also some sentences that I have go a little on the second line so technically there would be 5 sentences total since formatting. Any way to concatenate the two lines which actually should represent the same sentence?":
Solution: Concatenate with &:
Example:
Fourth item of Array(...) changes
from .Item(j + 3).Text
to .Item(j + 3).Text & .Item(j + 4).Text)
Problem: "Instead when creating the last column, everything ends in some funny looking crosses (like an Egyptian Ankh). Any idea how to remove those?":
Solution: Either remove the last character in the problem sentence using Left(string, Len(string)-1), or use Replace(string, [problem character], "")
Example:
Problem item (presuming sentence 4) in Array(...) changes
from .Item(j + 3).Text
to Left(.Item(j + 3).Text, Len(.Item(j + 3).Text) - 1)
Updated:
' Dim j As Long ' - make sure to have already declared this, or just uncomment this line
With docCur.Sentences ' perhaps docCur.Paragraphs instead?
For j = 1 To .Count
If .Item(j).Bold = True Then
i = i + 1
xlWS.Cells(i, 1).Resize(, 4).Value = Array(.Item(j + 0).Text, .Item(j + 1).Text, .Item(j + 2).Text, Left(.Item(j + 3).Text, Len(.Item(j + 3).Text) - 1) & .Item(j + 4).Text)
End If
Next j
End With
If this isn't a complete fix, please provide a sample file.
Related
I'm an Excel VBA newbie and i'm trying to get the duplicates rows to appends to the first occurence of that row.
Per exemple we have the table here
I would like to format data as here
The logic goes like this. Whenever we detect that the last name and the birth date are the same for the current and following line that mean we have a dependant and we need to append the dependant's data to the "Main"
I have started writing code but i'm not able to detect the dependants properly.
Below is what i have. please consider that i'm a real noob and i'm trying hard.
Sub formatData()
Dim sh As Worksheet
Dim rw As Range
Dim RowCount As Integer
'This variable is checked to see if we have a first occurence of a line
Dim firstOccurence
'Initialise the variables for that will be used to match the data
Dim LocationName
Dim PlanCode
Dim LastName
Dim FirstName
Dim dependantFirstName
Dim dependantLastName
Dim dependantBirthdate
RowCount = 0
firstOccurence = True
'Check if the spreadsheet already exist if not create it.
For i = 1 To Worksheets.Count
If Worksheets(i).Name = "Benefits Census Formatted" Then
exists = True
End If
Next i
If Not exists Then
'Create a new spreadsheet to add the data to
Set ws = Sheets.Add
Sheets.Add.Name = "Benefits Census Formatted"
End If
'Set the ActiveSheet to the one containing the original data
Set sh = Sheets("BENEFIT Census")
With ActiveSheet
LastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
For Each rw In sh.Rows
'If the data of one cell is empty EXIT THE LOOP
If sh.Cells(rw.Row, 1).Value = "" Then
Exit For
End If
If rw.Row > 1 Then
'Afffecting the variables to the next loop so we can compare the values
nextLocationName = sh.Cells(rw.Row + 1, 1).Value
nextPlanCode = sh.Cells(rw.Row + 1, 2).Value
nextLastName = sh.Cells(rw.Row + 1, 3).Value
nextFirstName = sh.Cells(rw.Row + 1, 4).Value
nextEmploymentDate = sh.Cells(rw.Row + 1, 5).Value
nextBirthDate = sh.Cells(rw.Row + 1, 6).Value
nextDependantFirstName = sh.Cells(rw.Row + 1, 25).Value
nextDependantLastName = sh.Cells(rw.Row + 1, 26).Value
nextDependantBirthdate = sh.Cells(rw.Row + 1, 27).Value
Debug.Print LastName & " - " & FirstName & " ::: " & nextLastName & " - " & nextFirstName & " : " & rw.Row & " : " & firstOccurence
'First time you pass through the loop write the whole lane
If firstOccurence = True Then
'Affecting the variables to the current loops values
LocationName = sh.Cells(rw.Row, 1).Value
PlanCode = sh.Cells(rw.Row, 2).Value
LastName = sh.Cells(rw.Row, 3).Value
FirstName = sh.Cells(rw.Row, 4).Value
dependantFirstName = sh.Cells(rw.Row, 25).Value
dependantLastName = sh.Cells(rw.Row, 26).Value
dependantBirthdate = sh.Cells(rw.Row, 27).Value
'Write the current line
sh.Rows(rw.Row).Copy
'We copy the value into another sheet
Set ns = Sheets("Benefits Census Formatted")
LastRow = ns.Cells(ns.Rows.Count, "A").End(xlUp).Row + 1
ns.Rows(LastRow).PasteSpecial xlPasteValues
firstOccurence = False
Else
'We match the location with the plan code and the last name and first name of the user to find duplicates
If dependantFirstName <> nextDependantFirstName And PlanCode <> nextPlanCode And LastName <> nextLastName And FirstName <> nextFirstName Then
'We find a different dependant if the first name or the last name or the birthdate differs
'If Not (dependantFirstName <> nextDependantFirstName) Or Not (dependantLastName <> nextDependantLastName) Or Not (dependantBirthdate <> nextDependantBirthdate) Then
'We have a dependant Append it to the line
'append the user to the currentLine
'End If
Else
'If the dependantFirstName and the nextDependant First name doesn't match then on the next loop we print the full line
firstOccurence = True
End If
End If
RowCount = RowCount + 1
'End of if row > 2
End If
Next rw
End With
End Sub
This is the code I wrote for you. (Glad to see that so many others did, too. So you got a choice :-))
Sub TransscribeData()
' 25 Mar 2017
Dim WsS As Worksheet ' Source
Dim WsT As Worksheet ' Target
Dim TargetName As String
Dim LastRow As Long ' in WsS
Dim Rs As Long ' Source: row
Dim Rt As Long, Ct As Long ' Target: row / column
Dim Tmp As String
Dim Comp As String ' compare string
' Set Source sheet to the one containing the original data
Set WsS = Worksheets("BENEFIT Census")
LastRow = WsS.Cells(WsS.Rows.Count, NbcName).End(xlUp).Row
Application.ScreenUpdating = False
TargetName = "Benefits Census Formatted"
On Error Resume Next
Set WsT = Worksheets(TargetName) ' Set the Target sheet
If Err Then
' Create it if it doesn't exist
Set WsT = Worksheets.Add(After:=Worksheets(Worksheets.Count))
WsT.Name = TargetName
' insert the column captions here
End If
On Error GoTo 0
Rt = WsT.Cells(WsS.Rows.Count, NfdName).End(xlUp).Row
AddMain WsS, WsT, NbcFirstDataRow, Rt ' Rt is counting in the sub
For Rs = NbcFirstDataRow To LastRow - 1
With WsS.Rows(Rs)
Tmp = .Cells(NbcFname).Value & .Cells(NbcName).Value & .Cells(NbcDob).Value
End With
With WsS.Rows(Rs + 1)
Comp = .Cells(NbcFname).Value & .Cells(NbcName).Value & .Cells(NbcDob).Value
End With
If StrComp(Tmp, Comp, vbTextCompare) Then
AddMain WsS, WsT, Rs + 1, Rt
Else
Ct = WsT.Cells(Rt, WsT.Columns.Count).End(xlToLeft).Column
If Ct > NfdMain Then Ct = Ct + 1
With WsS.Rows(Rs + 1)
WsT.Cells(Rt, Ct + NfdRelate).Value = .Cells(NbcRelate).Value
WsT.Cells(Rt, Ct + NfdDepName).Value = .Cells(NbcDepName).Value
End With
End If
Next Rs
Application.ScreenUpdating = True
End Sub
The above code calls one Sub routine which you must add in the same code module which, by the way, should be a normal code module (by default "Module1" but you can rename it to whatever).
Private Sub AddMain(WsS As Worksheet, WsT As Worksheet, _
Rs As Long, Rt As Long)
' 25 Mar 2017
Rt = Rt + 1
With WsS.Rows(Rs)
WsT.Cells(Rt, NfdFname).Value = .Cells(NbcFname).Value
WsT.Cells(Rt, NfdName).Value = .Cells(NbcName).Value
WsT.Cells(Rt, NfdDob).Value = .Cells(NbcDob).Value
WsT.Cells(Rt, NfdMain).Value = "Main"
End With
End Sub
Observe that I inserted the word "Main" as hard text. You could also copy the content of the appropriate call in the Source sheet. This procedure only writes the first entry. Dependents are written by another code.
The entire code is controlled by two "enums", enumerations, one for each of the worksheets. Enums are the quickest way to assign names to numbers. Please paste these two enums at the top of your code sheet, before either of the procedures.
Private Enum Nbc ' worksheet Benefit Census
NbcFirstDataRow = 2 ' Adjust as required
NbcFname = 1 ' columns:
NbcName
NbcDob
NbcRelate
NbcDepName
End Enum
Private Enum Nfd ' worksheet Formatted Data
NfdFirstDataRow = 2 ' Adjust as required
NfdName = 1 ' columns:
NfdFname
NfdDob
NfdMain
NfdRelate = 0 ' Offset from NfdMain
NfdDepName
End Enum
Note that the rule of enums is that you can assign any integer to them. If you don't assign any number the value will be one higher than the previous. So, NfdMain = 4, followed by NfdRelate which has an assigned value of 0, followed by NfdDepName which has a value of 0 + 1 = 1.
The numbers in these enumerations are columns (and rows). You can control the entire output by adjusting these numbers. For example, "Main" is written into column NfdMain (=4 =D). Change the value to 5 and "Main" will appear in column 5 = E. No need to go rummaging in the code. Consider this a control panel.
In the formatted output I introduced a logic which is slightly different from yours. If you don't like it you can change it easily by modifying the enums. My logic has the family name as the main criterion in the first column (switched from the raw data). In column D I write "Main". But when there is a dependent I write the relationship in column D. Therefore only entries without any dependents will have "Main" in that column. For your first example, the formatted row will show Rasmond / Shawn / 01-01-1990 / Spouse / Jessica, Child 1 / Vanessa.
If you wish to keep the "Main and place "Spouse" in the next column, just set the enum NfdRelate = 1. With the "control panel" it's that simple.
I would use an approach using Dictionaries to collect and organize the data, and then output it. Judging both by your comments, and the code, there is a lot of stuff you haven't included. But the following code will take your original data, and output a table close to what you show -- some of the results ordering is different, but it is standardized (i.e. there is a relation listed with every dependent name.
In the dictionary, we use Last Name and Birthdate as the "key" so as to combine what you stated were the duplicates.
We define two Class objects
Dependent object which includes the Name and the Relation
Family object which includes the First and Last Names, and Birthdate as well as a collection (dictionary) of the dependent objects.
Once we have it organized, it is relatively simple to output it as we want.
For a discussion of Classes, you can do an Internet search. I would recommend Chip Pearson's Introduction to Classes
Be sure to read the notes in the code about renaming the class modules, and also setting a reference to Microsoft Scripting Runtime
Class1
Option Explicit
'Rename this module: cDependents
'set reference to Microsoft Scripting Runtime
Private pRelation As String
Private pDepName As String
Public Property Get Relation() As String
Relation = pRelation
End Property
Public Property Let Relation(Value As String)
pRelation = Value
End Property
Public Property Get DepName() As String
DepName = pDepName
End Property
Public Property Let DepName(Value As String)
pDepName = Value
End Property
Class2
Option Explicit
'rename this module: cFamily
'set reference to Microsoft Scripting Runtime
Private pFirstName As String
Private pLastName As String
Private pBirthdate As Date
Private pDependents As Dictionary
Public Property Get FirstName() As String
FirstName = pFirstName
End Property
Public Property Let FirstName(Value As String)
pFirstName = Value
End Property
Public Property Get LastName() As String
LastName = pLastName
End Property
Public Property Let LastName(Value As String)
pLastName = Value
End Property
Public Property Get Birthdate() As Date
Birthdate = pBirthdate
End Property
Public Property Let Birthdate(Value As Date)
pBirthdate = Value
End Property
Public Function ADDDependents(Typ, Nme)
Dim cD As New cDependents
Dim sKey As String
With cD
.DepName = Nme
.Relation = Typ
sKey = .Relation & Chr(1) & .DepName
End With
If Not pDependents.Exists(sKey) Then
pDependents.Add Key:=sKey, Item:=cD
End If
End Function
Public Property Get Dependents() As Dictionary
Set Dependents = pDependents
End Property
Private Sub Class_Initialize()
Set pDependents = New Dictionary
End Sub
Regular Module
Option Explicit
'set reference to Microsoft Scripting Runtime
Sub Family()
Dim wsSrc As Worksheet, wsRes As Worksheet, rRes As Range
Dim vSrc As Variant, vRes As Variant
Dim dF As Dictionary, cF As cFamily
Dim I As Long, J As Long
Dim sKey As String
Dim V As Variant, W As Variant
'Set source and results worksheets and results range
Set wsSrc = Worksheets("sheet1")
Set wsRes = Worksheets("sheet2")
Set rRes = wsRes.Cells(1, 1)
'read source data into array
With wsSrc
vSrc = .Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp)).Resize(columnsize:=5)
End With
'Collect and organize the family and dependent objects
Set dF = New Dictionary
For I = 2 To UBound(vSrc, 1)
Set cF = New cFamily
With cF
.FirstName = vSrc(I, 1)
.LastName = vSrc(I, 2)
.Birthdate = vSrc(I, 3)
.ADDDependents vSrc(I, 4), vSrc(I, 5)
sKey = .LastName & Chr(1) & .Birthdate
If Not dF.Exists(sKey) Then
dF.Add Key:=sKey, Item:=cF
Else
dF(sKey).ADDDependents vSrc(I, 4), vSrc(I, 5)
End If
End With
Next I
'Results will have two columns for each relation, including Main
' + three columns at the beginning
'get number of extra columns
Dim ColCount As Long
For Each V In dF
I = dF(V).Dependents.Count
ColCount = IIf(I > ColCount, I, ColCount)
Next V
ColCount = ColCount * 2 + 3
ReDim vRes(0 To dF.Count, 1 To ColCount)
vRes(0, 1) = "First Name"
vRes(0, 2) = "Last Name"
vRes(0, 3) = "Birthdate"
vRes(0, 4) = "Dependant"
vRes(0, 5) = "Dependant Name"
For J = 6 To UBound(vRes, 2) Step 2
vRes(0, J) = "Relation " & J - 5
vRes(0, J + 1) = "Dependant Name"
Next J
I = 0
For Each V In dF
I = I + 1
With dF(V)
vRes(I, 1) = .FirstName
vRes(I, 2) = .LastName
vRes(I, 3) = .Birthdate
J = 2
For Each W In .Dependents
J = J + 2
With .Dependents(W)
vRes(I, J) = .Relation
vRes(I, J + 1) = .DepName
End With
Next W
End With
Next V
Set rRes = rRes.Resize(rowsize:=UBound(vRes, 1) + 1, columnsize:=UBound(vRes, 2))
With rRes
.EntireColumn.Clear
.Value = vRes
With .Rows(1)
.Font.Bold = True
.HorizontalAlignment = xlCenter
End With
.EntireColumn.AutoFit
End With
End Sub
Source Data
Results
I had this project in Chemistry to supply a list of Compound elements
now I had found a website where it gives me a very long list of elements:
I had made this Code but it Doesn't Work
Sub move()
Dim list As Range
Set list = Range("A1:A2651")
For Each Row In list.Rows
If (Row.Font.Regular) Then
Row.Cells(1).Offset(-2, 1) = Row.Cells(1)
End If
Next Row
End Sub
Can you make it run for me? you can have your own algorithm ofc.
Assuming the list is constantly in the same format (i.e. Compound name, empty line, Compound Symbols, empty line) this quick code will work:
Sub move()
Dim x As Integer
x = 3
With ActiveSheet
Do Until x > 2651
.Cells(x - 2, 2).Value = .Cells(x, 1).Value
.Cells(x, 1).ClearContents
x = x + 4
Loop
End With
End Sub
After running you can then just sort columns A:B to remove the blanks.
After trying your original code I realised the problem was with the .regular property value. I've not seen .regular before, so swapped it to NOT .bold instead, and to ignore blank entries, then added the line for clearing the contents of the cell copied. This is most like the original code for reference:
Sub get_a_move_on()
Dim list As Range
Set list = ActiveSheet.Range("A1:A2561")
For Each Row In list.Rows
If Row.Font.Bold = False And Row.Value <> "" Then
Row.Cells(1).Offset(-2, 1) = Row.Cells(1)
Row.Cells(1).ClearContents
End If
Next Row
End Sub
P.S it's a list of compounds, not elements, there's only about 120 elements in the periodic table! ;)
Another way to retrieve the data you need via XHR and RegEx:
Sub GetChemicalCompoundsNames()
Dim sRespText As String
Dim aResult() As String
Dim i As Long
' retrieve HTML content
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://quizlet.com/18087424", False
.Send
sRespText = .responseText
End With
' regular expression for rows
With CreateObject("VBScript.RegExp")
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = "qWord[^>]*?>([\s\S]*?)<[\s\S]*?qDef[^>]*?>([\s\S]*?)<"
With .Execute(sRespText)
ReDim aResult(1 To .Count, 1 To 2)
For i = 1 To .Count
With .Item(i - 1)
aResult(i, 1) = .SubMatches(0)
aResult(i, 2) = .SubMatches(1)
End With
Next
End With
End With
' output to the 1st sheet
With Sheets(1)
.Cells.Delete
Output .Range("A1"), aResult
End With
End Sub
Sub Output(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize( _
UBound(aCells, 1) - LBound(aCells, 1) + 1, _
UBound(aCells, 2) - LBound(aCells, 2) + 1 _
)
.NumberFormat = "#"
.Value = aCells
.Columns.AutoFit
End With
End With
End Sub
Gives output (663 rows total):
I am using Excel 2010.
I have some working VBA code that compares two cells (from text, to text) and generates the redlined text into a third cell with strikethroughs on removed words, underlines on added words. This is not a straight combination of the contents of the cells.
The code works, but I think it can be more efficient with the use of multidimensional arrays to store things instead of using additional cells and recombining. But I am stuck on how to implement it. I would also like to determine where the breaking point is, especially for newer versions of Excel that I don't have yet, since the number of characters allowed in a cell seems to continually grow with every new release.
Comments are also welcome.
The working code:
Sub main()
Cells(3, 3).Clear
Call Redline(3)
End Sub
Sub Redline(ByVal r As Long)
Dim t As String
Dim t1() As String
Dim t2() As String
Dim i As Integer
Dim j As Integer
Dim f As Boolean
Dim c As Integer
Dim wf As Integer
Dim ss As Integer
Application.ScreenUpdating = False
t1 = Split(Range("A" + CStr(r)).Value, " ", -1, vbTextCompare)
t2 = Split(Range("B" + CStr(r)).Value, " ", -1, vbTextCompare)
t = ""
f = False
c = 4
ss = 0
If (Range("A" + CStr(r)).Value <> "") Then
If (Range("B" + CStr(r)).Value <> "") Then
j = 1
For i = LBound(t1) To UBound(t1)
f = False
For j = ss To UBound(t2)
If (t1(i) = t2(j)) Then
f = True
wf = j
Exit For
End If
Next j
If (Not f) Then
Cells(r, c).Value = t1(i)
Cells(r, c).Font.Strikethrough = True ' strikethrough this cell
c = c + 1
Else
If (wf = i) Then
Cells(r, c).Value = t1(i) ' aka t2(wf)
c = c + 1
ss = i + 1
ElseIf (wf > i) Then
For j = ss To wf - 1
Cells(r, c).Value = t2(j)
Cells(r, c).Font.Underline = xlUnderlineStyleSingle ' underline this cell
c = c + 1
Next j
Cells(r, c).Value = t1(i)
c = c + 1
ss = wf + 1
End If
End If
Next i
If (UBound(t2) > UBound(t1)) Then
For i = ss To UBound(t2)
Cells(r, c).Value = t2(i)
Cells(r, c).Font.Underline = xlUnderlineStyleSingle ' underline this cell
c = c + 1
Next i
End If
Else
t = Range("A" + CStr(r)).Value
End If
Else
t = Range("B" + CStr(r)).Value
End If
lc = Range("XFD" + CStr(r)).End(xlToLeft).Column
Call Merge_Cells(r, 4, lc)
Application.ScreenUpdating = True
End Sub
Sub Merge_Cells(ByVal r As Long, ByVal fc As Integer, ByVal lc As Long)
Dim i As Integer, c As Integer, j As Integer
Dim rngFrom As Range
Dim rngTo As Range
Dim lenFrom As Integer
Dim lenTo As Integer
Set rngTo = Cells(r, 3)
' copy the text over
For c = fc To lc
lenTo = rngTo.Characters.Count
Set rngFrom = Cells(r, c)
lenFrom = rngFrom.Characters.Count
If (c = lc) Then
rngTo.Value = rngTo.Text & rngFrom.Text
Else
rngTo.Value = rngTo.Text & rngFrom.Text & " "
End If
Next c
' now copy the formatting
j = 0
For c = fc To lc
Set rngFrom = Cells(r, c)
lenFrom = rngFrom.Characters.Count + 1 ' add one for the space after each word
For i = 1 To lenFrom - 1
With rngTo.Characters(j + i, 1).Font
.Name = rngFrom.Characters(i, 1).Font.Name
.Underline = rngFrom.Characters(i, 1).Font.Underline
.Strikethrough = rngFrom.Characters(i, 1).Font.Strikethrough
.Bold = rngFrom.Characters(i, 1).Font.Bold
.Size = rngFrom.Characters(i, 1).Font.Size
.ColorIndex = rngFrom.Characters(i, 1).Font.ColorIndex
End With
Next i
j = j + lenFrom
Next c
' wipe out the temporary columns
For c = fc To lc
Cells(r, c).Clear
Next c
End Sub
You can directly assign Excel Range object to VBA 2d-array and perform all that business logic operations on that array. It will provide substantial performance boost vs range iteration. The result values then can be inserted back into Excel worksheet column from that 2d-array.
Sample code snippet follows:
Sub Range2Array()
Dim arr As Variant
arr = Range("A:B").Value
'alternatively
'arr = Range("A:B")
'test
Debug.Print (arr(1, 1))
End Sub
Another useful technique is to assign Excel's UsedRange to VBA Array:
arr = ActiveSheet.UsedRange
Hope this may help. Best regards,
Sample code not quite right
I've got a spreadsheet with the following "original" and "changed" content:
Tesla to Begin Trial for Allowing Other Vehicles from Other Electric Vehicle Automakers to Use Tesla Superchargers
Tesla to Begin Trial for Allowing Other Vehicles from Other EV Auto Makers to Use Tesla Superchargers
Running your code, I got not-quite-right results.
The "original" text that is missing from the "changed" version is correctly shown with strikethrough, but the new text in the "changed" version is just ... missing.
Alternative approach
Poking around, it looks like you're trying to re-create MS Word's Track Changes formatting.
Why not just leverage Word?
The following VBA code does just that. This requires that your Excel VBA project has a reference to the Word object library. You can add this from within the VBA editor by clicking Tools → References, and selecting Microsoft Word XX.Y Object Library, where XX.Y is whatever version you have installed.
Public Sub CompareCells()
' ####################
' Basic Flow
'
' 1. Get the text content of the two cells to compare.
' 2. Get an open instance of MS Word, or spin up a new one.
' 3. Use Word's text-comparison features to generate the tracked-changes markup.
' 4. Copy that markup to the clipboard.
' 5. Then just paste that into our target cell.
' ####################
Const Src As String = "A" ' Column containing the original source text
Const Tgt As String = "B" ' Column containing the targeted text to compare
Const Cmp As String = "C" ' Column where we will put the marked-up comparison
Const RowToUse As Integer = 8 ' Rejigger as appropriate to your use case.
' 1.
Dim ThisSheet As Excel.Worksheet: Set ThisSheet = Excel.ActiveSheet
Dim StrSrc As String, StrTgt As String
StrSrc = ThisSheet.Range(Src & RowToUse).Value
StrTgt = ThisSheet.Range(Tgt & RowToUse).Value
' 2.
Dim Wd As Word.Application: Set Wd = GetApp("Word")
' 3.
Dim DocOrig As Word.Document, DocChgd As Word.Document, DocMarkup As Word.Document
Set DocOrig = Wd.Documents.Add(Visible:=False)
DocOrig.Content = StrSrc
Set DocChgd = Wd.Documents.Add(Visible:=False)
DocChgd.Content = StrTgt
Set DocMarkup = Wd.CompareDocuments(DocOrig, DocChgd, wdCompareDestinationNew)
' 4.
DocMarkup.Content.Copy
' 5.
ThisSheet.Range(Cmp & RowToUse).Select
ThisSheet.Paste
' Cleanup
DocOrig.Close savechanges:=False
DocChgd.Close savechanges:=False
DocMarkup.Close savechanges:=False
End Sub
Public Function GetApp(AppName As String) As Object
Dim app As Object
On Error GoTo Handler
Set app = GetObject(, AppName & ".Application")
Set GetApp = app
Exit Function
On Error GoTo 0
Handler:
If Err.Number > 0 And Err.Number <> 429 Then ' Unknown error, so error out
Err.Raise Err.Number, Err.Source, Err.Description, Err.HelpFile, Err.HelpContext
Exit Function
End If
DoEvents
' If we get here, there's no open app by that name, so start a new instance.
Set app = CreateObject(AppName & ".Application")
Set GetApp = app
End Function
When run using the same sample texts, I get the following:
This time, we get both the removed text in strikethrough, and the added text in underlining, with color coding as well.
I'm currently working on a userform within excel. It currently pulls a list from a database and pastes this into excel then references that data to autofill in textboxes when you select someones name.
What I am having trouble with is I also want to autofill the access to certain systems a staff member will have. Basically the spreadsheet contains all staff within the company and the access they have to certain systems consisting of 2 cells (System and entitlement, columns K and L). I have defined the range I wish to work with but I am now stuck.
How do I get excel to loop through the range and "Copy and paste" the data from each cell (eg K2 and l2) into text boxes in the userform. So what I want to happen is select someones name and it will automatically pull through all their access details and autofill some textboxes with that access and entitlement.
Current code I have is as per below.
Private Sub cboStaffNumber_Change()
Dim rngCell As Range
Dim rngNumber As Range
Dim lngRow As Long
Dim lngRangeStart As Long
Dim lngRangeEnd As Long
Dim lngLastRow As Long
Dim rngColumn As Range
Dim rngEntitlement As Range
Dim rngAccess As Range
Set rngNumber = Range("A2:A" & lngStaffDataLastRow)
'Fills in the Staff Name, OIA Template, Division, Job Title and WAP Code fields when a staff member is selected
If bCboBool = False Then
If Me.cboStaffNumber.ListIndex > 0 Then
For Each rngCell In rngNumber.Cells
If rngCell.Value = Val(cboStaffNumber.Value) Then
' lngRangeStart = rngCell.Row
bCboBool = True
Me.cboStaffName = rngCell.Offset(0, 1)
Me.txtOIATemplate = rngCell.Offset(0, 9)
Me.txtDivision = rngCell.Offset(0, 7)
Me.txtJobTitle = rngCell.Offset(0, 2)
Me.txtWAP = rngCell.Offset(0, 3)
Exit For
End If
Next rngCell
Else
Me.cboStaffName.Value = ""
Me.txtOIATemplate.Value = ""
Me.txtDivision.Value = ""
Me.txtJobTitle.Value = ""
Me.txtWAP.Value = ""
End If
End If
For lngRow = 2 To lngLastRow
If rngNumber.Cells(lngRow, 1).Value = Val(cboStaffNumber.Value) Then
lngRangeStart = lngRow
Exit For
End If
Next lngRow
' For lngRow = lngRangeStart To lngLastRow + 1
' If rngNumber.Cells(lngRow, 1).Value <> Val(cboStaffNumber.Value) Then
' lngRangeEnd = lngRow
' Exit For
' End If
' Next lngRow
'
' If lngRow <> 0 Then
' lngRangeEnd = lngRangeEnd - 1
' End If
'
' For rngAccess = lngRangeStart To lngRangeEnd
' Set rngCell = lngRangeStart.Cells(rngCell, 11)
' For Each rngCell In rngAccess
' Set txtAccess1 = rngCell
' Exit For
' Next rngAccess
bCboBool = False
End Sub`
Any help would be greatly appreciated.
Thanks
The below should give you the basics to loop through and update a variable with the values of each System and Access cell for the staff member based on their staff number. You'll need to change the values in [] to the named values in your form. It also works off of the original range which you defined 'rngNumber'. I've not tested it, but at a glance it should work. Let me know how it goes.
strSystem = ""
strAccess = ""
For Each rngCell In rngNumber.Cells
If rngCell.Value = Val(cboStaffNumber.Value) Then
strSystem = strSystem & rngCell.Offset(0,10).value & ", "
strAccess = strAccess & rngCell.Offset(0,11).value & ", "
End If
Next rngCell
If len(strSystem) > 0 then
strSystem = Left(strSystem, len(strSystem)-1)
End If
If len(strAccess) > 0 then
strAccess = Left(strAccess , len(strAccess)-1)
End If
Me.[txtSystemBox] = strSystem
Me.[txtAccessBox] = strAccess
I am getting the impression that this is not possible in word but I figure if you are looking for any 3-4 words that come in the same sequence anywhere in a very long paper I could find duplicates of the same phrases.
I copy and pasted a lot of documentation from past papers and was hoping to find a simple way to find any repeated information in this 40+ page document there is a lot of different formatting but I would be willing to temporarily get rid of formatting in order to find repeated information.
To highlight all duplicate sentences, you can also use ActiveDocument.Sentences(i). Here is an example
LOGIC
1) Get all the sentences from the word document in an array
2) Sort the array
3) Extract Duplicates
4) Highlight duplicates
CODE
Option Explicit
Sub Sample()
Dim MyArray() As String
Dim n As Long, i As Long
Dim Col As New Collection
Dim itm
n = 0
'~~> Get all the sentences from the word document in an array
For i = 1 To ActiveDocument.Sentences.Count
n = n + 1
ReDim Preserve MyArray(n)
MyArray(n) = Trim(ActiveDocument.Sentences(i).Text)
Next
'~~> Sort the array
SortArray MyArray, 0, UBound(MyArray)
'~~> Extract Duplicates
For i = 1 To UBound(MyArray)
If i = UBound(MyArray) Then Exit For
If InStr(1, MyArray(i + 1), MyArray(i), vbTextCompare) Then
On Error Resume Next
Col.Add MyArray(i), """" & MyArray(i) & """"
On Error GoTo 0
End If
Next i
'~~> Highlight duplicates
For Each itm In Col
Selection.Find.ClearFormatting
Selection.HomeKey wdStory, wdMove
Selection.Find.Execute itm
Do Until Selection.Find.Found = False
Selection.Range.HighlightColorIndex = wdPink
Selection.Find.Execute
Loop
Next
End Sub
'~~> Sort the array
Public Sub SortArray(vArray As Variant, i As Long, j As Long)
Dim tmp As Variant, tmpSwap As Variant
Dim ii As Long, jj As Long
ii = i: jj = j: tmp = vArray((i + j) \ 2)
While (ii <= jj)
While (vArray(ii) < tmp And ii < j)
ii = ii + 1
Wend
While (tmp < vArray(jj) And jj > i)
jj = jj - 1
Wend
If (ii <= jj) Then
tmpSwap = vArray(ii)
vArray(ii) = vArray(jj): vArray(jj) = tmpSwap
ii = ii + 1: jj = jj - 1
End If
Wend
If (i < jj) Then SortArray vArray, i, jj
If (ii < j) Then SortArray vArray, ii, j
End Sub
SNAPSHOTS
BEFORE
AFTER
I did not use my own DAWG suggestion, and I am still interested in seeing if someone else has a way to do this, but I was able to come up with this:
Option Explicit
Sub test()
Dim ABC As Scripting.Dictionary
Dim v As Range
Dim n As Integer
n = 5
Set ABC = FindRepeatingWordChains(n, ActiveDocument)
' This is a dictionary of word ranges (not the same as an Excel range) that contains the listing of each word chain/phrase of length n (5 from the above example).
' Loop through this collection to make your selections/highlights/whatever you want to do.
If Not ABC Is Nothing Then
For Each v In ABC
v.Font.Color = wdColorRed
Next v
End If
End Sub
' This is where the real code begins.
Function FindRepeatingWordChains(ChainLenth As Integer, DocToCheck As Document) As Scripting.Dictionary
Dim DictWords As New Scripting.Dictionary, DictMatches As New Scripting.Dictionary
Dim sChain As String
Dim CurWord As Range
Dim MatchCount As Integer
Dim i As Integer
MatchCount = 0
For Each CurWord In DocToCheck.Words
' Make sure there are enough remaining words in our document to handle a chain of the length specified.
If Not CurWord.Next(wdWord, ChainLenth - 1) Is Nothing Then
' Check for non-printing characters in the first/last word of the chain.
' This code will read a vbCr, etc. as a word, which is probably not desired.
' However, this check does not exclude these 'words' inside the chain, but it can be modified.
If CurWord <> vbCr And CurWord <> vbNewLine And CurWord <> vbCrLf And CurWord <> vbLf And CurWord <> vbTab And _
CurWord.Next(wdWord, ChainLenth - 1) <> vbCr And CurWord.Next(wdWord, ChainLenth - 1) <> vbNewLine And _
CurWord.Next(wdWord, ChainLenth - 1) <> vbCrLf And CurWord.Next(wdWord, ChainLenth - 1) <> vbLf And _
CurWord.Next(wdWord, ChainLenth - 1) <> vbTab Then
sChain = CurWord
For i = 1 To ChainLenth - 1
' Add each word from the current word through the next ChainLength # of words to a temporary string.
sChain = sChain & " " & CurWord.Next(wdWord, i)
Next i
' If we already have our temporary string stored in the dictionary, then we have a match, assign the word range to the returned dictionary.
' If not, then add it to the dictionary and increment our index.
If DictWords.Exists(sChain) Then
MatchCount = MatchCount + 1
DictMatches.Add DocToCheck.Range(CurWord.Start, CurWord.Next(wdWord, ChainLenth - 1).End), MatchCount
Else
DictWords.Add sChain, sChain
End If
End If
End If
Next CurWord
' If we found any matching results, then return that list, otherwise return nothing (to be caught by the calling function).
If DictMatches.Count > 0 Then
Set FindRepeatingWordChains = DictMatches
Else
Set FindRepeatingWordChains = Nothing
End If
End Function
I have tested this on a 258 page document (TheStory.txt) from this source, and it ran in just a few minutes.
See the test() sub for usage.
You will need to reference the Microsoft Scripting Runtime to use the Scripting.Dictionary objects. If that is undesirable, small modifications can be made to use Collections instead, but I prefer the Dictionary as it has the useful .Exists() method.
I chose a rather lame theory, but it seems to work (at least if I got the question right cuz sometimes I'm a slow understander).
I load the entire text into a string, load the individual words into an array, loop through the array and concatenate the string, containing each time three consecutive words.
Because the results are already included in 3 word groups, 4 word groups or more will automatically be recognized.
Option Explicit
Sub Find_Duplicates()
On Error GoTo errHandler
Dim pSingleLine As Paragraph
Dim sLine As String
Dim sFull_Text As String
Dim vArray_Full_Text As Variant
Dim sSearch_3 As String
Dim lSize_Array As Long
Dim lCnt As Long
Dim lCnt_Occurence As Long
'Create a string from the entire text
For Each pSingleLine In ActiveDocument.Paragraphs
sLine = pSingleLine.Range.Text
sFull_Text = sFull_Text & sLine
Next pSingleLine
'Load the text into an array
vArray_Full_Text = sFull_Text
vArray_Full_Text = Split(sFull_Text, " ")
lSize_Array = UBound(vArray_Full_Text)
For lCnt = 1 To lSize_Array - 1
lCnt_Occurence = 0
sSearch_3 = Trim(fRemove_Punctuation(vArray_Full_Text(lCnt - 1) & _
" " & vArray_Full_Text(lCnt) & _
" " & vArray_Full_Text(lCnt + 1)))
With Selection.Find
.Text = sSearch_3
.Forward = True
.Replacement.Text = ""
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
Do While .Execute
lCnt_Occurence = lCnt_Occurence + 1
If lCnt_Occurence > 1 Then
Selection.Range.Font.Color = vbRed
End If
Selection.MoveRight
Loop
End With
Application.StatusBar = lCnt & "/" & lSize_Array
Next lCnt
errHandler:
Stop
End Sub
Public Function fRemove_Punctuation(sString As String) As String
Dim vArray(0 To 8) As String
Dim lCnt As Long
vArray(0) = "."
vArray(1) = ","
vArray(2) = ","
vArray(3) = "?"
vArray(4) = "!"
vArray(5) = ";"
vArray(6) = ":"
vArray(7) = "("
vArray(8) = ")"
For lCnt = 0 To UBound(vArray)
If Left(sString, 1) = vArray(lCnt) Then
sString = Right(sString, Len(sString) - 1)
ElseIf Right(sString, 1) = vArray(lCnt) Then
sString = Left(sString, Len(sString) - 1)
End If
Next lCnt
fRemove_Punctuation = sString
End Function
The code assumes a continuous text without bullet points.