Excel 2013 - How to delete duplicate phrases from a single cell

Excel 2013 - How to delete duplicate phrases from a single cell - vba

I'm a novice when it comes to VBA, Macros and Modules, so please include specific steps. How do I delete duplicate phrases from a single cell, such as the following:
"Brotherhood Of Man - United We Stand Brotherhood Of Man - United We Stand"
I want to be left with:
"Brotherhood Of Man - United We Stand"

You can use a regex with a backreference to match duplicated words or phrases. The pattern ^(.+)\s*\1$ will match any duplicating phrase with optional whitespace in between.
Const strText = "Brotherhood Of Man - United We Stand Brotherhood Of Man - United We Stand"
Dim re
Set re = CreateObject("VBScript.RegExp")
re.Pattern = "^(.+)\s*\1$"
If re.Test(strText) Then
Debug.Print re.Replace(strText, "$1")
End If
Output:
Brotherhood Of Man - United We Stand
Edit, with respect to comments:
To check every cell in column A, add a subroutine to your worksheet that iterates each cell and sends the cell value to the regex parser. For example, to run it for the range A1:A100:
Sub UpdateCells()
' Create our regex. This will never change, so do it up front.
Dim re
Set re = CreateObject("VBScript.RegExp")
re.Pattern = "^(.+)\s*\1$"
' Check every cell in a particular range.
Dim r As Range
For Each r In Range("A1:A100")
If re.Test(r) Then
r = re.Replace(r, "$1")
End If
Next
End Sub

Related

Selection of particular text in two lines with different scenarios

I had been in a situation in which I need to select particular text in two lines. I had been doing this by the following code:
Selection.Paragraphs(1).Range.Select
Selection.MoveRight Unit:=wdWord, Count:=2, Extend:=wdExtend
But the above code is not applicable to all four following scenarios. I'm in search of code which would output selection of first line and second line till 'comma'. I need code as simple as possible, kindly help.
Scenario 1
Infraestructura Energetica Nova SAB De CV
IENOVA* MM, Buy
Scenario 2
Infraestructura Energetica Nova SAB De CV
IENOVA13 MM, Sell
Scenario 3
Infraestructura Energetica Nova SAB De CV
IENOVA* MM
Scenario 4 Edited
Nova SAB
IENOVA MM
Illustration with Picture:

The following works with the two paragraphs as separate ranges. The first paragraph is picked up unaltered and used as the starting point for getting the second paragraph.
Using the Instr function, it determines whether a comma is present - Instr returns 0 if there is none, otherwise a positive number.
If there is no comma, the paragraph mark is cut off. It's not clear whether you want this Chr(13), if you do, just comment out that line and the paragraph is picked up with no changes.
If there is a comma, the Range is collapsed to its starting point, then extended to the position of the comma, minus 1 (leaves out the comma).
The two strings are then concatenated for debug.print. And then the endpoint of the first Range is extended to the end point of the second Range, so that you have one Range (if that's what you need - that's not clear).
Sub SelectInfo()
Dim rngLine1 As Word.Range
Dim rngLine2 As Word.Range
Dim isComma As Long
Set rngLine1 = Selection.Range.Paragraphs(1).Range
Set rngLine2 = rngLine1.Duplicate
rngLine2.Collapse (wdCollapseEnd)
Set rngLine2 = rngLine2.Paragraphs(1).Range
isComma = InStr(rngLine2.Text, ",")
If isComma = 0 Then
'No comma, assume we don't want the last paragraph mark...
rngLine2.MoveEnd wdCharacter, -1
Else
rngLine2.Collapse wdCollapseStart
rngLine2.MoveEnd wdCharacter, isComma - 1
End If
Debug.Print rngLine1.Text & rngLine2.Text
'Get a single Range instead of the string:
rngLine1.End = rngLine2.End
End Sub

Taking your question literally:
...I'm in search of code which would output selection of first line and second line till 'comma'.
You can make an adjustment to the 2nd line of your code as follows;
Selection.Paragraphs(1).Range.Select
Selection.MoveEndUntil ",", wdForward
What this does is moves the end of the selection forward until it finds ",".
If however, per your 'Scenarios', some of the selections may not contain a comma, the following will work:
Sub SelectionTest()
Dim mySel As String
With Selection
.Paragraphs(1).Range.Select
mySel = Selection
If InStr(1, mySel, ",") Then
.MoveEndUntil ",", wdForward
Else
.Extend "M"
.Extend "M"
End If
End With
End Sub
What this does is selects the paragraph, sets the string to the variable mySel and using the InStr function tests if the string contains a comma, if it does, it executes the same code as above, but if there is no comma, it extends the selection until the character "M" (upper case M) and then extends the selection again to the next "M".
As indicated in your comment the "MM" part of your text is a variable so:
Sub SelectionTest()
Dim mySel As String
With Selection
.Paragraphs(1).Range.Select
.MoveDown Unit:=wdLine, Count:=1, Extend:=wdExtend
mySel = Selection
If InStr(1, mySel, ",") Then
.Paragraphs(1).Range.Select
.MoveEndUntil ","
Else: Exit Sub
End If
End With
End Sub
What this does is selects the first paragraph and then extends the selection to the end of the 2nd line, sets selected text to the variable mySel and using the InStr function tests if the string contains a comma, if it does, it executes the same code as above, but if there is no comma, it keeps the 2 lines selected and that's it.
This keeps code shorter rather than having an ElseIf statement for each Country ("MM", "RO", "TI" etc) but does rely on no text after the Country code. Otherwise follow the previous part of the answer and repeat the ElseIf for each Country variable.
I tested this on all of your scenarios (by copy/pasting your scenario paragraphs into word) and each one resulted the same as your 'target selection' as long as the cursor was at the start of the required paragraph when the code was run.
Alternatively you can omit the part specifying the comma and just use (perhaps adjust as required and put this within an if statement to allow for your variables):
With Selection
.Paragraphs(1).Range.Select
.Extend "M"
.Extend "M"
End With
These codes will work based on what you've asked and provided in your question but may not be the most universal code in it's current form.
There is some more info on the functions and methods used in the below links:
Selection.MoveEndUntil
Selection.Extend
InStr
Selection.MoveDown

Split Text and number IN VBA

I have column header which I want to split
Heading
XA 2009
WW YY 2010
XXA 2011
I Want output like
XA,
WW YY,
XXA
Earlier I was using find function in excel which was working fine
=MID("XA 2009",1,FIND(" ","XA 2009",FIND(" ","XA 2009")+1)-1)
OUTPUT AS XA,
WW YY
Now requirement has change to code in vba
I was trying to use Instr() instead of find as it is not working in VBA
Mid("XA 2009", 1, InStr(1, "XA 2009", " ", InStr(1, "XA 2009", "2")) - 1)
Now the output is XA,
WW instead of WW YY.
Can anyone suggest what I am doing wrong. I am pretty new to vba.
I Want output like
XA,
WW YY,
XXA
I am using excel 2013

First, see in the answer for the following SO question, the general approach and prerequisites for using Regex search in VBA:
How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops
Now, as for your specific requirement, try the following pattern:
(\D*)\s+\d*\s+(\D*)\s+\d*\s+(\D*)\s+\d*
It will work for your precise example, but if you need the input string to be a bit more general you might need to modify the pattern.
Some explanations:
\D* will match one or more non numerical text characters ("alpha character")
\s+ will match at least one space character
\d* will match one or more numerical digits
the (parenthesis) are for grouping sets of results, so I used them to surround what you wish to extract from the input string.
If for example you know for sure that there's only one white-space character you can use:
[\s]
So the pattern might look like:
(\D*)[\s]\d*[\s](\D*)[\s]\d*[\s](\D*)[\s]\d*
Also, this is a great tool for online pattern testing:
https://regex101.com/
This is the solution for your edited requirement:
In the VBA editor, go to tools=>references, find and select the checkbox next to "Microsoft VBScript Regular Expressions 5.5", press ok
add this code to "ThisWorkbook" module:
Private Sub solution()
Dim regEx As New RegExp
Dim strPattern As String
Dim myInput As Range
Dim myOutput As Range
Set myInput = ActiveSheet.Range("A1")
Set myOutput = ActiveSheet.Range("A2")
strPattern = "(\D*)[\s]\d*[\s](\D*)[\s]\d*[\s](\D*)[\s]\d*"
strInput = myInput.Value
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.test(strInput) Then
ActiveSheet.Range("A2") = regEx.Replace(strInput, "$1, $2, $3")
End If
End Sub

You probably had that formula in a specific cell, right? I think the following should do:
Range("yourcell").FormulaR1C1 = "=MID("XA 2009",1,FIND(" ","XA 2009",FIND(" ","XA 2009")+1)-1)"
Just replace "yourcell" with the cell number you had the formula in. So if you had it in cell A1, for example, it should be Range("A1")

Pull Site Code from Location Name (VBA)

So I have a customer that need a specific code isolated from the name of each location. I have the following formula that I have been manually editing, but was wondering if there is a way to have it possibly count the characters in a cell and pull the codes to a new cell.
Example Location Name: MRI-LENOX HILL RADIOLOGY 150/14101
=RIGHT(A1,FIND("/",A1)-19)
The code format is 0123/01234 (3 to 4 characters in front of the slash and 5 after)
Any help in this regard would be much appreciated.
Thanks,
Justin Hames

You can use a regex to find and extract the code from the cell value. For example:
With CreateObject("VBScript.RegExp")
.Pattern = "\d{3,4}/\d{5}"
If .Test(Range("A1")) Then
Range("B1") = .Execute(Range("A1"))(0)
End If
End With
This will extract the code from A1 and place it into B1.
Edit, with respect to comments:
To run on a range of cells:
Dim re
Set re = CreateObject("VBScript.RegExp")
re.Pattern = "\d{3,4}/\d{5}"
Dim r As Range
For Each r In Range("A1:A100")
If re.Test(r) Then r.Offset(0, 1) = re.Execute(r)(0)
Next

How to breakdown text with a non-uniform delimiter?

I have this data in Excel:
But one of my clients needs it summarize per item in detail.
So above data needs to be converted to:
This way, client can analyze it per tracking and per item.
The text format is not really uniform since it is entered manually.
Some users use Alt+Enter to separate items. Some uses space and some doesn't bother separating at all. What's consistent though is that they put hyphen(-) after the item then the count (although not always followed by the number, there can be spaces in between). Also if the count of that item is one(1), they don't bother putting it at all (as seen on the tracking IDU3004 for Apple Juice).
The only function I tried is the Split function which brings me closer to what I want.
But I am still having a hard time separating the individual array elements into what I expect.
So for example, IDU3001 in above after using Split (with "-" as delimiter) will be:
arr(0) = "Apple"
arr(1) = "20 Grape"
arr(2) = "5" & Chr(10) & "Pear" ~~> Just to show Alt+Enter
arr(3) = "3Banana"
arr(4) = "2"
Of course I can come up with a function to deal with each of the elements to extract numbers and items.
Actually I was thinking of using just that function and skip the Split altogether.
I was just curious that maybe there is another way out there since I am not well versed in Text manipulation. I would appreciate any idea that would point me to a possible better solution.

I suggest using a Regular Expression approach
Here's a demo based on your sample data.
Sub Demo()
Dim re As RegExp
Dim rMC As MatchCollection
Dim rM As Match
Dim rng As Range
Dim rw As Range
Dim Detail As String
' replace with the usual logic to get the range of interest
Set rng = [A2:C2]
Set re = New RegExp
re.Global = True
re.IgnoreCase = True
re.Pattern = "([a-z ]+[a-z])\s*\-\s*(\d+)\s*"
For Each rw In rng.Rows
' remove line breaks and leading/trailing spaces
Detail = Trim$(Replace(rw.Cells(1, 3).Value, Chr(10), vbNullString))
If Not Detail Like "*#" Then
' Last item has no - #, so add -1
Detail = Detail & "-1"
End If
' Break up string
If re.Test(Detail) Then
Set rMC = re.Execute(Detail)
For Each rM In rMC
' output Items and Qty's to Immediate window
Debug.Print rM.SubMatches(0), rM.SubMatches(1)
Next
End If
Next
End Sub
Based on your comment I haved assumed that only the last item in a cell may be missing a -#
Sample input
Apple Juice- 20 Grape -5
pear- 3Banana-2Orange
Produces this output
Apple Juice 20
Grape 5
pear 3
Banana 2
Orange 1

Word 2010 VBA miscounts words per sentence against itself

The macro below is supposed to pull the average words per sentence, then turn the text red in all sentences that are >=150% of that.
The problem is, it turns some shorter sentences red, as well. For example, it colored these sentences (edited to add: in the source doc, 150% of average length is 35 words):
31 words: The FSAIPs provide the basis for evaluation of the adequacy of the regulatory implementation of the design based on this assumed operational process and supports the preparation of prospective dose assessments.
29 words: (In accordance with 10 CFR 835.2, the equivalent dose rate criteria are applicable at 30 cm from the radiation source or 30 cm from any surface the radiation penetrates.)
(I'd share more examples, but this is a radiation control procedure on a Federal nuclear project, so I'm having to choose carefully.)
Those word counts for the sentences above are from the status bar at the bottom of the window. So Word appears to be counting the number of words differently depending on what part of Word is counting. I think.
Are there any suggestions on how to make the count more accurate, or at least the same for both situations? Oh, and a final note: it's not counting visible deleted words. It may be counting things like nonbreaking hyphens in some instances, but not in the ones shared above.
Sub Mark_Long()
'''''''''''''''''''
' Adapted from "Allen Wyatt's Word Tips, wordribbon.tips.net.
' I added to it so it pulls the avg sentence length from
' the readability stats, and only marks the sentences that are 150%
' of the average.
''''''''''''''''''''
Dim iMyCount As Integer
Dim iWords As Integer
Dim bTrackingAsWas As Boolean
If Not ActiveDocument.Saved Then
ActiveDocument.Save
End If
Set myRange = ActiveDocument.Content
wordval = myRange.ReadabilityStatistics(6).Value
bTrackingAsWas = ActiveDocument.TrackRevisions
'Turn off tracked changes
ActiveDocument.TrackRevisions = False
'Reset counter
iMyCount = 0
'Set number of words
iWords = (wordval * 1.5)
For Each MySent In ActiveDocument.Sentences
If MySent.Words.Count > iWords Then
MySent.Font.Color = wdColorRed
iMyCount = iMyCount + 1
End If
Next
'Restore tracked changes
ActiveDocument.TrackRevisions = bTrackingAsWas
'Report results
MsgBox iMyCount & " sentences longer than " & _
iWords & " words."
End Sub

you should use .Range.ComputeStatistics(wdStatisticWords) instead of .Words.Count.
The first returns a filtered value, the second an unfiltered
See:
http://www.vbaexpress.com/forum/archive/index.php/t-21723.html

The property .Words returns real words but also punctuation marks and paragraph marks. To get the real word count you can use this - a little bit weird - method.
Set dlg = Dialogs(wdDialogToolsWordCount)
For Each MySent In ActiveDocument.Sentences
MySent.Select
Set dlg = Dialogs(wdDialogToolsWordCount)
dlg.Execute
Count = dlg.Words
' Count is the number you are looking for
Next
You just simulate the 'Word Count' dialog.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Excel 2013 - How to delete duplicate phrases from a single cell - vba

Related

Selection of particular text in two lines with different scenarios

Split Text and number IN VBA

Pull Site Code from Location Name (VBA)

How to breakdown text with a non-uniform delimiter?

Word 2010 VBA miscounts words per sentence against itself

Categories

Resources