Search for proper column and find duplicates - macro amendment needed [VBA] - vba

I have a VBA Macro for excel to find duplicates. It works but it is specified to a certain column. I would like to search through column headers which are in the 1st row and find the header called "Email" (the best would be "Email*" as sometimes this header contains some other words after the "Email" word). I think this script doesn't adjust to the number of rows and it is limited to 65536 values. I would prefer to let this script adjust to the number of values in the column. I have a similar VBA macro which does the perfect job. I thought I would be able to use this macro as an example and amend the one which I am currently working on...however I failed. Could anyone help me to do the proper amendments to the first code?
VBA MACRO WHICH I WOULD LIKE TO AMEND:
Option Explicit
Sub DeleteDups()
Dim x As Long
Dim LastRow As Long
Sheets("test").Activate
LastRow = Range("A65536").End(xlUp).Row
For x = LastRow To 1 Step -1
If Application.WorksheetFunction.CountIf(Range("A1:A" & x), Range("A" & x).Text) > 1 Then
Range("A" & x).Interior.Color = RGB(255, 48, 48)
End If
Next x
End Sub
VBA MACRO WHICH WORKS FINE AND I WANTED TO USE AS AN EXAMPLE:
Function getAllColNum(ByVal rowNum As Long, ByVal searchString As Variant) As Object
Dim allColNum As Object
Dim i As Long
Dim j As Long
Dim width As Long
Set allColNum = CreateObject("Scripting.Dictionary")
colNum = 1
With ActiveSheet
width = .Cells(rowNum, .Columns.Count).End(xlToLeft).Column
For i = 1 To width
If InStr(UCase(Trim(.Cells(rowNum, i).Value)), UCase(Trim(searchString))) > 0 Then
allColNum.Add i, ""
End If '
Next i
End With
Set getAllColNum = allColNum
End Function
Sub GOOD_WORKS_No_Dots_at_End_of_Emails()
Dim strSearch As String
strSearch = "Email"
Dim colNum As Variant
Dim allColNum As Object
Sheets("Data").Activate
Dim LR As Long, i As Long
Set allColNum = getAllColNum(1, searchString)
For Each colNum In allColNum
LR = Cells(Rows.Count, colNum).End(xlUp).Row
For i = 1 To LR
With Range(Cells(i, colNum), Cells(i, colNum))
If Right(.Value, 1) = "." Then .Value = Left(.Value, Len(.Value) - 1)
End With
Next i
Next colNum
Sheets("Automation").Activate
MsgBox "No Dots at the end of email addresses - Done!"
End Sub
MY WORK SO FAR
Function getAllColNum(ByVal rowNum As Long, ByVal searchString As Variant) As Object
Dim allColNum As Object
Dim i As Long
Dim j As Long
Dim width As Long
Set allColNum = CreateObject("Scripting.Dictionary")
colNum = 1
With ActiveSheet
width = .Cells(rowNum, .Columns.Count).End(xlToLeft).Column
For i = 1 To width
If UCase(Trim(.Cells(rowNum, i).Value)) Like UCase(Trim(searchString)) Then
allColNum.Add i, ""
End If '
Next i
End With
Set getAllColNum = allColNum
End Function
Sub testing_testing()
Dim strSearch As String
strSearch = "Email"
Dim colNum As Variant
Dim allColNum As Object
Sheets("Data").Activate
Dim LR As Long, i As Long
Set allColNum = getAllColNum(1, searchString)
For Each colNum In allColNum
LR = Cells(Rows.Count, colNum).End(xlUp).Row
For i = 1 To LR
With Range(Cells(i, colNum), Cells(i, colNum))
If Application.WorksheetFunction.CountIf(Range("R1:A" & x), Range("R" & x).Text) > 1 Then
Range("A" & x).Interior.Color = RGB(255, 48, 48)
End With
End If
Next i
Next colNum
Sheets("Automation").Activate
MsgBox "Finiding duplicates - Done!"
End Sub
Seems to be more complicated and as I mentioned I have limited knowledge of VBA. However, I found a different script which might be easier to amend.
This macro finds the email address column and marks the whole column
Option Explicit
Sub GOOD_WORKS_Mark_Email_Duplicates()
Dim x As Long
Dim LastRow As Long
Sheets("test").Activate
LastRow = Range("A65536").End(xlUp).Row
For x = LastRow To 1 Step -1
If Application.WorksheetFunction.CountIf(Range("A1:A" & x), Range("A" & x).Text) > 1 Then
Range("A" & x).Interior.Color = RGB(255, 48, 48)
End If
Next x
MsgBox "Email duplicates has been marked - red cells. Check if there are any red cells in the Email column"
End Sub
This one finds duplicates using countif function (which is good to me. The only problem is that I have this macro as a button, where the range is specified
Sub Highlight_Duplicates(Values As Range)
Dim Cell
For Each Cell In Values
If WorksheetFunction.CountIf(Values, Cell.Value) > 1 Then
Cell.Interior.ColorIndex = 6
End If
Next Cell
End Sub
Then the action button:
Private Sub CommandButton1_Click()
Highlight_Duplicates (Sheets("Test").Range("C2:C92"))
End Sub
It is fine for me to run 1st macro and then the 2nd. However, I don't know how to get rid of Range in the action button. Any ideas?

In your getAllColNum function, change this:
If InStr(UCase(Trim(.Cells(rowNum, i).Value)), _
UCase(Trim(searchString))) > 0 Then
to this:
If UCase(Trim(.Cells(rowNum, i).Value)) Like UCase(Trim(searchString)) Then
and that will allow you to pass a wildcard header like "email" and get all matching columns.

Related

How to combine multiple macros and excel functions into a single macro that executes on button click?

I need to combine multiple macros to a single macro that executes on button click. Kindly excuse me if I write anything wrong since I am completely new to excel macros and vb.
Following is the scenario.
Steps:
Calculate total
Extract reference
Compare total field value for matching reference and mark that as "Complete" if sum of total for matching references calculates to ).
(Explained...)
First i calculate the debit and credit amount to a new column called total, for this, initially I used the SUM function. after that I tried the same using the macro that executes on button click
(old macro)
Private Sub getTotal_Click()
With ActiveSheet
lastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
End With
For i = 5 To lastRow
Range("K" & i).Value = Range("F" & i).Value + Range("G" & i).Value
Next i
End Sub
This was so much time consuming (took around 2 hrs when executed on 75k records) than when using the formula (which finished in minutes). I am still not able to understand the reason for this. However modifiying to Dy.Lee's answer below, it took only seconds to calculate the total.
(modified based on Dy.Lee's answer)
Private Sub getTotal_Click()
Dim vDB As Variant, vR() As Variant
Dim i As Long, n As Long, lastRow As Long
With ActiveSheet
lastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
vDB = .Range("R5", "S" & lastRow)
n = UBound(vDB, 1)
ReDim vR(1 To n, 1 To 1)
For i = 1 To n
vR(i, 1) = vDB(i, 1) + vDB(i, 2)
Next i
.Range("AL5").Resize(n) = vR
End With
End Sub
Now moving on to the second macro which I used to extract a pattern from strings in a column D and E.
Function extractReference(cid_No As String, pm_source As String)
Dim regExp As Object, findMatches As Object, match As Object
Dim init_result As String: init_result = ""
Set regExp = CreateObject("vbscript.regexp")
With regExp
.Global = True
.MultiLine = False
.Pattern = "(?:^|\D)(\d{5,6})(?!\d)"
End With
Set findMatches = regExp.Execute(pm_source)
For Each match In findMatches
init_result = init_result + match.SubMatches.Item(0)
Next
If init_result <> "" Then
extractReference = cid_No & " | " & init_result
Else
extractReference = ""
End If
End Function
This macro was working fine.
Finally I used the following function after copying both the extracted reference and total to a new sheet and creating a datatable for that
=IF(ISBLANK([#Reference]), "", (IF((ROUND(SUMIFS([Total],[Reference],[#Reference]),2)=0), "complete", "")))
This also worked fine.
Now what I actually want is I need to avoid creating any new data tables or sheets and preform all this within current sheet on a single button click. Is there anyway that can be done without making the macro a time consuming process? Your help is higly appreciated!
Thanks in Advance
for the first part try:
Private Sub getTotal_Click()
Dim lastRow As Long
Dim sumRange As Range
With ActiveSheet
lastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
End With
Set sumRange = Range(Range("K5"), Range("K" & lastRow))
sumRange.FormulaR1C1 = "=RC[-5]+RC[-4]"
sumRange.Copy
sumRange.PasteSpecial Paste:=xlPasteValues
Application.CutCopyMode = False
End Sub
also, if you still want to loop notice that calling cell like .Cells(1, 1) is faster than Range("A1")
You need using Variant Array. It is faster.
Private Sub getTotal_Click()
Dim vDB As Variant, vR() As Variant
Dim i As Long, n As Long, lastRow As Long
With ActiveSheet
lastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
vDB = .Range("f5", "g" & lastRow)
n = UBound(vDB, 1)
ReDim vR(1 To n, 1 To 1)
For i = 1 To n
vR(i, 1) = vDB(i, 1) + vDB(i, 2)
Next i
.Range("k5").Resize(n) = vR
End With
End Sub

Need help trimming spaces out of column

I am trying to figure out how to loop through the first column of my worksheet and take out the spaces so I can use VLOOKUP. Not sure how to do it in VBA. Here is what I have:
I can't figure out why it does not go onto the next sheet now? I can't just cycle through all of the sheets since they are different.
Sub trima()
Dim x As Integer
Dim numrows As Long
numrows = Range("A1",Range("A1").End(xlDown)).Rows.Count
Range("A1").Select
For x = 1 To numrows
Application.WorksheetFunction.trim (ActiveCell)
ActiveCell.Offset(1, 0).Select
Next
End Sub
Here you go:
Sub TrimA()
Dim v
v = [transpose(transpose(trim(a1:index(a:a,match("",a:a,-1)))))]
[a1].Resize(UBound(v)) = v
End Sub
UPDATE
If you want to update multiple sheets, you can utilize the above like so:
Sub DoTrims()
Sheet1.Activate: TrimA
Sheet2.Activate: TrimA
'etc.
End If
The Trim function does not work like that.
Instead, try something like:
Sub trima()
Dim numrows As Long
Dim vItem as Variant
Dim i As Long
numrows = Range("A1",Range("A1").End(xlDown)).Rows.Count
Application.ScreenUpdating = False
With ActiveSheet
For i = 1 To numrows
vItem = .Range("A" & i)
If vItem <> vbNullString then .Range("A" & i) = Application.WorksheetFunction.Trim(vItem)
Next
End With
Application.ScreenUpdating = True
End Sub
The following code will loop through ALL worksheets in the Workbook and perform the same trim on values in Column A:
Sub trimA()
Dim ws As Excel.Worksheet
Dim i As Long, numrows As Long
Dim vItem As Variant
Application.ScreenUpdating = False
For Each ws In Worksheets
With ws
numrows = .Range("A1", .Range("A1").End(xlDown)).Rows.Count
For i = 1 To numrows
vItem = .Range("A" & i)
If vItem <> vbNullString Then .Range("A" & i) = Application.WorksheetFunction.Trim(vItem)
Next i
End With
Next
Application.ScreenUpdating = True
End Sub
Using the Range.TextToColumns method should quickly clear all cells containing leading/trailing spaces.
This procedure can quickly convert text-that-look-like-numbers to true numbers as well.
Dim c As Long
With Range("A1").CurrentRegion `<~~ set to the desired range of one or more columns
For c = 1 To .Columns.Count
.Columns(c).TextToColumns Destination:=.Columns(c), _
DataType:=xlFixedWidth, FieldInfo:=Array(0, 1)
Next c
End With
If the cells actually contain non-standard spacing like the non-breaking space (common on data copied from a web page) then other Range.Replace method should be added.

Formatting a text file and exporting a text file

I have a Range of data in text file like 102201906000-102201911999-23451 around thousands. i want to create a new text file to create the range into numbers like.
102201906000 23451
102201906001 23451
102201906002 23451
till
102201911999 23451
Keeping the last digit as fixed.
I have made following code.
Private Sub CommandButton21_Click()
Application.ScreenUpdating = False
Dim LastRow As Long
LastRow = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
Dim lCol As Long
Dim rngEnd As String
Dim rng1 As String
Dim rng2 As String
Dim x As Long
Dim Num As Range
For Each Num In Range("A1:A" & LastRow)
lCol = ActiveSheet.UsedRange.Columns.Count
rngEnd = Split(Num, "-")(2)
rng1 = Split(Num, "-")(0) - 1
rng2 = Split(Num, "-")(1)
For x = 1 To rng2 - rng1
Cells(x, lCol + 1) = rng1 + x & " " & rngEnd
Next x
Next Num
Application.ScreenUpdating = True
End Sub
But as i have huge data i am unable to use it properly.
Can i get some help on to create a text file itself when i run a macro without using the spreadsheet.
Waiting for expert advises.
Assuming that you have a text file and want to create another textfile where each line like 102201906000-102201911999-23451 is replaced by a number of lines like 102201906000 23451 it is more natural to use VBScript than straight VBA. You can bypass the need to pull the data into Excel (but -- it is written as an Excel macro so you need to call it from Excel. With minor modification you can remove Excel from the loop completely and use pure VBScript).
To use it you have to include a reference to Microsoft Scripting Runtime to your projects (Tools/References in the VBA editor).
Sub ExpandData(inName As String, outName As String)
Dim FSO As New FileSystemObject
Dim tsIn As TextStream
Dim tsOut As TextStream
Dim startNum, endNum, i, line 'variants
On Error GoTo err_handler
Set tsIn = FSO.OpenTextFile(inName, ForReading)
Set tsOut = FSO.OpenTextFile(outName, ForWriting, True)
Do While tsIn.AtEndOfStream = False
line = Split(tsIn.ReadLine, "-")
If UBound(line) = 2 Then
startNum = CDec(line(0))
endNum = CDec(line(1))
For i = startNum To endNum
tsOut.WriteLine i & " " & line(2)
Next i
End If
Loop
tsIn.Close
tsOut.Close
Exit Sub
err_handler:
Debug.Print "I'm confused!"
End Sub
Used like thus (inName must be different from outName):
Sub test()
ExpandData "C:\Programs\test.txt", "C:\Programs\testout.txt"
End Sub
I would attempt to write then all at once. Looping through and examining them individually should not be necessary if they are sequential.
Private Sub CommandButton21_Click()
Application.ScreenUpdating = False
Dim lr As Long, nmbr As Long, bgn As String, nd As String
With Worksheets("Sheet1") '<~~set this worksheet properly!
lr = .Cells(1, 1).SpecialCells(xlCellTypeLastCell).Row
nd = .Cells(1, 1).Value2
bgn = left(nd, 7)
nmbr = CLng(Mid(nd, 8, 5))
nd = right(nd, 5)
With .Cells(1, Columns.Count).End(xlToLeft).Offset(0, 1).Resize(lr, 1)
.Formula = "=""" & bgn & """&TEXT(ROW(" & Rows(nmbr).Address(0, 0) & "), ""00000_)"")&""" & nd & """"
.Value = .Value2
End With
End With
Application.ScreenUpdating = True
End Sub
This generate sequential number based upon the ROW function. The prefix and suffix are peeled off the first value in A1 only once and used afterwards as string vars.
      

Concatenate columns(user selected) and replace them with new column

I'm not an advanced VBA programmer. I'm working on an excel macro which will allow me to select a range(using input box) to clean the data(makes consistent with mySQL schema) on worksheet. I get this file from anther team and
1.) the order of columns is not fixed
2) levels of categories(there are few columns for categories like level1 level2 etc.) can be anything between 3-10.
I want to concatenate the columns for categories(in image level 1, level 2 etc.) using | as a separator and put the values in first category column(level1) while deleting remaining columns(level 2, level 3...[level 10]).
I removed some code from the end to reduce the length here but it still makes sense:
Sub cleanData()
Dim rngMyrange As Range
Dim cell As Range
On Error Resume Next
Do
'Cleans Status column
Set rngMyrange = Application.InputBox _
(Prompt:="Select Status column", Type:=8)
On Error GoTo 0
'Is a range selected? Exit sub if not selected
If rngMyrange Is Nothing Then
End
Else
Exit Do
End If
Loop
With rngMyrange 'with the range just selected
.Replace What:="Dead", Replacement:="Inactive", SearchOrder:=xlByColumns, MatchCase:=False
'I do more replace stuff here
End With
rngMyrange.Cells(1, 1) = "Status"
Do
'Concatenates Category Columns
Set rngMyrange = Application.InputBox _
(Prompt:="Select category columns", Type:=8)
On Error GoTo 0
'Is a range selected? Exit sub if not selected
If rngMyrange Is Nothing Then
End
Else
Exit Do
End If
Loop
With rngMyrange 'with the range just selected
'Need to concatenate the selected columns(row wise)
End With
rngMyrange.Cells(1, 1) = "Categories"
End Sub
Please do not suggest a UDF, I want to do this with macro. I must do this on files before importing them on SQL database, so a macro will be handy. Please ask if I failed to mention anything else.
EDIT: Image attached for illustration
UPDATE:
I now have a working code with help from vaskov17 on mrexcel but it does not delete the columns from where the levels are picked-level 2, level 3...etc. to shift next columns to left and the major challenge for me is to implement that code in my existing macro using range type instead of long type. I do not want to enter start column and finish column separately, instead I should be able to select range like in my original macro. Code for that macro is below, please help me:
Sub Main()
Dim start As Long
Dim finish As Long
Dim c As Long
Dim r As Long
Dim txt As String
start = InputBox("Enter start column:")
finish = InputBox("Enter ending column:")
For r = 2 To Cells(Rows.Count, "A").End(xlUp).Row
For c = start To finish
If Cells(r, c).Text <> "" Then
txt = txt & Cells(r, c).Text & "|"
Cells(r, c).Clear
End If
Next
If Right(txt, 1) = "|" Then
txt = Left(txt, Len(txt) - 1)
End If
Cells(r, start) = txt
txt = ""
Next
End Sub
I have removed the inputbox for selection of the category columns. Since they are always named Level x»y it makes it easier to find them automatically. That's why added a FindColumns() Sub to your code. It assigns the first fCol and last lCol Category column to global variables.
The ConcatenateColumns() concatenates cells in each row using "|" as separator.
The DeleteColumns() deletes the other columns
Cells(1, fCol).Value = "Category renames Level 1 to Category and Columns.AutoFit resizes all columns widths to fit the text.
Code:
Option Explicit
Dim fCol As Long, lCol As Long
Sub cleanData()
Dim rngMyrange As Range
Dim cell As Range
On Error Resume Next
Do
'Cleans Status column
Set rngMyrange = Application.InputBox _
(Prompt:="Select Status column", Type:=8)
On Error GoTo 0
'Is a range selected? Exit sub if not selected
If rngMyrange Is Nothing Then
End
Else
Exit Do
End If
Loop
With rngMyrange 'with the range just selected
.Replace What:="Dead", Replacement:="Inactive", SearchOrder:=xlByColumns, MatchCase:=False
'I do more replace stuff here
End With
rngMyrange.Cells(1, 1) = "Status"
' Concatenate Category Columns
FindColumns
ConcatenateColumns
DeleteColumns
Cells(1, fCol).Value = "Category"
Columns.AutoFit
End Sub
Private Sub FindColumns()
Dim ws As Worksheet
Set ws = ActiveSheet
Dim i As Long, j As Long
For i = 1 To ws.Cells(1, Columns.Count).End(xlToLeft).Column
If StrComp(ws.Cells(1, i).Text, "Level 1", vbTextCompare) = 0 Then
For j = i To ws.Cells(1, Columns.Count).End(xlToLeft).Column
If InStr(1, ws.Cells(1, j).Text, "Level", vbTextCompare) Then
lCol = j
End If
Next j
fCol = i
Exit Sub
End If
Next i
End Sub
Private Sub ConcatenateColumns()
Dim rng As Range
Dim i As Long, j As Long
For i = 2 To Cells(Rows.Count, fCol).End(xlUp).Row
Set rng = Cells(i, fCol)
For j = fCol + 1 To lCol
rng = rng & "|" & Cells(i, j)
Next j
rng = "|" & rng & "|"
Set rng = Nothing
Next i
End Sub
Private Sub DeleteColumns()
Dim i As Long
For i = lCol To fCol + 1 Step -1
Columns(i).Delete Shift:=xlToLeft
Next i
End Sub

VBA column looping

I have a large Excel file and I need to replace all values in 12 columns completely.
Right now, there is a formula in each one of the cells, and I need to replace that formula with my own.
How do I loop through all those columns, knowing at what row it starts but don't know the end row (file is updated constantly). The hack of "A600000" seems overkill.
I am new to VBA and some guidance would be really appreciated.
ActiveSheet.UsedRange is the range of all the used cells on the current sheet.
You can use ActiveSheet.UsedRange.Rows.Count and .Columns.Count to get the height and widht of this range.
Here's a very crude function that hits every cell in the range:
Sub test()
Dim thisRange As Range
Set thisRange = ActiveSheet.UsedRange
With thisRange
For y = 1 To .Rows.Count
For x = 1 To .Columns.Count
thisRange.Cells(y, x).Value = "Formula here"
Next x
Next
End With
End Sub
But what you want may be different, can you be more specific?
The below will accomplish what you need to do. You just need to supply the startRow, .Sheets("Name"), and i arguments. If the columns are all the same length, then UsedRange will work fine if there are not random cells with values outside and below the columns you are interested in. Otherwise, try this in your code (on a throw away copy of your workbook)
Sub GetLastRowInColumn()
Dim ws as Excel.Worksheet
Set ws = Activeworkbook.Sheets("YOURSHEETNAMEHERE")
Dim startRow as long
startRow = 1
Dim lastRow as long
Dim i as long
For i = 1 to 12 'Column 1 to Column 12 (Adjust Accordingly)
lRow = ws.Cells(ws.Rows.Count, i).End(xlUp).Row
ws.Range(ws.Cells(startRow, i), ws.Cells(lRow, i)).Formula = "=Max(1)" 'Sample Formula
Next
End Sub
EDIT : Fixed typo
The below function will build the range with varying length columns. Use the function to return the desired range and fill all related cells in one shot.
Function GetVariantColumnRange(MySheet As Excel.Worksheet, _
TopRow As Long, StartColumn As Long, LastColumn As Long) As Excel.Range
Dim topAddress As String
Dim bottomAddress As String
Dim addressString As String
Dim i As Long
For i = StartColumn To LastColumn
topAddress = MySheet.Cells(TopRow, i).Address
bottomAddress = MySheet.Cells(MySheet.Rows.Count, i).End(xlUp).Address
addressString = addressString & ", " & topAddress & ":" & bottomAddress
Next
addressString = Right(addressString, Len(addressString) - _
InStr(1, addressString, ", ", vbBinaryCompare))
Set GetVariantColumnRange = MySheet.Range(addressString)
End Function
Usage follows...
Sub Test()
Dim myrange As Range
Set myrange = GetVariantColumnRange(ThisWorkbook.Sheets(1), 1, 1, 12)
myrange.Select 'Just a visual aid. Remove from final code.
myrange.Formula = "=APF($Jxx, "string1", "string2") "
End Sub