Delete duplicate entries in a column in excel 2003 vba - vba

Well the question is, I have got a column, for example column Y has many entries in it, nearly 40,000 and It increases everyweek. The thing is I have to check for duplicates in Y column and delete the entire row. Thus, Y column should have only unique entries.
Suppose I have 3,000 entries and after 1 week, i'll have about 3,500 entries. Now I have to check these newly added 500 columnn values not the 3,500 with the old + the new i.e 3,500 entries and delete the duplicated row. The old 3,000 shouldn't be deleted or changed. I have found macros but they do the trick for the entire column. I would like to filter the new 500 values.
Cells(2, "Q").Formula = "=COUNTIF(P$1:P1,P2)=0" 'I have used these formula
Range("Q2").Copy Destination:=Range("Q3:Q40109") 'it gives false for the duplicate values
I know we have to use countif for the duplicate entries. But what Iam doing is applying the formula and then search for false entries and then delete it. I belive applying formula and finding false and then deleting its little bit time consuming.
Sub DeleteDups()
Dim x As Long
Dim LastRow As Long
LastRow = Range("A65536").End(xlUp).Row
For x = LastRow To 1 Step -1
If Application.WorksheetFunction.CountIf(Range("A1:A" & x), Range("A" & x).Text) > 1 Then
Range("A" & x).EntireRow.Delete
End If
Next x
End Sub
This is what I found on google but i dont know where the error is. It is deleting all the columns if i set
For x = LastRow To 1 Step -1
For x = LastRow to step 3000 ' It is deleting all 500 columns but if it is -1 working fine
Any modifications need to be done for these function? or sugest me any good function that helps me. Check for the duplicate values of a selected column range from the entire column. I mean check 500 entires column values with the 3500 column entry values and delete the duplicates in 500 entries
Thanks in advance

This should be rather simple. You need to create 1 cell somewhere in your file that you will write the cell count for column Y each week after removing all dupes.
For example, say week1 you remove dupes and you are left with a range of Y1:Y100. Your function will put "100" somewhere in your file to reference.
Next week, your function will start looking from dupes from (cell with ref number) + 1, so Y:101 to end of column. After removing dupes, the function changes the ref cell to the new count.
Here is the code:
Sub RemoveNewDupes()
'Initialize for first time running this
If Len(Range("A1").Value) = 0 Then
Range("A1").Value = Range("Y" & Rows.count).End(xlUp).row
End If
If Range("A1").Value = 1 Then Range("A1").Value = 0
'Goodbye dupes!
ActiveSheet.Range("Y" & Range("A1").Value + 1 & ":Y" & _
Range("Y" & Rows.count).End(xlUp).row).RemoveDuplicates Columns:=1, Header:=xlNo
'Re-initialize the count for next time
Range("A1").Value = Range("Y" & Rows.count).End(xlUp).row
End Sub
*sorry no idea why auto-syntax highlighting makes this hard to read
Update:
Here is a way to do it in Excel 2003. The trick is to loop backwards through the column so that the loop isn't destroyed when you delete a row. I use a dictionary (which I'm famous for over-using) since it allows you to check easily for dupes.
Sub RemoveNewDupes()
Dim lastRow As Long
Dim dict As Object
Set dict = CreateObject("scripting.dictionary")
If Len(Range("A1").Value) = 0 Then
Range("A1").Value = 1
End If
lastRow = Range("Y" & Rows.count).End(xlUp).row
On Error Resume Next
For i = lastRow To Range("A1").Value Step -1
If dict.exists(Range("Y" & i).Value) = True Then
Range("Y" & i).EntireRow.Delete
End If
dict.Add Range("Y" & i).Value, 1
Next
Range("A1").Value = Range("Y" & Rows.count).End(xlUp).row
End Sub

How can Excel know that entries are "new"? (e.g. how can we know we only have to consider the 500 last rows)
Actually, if you already executed the macro last week, the first 3,000 rows won't have any duplicates so the current execution won't change these rows.
The code your described should nearly work. If we keep it and change it very slightly:
Sub DeleteDups()
Dim x As Long
Dim LastRow As Long
LastRow = Range("Q65536").End(xlUp).Row
For x = LastRow To 1 Step -1
'parse every cell from the bottom to the top (to still count duplicates)
' and check if duplicates thanks to the formula
If Range("Q" & x).Value Then Range("Q" & x).EntireRow.Delete
Next x
End Sub
[EDIT] Another (probably faster) solution: filter first the values and then delete the visible rows:
Sub DeleteDups()
ActiveSheet.UsedRange.AutoFilter Field:=17, Criteria1:="True" 'filter column Q for True values
ActiveSheet.Cells.SpecialCells(xlCellTypeVisible).EntireRow.Delete
End Sub
Couldn't test this last solution right here, sorry.

Here's an idea:
Sub test
LastRow = Range("A65536").End(xlUp).Row
For i = LastRow To 1 Step -1
If Not Range("a1:a" & whateverLastRowYouWantToUse ).Find(Range("a" & i).Value, , , , , xlPrevious) Is Nothing Then
Rows(i).Delete
End If
Next i
End Sub
It checks the entire range above the current cell for a single duplicate. If found, it the current row is deleted.
EDIT I just realized in your example, you said column Y, but in your code you are checking A. Not sure if the example was just a hypothetical, but wanted to make sure that wasn't the reason for the odd behavior.
Note, this is untested! Please save your workbook before trying this!

Related

"Next" not working in loop when IF statement is true

I am trying to check a cell which contains a unique ID (concatenation of several columns) in the current row and compare it to the row above it's unique ID. If the values are the same, I want to copy the current line and insert this copy in a new row located two rows above the current row. If the ID's are not the same, I want to do nothing and move to the next row on the sheet.
The code below runs without errors, and works correctly until an ID match is found. The code copies and inserts as I require, but it acts like it is starting the "IF" statement again for the same line (the "next" does not advance to analyze the next row). So I end up getting endless copies and inserts of the first line that has a matched Unique ID to the row above it.
Sub CopyAndInsert
Dim LastRow As Long
LastRow = Worksheets("Orders").Range("A" & Rows.Count).End(xlUp).Row
Set SelectionRNG = Worksheets("Orders").Range("A2:CX" & LastRow)
For Each rngrow In SelectionRNG.Rows
rngrow.Copy
If rngrow.Cells(1, 102) = rngrow.Cells(0, 102) Then 'checks if row CX is equal to the row above it (same column)
rngrow.Cells(2, 1).Offset(-2, 0).EntireRow.Insert
End If
Next
End Sub
Please note if the rows do not match in the unique ID column (CX) then the "next" does work as expected and continues onto the next row. The issue of not advancing only occurs when the ID's do match and the copied row is inserted.
Thanks in advance for any assistance!
Tying the loop to a variable and manually increasing it by 1 to skip the added row seems to work:
Sub CopyAndInsert()
Dim lastRow As Long, x As Long
lastRow = Worksheets("Orders").Range("A" & Rows.Count).End(xlUp).Row
For x = 3 To lastRow
If Cells(x, 102) = Cells(x - 1, 102) Then
Cells(x - 1, 1).EntireRow.Insert
Range("A" & x + 1 & ":" & "CX" & x + 1).Copy Cells(x - 1, 1)
x = x + 1
End If
Next x
End Sub

VBA Look for Duplicate, then assesses another cells value

I initially asked a question below.
Basically I want VBA to look at Column L:L. If the cell=program, and the cell below does not equal lathe I want the row above to be deleted. If the cell doesn't equal program continue looking until the end of the data.
Realized I needed to look at the data different, as I was losing rows that I needed to stay.
New logic, which I think will still use some of the old program, but
it needed to be sorted using another column. I need the VBA to look at
column E:E. If the cell in the row below is a duplicate of the cell
above, then look at column L in that row to see if the cell says
Program. If so the cell below should be Lathe. If not lathe delete the
Program Row, If it is Lathe leave both rows. If the Cells in Column E
are not duplicates, continue looking. EX. If E5=E6, If not continue
looking. If yes Look at L5 to see if it say Program. If so look at L6
for Lathe. If not delete ROW5.
This I what I received that answered teh first question which I think will still get used
Dim rngCheck as Range
Dim rngCell as Range
Set rngCheck = Range("L1", "L" & Rows.Count - 1)
For each rngCell in rngCheck
If rngCell.value = "Program" And rngCell.offset(1,0).value <> "lathe" then
rngCell.offset(-1,0).EntireRow.Delete
End if
Next rngCell
This should do it
For i = ThisWorksheet.Cells.SpecialCells(xlCellTypeLastCell).Row to 2 step -1
' that row do you mean the duplicate or the original (I am using original)
If ThisWorksheet.Cells(i, 5) = ThisWorksheet.Cells(i-1, 5) and _
ThisWorksheet.Cells(i-1, 12) = "Program" and ThisWorksheet.Cells(i, 12) <> "Lathe"
ThisWorksheet.Rows(i-1).EntireRow.Delete
End If
Next i
When deleting it is best to iterate from last to first. If prevent you from skipping rows.
Sub RemoveRows()
Dim x As Long
With Worksheets("Sheet1")
For x = .Range("E" & .Rows.Count).End(xlUp).Row To 2 Step -1
If .Cells(x, "E").Value = .Cells(x - 1, "E").Value And Cells(x - 1, "L").Value = "Program" Then
.Rows(x).Delete
End If
Next
End With
End Sub

Autofill to new last row from previously last row after data is added?

I have a database that is growing quite large. Every day I add about 100 rows of information.
I have about 20 columns that autofill with calculations, etc. A few of those columns pull from a VERY large file using Vlookups. That takes forever because it's pulling the entire column everyday because my current autofill macro starts in row 2.
Is there any way to write the macro so that it autofills from the previous "last row" so it's only autofilling 100 or so new rows instead of several thousand?
I have tried the following with no luck:
Range("BZ2").AutoFill Destination:=Range("BZ2:BZ" & LastRow) is an example of one of my VBA codes for an autofill I use. LastRow is the last row after the new data is pasted in. I would like it to ideally start at the OldLastRow which would be the last row before I paste in the new data. I tried Range("BZ" & OldLastRow).AutoFill Destination:=Range("BZ" & OldLastRow & ":BZ" & LastRow) without luck.
Try this.
Dim LastRow As Long
With ActiveSheet
'This will get you the OldLastRow. Add +1 to the end to get the first empty row if you need it.
'The "A" in .Rows.Count, "A" should be replaced with a column that will always have data in every row.
LastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
.Range("BZ2").AutoFill Destination:=Range("BZ2:BZ" & LastRow)
'This will pause the macro until Calculations are complete.
Do While Application.CalculationState <> xlDone
DoEvents
Loop
End With
Ideally you would be doing the autofill from the point that you pasted your data in, so you wouldnt want to have "BZ2" anywhere in there if you just want to autofill new data.
you could try this one as well.
Dim LastRow As Long
With ActiveSheet
LastRow = .Cells(.Rows.Count, "BZ").End(xlUp).Row
.Range("BZ2").AutoFill Destination:=Range("BZ" & LastRow & ":BZ" & Cells(Rows.Count, "BY").End(xlUp).Row)
Do While Application.CalculationState <> xlDone
DoEvents
Loop
End With
This is assuming you only have one column of calculations. if not change the "BZ" & LastRow & ":BZ" to "BZ" & LastRow & ":CA" or whatever your last calculated column is.

How can I put consecutive numbers in an active range of cells using VBA?

I have a selected series of 400 cells in one common and want to number them. However, I want this to be dynamic so if I select a bigger range it will number them. Basically how do I number the active cells. I tried doing it using a macro but because of the way my data is structured there is nothing in the row to start and consequently I couldn't get an xldown function to work. Thanks in advance I really appreciate it.
This will fill either a single-column or single-row contiguous selection (i.e. a single area):
Sub Tester()
With Selection
If .Columns.Count = 1 Then
.Value = .Parent.Evaluate("Row(1:" & .Rows.Count & ")")
ElseIf .Rows.Count = 1 Then
.Value = Application.Transpose( _
.Parent.Evaluate("Row(1:" & .Columns.Count & ")"))
End If
End With
End Sub

How to delete duplicates from each row but not delete the whole column

For context, I have a worksheet with job titles in Column A, then in columns B onwards, it has the name of a person who does that job. However, as it is populated from multiple projects, some job titles have duplicates in the row, where the same person does the same job on different projects. What I need to do is delete the duplicates from each row. All solutions I have tried have given me errors.
Can anyone help?
The data gets to its current form using VBA, copying it from another sheet, so if it could be integrated into the copying, it wouldn't be a problem.
update:
So in an earlier part of the macro, I have to do this, but for a single column. This works:
For np = lastRow To 1 Step -1
If Application.WorksheetFunction.CountIf(Range("D1:D" & np), Range("D" & np).Text) > 1 Then
Range("D" & np).ClearContents
End If
Next np
However, when I tried to add another loop outside this one, and change the D1:D to a variable, I fall down.
And what I want it to do is go from the top 4 rows in the picture, to the bottom 4. I have 78 jobs, and some of them have 20 people including the duplicates http://picture.ms/images/2013/09/18/Capture1.png:
if you want to use two loops, how about something like this
Sub RemoveDuplicates()
Dim i As Integer
Dim j As Integer
For i = 1 To Range("A65536").End(xlUp).Row
For j = i + 1 To Range("A65536").End(xlUp).Row
If Cells(i, 1) = Cells(j, 1) Then
If i <> j Then
'just clear the duplicate cell...
Cells(j, 1).ClearContents
'or to delete the entire row if a duplicate is found...
'Cells(j, 1).EntireRow.Delete
End If
End If
Next
Next
End Sub
NB - I've used Range("A65536").End(xlUp).Row to get the last row of data in column A, in Excel2007 and above there more rows on a worksheet, but it sounds like your data is much less than this anyway