excel, VBA, duplicate values - vba

I have a large spreadsheet I was asked to edit. Basically wherever the data was pulled from, it created several duplicates of individuals names, country, start dates, and end dates. Would it be possible to get the start and stop dates in adjacent cells and remove the duplicate data?
I have provided a screen shot. Manually copying, pasting, and deleting would take a very long time since this spreadsheet has over 2300 rows with approximately 50% being duplicates that will need edited.
thanks

VBA shouldn't be necessary here, just add a new column with the formula:
=CONCATENATE(C1,D1)
Replace the column letters with your column letters for Start Date and End Date.
You can then use Excel's remove duplicates function on the new column (Data -> Remove Duplicates)

Using SQL is suitable.
Sub myQuery()
Dim strSQL As String
Dim strTable As String
Dim Ws As Worksheet
strTable = "[" & ActiveSheet.Name & "$]"
strSQL = "SELECT NAME, COUNTRY, MIN([Start Date]) as [Start Date] , max([End Date]) as [End Date] "
strSQL = strSQL & " FROM " & strTable & " "
strSQL = strSQL & " Where not isnull(NAME) "
strSQL = strSQL & " Group by NAME, COUNTRY "
Set Ws = Sheets.Add
exeSQL strSQL, Ws
End Sub
Sub exeSQL(strSQL As String, Ws As Worksheet)
Dim Rs As Object
Dim strConn As String
Dim i As Integer
strConn = "Provider=Microsoft.ACE.OLEDB.12.0;" & _
"Data Source=" & ThisWorkbook.FullName & ";" & _
"Extended Properties=Excel 12.0;"
Set Rs = CreateObject("ADODB.Recordset")
Rs.Open strSQL, strConn
If Not Rs.EOF Then
With Ws
.Range("a1").CurrentRegion.ClearContents
For i = 0 To Rs.Fields.Count - 1
.Cells(1, i + 1).Value = Rs.Fields(i).Name
Next
.Range("a" & 2).CopyFromRecordset Rs
.Columns.AutoFit
End With
End If
Rs.Close
Set Rs = Nothing
End Sub

Related

Extracting data from over a million records

I have an Excel file in which I have set up a connection with an Access database. In the Excel file I have a list of names in column A, and I want to search these names in the Access database and return back two fields from that database. I need to do this for around 200-300 names.
Here is my code:
N = Cells(Rows.Count, "A").End(xlUp).Row
Application.DisplayAlerts = False
strDB = ThisWorkbook.Path & "file.accdb"
Set objConnection = New ADODB.Connection
objConnection.Open "Provider=Microsoft.ACE.OLEDB.12.0; Data Source=" & strDB
For i = 2 To N
Dim rstTable As ADODB.Recordset
Set rstTable = New ADODB.Recordset
lookup = Range("A" & i).Value
strSQL = "SELECT NAME1,NAME2 FROM DATA WHERE [Field2]= """ & lookup & """;"
'Store query output
rstTable.Open Source:=strSQL, ActiveConnection:=objConnection
'Paste results to Transactions sheet
Worksheets("Sheet1").Range("B" & i).CopyFromRecordset rstTable
'Close the record set & connection
rstTable.Close
objConnection.Close
Next i
This works (kindof) but it takes an extremely long time and randomly crashes. Any ideas how to improve this?
Making sure there is a key on the lookup field will help. I would suggest making a copy of the workbook and test external data from Access or MS Query to see if that gives a performance gain over VBA.
When using MS Query or data from Access, you can modify the command text in the connection properties and use ? in the where clause to specify the parameter in the worksheet (so you don't lose that functionality).
I modified your SQL statement. Replace the Where [Field2] = "xxx" by Where [Field2] IN ("xxx", "yyy", "zzz").
N = Cells(Rows.Count, "A").End(xlUp).Row
Application.DisplayAlerts = False
strDB = ThisWorkbook.Path & "file.accdb"
Set objConnection = New ADODB.Connection
objConnection.Open "Provider=Microsoft.ACE.OLEDB.12.0; Data Source=" & strDB
For i = 2 To N
lookup = lookup & "'" & Range("A" & i).Value & "', "
Next i
lookup = left(lookup, len(lookup) - 2)
Dim rstTable As ADODB.Recordset
Set rstTable = New ADODB.Recordset
strSQL = "SELECT NAME1,NAME2 FROM DATA WHERE [Field2] IN (" & lookup & ");"
'Store query output
rstTable.Open Source:=strSQL, ActiveConnection:=objConnection
'Paste results to Transactions sheet
Worksheets("Sheet1").Range("B" & i).CopyFromRecordset rstTable
'Close the record set & connection
rstTable.Close
objConnection.Close
You close the connection after the first iteration, so your next iteration -- which does not have code to open the connection -- would fail. So you should move the objConnection.Close out of the loop.
But, even then, to execute the same kind of query over and over again, just with a different argument, can be done in one go, using the IN (...) syntax:
' Declare all your variables
Dim N As Long
Dim strDB As String
Dim objConnection As ADODB.Connection
Dim rstTable As ADODB.Recordset
Dim strSQL As String
N = Cells(Rows.Count, "A").End(xlUp).Row
Application.DisplayAlerts = False
strDB = ThisWorkbook.Path & "file.accdb"
Set objConnection = New ADODB.Connection
objConnection.Open "Provider=Microsoft.ACE.OLEDB.12.0; Data Source=" & strDB
' collect the values in comma-separated string
lookup = ""
For i = 2 To N
lookup = lookup & ",""" & Range("A" & i).Value & """"
Next i
' Chop off the first comma
lookup = Mid(lookup, 2)
' Perform a single query, but also select the Field2 value
Set rstTable = New ADODB.Recordset
strSQL = "SELECT Field2, NAME1,NAME2 FROM DATA WHERE [Field2] IN (" & lookup & ");"
' query output
rstTable.Open Source:=strSQL, ActiveConnection:=objConnection
' Retrieve values
While Not rstTable.EOF
lookup = rstTable.Fields(0).Value
' Locate in which row to put the result
For i = 2 To N
If lookup = Range("A" & i).Value Then
Range("B" & i).Value = rstTable.Fields(1).Value
Range("C" & i).Value = rstTable.Fields(2).Value
End If
Next i
rstTable.MoveNext
Loop
' Close the record set & connection
rstTable.Close
objConnection.Close
You can do what you described, but I think it's far more efficient to do this in Access itself. Just create a table with your names and do an Inner Join to the table you want to find 2 fields. Should take less than a minute, and probably less than 30 seconds.

Syntax error in dynamic SQL string

Please help to fix the following syntax error with Like statement. The query works with = but I need to use Like to search in the AAchange field. I think the problem is here "WHERE [AAchange] LIKE '" & "%" & _
but I'm not sure how to correct this syntax. Please see the code below:
Sub ColorNewVariant()
Dim PolicyNum As Variant
Dim bFound As Boolean
Dim cnn As ADODB.Connection 'dim the ADO collection class
Dim rs As ADODB.Recordset 'dim the ADO recordset class
Dim dbPath As String
Dim strSQL As String
Dim r As Range, cell As Range
Dim LastRow As Long
LastRow = ActiveSheet.Range("A" & Rows.Count).End(xlUp).Row
Application.EnableEvents = False
Set r = ThisWorkbook.ActiveSheet.Range("G3:G" & LastRow)
For Each cell In r
If cell.Value <> "" Then
PolicyNum = cell.Value
dbPath = PATH_MAIN & "\Report\MDL_IonTorrent.accdb"
Set cnn = New ADODB.Connection ' Initialise the collection class variable
'Connection class is equipped with a -method- Named Open
'--4 aguments-- ConnectionString, UserID, Password, Options
'ConnectionString formula--Key1=Value1;Key2=Value2;Key_n=Value_n;
cnn.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & dbPath
strSQL = "SELECT [AAchange] " & _
"FROM [MDL_Table1] " & _
"WHERE [AAchange] LIKE '" & "%" & _
Replace(PolicyNum, """", """""", , , vbTextCompare) & _
""""
'Create the ADODB recordset object.
Set rs = New ADODB.Recordset 'assign memory to the recordset
'ConnectionString Open '--5 aguments--
'Source, ActiveConnection, CursorType, LockType, Options
rs.Open strSQL, cnn
bFound = Not rs.EOF
'Check if the recordset is empty.
'Close the recordet and the connection.
rs.Close
cnn.Close
'clear memory
Set rs = Nothing
Set cnn = Nothing
'Enable the screen.
If bFound Then
'MsgBox "Record exists."
Else
'MsgBox "Record not found."
'cell.Interior.ColorIndex = 8
cell.Interior.Color = RGB(255, 217, 218)
'cell.ClearComments
'cell.AddComment "New Variant"
'Fits shape around text
'cell.Comment.Shape.TextFrame.AutoSize = True
End If
End If
Next cell
Application.EnableEvents = True
End Sub
Change the quoting in your query's WHERE clause.
If you use single quotes to start and end the string value you build, you needn't bother with Replace() of double quotes within the PolicyNum value. That should make this task simpler and less confusing ...
strSQL = "SELECT [AAchange] " & _
"FROM [MDL_Table1] " & _
"WHERE [AAchange] LIKE '%" & PolicyNum & "'"
Debug.Print strSQL

Using inputs from Excel worksheet to run Access SQL query from vba

I have tables that are created each month to reflect that month's records. I have created vba code that runs a query in excel on multiple months to show changes, new adds, etc. However, I would like the user to be able to choose the two months they would like to compare from an excel drop down box. I am struggling to create dynamic SQL that can do this. Below is my attempted code
`Private Sub ADO_New()
Dim DBFullName As String
Dim Cnct As String, Src As String
Dim Connection As ADODB.Connection
Dim Recordset As ADODB.Recordset
Dim Col As Integer
Dim vCurrentMonth As Variant
Dim vPriorMonth As Variant
Dim wSummary As Worksheet
Set wSummary = Worksheets("Summary")
vCurrentMonth = wSummary.Range("Current_Month").Value
vPriorMonth = wSummary.Range("Prior_Month").Value
Worksheets("New").Cells.ClearContents
DBFullName = ThisWorkbook.Path & "\Guardian_CensusDB.accdb"
Set Connection = New ADODB.Connection
Cnct = "Provider=Microsoft.ACE.OLEDB.12.0;"
Cnct = Cnct & "Data Source=" & DBFullName & ";"
Connection.Open ConnectionString:=Cnct
Set Recordset = New ADODB.Recordset
With Recordset
Src = "SELECT * FROM [vCurrentMonth] LEFT JOIN [vPriorMonth] ON
[vCurrentMonth].[Plan Number] = [vPriorMonth].[Plan Number]" & _
"WHERE ((([vPriorMonth].[Plan Number]) Is Null))"
.Open Source:=Src, ActiveConnection:=Connection
For Col = 0 To Recordset.Fields.Count - 1
Sheets("New").Range("A1").Offset(0, Col).Value = _
Recordset.Fields(Col).Name
Next
Sheets("New").Range("A1").Offset(1, 0).CopyFromRecordset Recordset
End With
Set Recordset = Nothing
Connection.Close
Set Connection = Nothing
End Sub`
You need to concatenate the variables into your string:
Src = "SELECT * FROM [" & vCurrentMonth & "] LEFT JOIN [" & vPriorMonth & "] ON
[" & vCurrentMonth & "].[Plan Number] = [" & vPriorMonth & "].[Plan Number]" & _
"WHERE ((([" & vPriorMonth & "].[Plan Number]) Is Null))"

Excel vba - ADO inner join on data tables

I have two data tables in excel that I wish to join into a single set in my vba code. I have identified the ADO connector as the best way to do this, however using the query below, I get the following error
"Run time error -2147217904
No value given for one or more required parameters"
SELECT components.[name], InputData.Datatype
FROM [Rules$A5:F30] components
INNER JOIN [Rules$O5:R17] InputData ON components.[name] = InputData.[name]
WHERE components.RowId = 0 GROUP BY components.[name], InputData.Datatype
EDIT: The full code:
Dim cn As ADODB.Connection
Dim rs As ADODB.Recordset
Dim dataRows As Integer
strFile = ThisWorkbook.FullName
strCon = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & strFile _
& ";Extended Properties=""Excel 12.0;HDR=Yes;IMEX=1"";"
Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")
cn.Open strCon
strsql = "SELECT components.[name], InputData.Datatype " _
+ " FROM [" + GetTableAddress("componentTable") _
+ "] components INNER JOIN [" + GetTableAddress("DataLocations") + "] InputData" _
+ " ON components.[name] = InputData.[name] " _
+ " WHERE components.RowId = " + CStr(RowId) + " GROUP BY components.[name], InputData.Datatype"
rs.Open strsql, cn
If Not rs.EOF Then
dataRows = rs.GetRows
and the GetTableAddress function
Private Function GetTableAddress(tableName)
Dim oSh As Worksheet
Dim oLo As ListObject
For Each oSh In ThisWorkbook.Worksheets
For Each oLo In oSh.ListObjects
If oLo.Name = tableName Then
GetTableAddress = Replace(oSh.ListObjects(tableName).Range.AddressLocal, "$", "")
GetTableAddress = oSh.Name + "$" + GetTableAddress
End If
Next
Next
End Function
If both data sets are in Excel, you should use vLookup to create the final table. It'll be easier for you and the benefit is that you can use syntax that you're already familiar with.
vLookup is essentially a table join. You can even use it with Application.WorksheetFunctions if you wish to do it that way.
Also, RecordSet.GetRows can return an array. You should probably use CInt(rs.GetString) if you're not expecting more than one value to be returned.

Tweaking a Excel Pivot Table to display a OrderNumber instead of a calculation?

Using Microsoft Excel 2007 (or 2002) is it possible to create pivot data like this?
Specifically I would like to know if I can display '01(Y 0)' as a non-calculated text value instead of just a SUM/COUNT/MAX/etc value.
With ADO
Dim cn As Object
Dim rs As Object
Dim strFile As String
Dim strCon As String
Dim strSQL As String
Dim s As String
Dim i As Integer, j As Integer
''This is not the best way to refer to the workbook
''you want, but it is very convenient for notes
''It is probably best to use the name of the workbook.
strFile = ActiveWorkbook.FullName
''Note that if HDR=No, F1,F2 etc are used for column names,
''if HDR=Yes, the names in the first row of the range
''can be used.
''This is the Jet 4 connection string, you can get more
''here : http://www.connectionstrings.com/excel
strCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strFile _
& ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"
''Late binding, so no reference is needed
Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")
cn.Open strCon
strSQL = "TRANSFORM First(t.[Order Number]) AS OrdNo " _
& "SELECT t.[Slot Number], t.Time " _
& "FROM [Sheet2$] t " _
& "GROUP BY t.[Slot Number], t.Time " _
& "PIVOT t.Company"
rs.Open strSQL, cn, 3, 3
''Pick a suitable empty worksheet for the results
With Worksheets("Sheet3")
For i = 1 To rs.Fields.Count
.Cells(1, i) = rs.Fields(i - 1).Name
Next
.Cells(2, 1).CopyFromRecordset rs
End With
''Tidy up
rs.Close
Set rs = Nothing
cn.Close
Set cn = Nothing