Greyhound data import to excel macro formula - vba

as part of a research project i need to extract as much data as possible from a webpage. The problem is to access each table i have to follow lots of links which I can't get to work automatically.
Its from a greyhound-data.com. So an example would be I want to extract all the racing stats for every dog that raced in swindon between 1st Jan 2017- 28th Feb 2018. When i put it in the search engine I get 57236 races in a table. I have to follow the link on name of race for each race..
http://www.greyhound-data.com/d?racename=&country=13000&startmonth=3&endmonth=2&startdate=2017&enddate=2018&maxdist=unlimitied&class=any&order=dateD&x=2
My biggest problem is I don't know how to say follow the various links. And I don't know how to loop multiple times - once for each of the races in the original list.
I have created the simple Macro query :
Sub GetData()
Dim IE As Object
Dim doc As Object
Dim strURL As String
Dim I As Integer
For I = 1 To 9
strURL = "http://www.greyhound-data.com/d?racename=&country=13000&startmonth=3&endmonth=2&startdate=2017&enddate=2018&maxdist=unlimitied&class=any&order=dateD&x=" + Trim(Str(I))
Set IE = CreateObject("InternetExplorer.Application")
With IE
.navigate strURL
Do Until .ReadyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
Set doc = IE.Document
GetAllTables doc
.Quit
End With
Next I
End Sub
Sub GetAllTables(doc As Object)
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Set ws = Worksheets.Add
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.outerText
Set rng = rng.Offset(, 1)
I = I + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl
End Sub
it retrieves the all races data from the url automatically. But can not make the next step. on each page there is a "name of the race" tab and I need to get all the data on each page for each row. It is because I need to get the information of 1st place, 2nd place and third place.
Thanks for your time I know its a bit garbled!!
My new code after changes is looking like this:
Sub GetData()
Dim IE As Object
Dim doc As Object
Dim strURL As String
Dim I As Integer
For I = 1 To 9
strURL = "http://www.greyhound-data.com/d?racename=&country=13000&startmonth=3&endmonth=2&startdate=2017&enddate=2018&maxdist=unlimitied&class=any&order=dateD&x=" + Trim(Str(I))
Set IE = CreateObject("InternetExplorer.Application")
With IE
.navigate strURL
Do Until .ReadyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
Set doc = IE.Document
GetAllTables doc
.Quit
End With
Next I
End Sub
Sub GetAllTables(doc As Object)
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Dim ThisLink As Object 'variable for <a> tags
Set ws = Worksheets.Add
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.outerText
Set rng = rng.Offset(, 1)
I = I + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl
I = Range("B" & Rows.Count).End(xlUp).Row 'last row with data
Do While Cells(I, 1).Value = "" 'will loop until first not blank found in column A (starting from last row of data, from end to start)
For Each ThisLink In doc.getElementsByTagName("a") 'we check all <a> tags
If ThisLink.innerText = Cells(I, 2).Value Then Cells(I, 1).Value = ThisLink.href 'If the innertext is the name of the race, in column A we add link
Next ThisLink
I = I - 1 'we decrease row position
Loop
End Sub
but the case is that it returns the empty table as in this link : https://imageshack.us/i/poC4yhEZp

This code, after you get all your data, will check every race from end of list to start of list. And it will add in column A the related link to race.
Sub GetAllTables(doc As Object)
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Dim ThisLink As Object 'variable for <a> tags
Set ws = Worksheets.Add
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.outerText
Set rng = rng.Offset(, 1)
I = I + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl
I = Range("B" & Rows.Count).End(xlUp).Row 'last row with data
Do While Cells(I, 1).Value = "" 'will loop until first not blank found in column A (starting from last row of data, from end to start)
For Each ThisLink In doc.getElementsByTagName("a") 'we check all <a> tags
If ThisLink.innerText = Cells(I, 2).Value Then Cells(I, 1).Value = ThisLink.href 'If the innertext is the name of the race, in column A we add link
Next ThisLink
I = I - 1 'we decrease row position
Loop
End Sub
In HTML all <a> tags are like this:
Henlow 26 Feb 2018 HT 5
The href attribute contains the link related to text between <a> and </a>. You can get it with a.href in VBA
To know the text between <a> and </a> you can use a.InnerText
What i did is just a simple loop to check every <a> tag. If the InnerText matchs the value in the cell, then I get the href attribute.
This code will get you all the links you want in your question. Just adapt the code to your needs (I pasted them in column A, but maybe you want to do something else with them).
You need 2 references for this code to work;
Microsoft HTML Object Library
Microsoft Internet Controls
And this is the final result:

Related

recursive tree parsing with vba

Given the following spreadsheet of data: https://ethercalc.org/q7n9zwbzym5y
I have the following code that will parse this and will derive a tree from the parent-child relationships in the sheet. Note that fact that every column occurs twice is because the first instance of the columns is for another type of data, I am only concerned with the populated columns. This is the desired output from the sheet above:
Code:
Sub performanceSheet(someParams)
' Write to "Performance" sheet
Dim w1 As Worksheet, w2 As Worksheet, wsSearch As Worksheet, wsData As Worksheet
Dim num_rows
Dim parent As Range, parentName As String
Dim parentRange As Range, childrenRange As Range
Dim childCount As Long
Dim p As Variant
Dim f1 As Range, f2 As Range
currRow = 8
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
Set w1 = wbk.Sheets("PositionsDB")
Set w2 = wbk.Sheets("Performance")
num_rows = w1.Cells(Rows.Count, 1).End(xlUp).row
'If there's no parentName column, we can't continue.
If w1.Rows(1).Find("portfolioName") Is Nothing Then Exit Sub
'find first instance
Set f1 = w1.Rows(1).Find("portfolioName", lookat:=xlWhole)
If Not f1 Is Nothing Then
'find second instance
Set f2 = f1.Offset(0, 1).Resize(1, w1.Columns.Count - f1.Column).Find("portfolioName", lookat:=xlWhole)
If Not f2 Is Nothing Then
'set range based on f2
Set parentRange = w1.Range(f2.Offset(1, 0), _
w1.Cells(Rows.Count, f2.Column).End(xlUp))
End If
End If
'If there's no Root level, how do we know where to start?
If parentRange.Find("Main") Is Nothing Then Exit Sub
For Each parent In parentRange
If Not dict.Exists(parent.Value) Then
childCount = Application.WorksheetFunction.CountIf(parentRange, parent.Value)
Set childrenRange = parent.Offset(, 2).Resize(childCount, 1)
dict.Add parent.Value, Application.Transpose(Application.Transpose(childrenRange.Value))
End If
Next
' Recursive method to traverse our dictionary, beginning at Root element.
Call PerformanceProcessItem("", "Main", dict, w2, 9)
wbk.Sheets("Performance").Columns("A:F").AutoFit
End Sub
Private Sub PerformanceProcessItem(parentName As String, name As String, dict As Object, ws As Worksheet, row_num As Long, Optional indent As Long = 0)
Dim output As String, v
Dim w2 As Worksheet
'Debug.Print WorksheetFunction.Rept(" ", indent) & name
'Debug.Print parentName & name
'write to sheet
ws.Cells(row_num, 3).Value = name
row_num = row_num + 1
If Not dict.Exists(name) Then
'we're at a terminal element, a child with no children.
Exit Sub
Else
For Each v In dict(name)
' ## RECURSION ##
Call PerformanceProcessItem(name, CStr(v), dict, ws, row_num, indent + 2)
Next
End If
End Sub
However, when creating this tree, it gets stuck on an infinite loop of India's, where after recognizing "Cash" as the terminal element of India, rather than exiting that subtree it will create another India and continue until overflow. Is there a logic error in my code? Hours of debugging hasn't worked for me and any input would be appreciated on where I have a flaw in my logic.
I am assuming that "Main" and "Cash" will always be there. If not then we will have to tweak the code little bit. I have commented the code so you may not have a problem understanding it. But if you do, simply ask. I quickly wrote this code so I am sure it can be optimized :)
Option Explicit
Dim sB As String
Dim tmpAr As Variant
Sub Sample()
Dim col As New Collection
Dim s As String
Dim ws As Worksheet
Dim lRow As Long, i As Long, j As Long
Dim itm As Variant, vTemp As Variant
Set ws = Sheet1 '<~~ Change this to the relevant sheet
With ws
'~~> Get Last Row of Col AA
lRow = .Range("AA" & .Rows.Count).End(xlUp).Row
'~~> Store Range AA:AC in an array
tmpAr = .Range("AA2:AC" & lRow).Value
End With
'~~> Create a unique collection of portfolioName
For i = LBound(tmpAr) To UBound(tmpAr)
If tmpAr(i, 1) = "Main" Then
On Error Resume Next
col.Add tmpAr(i, 3), CStr(tmpAr(i, 3))
On Error GoTo 0
End If
Next i
'~~> Sort the collection
For i = 1 To col.Count - 1
For j = i + 1 To col.Count
If col(i) > col(j) Then
vTemp = col(j)
col.Remove j
col.Add vTemp, vTemp, i
End If
Next j
Next i
s = "Main"
For Each itm In col
sB = vbTab & itm
s = s & vbNewLine & sB
sB = ""
GetParentChild itm, 2
If Trim(sB) <> "" Then _
s = s & vbNewLine & sB
Next itm
s = s & vbNewLine & vbTab & "Cash"
Debug.Print s
End Sub
Private Sub GetParentChild(strg As Variant, n As Integer)
Dim sTabs As String
Dim j As Long, k As Long
For k = 1 To n
sTabs = sTabs & vbTab
Next k
For j = LBound(tmpAr) To UBound(tmpAr)
If Trim(tmpAr(j, 1)) = Trim(strg) And Trim(tmpAr(j, 1)) <> "Cash" Then
sB = sB & sTabs & tmpAr(j, 3) & vbNewLine
GetParentChild tmpAr(j, 3), n + 1
End If
Next j
End Sub
This is what I got when I ran it on the data that you provided.

Undefined object defined issue

I have done really good progress in my macros while getting the data in different URL's. with the code block below I am getting the error like "Undefined Object":
Sub GetData()
Dim IE As Object, doc As Object
Dim strURL As String, myDate As Date
Set IE = CreateObject("InternetExplorer.Application")
With IE
For myDate = CDate("01-05-2017") To CDate("05-05-2017")
strURL = "https://www.ukdogracing.net/racecards/" & Format(myDate, "dd-mm-yyyy") & "/monmore" ' Trim(Str(I))
.navigate strURL
Do Until .ReadyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
Set doc = IE.Document
GetAllTables doc
Next myDate
.Quit
End With
End Sub
Sub GetAllTables(doc As Object)
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Dim ThisLink As Object 'variable for <a> tags
Set ws = Worksheets.Add
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.outerText
Set rng = rng.Offset(, 1)
myDate = myDate + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -myDate)
myDate = 0
Next rw
Next tbl
myDate = Range("B" & Rows.Count).End(xlUp).Row 'last row with data
Do While Cells(myDate, 1).Value = "" 'will loop until first not blank found in column A (starting from last row of data, from end to start)
For Each ThisLink In doc.getElementsByTagName("a") 'we check all <a> tags
If ThisLink.innerText = Cells(myDate, 2).Value Then Cells(myDate, 1).Value = ThisLink.href 'If the innertext is the name of the race, in column A we add link
Next ThisLink
myDate = myDate - 1 'we decrease row position
Loop
End Sub
the issue is in line :
Do While Cells(myDate, 1).Value = ""
the aim of this line is that, continue to get the data from the links if the page has empty return. What is my mistake could you please help me? It is the last part of my work of macros. Thanks in advance.
I think the error you are experiencing may be to do with the value of myDate. It can't be zero.
If I do this:
Option Explicit
Sub TEST()
Dim myDate As Date
myDate = 1
Do While Cells(myDate, 1).Value = ""
myDate = myDate - 1
Loop
End Sub
I get the Object defined error. I avoid this with:
Option Explicit
Sub TEST()
Dim myDate As Date
myDate = 1
Do While Cells(myDate, 1).Value = ""
myDate = myDate - 1
If Int(myDate) = 0 Then Exit Do
Loop
End Sub

Excel VBA - Looking for Strings in "Text Block" , then look in the next

I doesnt find a solution for my problem in the WWW.
Hope you can help me:
I've imported a long text file with various information: it looks like this:
id 5
name node1
UPS_serial_number
WWNN 500507680350BD
status online
IO_group_id 0
IO_group_name io_grp0
partner_node_id 4
partner_node_name node2
config_node yes
UPS_unique_id
port_id 500507680456454
port_status active
port_speed 8Gb
port_id 500507680545644
port_status active
port_speed 8Gb
id 4
name node2
UPS_serial_number
WWNN 500507680200DDE8
status online
IO_group_id 0
IO_group_name io_grp0
partner_node_id 4
partner_node_name node1
config_node yes
UPS_unique_id
port_id 5005076803594BDE
port_status active
port_speed 8Gb
port_id 500507680235486F
port_status active
port_speed 8Gb
.
.
.
Its almost formatted in the right format like this:
[string || value]
I want to look in the first block and get the infos for name, id, WWPN - then copy the values to another worksheet.
Then look into the second block and get the same infos: name, id, WWPN and copy them.
Then next block and the next block and so on.
I have the following code:
Sub find_test()
Dim rng As Range
Dim rngCell As Variant
Dim LR As Long
Dim tRow
LR = Cells(1, 1).End(xlDown).Row
Set rng = Range("A1:A" & LR)
For Each rngCell In rng.Cells
tRow = rngCell.Row
If StrComp(rngCell.Value, "name") = 0 Then 'Node 1 Service IP
Worksheets("temp").Range("E16").Value = Worksheets("lsnodecanister").Range("B" & tRow).Value
End If
Next
End Sub
The Text blocks are almost seperated by an empty row.
Do you have any idea?
Hope it was understandable.
Thank you very much,
Best regards,
Kalain
something like
Sub SO1()
Dim lngRow As Long
Dim lngLastRowOfSection As Long
Dim rngFind As Range
Dim strName As String
lngRow = 1
Do Until Cells(lngRow + 1, 1).Value = ""
lngLastRowOfSection = Cells(lngRow, 1).End(xlDown).Row
Set rngFind = Range(Cells(lngRow, 1), Cells(lngLastRowOfSection, 1)).Find("name")
If Not rngFind Is Nothing Then
strName = rngFind.Offset(0, 1).Value
Debug.Print strName
End If
lngRow = Cells(lngLastRowOfSection, 1).End(xlDown).Row
If lngRow >= Rows.Count Then Exit Do
Loop
End Sub
I might have misunderstood the question. I think you meant that each line of your data has the name and data separated by a space. I manipulated your subroutine to put all values in column a into an array and then I split the array into columns B and C.
Sub find_test()
Dim rng As Range
Dim LR As Long
Dim tRow As Long
Dim myArray() As Variant, arrayCounter As Long
Dim lilStringArray
'
LR = ActiveSheet.Cells(ActiveSheet.Rows.Count, "A").End(xlUp).Row
ReDim myArray(1 To 1)
arrayCounter = 1
myArray = Range("A1:A" & LR)
tRow = 1
For i = LBound(myArray) To UBound(myArray)
If myArray(i, 1) <> "" Then
lilStringArray = Split(myArray(i, 1), " ")
Range("B" & tRow).Value = lilStringArray(0)
On Error Resume Next
Range("C" & tRow).Value = lilStringArray(1)
On Error GoTo 0
Else
Range("C" & tRow).Value = ""
End If
tRow = tRow + 1
Next i
End Sub

Copy rows between two strings based on search condition

I have to search column B for a certain string and return a specific range of cells for all occurrences of the string in the file. I have code which searches and finds all occurrences of the string but have difficulty with copying into a new sheet the specific range of cells between Path and Owner. The catch is that the row numbers between Path and Owner are dynamic.
Excel structure
(including expected results for search string Kevin).
Macro
Sub FindString()
Dim intS As Integer
Dim rngC As Range
Dim strToFind As String, FirstAddress As String
Dim wSht As Worksheet
Application.ScreenUpdating = True
intS = 1
Set wSht = Worksheets("Search Results")
strToFind = Range("I3").Value 'This is where I obtain the string to be searched
With ActiveSheet.Range("B1:B999999")
Set rngC = .Find(what:=strToFind, LookAt:=xlPart)
If Not rngC Is Nothing Then
FirstAddress = rngC.Address
Do
(
'need help to find copy rows from column B based on values in column A
)
intS = intS + 1
Set rngC = .FindNext(rngC)
Loop While Not rngC Is Nothing And rngC.Address <> FirstAddress
End If
End With
Please help me or guide me as I'm a newbie to Excel.
This code will display the paths found (variable sPath), this is untested:
Sub FindString()
'Dim intS As Integer
Dim rngC As Range
Dim strToFind As String, FirstAddress As String
Dim wSht As Worksheet, lRowPath As Long, lRowOwner As Long, i As Long, sPath As String
'Application.ScreenUpdating = True
'intS = 1
Set wSht = Worksheets("Search Results")
strToFind = Range("I3").Value 'This is where I obtain the string to be searched
'With ActiveSheet.Range("B1:B999999")
With ActiveSheet.Range("B:B")
Set rngC = .Find(what:=strToFind, LookAt:=xlPart)
If Not rngC Is Nothing Then
FirstAddress = rngC.Address
Do
' Find the "Path:" above the found cell, note that offset too much is not handled: Cells(-1,1)
i = -1
Do Until InStr(1, rngC.Offset(i, -1).Value, "Path", vbTextCompare) > 0
i = i - 1
Loop
lRowPath = rngC.Row + i
' Find the Owner row above the found cell
i = -1
Do Until InStr(1, rngC.Offset(i, -1).Value, "Owner", vbTextCompare) > 0
i = i - 1
Loop
lRowOwner = rngC.Row + i
'need help to find copy rows from column B based on values in column A
sPath = ""
For i = lRowPath To lRowOwner - 1
sPath = sPath & ActiveSheet.Cells(i, "B").Value ' <-- Update
Next
Debug.Print "Searching " & strToFind; " --> " & sPath
'intS = intS + 1
Set rngC = .Find(what:=strToFind, After:=rngC, LookAt:=xlPart)
Loop Until rngC.Address = FirstAddress
End If
End With
End Sub
I suggest you load everything to memory first, then do your searches and manipulations.
You could use a user-defined type to store info about your paths:
Type PathPermissionsType
pth As String
owner As String
users As Dictionary
End Type
Note: to use Dictionary you need to go to Tools>References and set a checkmark next to Microsoft Scripting Runtime.
You can load all your info using something like this:
Function LoadPathPermissions() As PathPermissionsType()
Dim rngHeaders As Range
Dim rngData As Range
Dim iPath As Long
Dim nPath As Long
Dim iRow As Long
Dim nRow As Long
Dim vHeaders As Variant
Dim vData As Variant
Dim pathPermissions() As PathPermissionsType
Set rngHeaders = Range("A1:A12") 'or wherever
Set rngData = rngHeaders.Offset(0, 1)
'Load everything to arrays
vHeaders = rngHeaders.Value
vData = rngData.Value
nRow = UBound(vData, 1)
nPath = WorksheetFunction.CountIf(rngHeaders, "Path:")
ReDim pathPermissions(1 To nPath)
iRow = 1
'Look for first "Path:" header.
Do Until InStr(vHeaders(iRow, 1), "Path") <> 0
iRow = iRow + 1
Loop
'Found "Path:" header.
For iPath = 1 To nPath
With pathPermissions(iPath)
'Now look for "Owner:" header, adding to the path until it is found.
Do Until InStr(vHeaders(iRow, 1), "Owner") <> 0
.pth = .pth & vData(iRow, 1)
iRow = iRow + 1
Loop
'Found "Owner:" header.
.owner = vData(iRow, 1)
'"User:" header is on next row:
iRow = iRow + 1
'Now add users to list of users:
Set .users = New Dictionary
Do Until InStr(vHeaders(iRow, 1), "Path") <> 0
.users.Add vData(iRow, 1), vData(iRow, 1)
iRow = iRow + 1
If iRow > nRow Then Exit Do ' End of data.
Loop
End With
Next iPath
LoadPathPermissions = pathPermissions
End Function
Example usage:
Dim pathPermissions() As PathPermissionsType
pathPermissions = LoadPathPermissions()
Then to get an array containing the paths for a given user:
Function GetPathsForUser(ByVal user As String, pathPermissions() As PathPermissionsType) As String()
Dim iPath As Long
Dim iPathsWithPermission As Long
Dim nPathsWithPermission As Long
Dim pathsWithPermission() As String
For iPath = LBound(pathPermissions) To UBound(pathPermissions)
If pathPermissions(iPath).users.Exists(user) Then nPathsWithPermission = nPathsWithPermission + 1
Next iPath
ReDim pathsWithPermission(1 To nPathsWithPermission)
iPathsWithPermission = 0
For iPath = LBound(pathPermissions) To UBound(pathPermissions)
If pathPermissions(iPath).users.Exists(user) Then
iPathsWithPermission = iPathsWithPermission + 1
pathsWithPermission(iPathsWithPermission) = pathPermissions(iPath).pth
End If
Next iPath
GetPathsForUser = pathsWithPermission
End Function
Example usage:
Dim pathPermissions() As PathPermissionsType
Dim pathsWithPermission() As String
pathPermissions = LoadPathPermissions()
pathsWithPermission = GetPathsForUser("Kevin", pathPermissions)
Now pathsWithPermission is an array containing the paths for which Kevin is listed as user. Note that I haven't dealt with edge cases, like if Kevin is a not a user for any paths, etc. Up to you to do that.
Finally you can write the contents of that array to your sheet.

Fetched table data from a webpage not displaying first column of table

I need to fetch the price table from this page:
http://www.kieskeurig.nl/objectief/canon/ef_100mm_f2_usm/prijzen/bezorgen/167557#prijzen
So far I have developed this code to get the data
Sub TableExample()
Dim IE As Object
Dim doc As Object
Dim strURL As String
strURL = "http://www.kieskeurig.nl/objectief/canon/ef_100mm_f2_usm/prijzen/bezorgen/167557#prijzen"
' replace with URL of your choice
Set IE = CreateObject("InternetExplorer.Application")
With IE
'.Visible = True
.navigate strURL
Do Until .readyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
Set doc = IE.document
GetAllTables doc
.Quit
End With
End Sub
Sub GetAllTables(doc As Object)
' get all the tables from a webpage document, doc, and put them in a new worksheet
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Set ws = Sheets("Sheet1")
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.innerText
Set rng = rng.Offset(, 1)
I = I + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl
ws.Cells.ClearFormats
End Sub
This code does work for me
But the problem is the 1st column i.e. the supplier data is not displaying in the fetched table.
Can anyone please help me on this
Replace your GetAllTables subroutine with the following:
Sub GetAllTables(doc As Object)
' get all the tables from a webpage document, doc, and put them in a new worksheet
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Set ws = Sheets("Sheet1")
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
colno = 1
For Each cl In rw.Cells
If colno = 1 and nextrow > 1 then
Set classColl = doc.getElementsByClassName("shopLogoX")
Set imgTgt = classColl(nextrow - 2).getElementsByTagName("img")
rng.Value = imgTgt(0).getAttribute("alt")
Else
rng.Value = cl.innerText
End If
Set rng = rng.Offset(, 1)
I = I + 1
colno = colno + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl
ws.Cells.ClearFormats
End Sub
The change is very little, actually. We use colno to track on which column we're in already in the row. Obviously, we check if we're in the first cell. If we are on the first column and not on the first row (header row), we create a collection of elements with class shopLogoX. This contains the img tags that have the alt attribute we want.
Tried, tested, and working. Let us know if this helps.