Scraping and Looping with a VBA Macro - vba

The code below has a bug that I am unable to fix.
Code manages to copy successfully the first 50 records and paste them into my excel sheet from the table but the code doesn’t advance onwards past page 1 to all the other pages and repeat for all the other records.
I’d like the code to copy and paste all 111,582 records into my excel sheet, not just do the first 50 records continuously.
Here is the code I have so far, which works to copy the data table on page one:
Sub LoopTest()
Dim ie As Object
Dim i As Long
Dim strText As String
Dim doc As Object
Dim hTable As Object
Dim hBody As Object
Dim hTR As Object
Dim hTD As Object
Dim tb As Object
Dim bb As Object
Dim tr As Object
Dim td As Object
Dim y As Long, z As Long, wb As Excel.Workbook, ws As Excel.Worksheet
Set wb = Excel.ActiveWorkbook
Set ws = wb.ActiveSheet
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
y = 1 'Column A in Excel
z = 1 'Row 1 in Excel
variable = 0
Here:
ie.Navigate "website" & variable
Do While ie.Busy: DoEvents: Loop
Do While ie.ReadyState <> 4: DoEvents: Loop
Set doc = ie.Document
Set hTable = doc.getElementsByClassName("conBody conList")
For Each tb In hTable
Set hBody = tb.getElementsByTagName("tbody")
For Each bb In hBody
Set hTR = bb.getElementsByTagName("tr")
For Each tr In hTR
Set hTD = tr.getElementsByTagName("td")
y = 1 ' Resets back to column A
For Each td In hTD
ws.Cells(z, y).Value = td.innerText
y = y + 1
Next td
DoEvents
z = z + 1
Next tr
Exit For
Next bb
Exit For
Next tb
variable = variable + 1
GoTo Here:
End Sub

Related

Extract Table from result of searched webpage using VBA

I need to extract table from result of searched webpage which provide VIN decoder. I tried the following codes many times but I failed to extract. Actually, I tried modify code that someone uploaded but it doesn't occur error but extract nothing. It needed to bring vin code from sheets(1) and Add to end of URL. Also, result for searched vin decoder web site has so many tables that I couldn't specify table that I need.
Sub WebScrape()
Dim ie As Object, i As Long, strText As String
Dim doc As Object, hTable As Object, hBody As Object, hTR As Object, hTD As Object, hHead As Object
Dim tb As Object, bb As Object, tr As Object, td As Object, hth As Object, hh As Object
Dim y As Long, z As Long, wb As Excel.Workbook, ws As Excel.Worksheet
For i = 2 To 386
pagenum = Sheets(1).Cells(i, 2).Value
Set wb = Excel.ActiveWorkbook
Set ws = wb.ActiveSheet
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
y = 1 'Column A in Excel
z = 1 'Row 1 in Excel
ie.navigate "https://en.vindecoder.pl/" & pagenum
Do While ie.Busy: DoEvents: Loop
Do While ie.readyState <> 4: DoEvents: Loop
Set doc = ie.document
Set hTable = doc.getElementsByTagName("table")
For Each tb In hTable
'tTable > thead > tr > th
Set hHead = tb.getElementsByTagName("tbody")
For Each hh In hHead
Set hTR = hh.getElementsByTagName("tr")
For Each tr In hTR
Set hth = tr.getElementsByTagName("th")
y = 1 ' Resets back to column A
For Each th In hth
Debug.Print th.innerText
ws.Cells(z, y).Value = th.innerText
y = y + 1
Next th
DoEvents
Next tr
Exit For
Next hh
'th.innerText
Set hBody = tb.getElementsByTagName("tbody")
For Each bb In hBody
Set hTR = bb.getElementsByTagName("tr")
For Each tr In hTR
Set hTD = tr.getElementsByTagName("td")
y = 1 ' Resets back to column A
For Each td In hTD
Debug.Print td.innerText
z = 2
ws.Cells(z, y).Value = td.innerText
y = y + 1
Next td
DoEvents
z = z + 1
Next tr
Exit For
Next bb
'Exit For
Next tb
ie.Quit
Next
End Sub

VBA - Error shows like "Object variable or With block variable not set "

Error shows like "Object variable or With block variable not set "
A variable 'Sathish' should have this type of values "ABC21A". Please suggest me how to declare this variable. Also, this code needs to run 1000 times to fetch data. I didn't have a privilege to access the tool outside from office. So I can't provide the link.
Please see the below code:
This is Sathish, While run the below code I received error at this area
(appIE.document.getElementById("__tab_tabFilter").Click appIE.document.getElementById("filter_ReferenceCode").Value = sathish)
Sub macro()
Dim appIE As Object
Dim objElement As Object
Dim Y As Long
Dim obj As Object
Dim r As Long, c As Long, t As Long
Dim objCollection As Object
Dim eRow As Long
' Create InternetExplorer Object
Set appIE = CreateObject("InternetExplorer.Application")
With appIE
appIE.Visible = False
appIE.navigate "website"
Do While appIE.Busy Or appIE.ReadyState <> 4
DoEvents
Loop
appIE.Visible = True
appIE.document.getElementById("Username").Value = ""
appIE.document.getElementById("Password").Value = ""
appIE.document.getElementById("btnSubmit").Click
Number = Range("A1", Range("A1").End(xlDown)).Rows.Count
For Y = 1 To Number
sathish = Cells(Y, 1).Value
appIE.document.getElementById("__tab_tabFilter").Click
appIE.document.getElementById("filter_ReferenceCode").Value = sathish
appIE.document.getElementById("filter_btnSetFilter").Click
'Set IE = Nothing
Set objCollection = appIE.document.getElementsByTagName("TABLE")
For t = 10 To (objCollection.Length - 1)
For r = 0 To (objCollection(t).Rows.Length - 1)
eRow = Sheet2.Cells(Rows.Count, 1).End(xlUp).Offset(1, 0).Row
For c = 0 To (objCollection(t).Rows(r).Cells.Length - 1)
ThisWorkbook.Worksheets(2).Cells(eRow, c + 1) = objCollection(t).Rows(r).Cells(c).innerText
Next c
Next r
If t = objCollection.Length - 1 Then
Cells(eRow + 1, 1) = ""
End If
Next t
Next Y
End With
' Set IE = Nothing
End Sub

Anyone know how to get macro (Excel VBA) to stop when there is no next button

Anyone know how to get a macro (Excel VBA) to stop when there is no next button present (so it should scrape x pages until next button value is no longer present).
Any help? Loop Until e.Value <> "Next Results"
Sub Test()
Dim ie As Object
Dim i As Long
Dim strText As String
Dim doc As Object
Dim hTable As Object
Dim hBody As Object
Dim hTR As Object
Dim hTD As Object
Dim tb As Object
Dim bb As Object
Dim tr As Object
Dim td As Object
Dim y As Long, z As Long, wb As Excel.Workbook, ws As Excel.Worksheet
Set wb = Excel.ActiveWorkbook
Set ws = wb.ActiveSheet
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
y = 1 'Column A in Excel
z = 1 'Row 1 in Excel
variable = 0
Here:
ie.navigate "http://games.espn.com/ffl/tools/projections?&seasonTotals=true&seasonId=2016&slotCategoryId=0&startIndex=" & variable
Do While ie.Busy: DoEvents: Loop
Do While ie.ReadyState <> 4: DoEvents: Loop
Set doc = ie.document
Set hTable = doc.getElementsByClassName("playerTableTable tableBody")
For Each tb In hTable
Set hBody = tb.getElementsByTagName("tbody")
For Each bb In hBody
Set hTR = bb.getElementsByTagName("tr")
For Each tr In hTR
Set hTD = tr.getElementsByTagName("td")
y = 1 ' Resets back to column A
For Each td In hTD
ws.Cells(z, y).Value = td.innerText
y = y + 1
Next td
DoEvents
z = z + 1
Next tr
Exit For
Next bb
Exit For
Next tb
variable = variable + 40
GoTo Here:
End Sub
I read somewhere that adding (below) might help though its as yet to work for me.
buttonFound = True
While buttonFound
Set allLinks = ie.getElementsByTagName("a")
buttonFound = False
For Each btn In allLinks
If btn.innerText = "Next"
buttonFound = True
Set btnNext = btn
Exit For
End If
Next btn
btn.Click
End Sub
Try this:
' FindNextButton()
Set allLinks = ie.getElementsByTagName("a")
For Each btn In allLinks
If btn.innerText = "Next"
btn.Click
Goto Here
Exit For
End If
Next btn

VBA returns [object] when using Doc.getElementsByTagName. Can't locate correct TagName from browser

I'm having trouble locating the correct tag name from Chrome. When I run the following code all that is returned is "[object]". Does anyone have any suggestions to pull in the correct field?
Here's the website I'm attempting to pull information. Specifically Census Tract but at this point if someone could show me how to return any information I will mine the correct one.
Here's my code, currently running the output to a MsgBox just to find the right field. From there I will reenable to for statement and make the URL in the VBA dynamic. Also note that I've attempted to use getElementsByName but no progress there either:
Sub censusTract()
Dim sht As Worksheet
Dim lastRow As Long
Set sht = ActiveWorkbook.Sheets("Sheet1")
lastRow = sht.Cells(sht.Rows.Count, "A").End(xlUp).Row
'For i = 2 To lastRow
Dim IE As New InternetExplorer
'IE.Visible = True
IE.navigate "http://geocoding.geo.census.gov/geocoder/geographies/address?street=1308+Lapwing+Rd&city=Edmond&state=OK&zip=73003&benchmark=4&vintage=4"
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Dim Doc As HTMLDocument
Set Doc = IE.document
Dim sDD As String
'sDD = Doc.getElementsByTagName("br")(0)
sDD = Doc.getElementsByName("pl_gov_census_geo_geocoder_domain_AddressResult")(0)
MsgBox sDD
'IE.Quit
'sht.Cells(i, 41).Value = sDD
'Next i
End Sub
You are very close. All you need to do is pull the correct data from the object. You can use one of the following
sDD = Doc.getElementsByName("pl_gov_census_geo_geocoder_domain_AddressResult").item(0).innerText
sDD = Doc.getElementsByName("pl_gov_census_geo_geocoder_domain_AddressResult").item(0).innerHTML
sDD = Doc.getElementsByName("pl_gov_census_geo_geocoder_domain_AddressResult").item(0).outerHTML
Right now its just pulling the object, adding innerText, innerHTML or outerHTML should pull your result.
Once you have that pulled you may want to split the results into an array so you can do what you need with each of the elements. the code below should get you started.
Sub censusTract()
Dim sht As Worksheet
Dim lastRow As Long, v As Variant, block As Variant, x As Integer
Set sht = ActiveWorkbook.Sheets("Sheet1")
lastRow = sht.Cells(sht.Rows.Count, "A").End(xlUp).Row
'For i = 2 To lastRow
Dim IE As New InternetExplorer
'IE.Visible = True
IE.navigate "http://geocoding.geo.census.gov/geocoder/geographies/address?street=1308+Lapwing+Rd&city=Edmond&state=OK&zip=73003&benchmark=4&vintage=4"
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Dim Doc As HTMLDocument
Set Doc = IE.document
Dim sDD As String
'sDD = Doc.getElementsByTagName("br")(0)
sDD = Doc.getElementsByName("pl_gov_census_geo_geocoder_domain_AddressResult").Item(0).innerText
block = Split(sDD, vbLf)
x = 2 'start column
For Each v In block
If v <> "" Then
Cells(i, x).Value = v
x = x + 1
End If
Next v
'IE.Quit
'sht.Cells(i, 41).Value = sDD
'Next i
End Sub
Let me know if you have any questions.

Web-scraping in Excel with variable url (url extension)

I am fairly new to VBA and VBA in excel, I have been trying to find out how to conditionally scrape web data based off of one cells value ("Guid") and have not really found a way to progress the function -- to make it dynamic. As of right now I can only get it to retrieve data for one specific cell, and print in another specified cell. I believe I am just missing some kind of looping variable function? (aside from there is probably a more correct way of writing the code).
Sub ie_open()
Dim wb As Workbook
Dim ws As Worksheet
Dim TxtRng As Range
Dim Guid As Range
Dim ie As Object
Dim URL As String
URL = "https://url.com/userpage="
Set wb = ActiveWorkbook
Set ws = wb.Sheets("Detail Report - Individuals")
Set Guid = ws.Range("E2")
Set TxtRng = ws.Range("F2")
Set ie = CreateObject("INTERNETEXPLORER.APPLICATION")
ie.NAVIGATE (URL + Guid)
ie.Visible = True
While ie.ReadyState <> 4
DoEvents
Wend
TxtRng = ie.document.getelementbyid("lbl_Location").innertext
End Sub
Thank you in advance.
Turn on a reference to HTML elements (Go to Tools -- References. You should also turn on a reference to Microsoft Internet controls so you can declare IE as an InternetExplorer object rather than just an object, but it's not necessary), then you can loop through each element like
Sub ie_open()
Dim wb As Workbook
Dim ws As Worksheet
Dim TxtRng As Range
Dim Guid As Range
Dim ie As Object
Dim URL As String
'ADDED THIS
Dim sl as Ihtmlelement
Dim r as long = 1
URL = "https://url.com/userpage="
Set wb = ActiveWorkbook
Set ws = wb.Sheets("Detail Report - Individuals")
Set Guid = ws.Range("E2")
Set TxtRng = ws.Range("F2")
Set ie = CreateObject("INTERNETEXPLORER.APPLICATION")
ie.NAVIGATE (URL + Guid)
ie.Visible = True
While ie.ReadyState <> 4
DoEvents
Wend
For each sl in ie.document.all
ws.cells(r, 1).value = sl.innertext
r = r + 1
Next
'TxtRng = ie.document.getelementbyid("lbl_Location").innertext
End Sub
Edit: forgot to increment the r variable in the loop, and I think it should be IE.Document.All instead of just IE.Document when initializing the loop