Scraping Data from Website using vba doesn´t work - vba

I want to create a list of player names scraping a website. The Internet Explorer starts but I get an run time error "438" - Object does not support this property or method.
Structure of webpage
My coding is as follows:
Option Explicit
Sub Kickbase()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLdoc As MSHTML.HTMLDocument
Dim HTMLPlayers As MSHTML.IHTMLElementCollection
Dim HTMLPlayer As MSHTML.IHTMLElement
Dim i As Integer
Dim HTMLfirstName As Object
Dim firstName As String
IE.Visible = True
IE.Navigate "https://play.kickbase.com/transfermarkt/kaufen"
Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop
Application.Wait (Now + TimeValue("0:00:10"))
Set HTMLdoc = IE.Document
Set HTMLPlayers = HTMLdoc.getElementsByClassName("players")
For i = 0 To HTMLPlayers(0).getElementsByClassName("firstName").Length - 1
Set HTMLfirstName = HTMLPlayers(0).getElementsByClassName("firstName")
If Not HTMLfirstName Is Nothing Then
firstName = Trim(HTMLfirstName.innerText)
Else
firstName = "no_value"
End If
Debug.Print firstName
Next i
End Sub
I have activated the following libraries:

Since it's not possible to test the website on my own, the code below might not be the best way to do it but it should work:
Sub Kickbase()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLdoc As MSHTML.HTMLDocument
Dim HTMLPlayers As Object
Dim i As Integer
Dim firstName As String
IE.Visible = True
IE.navigate "https://play.kickbase.com/transfermarkt/kaufen"
Do While IE.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Application.Wait (Now + TimeValue("0:00:10"))
Set HTMLdoc = IE.document
Set HTMLPlayers = HTMLdoc.getElementsByClassName("playerName")
For i = 0 To HTMLPlayers(0).getElementsByClassName("firstName").Length - 1
firstName = Trim$(HTMLPlayers(0).getElementsByClassName("firstName")(i).innerText)
If firstName = vbNullString Then firstName = "no_value"
Debug.Print firstName
Next i
'=== Optional depending on your use case, remember to close IE or else it will remain there ===
'IE.Quit
'Set IE = Nothing
End Sub

Related

Type Mismatch on one machine

I wrote some code to scrape data from a website. I've tested it on 5 difference machines with different versions of excel and it all works fine. But on the intended users machine we get type mismatch error.The code fails at the last line below.
Sub LogIn()
Dim ie As SHDocVw.InternetExplorer
Dim iDoc As MSHTML.HTMLDocument
Dim ele As MSHTML.IHTMLElement
Dim eles As MSHTML.IHTMLElementCollection
Dim tableSection As MSHTML.IHTMLElement
Dim tableRow As MSHTML.IHTMLElement
Dim tableCell As MSHTML.IHTMLElement
Dim smallCell As MSHTML.IHTMLElement
Dim iCol As Integer
Dim iRow As Integer
Dim iCounter As Integer
iRow = 0
Do
iRow = iRow + 1
Loop Until Cells(iRow, 5) = ""
Range(Cells(1, 5), Cells(iRow, 6)).ClearContents
Set ie = New InternetExplorer
ie.Visible = False
ie.navigate ("https://www.howdidido.com/")
Do While ie.readyState <> READYSTATE_COMPLETE
Loop
Set iDoc = ie.document
any help greatly appreciated.
I have tried the following code and it is working alright. Maybe it can help you (seems as two loops and doEvents are needed for the ready state completes).
Dim iDoc As MSHTML.HTMLDocument
Dim iCol As Integer
Dim iRow As Integer
Dim iCounter As Integer
Dim i As Long
Dim URL As String
Dim IE As Object
Dim objElement As Object
Dim objCollection As Object
Set IE = CreateObject("InternetExplorer.Application")
'Set IE.Visible = True to make IE visible, or False for IE to run in the background
IE.Visible = True
'Define URL
URL = "https://www.automateexcel.com/excel/"
'Navigate to URL
IE.Navigate URL
' Statusbar let's user know website is loading
Application.StatusBar = URL & " is loading. Please wait..."
' Wait while IE loading...
'IE ReadyState = 4 signifies the webpage has loaded (the first loop is set to avoid inadvertently skipping over the second loop)
Do While IE.ReadyState = 4: DoEvents: Loop 'Do While
Do Until IE.ReadyState = 4: DoEvents: Loop 'Do Until
'Webpage Loaded
Application.StatusBar = URL & " Loaded"
Set iDoc = IE.Document
'Unload IE
Set IE = Nothing
Set objElement = Nothing
Set objCollection = Nothing

'Busy' method of 'IUWebBrowser2' object failed

When I launch the site via my code, there is an error of type "method document of object iwebbrowser2 failed " at the level of my variable "oDoc"
Private Function CreerNavigateur(ByVal mails As String)
Dim IE As Object
Dim oDoc As Object
Dim Htable, maTable As Object
Dim text As String
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = False
IE.navigate "https://csrtool-ssl.sso.infra.ftgroup/csrtool_web/Bricks/pg/osuit/pages/identity/IdentityAccountAndUsers?type=emailAlias&value=" & mails & "&tab_main=AccountInfo"
While IE.Busy Or IE.ReadyState <> READYSTATE_COMPLETE: DoEvents: Wend
Set oDoc = IE.Document
Set Htable = oDoc.getElementsByTagName("div")(1)
' MsgBox Htable.innerhtml
Set maTable = Htable.getElementsByTagName("span")
'MsgBox maTable(0).href
'myData = maTable(0).innertext
'MsgBox (myData)
IE.Quit
'On libère les variables
Set IE = Nothing
Set oDoc = Nothing
End Function
thank you for helping me to see my mistake
formalizing my answer try maximized ie window and updated code will be
Private Function CreerNavigateur(ByVal mails As String)
Dim ie As SHDocVw.InternetExplorer
Dim oDoc As Object
Dim Htable, maTable As Object
Dim text As String
Set ie = New SHDocVw.InternetExplorer
ie.Visible = False
ie.navigate "https://csrtool-ssl.sso.infra.ftgroup/csrtool_web/Bricks/pg/osuit/pages/identity/IdentityAccountAndUsers?type=emailAlias&value=" & mails & "&tab_main=AccountInfo"
While ie.Busy Or ie.ReadyState <> READYSTATE_COMPLETE: DoEvents: Wend
Set oDoc = ie.Document
Set Htable = oDoc.getElementsByTagName("div")(1)
'MsgBox Htable.innerhtml
Set maTable = Htable.getElementsByTagName("span")
'MsgBox maTable(0).href
'myData = maTable(0).innertext
'MsgBox (myData)
ie.Quit
'On libère les variables
Set ie = Nothing
Set oDoc = Nothing
End Function

Data Scraping using VBA From website doesn’t download

I am trying to download some data from website which starts download upon clicking.
But this code is not working, Can anyone help.
Dim ie As InternetExplorer
Dim htmldoc As MSHTML.IHTMLDocument
Dim HTMLInput As MSHTML.IHTMLElementCollection
Dim HTMLAs As MSHTML.IHTMLElementCollection
Dim HTMLA As MSHTML.IHTMLElement
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
'Navigate to webpage
Dim ieURL As String: ieURL = "http://erldc.org/final-schedule.aspx"
ie.navigate ieURL
Do While ie.readyState <> READYSTATE_COMPLETE
Loop
Set htmldoc = ie.document
Set HTMLInput = htmldoc.getElementsByTagName("a")
For Each HTMLA In HTMLAs
Debug.Print HTMLA.getAttribute("classname"), HTMLA.getAttribute("href"), HTMLA.getAttribute("rel")
If HTMLA.getAttribute("href") = "javascript:__doPostBack('ctl00$ContentPlaceHolder1$Calendar1','6420')" Then
HTMLA.Click
Exit For
End If
Next HTMLA
End Sub
Try pausing the macro for a few seconds, maybe 5 seconds, after the Do While/Loop that checks for the ReadyState...
Dim sngFinish As Single
Dim intPauseTime As Integer
intPauseTime = 5 'in seconds
sngFinish = timer + intPauseTime
Do While timer < sngFinish
DoEvents
Loop
Also, I would suggest that you check the Busy state of Internet Explorer, in addition to the ReadyState, and add DoEvents...
Do While ie.Busy Or ie.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Hope this helps!

Scraper has stopped working, probably a small issue but cannot find the issue

This scraper should return the data suggesting how many properties show up in the search.
It was working until I opened it this morning, the class hasn't changed but for some reason, it will not return any data into the cell stated.
Sub ZPLA2()
Const READYSTATE_COMPLETE = 4
Dim j As Integer
Dim ie As InternetExplorer
Dim Doc As IHTMLDocument
Dim xcolElements As IHTMLElementCollection
Dim ell As IHTMLElement
Dim pn As Integer
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
ie.Navigate "http://www.zoopla.co.uk/for-sale/property/london/w1/west-end-mayfair-soho-marylebone-south/?beds_max=0&beds_min=0&include_retirement_homes=true&include_shared_ownership=true&new_homes=include&price_max=200000&price_min=50000&q=w1&radius=20&results_sort=newest_listings&search_source=refine"
' Do
'DoEvents
'Loop Until Not ie.Busy And ie.readyState = READYSTATE_COMPLETE
Application.Wait (Now + TimeValue("0:00:07"))
Set Doc = ie.Document
Set xcolElements = Doc.getElementsByClassName("split3l result-count")
For Each ell In xcolElements
Sheet2.Range("d2").Value = ell.innerText
On Error GoTo skip
Next
skip:
ie.Quit
Set el = Nothing
Set xcolElements = Nothing
Set Doc = Nothing
Set ie = Nothing
End Sub
Replace
Set xcolElements = Doc.getElementsByClassName("split3l result-count")
with:
Set xcolElements = Doc.getElementsByClassName("listing-results-utils-count")
Remember to use IE Developer Tools not Chrome etc. as you are in fact utilizing the IE browser in VBA in the code above.

VBA Script pull data from website

I want to pull the data from http://www.buyshedsdirect.co.uk/ to get the most recent prices of specific items.
I have an excel spreadsheet with the following:
|A | B
1 |Item |Price
2 |bfd/garden-structures/arches/premier-arches-pergola
and the VBA script:
Dim ie As New InternetExplorer
Dim item As String
item = Sheet1.Range("A2").Value
Dim doc As HTMLDocument
ie.Visible = True
ie.navigate "http://www.buyshedsdirect.co.uk/" & item
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.document
On Error Resume Next
output = doc.getElementByClass("NowValue").innerText
Sheet1.Range("B2").Value = output
ie.Quit
End Sub
i am new to VBA scripting and have no idea why it isnt pulling the value form the class "NowValue"
Any help would be appreciated :)
The On Error Resume Next line is stopping an error message from being displayed. That error message would be that there is no method on HTMLDocument called "getElementByClass". You probably want "getElementsByClassName" instead and will have to handle the fact that this returns a collection rather than a single element. Code like this would work:
Option Explicit
Sub foo()
Dim ie As New InternetExplorer
Dim item As String
item = Sheet1.Range("A2").Value
Dim doc As HTMLDocument
ie.Visible = True
ie.navigate "http://www.buyshedsdirect.co.uk/" & item
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.document
Dim results As IHTMLElementCollection
Dim result As IHTMLElement
Dim output As String
Set results = doc.getElementsByClassName("NowValue")
output = ""
For Each result In results
output = output & result.innerText
Next result
Sheet1.Range("B2").Value = output
ie.Quit
End Sub
You would then find that there are multiple elements with class "NowValue" on that page. It looks as though the one you want might be enclosed in a div called "VariantPrice" so this code should work:
Option Explicit
Sub bar()
Dim ie As New InternetExplorer
Dim item As String
item = Sheet1.Range("A2").Value
Dim doc As HTMLDocument
ie.Visible = True
ie.navigate "http://www.buyshedsdirect.co.uk/" & item
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.document
Dim results As IHTMLElementCollection
Dim results2 As IHTMLElementCollection
Dim result As IHTMLElement
Dim result2 As IHTMLElement
Dim output As String
Set results = doc.getElementsByClassName("VariantPrice")
output = ""
For Each result In results
Set results2 = result.getElementsByClassName("NowValue")
For Each result2 In results2
output = output & result2.innerText
Next result2
Next result
Sheet1.Range("B2").Value = output
ie.Quit
End Sub
edit: as the code above works perfectly for me but fails to work for the question asker, it may be the case that they are using an older version of Internet Explorer which does not support getElementsByClassName. It may be the case that using querySelector will work instead. To be certain, go to this QuirksMode page to determine exactly what your browser supports.
New code using querySelector:
Option Explicit
Sub bar()
Dim ie As New InternetExplorer
Dim doc As HTMLDocument
Dim result As IHTMLElement
Dim result2 As IHTMLElement
Dim item As String
item = Sheet1.Range("A2").Value
ie.Visible = True
ie.navigate "http://www.buyshedsdirect.co.uk/" & item
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.document
Set result = doc.querySelector(".VariantPrice")
Set result2 = result.querySelector(".NowValue")
Sheet1.Range("B2").Value = result2.innerText
ie.Quit
End Sub
further edit: to make the macro loop through all of the entries in column A, here are the relevant bits to add or change:
Option Explicit
Sub bar()
Dim ie As New InternetExplorer
Dim doc As HTMLDocument
Dim result As IHTMLElement
Dim result2 As IHTMLElement
Dim item As String
Dim lRow As Long
ie.Visible = True
lRow = 2
item = Sheet1.Range("A" & lRow).Value
Do Until item = ""
ie.navigate "http://www.buyshedsdirect.co.uk/" & item
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.document
Set result = doc.querySelector(".VariantPrice")
Set result2 = result.querySelector(".NowValue")
Sheet1.Range("B" & lRow).Value = result2.innerText
lRow = lRow + 1
item = Sheet1.Range("A" & lRow).Value
Loop
ie.Quit
End Sub