access vba IE getElementById "object required" error 424 - vba

this is related to this question
scrape data from a table on a website without having to search for tags
this code worked up until a few days ago, the only change since then was changing to Windows 10 but that shouldn't affect it, should it?
also, it seems that it scrapes the first record and then gives the error. however, if i click DEBUG, then STEP OUT, it works and goes to the next record and i again have to click DEBUG, STEP OUT and so on. 81 times.
There are other functions being called from it, they just scrape more stuff and put it in a table, i don't think they're the issue, but i can add them if needed.
The line that gives the error 424 is
If .Document.getElementById("middleContent_lbType").outerHTML Like "*General Acute Care Hospital*" Then
here's the code
Public Sub VisitPages()
DoCmd.RunSQL "DELETE FROM ScrapedFacs"
AutoID = 1
Dim ie As New InternetExplorer
'Set ie = New InternetExplorerMedium
With ie
.Visible = False
.navigate "http://healthapps.state.nj.us/facilities/acSetSearch.aspx?by=county"
While .Busy Or .ReadyState < 4: DoEvents: Wend
With .Document
.querySelector("#middleContent_cbType_1").Click
.querySelector("#middleContent_cbType_4").Click
.querySelector("#middleContent_btnGetList").Click
End With
While .Busy Or .ReadyState < 4: DoEvents: Wend
Dim list As Object, i As Long
Set list = .Document.querySelectorAll("#main_table [href*=doPostBack]")
For i = 0 To list.Length - 1
list.Item(i).Click
While .Busy Or .ReadyState < 4: DoEvents: Wend
If .Document.getElementById("middleContent_lbType").outerHTML Like "*General Acute Care Hospital*" Then
FacType = "General Acute Care Hospital"
ElseIf .Document.getElementById("middleContent_lbType").outerHTML Like "*Psychiatric Hospital*" Then
FacType = "Psychiatric Hospital"
End If
Address = Replace(Replace(Replace(.Document.getElementById("middleContent_lbAddress").outerHTML, "<span id=" & Chr(34) & "middleContent_lbAddress" & Chr(34) & ">", ""), "<br>", ", "), "</span>", "")
WriteTable .Document.getElementsByTagName("table")(3), .Document.getElementById("middleContent_lbName_county").innerText
'do stuff with new page
.Navigate2 .Document.URL '<== back to homepage
While .Busy Or .ReadyState < 4: DoEvents: Wend
Set list = .Document.querySelectorAll("#main_table [href*=doPostBack]") 'reset list (often required in these scenarios)
Next
' Stop '<== Delete me later
.Quit '<== Remember to quit application
End With
End Sub

The issue is because of running code too fast
So either add a sleep before the line
Sleep 1
...Document.getElementById("middleContent_lbType")..
Or start checking if there is a return value or not
Set obj = .Document.getElementById("middleContent_lbType")
If obj is Nothing:
Sleep 1
Set obj = .Document.getElementById("middleContent_lbType")
End If
If obj.outerHTML Like "*General Acute Care Hospital*" Then
...

Related

How to exit loop when scrollBy reaches the bottom of the web page?

I've written a script in VBA using IE to reach the bottom of a web page automatically. The web page displays it's content in such a way that if I scroll downward more products become visible. I have used .scrollBy within my script to handle the lazy load.
I don't understand how to stop the scrolling when there is no more new products to load - I've used .scrollBy within a Do loop. How can I exit my loop when the scrolling is done and the browser reaches the bottom of the web page? Thanks in advance for any solution.
This is what I've tried so far:
Sub HandleLazyload()
Const URL As String = "https://www.inc.com/profile/sumup-payments-limited"
Dim IE As New InternetExplorer, HTML As HTMLDocument, post As Object
With IE
.Visible = True
.navigate URL
While .Busy = True Or .readyState < 4: DoEvents: Wend
Set HTML = .document
End With
Do
HTML.parentWindow.scrollBy 0, 99999
Application.Wait Now + TimeValue("00:00:03")
Set post = HTML.getElementsByTagName("article")
Loop ''I wish to break out of this loop when all the scrolling is done
IE.Quit
End Sub
Try the following where I used the rank to determine termination/exit of loop.
Option Explicit
Public Sub HandleLazyload()
Const URL As String = "https://www.inc.com/profile/sumup-payments-limited"
Dim IE As New InternetExplorer, HTML As HTMLDocument
With IE
.Visible = True
.navigate URL
While .Busy = True Or .readyState < 4: DoEvents: Wend
Set HTML = .document
End With
Dim rank As Long, item As Long
item = 1
Do While Err.Number = 0
HTML.parentWindow.scrollBy 0, 99999
Application.Wait Now + TimeSerial(0, 0, 1)
On Error GoTo errhand
rank = Split(HTML.querySelectorAll(".rank dt ~ dd")(item).innerText, "#")(1)
item = item + 1
Loop
errhand:
Err.Clear
Debug.Print "Stopped at rank " & rank
'Your other code
'IE.Quit
End Sub
Notes:
CSS Selector:
In case you want to know more about the CSS selector
The selector below targets all elements where class name is rank and then has sibling elements dt and dd within.
HTML.querySelectorAll(".rank dt ~ dd")(item)
Targeted HTML:

Need help to convert Internet Explorer based web scraping to XMLHTTP

I am trying to speed up some intranet webscraping as well as make it more reliable. I am just learning how to implement XMLHTTP and I need some advice on converting my code from IE based scrapping to XMLHTTP.
I have 2 subs in my module that accomplishes loading up and navigating the intranet site (GetWebTable) and parsing through the data (GetOneTable) to return a table in excel. The subs are as follows:
Sub GetWebTable(sAccountNum As String)
On Error Resume Next
Dim objIE As Object
Dim strBuffer As String
Dim thisCol As Integer
Dim iAcctCount As Integer
Dim iCounter As Integer
Dim iNextCounter As Integer
Dim iAcctCell As Integer
Dim thisColCustInfo As Integer
Dim iErrorCounter As Integer
If InStr(1, sAccountNum, "-") <> 0 Then
sAccountNum = Replace(sAccountNum, "-", "")
End If
If InStr(1, sAccountNum, " ") <> 0 Then
sAccountNum = Replace(sAccountNum, " ", "")
End If
iErrorCounter = 1
TRY_AGAIN:
'Spawn Internet Explorer
Set objIE = GetObject("new:{D5E8041D-920F-45e9-B8FB-XXXXXXX}")
DoEvents
With objIE
.Visible = False
.Navigate "http://intranetsite.aspx"
While .busy = True Or .readystate <> 4: DoEvents: Wend
While .Document.readyState <> "complete": DoEvents: Wend
.Document.getElementById("ctl00_MainContentRegion_tAcct").Value = sAcct
While .busy = True Or .readyState <> 4: DoEvents: Wend
While .Document.readyState <> "complete": DoEvents: Wend
.Document.getElementById("ctl00_MainContentRegion_btnRunReport").Click
While .busy = True Or .readyState <> 4: DoEvents: Wend
While .Document.readyState <> "complete": DoEvents: Wend
End With
thisCol = 53
thisColCustInfo = 53
GetOneTable objIE.Document, 9, thisCol
'Cleanup:
objIE.Quit
Set objIE = Nothing
GetWebTable_Error:
Select Case Err.Number
Case 0
Case Else
Debug.Print Err.Number, Err.Description
iErrorCounter = iErrorCounter + 1
objIE.Quit
Set objIE = Nothing
If iErrorCounter > 4 Then On Error Resume Next
GoTo TRY_AGAIN
'Stop
End Select
End Sub
Sub GetOneTable(varWebPageDoc, varTableNum, varColInsert)
Dim varDocElement As Object ' the elements of the document
Dim varDocTable As Object ' the table required
Dim varDocRow As Object ' the rows of the table
Dim varDocCell As Object ' the cells of the rows.
Dim Rng As Range
Dim iCellCount As Long
Dim iElemCount As Long
Dim iTableCount As Long
Dim iRowCount As Long
Dim iRowCounter As Integer
Dim bTableEndFlag As Boolean
bTableEndFlag = False
For Each varDocElement In varWebPageDoc.all
If varDocElement.nodeName = "TABLE" Then
iElemCount = iElemCount + 1
End If
If iElemCount = varTableNum Then
Set varDocTable = varDocElement
iTableCount = iTableCount + 1
iRowCount = iRowCount + 1
Set Rng = Worksheets("Sheet1").Cells(2, varColInsert)
For Each varDocRow In varDocTable.Rows
For Each varDocCell In varDocRow.Cells
If Left(varDocCell.innerText, 9) = "Total for" Then
bTableEndFlag = True
Exit For
End If
Rng.Value = varDocCell.innerText
Set Rng = Rng.Offset(, 1)
iCellCount = iCellCount + 1
Next varDocCell
iRowCount = iRowCount + 1
Set Rng = Rng.Offset(1, -iCellCount)
iCellCount = 0
Next varDocRow
Exit For
End If
Next varDocElement
Set varDocElement = Nothing
Set varDocTable = Nothing
Set varDocRow = Nothing
Set varDocCell = Nothing
Set Rng = Nothing
End Sub
Any thoughts?
HTML is not XML. XML is strictly enforced is terms of opening and closing tags whilst HTML is famous for <br> tags without closuing </br>. You'd be very lucky if the HTML is XML compliant.
Anyway, if you want to use XMLHTTP because of the HTTP request and still keep your IE based web scraping code then see this article http://exceldevelopmentplatform.blogspot.com/2018/01/vba-xmlhttp-request-xhr-does-not-parse.html It shows how to use XMLHTTP before passing response to MSHTML.
You can use MSHTML independently of IE, see this article Use MSHTML to parse local HTML file without using Internet Explorer (Microsoft HTML Object Library). If you read that you will see much of the code that you write against the IE object model is in fact aaginst the MSHTML object model and as such you can decouple and jettison IE. Enjoy!
EDIT1: Don't forget you can ask your company's IT staff
You say it is an intranet site which implies internal to your company, you could ask the programmers who are responsible for that system for a direct API guide.
EDIT2: Folding in feedback about how to mimic a browser...
To mimic the browser you need to figure out the traffic that button clicks generate...
To watch network traffic I recommend you switch to Chrome as your browser. Then, on this web page, right-click mouse button and take "Inspect" menu option, this opens the Chrome Developer Tools. Then, in Developer Tools select the Network tab, then click on a link on this page and you will see the traffic that is generated.
So, if you want to go pure XMLHTTP and leave browsers behind then you won't have buttons available to click but you can observe the network traffic that happens when a button is clicked in a browser and you can then mimic this in code.
So for example, in your comment you ask how do I enter an account number and click the button. I'm guessing that clicking a button will result in a XMLHTTP call of something like http://example.com/dowork/mypage.asp?accountnumber=1233456&otherParams=true so you see account number would be buried in the query parameters. Once you have that url you can put that in your XMLHTTP request.
One potential problem is that system designers may have chosen to hide account numbers in the body of a HTTP POST because it is sensitive/confidential data. However, Chrome Developer Tools is very good and should still yield that information but may have to poke around.

VBA IE change dropdown value

Tried the below code for following URL Scripture look up. Please how to change drop down value from WEB to RV1909?
Dim Doc As HTMLDocument
Set Doc = IEApp.document
TestV2 = ""
TestV3 = ""
TestV2 = Doc.getElementsByClassName("app-list text-list")(0).innerText
Debug.Print "4b of 5: " & TestV2
IEApp.Doc.getElementsByClassName("app-list text-list").selectedIndex = 1
IEApp.Doc.getElementsByClassName("app-list text-list").FireEvent ("onchange")
TestV3 = Doc.getElementsByClassName("app-list text-list")(0).innerText
Debug.Print "4c of 5: " & TestV3
Tried many approaches from other posts, the following does not work:
IEApp.Doc.getElementsByClassName("app-list text-list")(0).innerHTML = "RV1909"
Here is the screenshot of Chrome Inspector:
In this case it's not enough just change div.app-list.text-list element innerText, as you can see that element is simple div, but not even ul. It should be changed by scripts which are called on click events. So, first of all you need to click on div to display the entire list, and then click on the RV1909 item. The below example shows how that could be done:
Sub Test()
Dim oIE As Object
Dim oDiv As Object
Set oIE = CreateObject("InternetExplorer.Application")
With oIE
.Visible = True
.Navigate "http://ebible.org/study/"
Do While .readyState <> 4 Or .Busy
DoEvents
Loop
With .Document
Do While .readyState <> "complete"
DoEvents
Loop
Set oDiv = .getElementsByClassName("app-list text-list")(0)
oDiv.Click
Do While .readyState <> "complete"
DoEvents
Loop
For Each oNode In .getElementsByClassName("text-chooser-main")(0).getElementsByClassName("text-chooser-abbr")
If oNode.innerText = "RV1909" Then
oNode.Click
Do While oDiv.innerText <> "RV1909"
DoEvents
Loop
Exit For
End If
Next
End With
End With
End Sub
Change the html div element <div class=app-list text-list>WEB</div> to <div class=app-list text-list>RV1909</div>

Capturing the IE windows using For Each window shell application

I need to capture the windows and need to navigate further, I am able to capture using the below code,
With objIE
Application.Wait (Now + TimeValue("00:00:05"))
Set TBL = .document.getElementById("payersitecredential-table")
Set PRV = TBL.getElementsByTagName("A"): PRV.Item(1).Click
'once the above code runs, it opens up an another window and i need to set that window as object.
Application.Wait (Now + TimeValue("00:00:20"))
End With
For Each WINDW In CreateObject("Shell.Application").Windows
If WINDW.Name = "Internet Explorer" Then
If InStr(WINDW.LocationURL, "newmmis") <> 0 Then
Set objIE1 = WINDW
Exit For
ElseIf InStr(WINDW.LocationURL, "https://www.example.com") <> 0 Then
Set ClsWindw = WINDW
ClsWindw.Quit
GoTo ThEnD
End If
End If
Next
With objIE1
Do While .Busy Or .readyState <> 4 'this is the place i get error as object required
DoEvents
Loop
Application.Wait (Now + TimeValue("00:00:05"))
.document.getElementById("InquireStatus").Click
End With
i have kept a wait time of 60 seconds for the URL to load, but sometimes the URL takes much time to load, at that time it is unable to set the object, could you guys let me figure out this and give me a solution.
Please see this in support of my comment
Sub ex()
Dim ie As SHDocVw.InternetExplorer
Dim wins As New SHDocVw.ShellWindows
For Each ie In wins
Debug.Print ie.Document.URL, ie.ReadyState, ie.Busy
Next ie
End Sub
EDIT
I would utilise the above, like so, this will find the IE based on URL and wait until it's ready and return, so use set IE=GetMeReadyIE("http://stackoverflow.com/questions/tagged/vba") for example.
Function GetMeReadyIE(strURL As String) As SHDocVw.InternetExplorer
Dim ie As SHDocVw.InternetExplorer
Dim wins As New SHDocVw.ShellWindows
For Each ie In wins
If ie.Document.URL = strURL Then
While ie.ReadyState <> READYSTATE_COMPLETE Or ie.Busy
DoEvents
Wend
Set GetMeReadyIE = ie
Exit For
End If
Next ie
End Function

VBA Object Required Error Trying To Get InnerText

I am trying to create a code that will go to a website, put in data, submit it, and then return the answer to a cell in excel. When I step through it, it works fine, but when I just try to run it, I get run-time error 424; Object Required.
I tried looking for a good answer, but I am just not grasping on how to quite fix this. Where is my issue? How do I correct it?
Sub Distance()
Dim IE As Object
' Create InternetExplorer Object
Set IE = CreateObject("InternetExplorer.Application")
' Make visible
IE.Visible = True
' Go to site
IE.Navigate "http://www.distance-cities.com/"
' Wait while IE loading...
Do Until IE.READYSTATE = 4
DoEvents
Loop
IE.Document.getelementbyId("from").Value = "Stillwater, OK"
IE.Document.getelementbyId("to").Value = "Hollis, OK"
IE.Document.forms(0).submit
Do Until IE.READYSTATE = 4
DoEvents
Loop
'*Below is where I get my error
Sheet1.Range("E5").Value = IE.Document.getelementbyId("routemi").InnerText
IE.Quit
End Sub
I apologize if this is a bit messy.
Thanks for you help!
Something like this (untested):
Dim el as object
'...
Set el = WaitForElement(IE.Document, "routemi", 1)
If Not el is Nothing Then
Sheet1.Range("E5").Value = el.innerText
Else
Msgbox "Element 'routemi' not found!"
Exit Sub
End if
Utility function to get an element by id, waiting until it appears:
Function WaitForElement(doc As Object, id As String, maxWaitSec As Double) As Object
Dim rv As Object, t
t = Timer
Do
Set rv = doc.getElementById(id)
DoEvents
Loop While rv Is Nothing And (Timer - t) < maxWaitSec
Set WaitForElement = rv
End Function
You need to leave proper waits at each point (after .Navigate2 and .Click) to allow for page loading. I use CSS selectors to target ids as modern browsers are optimized for CSS. Also, I updated method for page navigation to .Navigate2.
Option Explicit
Public Sub Distance()
Dim ie As New InternetExplorer
With ie
.Visible = True
.Navigate2 "http://www.distance-cities.com/"
While .Busy Or .readyState < 4: DoEvents: Wend
With .document
.querySelector("#from").Value = "Stillwater, OK"
.querySelector("#to").Value = "Hollis, OK"
.querySelector("[type=submit]").Click
End With
While .Busy Or .readyState < 4: DoEvents: Wend
Debug.Print .document.querySelector("#routemi").innerText
.Quit
End With
End Sub