Macro single step works when routine doesn't - vba

I have been running this macro and it come up with an 424 Object Required Error but the macro works and I get the expected result when I run it with a single step button "F8".
Sub FileUpload()
Dim IEexp As InternetExplorer
Set IEexp = CreateObject("InternetExplorer.Application")
IEexp.Visible = False
IEexp.navigate "https://www.google.co.uk/?gws_rd=ssl#q=lenti+a+contatto+colorate"
Do While IEexp.ReadyState <> 4: DoEvents: Loop
Dim inputElement As HTMLDivElement
Set inputElement = IEexp.Document.getElementById("brs")
MsgBox inputElement.textContent
IEexp.Quit
Set IEexp = Nothing
End Sub
The error comes up on the Set inputElement = IEexp.Document.getElementById("brs") line.

You’re checking the ReadyState of the browser, but with some modern web pages the DOM isn’t actually updated with some objects until at least that point.
IE automation in VBA is quite primitive, and it sounds like in this scenario you’re trying to access a node in the DOM before it exists - despite your best efforts to wait until the browser is ready. In some cases this can literally be a matter of milliseconds out in timings.
Your quickest fix here is to simply add Application.Wait() in your loop to cause an actual time delay. A more elegant option might be to introduce a check in your loop and exit the loop when the desired DOM object actually exists. If you do this, there’s a danger of ending up in an infinite loop and so I would always recommend setting a maximum number of increments as a backup.

Related

How to get the last child of an HTMLElement

I have written a macro in Excel that opens and parses a website and pulls the data from it. The trouble I'm having is once I'm done with all of the data on the current page I want to go to the next page. To do this I want to get the last child of the "result-stats" node. I found the lastChild function, and so came up with the following code:
'Checks to see if there is a next page
If html.getElementById("result-stats").LastChild.innerText = "Next" Then
html.getElementById("result-stats").LastChild.Click
End If
And here is the HTML that it is accessing:
<p id="result-stats">
949 results
<span class="optional"> (1.06 seconds)</span>
Modify search
Show more columns
Next
</p>
When I try to run this, I get an error. After a lot of searching I think I found the reason. According to what I read, getElementById returns an element and not a node. lastChild only works on nodes, which is why the function doesn't work here.
My question is this. Is there a clean and simple way to grab the last child of an element? Or is there a way to typecast an element to that of a node? I feel like I'm missing something obvious, but I've been at this way longer than I should have been. Any help anyone could provide would be greatly appreciated.
Thanks.
Here's a shell of how to do it. If my comments are not clear, ask away. I assumed knowledge of how to navigate to the page, wait for the browser, etc.
Sub ClickLink()
Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")
'load up page and all that stuff
'process data ...
'click link
Dim doc As Object
Set doc = IE.document
Dim aLinks As Object, sLink As Object
For Each sLink In doc.getElementsByTagName("a")
If sLink.innerText = "Next" Then 'may need to play with this, if `innerttext' doesn't work
sLink.Click
Exit For
End If
Next
End Sub

IE source code placeholder control for my VBA scraper

I have the following code which opens an IE page, and fills in the fields with the value "caravan". However I only need the first field to be filled in with "caravan". I need the second one to be filled in with "2016" for example. I've had trouble with this task because I can't seem to uniquely identify each element within the input tag (to which all of the fields belong).
Here is my code:
Sub Quote()
Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")
ie.navigate ("https://insurance.qbe.com.au/portal/caravan/form/estimate")
ie.Visible = True
Do
DoEvents
Loop Until ie.readystate = 4
Application.Wait (Now + TimeValue("00:00:03"))
Do
DoEvents
Loop Until ie.readystate = 4
Set inputCollection = ie.document.getElementsByTagName("input")
For Each inputElement In inputCollection
inputElement.Value = "Caravan"
Next inputElement
Loop
End Sub
So it's taking each "inputElement" that is housed within the "input" tag, and where possible, it's making a corresponding field's display value be that of "caravan".
To illustrate why I'm having difficulty in uniquely identifying each field, here is the source of the first two fields (first one is for caravan type; second one is for caravan year-of-manufacture):
First one
Second one
So neither have an id. And both are within the "input" tag and both have the same classname. So I can't get-element-by-id or get-elements-by-classname. I've tried getting elements by classname in a wide range of ways and it simply does nothing (no error is produced and the web page isn't affected).
The only way I've managed to fill in a field is through using the code I have above. But, again, it's changing all the fields of course. I figure that the only thing I can really use to get my code to tell the two apart is the placeholder element of each one.
But how do I achieve this seeing as you cannot "get element by placeholder"
//
I've since tried to confirm that there's no way to use classname, with the following code modification:
Set inputCollection = ie.document.getElementsByTagName("input")
For Each inputElement In inputCollection
If ie.document.getElementsByClassName.Value = "ui-select-search ui-select-toggle ng-pristine ng-valid ng-touched" Then inputElement.Value = "Caravan"
Oh my! How exciting!! I finally found out how to do this after literally days of searching online. It always had to be something simple (but, alas, this isn't my area of expertise at all so it's always going to be really challenging for me). Anyway, this code works (and I expect I will need to put a fire-event line in soon):
Set inputCollection = ie.document.getElementsByTagName("input")
For Each inputElement In inputCollection
If inputElement.getAttribute("placeholder") = "Caravan type" Then
inputElement.Value = "Caravan"
Exit For
End If
Next inputElement
I was so unaware of "getAttribute" but it makes so much sense. If you don't have an id and some of the fields you are looking at have the same classname (as can often be the case), then you need to rely on other unique attributes and use this sort of code.
If you're wondering where I found out about this, I found this pretty cool Youtube channel, and here's the specific video that helped me:
https://www.youtube.com/user/alexcantu3/videos
Hope it helps someone else some day!

How to work with result collections from Selenium

Or how to work with collections (or arrayss) in VBA.
The issue is most probably myself, but I couldn't find an answer yet.
I am trying to go trough a some pages on a web-site with Selenium-vba to find some data.
As usual if there is more to display, the site shows a 'NEXT' button. The button has <a href ... > when the link is activated, else it's just plain text.
To test if there is another page I have found the way to use findElementsByLinkText, and either there is a link or the the collection is empty. So this can be tested by the size of the collection.
This works so far.
But when I try to use the collection (aside from a for each loop) for further action I can't get it to operate.
This is the code:
Dim driver As New SeleniumWrapper.WebDriver
Dim By As New By, Assert As New Assert, Verify As New Verify, Waiter As New Waiter
On Error GoTo ende1
driver.Start "chrome", "http://www.domain.tld/"
driver.setImplicitWait 5000
driver.get "//......."
Set mynext = driver.findElementsByLinkText("Next")
if mynext.Count >0 Then
mynext(1).Click 'THIS STATEMENT DOES NOT WORK
End If
So please help me to get around my understanding issue (which I am convinced it is)
How can I access an element from the collection.
My workaround so far is to execute
driver.findElementByLinkText("Next").Click
but this is unprofessional as it executes the query again.
The Next button is probably loaded asynchonously after the page is completed.
This implies that findElementsByLinkText("Next") returns no elements at the time it's called.
A way to handle this case is to silent the error, adjust the timeout and test the returned element:
Dim driver As New Selenium.ChromeDriver
driver.Get "https://www.google.co.uk/search?q=selenium"
Set ele = driver.FindElementByLinkText("Next", Raise:=False, timeout:=1000)
If Not ele Is Nothing Then
ele.Click
End If
driver.Quit
To get the latest version in date working with the above example:
https://github.com/florentbr/SeleniumBasic/releases/latest

Extracting geo-coordinates for List of Addresses (VBA EXCEL)

I have a large list of locations that I need to find the latitude and longitude for to use in another application. I've created some code to go to a website and retrieve this information. The first coordinate is placed fine most of the time. However, not all the addresses are recorded in excel. Sometimes I'll get empty cells. I can't seem to figure out what's wrong. Here's is the code:
Sub GetCoordinates()
Dim IE As New InternetExplorer
Dim Doc As HTMLDocument
Dim rowNum As Integer
Dim lastRow As Integer
Dim tURL As String
Dim coordinates As String
IE.Visible = True
'This will be the number of rows with an address (5 is for testing purposes)
lastRow = 5
For rowNum = 1 To lastRow
'The URL is determined by the address in a cell (ex. in C2)
tURL = "http://dbsgeo.com/latlon/?" & Cells(rowNum, "C").Value
IE.navigate tURL
Do
'Wait until the page loads
Loop Until IE.readyState = READYSTATE_COMPLETE
Set Doc = IE.document
coordinates = Trim(Doc.getElementById("latlon").innerText)
Cells(rowNum, "B").Value = coordinates
Next rowNum
IE.Quit
End Sub
It's not that the code doesn't run; The problem is I'll get coordinates for say 3 out of 8 addresses, in random order. There are even times when the wrong coordinates are recorded. I don't know why it's like that. Can someone tell me what's wrong with my code?
The problem seems to be that those fields "latlon" for example, are being populated by some javascript functions which aren't executed until after the page has loaded, so relying on the IE.document.readyState is not reliable in this instance.
Ordinarily, I say use the WinAPI sleep function (put this at the top of your module):
Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Then, do your "waiting" loop like:
Do
Sleep 250
Loop Until IE.ReadyState = READYSTATE_COMPLETE And Not IE.Busy
This was still not working for me for this particular website, though, although if I did Sleep 1000 that seemed to work, the timing will vary based on traffic, network speed, connectivity, available system resources, etc., so even that is not 100% reliable.
I also tried to find a way to emulate the XMLHTTPRequest (which is faster than using the browser directly) but this was also not working.
I looked at the other site. It is possible to fill forms and submit/click controls on a website using VBA, for the most part, but I never found those to be very reliable and didn't try that site.
I spent waaaaay too much time trying to force that solution to work, but it is a bad solution because it's a bad website that's hacked together by some 3rd party using Google's Geocode API.
You should just use Google's API directly
Documentation:
https://developers.google.com/maps/articles/geocodingupgrade
You can construct a URL like:
http://maps.googleapis.com/maps/api/geocode/xml?address=Paris,+France
You can load that to an XML/DOM parser (use a vba reference to Microsoft XML, v6.0 (aka MSXML2) which is what I would recommend. This does not use a browser, it is a stream that is capable of loading an XML from a url (like above) directly in memory.
With a DOM parser, use the DOMDocument's .GetElementsByTagName("location")(0) and then pull the lat/lon from the child nodes.
Let me know if you have trouble implementing that.

logic to loop until web page element is not nothing

*Using HTML Object library with vba
*(CAS is set as Browser instance (shdocvw))
Set HTMLDoc = CAS.document.frames("MainFrame").document 'pull the main frame
Do Until Not HTMLDoc Is Nothing
DoEvents
Loop
I dont think this is correct since, It will only set HTMLDoc one time, and if it is nothing, its going to keep looping itself over and over, checking for it to be something, but since it's only called once. A better way to go, imho, would be can check for an element and loop until the element exists, since the page can load, but my elements pulling from a DB take half a second longer or so. Im just not sure how to write the loop to keep setting the htmldoc, and then keep checking for an element within it to be not nothing. (The point is so even if my wait timer isnt waiting long enough, it should not proceed until the element exists)
If you wanted to wait for a specific element:
Dim el As Object
Do
Set el = Nothing
On Error Resume Next
Set el = CAS.document.frames("MainFrame").document.getElementById("idHere")
On Error GoTo 0
DoEvents
Loop While el Is Nothing
You probably want to build in a maximum wait time though, so you don't loop endlessly if for some reason the element never appears.