Importing specific web data to excel using VBA - vba

I'm very much beginner to the VBA coding scene (web scripting is more my thing) but I have an excel based program I need to create that will import data from a intranet web based application into a spreadsheet. Here's the gist of what I'm looking to set up...
In the spreadsheet the user will enter the following info: username, password, list of customer account numbers and a date range. The user will then click a "command button" that will make the following happen:
Open web based program, login (based on login/password typed into spreadsheet) and navigate to the account search screen.
Enter first customer account number into search field and click the "search" button to navigate to the specific customer account.
Navigate to the "search activity" screen, enter the date range and click the "search activity button.
Pull the data from a specific column of the activity table and import the data to the spreadsheet.
If there are multiple pages of data there will be a "Next Results" button, there should be a loop to click the next results button (if it exists) and pull the same column of data from each page until the button no longer exists (no more data).
Once there are no more pages of data (or if there is only one page) the macro will loop back and navigate to the account search screen and perform the same operations for each account in the list of accounts typed into the spreadsheet until there are no other accounts.
Once completed (all data successfully imported to the spreadsheet) it should close the IE window.
It's a little complicated and I realize excel/vba is definitely not the best solution for performing these functions but unfortunately it's what I have to use in this instance. I've been able to piece together some VBA that does almost everything above, the problem I'm having is looping through the activity pages and pulling the data just will not work (get a wide range of errors that only confuse me more), sometimes it will pull data from the first sheet, click the "next results" button, get to the next page and throw an error or even get through two or three pages and throw an error. It doesn't make a lot of sense but the most common error is "permission denied". Also this code currently only pulls the data from one account, I was hoping once I got it working for one account it would be simple to create a loop of the entire code to have it go down the list of account numbers and do the same for each until completed. I've been stuck on this for a number of weeks and I'm really ready to toss out the whole thing and start from scratch, any help would be very very appreciated!
Below is the code I have so far...
Private Sub CommandButton1_Click()
' open IE, navigate to the desired page and loop until fully loaded
Set IE = New InternetExplorerMedium
my_url = "https://customerinfo/pages/login.jsp"
my_url2 = "https://customerinfo/pages/searchCustomer.jsp"
my_url3 = "https://customerinfo/pages/searchAccountActivity.jsp"
With IE
.Visible = True
.navigate my_url
Do Until Not .Busy And .readyState = 4
DoEvents
Loop
End With
' Input the userid and password
IE.document.getElementById("userId").Value = [B2]
IE.document.getElementById("password").Value = [B3]
' Click the "Login" button
IE.document.getElementById("action").Click
Do Until Not IE.Busy And IE.readyState = 4
DoEvents
Loop
' Navigate to Search screen
With IE
.navigate my_url2
Do Until Not .Busy And .readyState = 4
DoEvents
Loop
End With
' Input the account number & click search
IE.document.getElementById("accountNumber").Value = [B5]
IE.document.getElementById("action").Click
Do Until Not IE.Busy And IE.readyState = 4
DoEvents
Loop
With IE
.navigate my_url3
Do Until Not .Busy And .readyState = 4
DoEvents
Loop
End With
'Input search criteria
IE.document.getElementById("store").Value = [C7]
IE.document.getElementById("dateFromMonth").Value = [C10]
IE.document.getElementById("dateFromDay").Value = [B11]
IE.document.getElementById("dateFromYear").Value = [B12]
IE.document.getElementById("timeFromHour").Value = [B20]
IE.document.getElementById("timeFromMinute").Value = [B21]
IE.document.getElementById("dateToMonth").Value = [C15]
IE.document.getElementById("dateToDay").Value = [B16]
IE.document.getElementById("dateToYear").Value = [B17]
IE.document.getElementById("timeToHour").Value = [B24]
IE.document.getElementById("timeToMinute").Value = [B25]
IE.document.getElementById("action").Click
Do Until Not IE.Busy And IE.readyState = 4
DoEvents
Loop
'Pulls data from activity search
Dim TDelements As IHTMLElementCollection
Dim TDelement As HTMLTableCell
Dim r As Long, i As Long
Dim e As Object
Application.Wait Now + TimeValue("00:00:05")
Set TDelements = IE.document.getElementsByTagName("tr")
r = 0
For i = 1 To 1
Application.Wait Now + TimeValue("00:00:03")
For Each TDelement In TDelements
If TDelement.className = "searchActivityResultsOldContent" Then
Sheet1.Range("E1").Offset(r, 0).Value = TDelement.ChildNodes(8).innerText
r = r + 1
ElseIf TDelement.className = "searchActivityResultsNewContent" Then
Sheet1.Range("E1").Offset(r, 0).Value = TDelement.ChildNodes(8).innerText
r = r + 1
End If
Next
Application.Wait Now + TimeValue("00:00:02")
Set elems = IE.document.getElementsByTagName("input")
For Each e In elems
If e.Value = "Next Results" Then
e.Click
i = 0
Exit For
End If
Next e
Next i
Do Until Not IE.Busy And IE.readyState = 4
DoEvents
Loop
IE.Quit
End Sub

So, what is happening after you've clicked on "Next..." element? Let me describe an issue I encountered. Assume the code flow as follows:
Create IE instance, and navigate to some URL, e. g. first search results page.
Make a check if the page is loaded and ready. Wait for it.
Create the DispHTMLElementCollection collection of the target elements, retrieved by .document.getElementsByTagName(), etc..
Loop through the elements of the collection, do some stuff.
Click on the "Next ..." element. The issue is that in some cases the next page doesn't start downloading immediately after click due to some JS or XHR processing.
Make a conventional check if the next page is loaded and ready. This check just allows the further code execution without any delay, since downloading of the next page has not been started immediately after click, and the current existing page is determined as next page downloaded and ready, by mistake. Simple several secs delays doesn't provide reliable way to get the ready page.
Again, create the DispHTMLElementCollection collection of the elements from the existing page, instead of the next page, by mistake.
Loop through the elements of the created collection. While the loop in progress, the next page starts downloading. The collection still contains the references to the objects, but actually the page with that objects has been unloaded. Thereby either attempt to access to the element of the unloaded page or due to document object is unresponsive, the operation gives "permission denied" errors.
My clue is to avoid clicking on "Next...", try to read the next page URL from .href property of the "Next..." anchor <a> element, and invoke IE.navigate to that URL, then check the page readiness.
Take a look at the example implementing that approach.
IMO the most efficient way is to use XHR, like this, this and this.

Related

Starting JavaScript query in IE from VBA

I am trying to develop a macro that will be able to automatically check the format of tracking number in a sheet, decide which courier site to use, and then get the status of shipment. Until now it's going pretty well, since both tnt and dhl's results have tracking no in it's address, but I got kinda stuck with ups, where the submission of form seems to be done via javascript.
I managed to trick it by focusing on input box and making my macro send keys tab, tab and enter after inputting no, but it's not so elegant and the success rate seems to be about 70% from my testing.
I tried all different variations of getelementsby..., but nothing really seemed to submit the form successfully.
Is there a way to call the JavaScript to start query, or maybe another way of submitting the form itself?
Added an edited chunk of code for testing as well as html for both what seems to be code for the button and for the JavaScript :
sub ups()
Dim objIE
Dim Website
Dim Element
Dim cRng As Range
Set cRng = "1Z30RY119130505965"
Set objIE = "https://www.ups.com/tracking/tracking.html"
With objIE
.visible = True
.navigate Website
Do While objIE.Busy Or objIE.ReadyState <> 4
DoEvents
Loop
Set Element = .Document.GetElementsByName("trackNums")
'a part of code I found useful for when previous Do
'While fails and events start executing too early
Dim btimer As Integer
btimer = 0
Do While Element.Item(3).Value = 0
Application.Wait (Now + TimeValue("0:00:01"))
btimer = 4 Then
Exit Do
End If
Loop
Element.Item(3).Focus
Element.Item(3).Value = cRng.Value
SendKeys "{TAB}"
SendKeys "{TAB}"
SendKeys "{~}"
<div class="btnBar">
<input name="track.x" class="ups-cta ups-cta_primary" type="submit" value="Track"></input>
</div>
<script language="JavaScript1.2" type="text/javascript">
$(function)
{
Tracking.setLocaleString('en_KR');
Tracking.setLoadingText('Loading. Please wait a moment.');
Tracking.setErrorMessage3('The systen is unable to process your request at this time. Please try again later.');
Tracking.addInatructionTextForyinputPage("Enter up to 25 tracking or InfoNotice numbers, one per line.", 'y');
Thanks for any ideas in advance.

Return Address from Google term search to excel using VBA

I am familiar with StackOverflow but have just recently signed up. I am trying to search a Hotel on google and return the address in Excel using VBA. Below is a photo of what Information I am trying to return from Google. From my research, I was able to find a VBA that allowed me to return the Results stats.
Would it be possible to modify my code and return the box at the top of my google search?
I would really appreciate your help! Below is the VBA I am using to return search results.
Sample Image - Red Roof Inn & Address
Sub SearchGoogle()
Dim ie As Object
Dim form As Variant
Dim button As Variant
Dim LR As Integer
Dim var As String
Dim var1 As Object
LR = Cells(Rows.Count, 1).End(xlUp).Row
For x = 2 To LR
var = Cells(x, 1).Value
Set ie = CreateObject("internetexplorer.application")
ie.Visible = True
With ie
.Visible = True
.navigate "http://www.google.co.in"
While Not .readyState = READYSTATE_COMPLETE
Wend
End With
'Wait some to time for loading the page
While ie.Busy
DoEvents
Wend
Application.Wait (Now + TimeValue("0:00:02"))
ie.document.getElementById("lst-ib").Value = var
'Here we are clicking on search Button
Set form = ie.document.getElementsByTagName("form")
Application.Wait (Now + TimeValue("0:00:02"))
Set button = form(0).onsubmit
form(0).submit
'wait for page to load
While ie.Busy
DoEvents
Wend
Application.Wait (Now + TimeValue("0:00:02"))
Set var1 = ie.document.getElementById("resultStats")
Cells(x, 2).Value = var1.innerText
ie.Quit
Set ie = Nothing
Next x
End Sub
Right now your code loads the page and then loads the value of the resultStats element.
So the section of your code that you will need to alter is:
Set var1 = ie.document.getElementById("resultStats")
Cells(x, 2).Value = var1.innerText
The first step to your problem is to understand the DOM of the HTML page you are attempting to use, in this case Google. I would suggest using a browser to navigate the DOM as it would give you a good idea of what the whole page is doing.
If you are aiming to do this on a macro basis you will need a path through the DOM that will always take you where you want to go. I would suggest having two pages with different searches open so that you can check you hypothesis as you go.
For example the boxes that you refer to seem to be located in a class called kp-header from knowing this you can build out your path through the DOM to return the text value displayed on screen. Again you will need to do your own investigations to find the best stating point for your search as kp-header was just the first potently helpful result I could find.
Although please note that depending on the speed you are loading these webpages you may hit a limit from google as they discourage scraping. What would be a better option to avoid these limits and to avoid yourself having to investigate all of google's DOM would be to try and incorporate one of google's API's

VBA getelementsbyclassname use with getelementsbyid

I'm trying to write a short VBA code to click on a button on a website by searching first by its classname, and then by id:
Sub Autoclick[enter image description here][1]()
Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Navigate "http://stackoverflow.com/"
IE.Visible = True
While IE.Busy
DoEvents
Wend
IE.Document.getElementsByClassname("nav mainnavs").getElementById("nav-jobs").Click
End Sub
However the code would not click on the Jobs button. I understand that I can just use getelementbyid directly, but I'd still like to know why using getelementsbyclassname and getelementbyid together would not work.
Attached image contains the html code of the website.
Thanks a lot for any help!
I'd still like to know why using getelementsbyclassname and
getelementbyid together would not work.
Because code IE.Document.getElementsByClassname("nav mainnavs").getElementById("nav-jobs") causes error 438: object doesn't support this property or method.
It doesn't work because getElementsByClassName("nav mainnavs") returns a div and divs doesn't have any getElementById method.
To search the div by element id it would probably be necessary to loop through all the elements of it and check their id:
Dim mainnavs
Set mainnavs = IE.document.getElementsByClassName("nav mainnavs")
If mainnavs.Length > 0 Then
Dim mainnavItem ' this is a div element and it doesn't have .getElementById() method
Set mainnavItem = mainnavs.Item(0)
Dim itm
For Each itm In mainnavItem.all
If itm.ID = "nav-jobs" Then
itm.Click
Exit For
End If
Next itm
Else
MsgBox "getElementsByClassName('nav mainnavs') didn't return any elements"
End If
when I apply the same method to my company's website it keeps on
showing the message box "didn't return any elements"
This means that element with such class name does not exist on the page.
Do you wait until the page is fully loaded?
While ie.Busy Or ie.readyState <> READYSTATE_COMPLETE: DoEvents: Wend
why would we do mainnavItem.all in the For loop
This is because there is no other possibility to search for element by its id. Therefore we go through all the children of the mainnavItem and check the id.

Web scraping - create object for IE

Sub Get_Data()
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
ie.Navigate "http://www.scramble.nl/military-database/usaf"
Do While ie.Busy
Application.Wait DateAdd("s", 1, Now)
Loop
SendKeys "03-3114"
SendKeys "{ENTER}"
End Sub
The code below searches for keyboard typed value 03-3114 and gets a data in the table. If I 'd like to search for value which is already in cell A1 and scrape values from table for "Code, Type, CN, Unit" in cell range ("B1:E1") what should I do?
You are using SendKeys which are highly unreliable :) Why not find the name of the textbox and the search button and directly interact with it as shown below?
Sub Get_Data()
Dim ie As Object, objInputs As Object
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
ie.Navigate "http://www.scramble.nl/military-database/usaf"
Do While ie.readystate <> 4: DoEvents: Loop
'~~> Get the ID of the textbox where you want to output
ie.Document.getElementById("serial").Value = "03-3114"
'~~> Here we try to identify the search button and click it
Set objInputs = ie.Document.getElementsByTagName("input")
For Each ele In objInputs
If ele.Name Like "sbm" Then
ele.Click
Exit For
End If
Next
End Sub
Note: To understand how I got the names serial and sbm, refer to the explanation given just above the image below.
The code below searches for keyboard typed value 03-3114 and gets a data in the table. If I 'd like to search for value which is already in cell A1 and scrape values from table for "Code, Type, CN, Unit" in cell range ("B1:E1") what should I do?
Directly put the value from A1 in lieu of the hardcoded value
ie.Document.getElementById("serial").Value = Sheets("Sheet1").Range("A1").Value
To get the values from the table, identify the elements of the table by right clicking on it in the browser and clicking on "Inspect/Inspect Element(In Chrome it is just Inspect)" as shown below.
I can give you the code but I want you to do it yourself. If you are still stuck then update the question with the code that you tried and then we will take it from there.
Interesting read: html parsing of cricinfo scorecards

VBA and Internet Explorer: fill in an input box

I'm new to making interactions between VBA and Internet Explorer, but I've read many things online and couldn't figure out the problem in my code. I just want to retrieve the 'Username' box on a website and add a value inside. So I retrieved all input boxes into a collection of HTML elements, but then that collection is empty:
Dim Collect As IHTMLElementCollection
With IE
.navigate "http:xxxxxxxxxx"
.Visible = True
End With
Do While IE.Busy
Loop
Set Collect = IE.document.getElementsByTagName("input")
MsgBox Collect.Length
End Sub
This will give a message box with "0". If I toggle a breakpoint before the end of the code and I "watch" the variable Collect, I can see there are 17 items inside, one of them being the username 'inputbox', with name 'tfUserName'. Can you help me please?
EDIT: I found that the problem comes from this code:
Do While IE.Busy
Loop
Which I replaced with this:
Do Until IE.readyState = READYSTATE_COMPLETE
DoEvents
Loop
And now everything works fine. Thank you for your responses.
Validate the collection against null instead to determine if it contains elements
If Not Collect Is Nothing Then
For Each htmlElement In Collect
If Not htmlElement.getAttribute("username") Is Nothing Then
htmlElement.setAttribute("value", "licou6")
Exit For
End If
Next
End If