How to scrape data from newly opening webpage? - vba

To scrape data from webpage which is opening after clicking a submit button
In this website I'm filling textbox showing place-holder #Tracking with this value 148459 after filling clicking Submit it opens another page, with details... I want to scrape the data on that page and bring it into Excel using VBA.
My guess is that the Submit button is using POST method; I need help to scrape the webpage with POST method.
I've worked out with below code:
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLInput As MSHTML.IHTMLElement
Dim HTMLButton As MSHTML.IHTMLElement
Dim HTMLButtons As MSHTML.IHTMLElementCollection
Dim HTMLAs As MSHTML.IHTMLElementCollection
Dim HTMLA As MSHTML.IHTMLElement
Dim txt As String
Dim delStat As String
txt = ActiveCell.Value
IE.Visible = True
'IE.Navigate "http://206.50.6.194/WebtrakWT/shipinquiry/quicktrack.aspx"
IE.Navigate "http://206.50.6.194/WebtrakWT/shipinquiry/ShipInfo.aspx?OrderNo=393874&Back=QuickTrack&TrackType=HousebillNo&TrackNo=" & txt
Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = IE.Document
'Set HTMLInput = HTMLDoc.getElementById("ddlTrackBy")
'HTMLInput.Value = "HousebillNo"
'Set HTMLInput = HTMLDoc.getElementById("txtInputNo")
'HTMLInput.Value = ActiveCell.Value
'Do While IE.ReadyState <> READYSTATE_COMPLETE
' DoEvents
' Loop
'Set HTMLButton = HTMLDoc.getElementById("btnSubmit")
'HTMLButton.Click
Do While IE.ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
Set HTMLAs = HTMLDoc.getElementsByTagName("span")
For Each HTMLA In HTMLAs
Debug.Print HTMLA.getAttribute("id"), HTMLA.innerHTML
Next HTMLA
delStat = HTMLDoc.getElementById("lblStatus").innerHTML
Debug.Print delStat
If delStat = "DELIVERED" Then
ActiveCell.Offset(0, 5).Value = "Delivered"
Call screenShot
Else
ActiveCell.Offset(0, 5).Value = "Not Delivered"
End If
IE.Quit
Set IE = Nothing

Related

Type Mismatch on one machine

I wrote some code to scrape data from a website. I've tested it on 5 difference machines with different versions of excel and it all works fine. But on the intended users machine we get type mismatch error.The code fails at the last line below.
Sub LogIn()
Dim ie As SHDocVw.InternetExplorer
Dim iDoc As MSHTML.HTMLDocument
Dim ele As MSHTML.IHTMLElement
Dim eles As MSHTML.IHTMLElementCollection
Dim tableSection As MSHTML.IHTMLElement
Dim tableRow As MSHTML.IHTMLElement
Dim tableCell As MSHTML.IHTMLElement
Dim smallCell As MSHTML.IHTMLElement
Dim iCol As Integer
Dim iRow As Integer
Dim iCounter As Integer
iRow = 0
Do
iRow = iRow + 1
Loop Until Cells(iRow, 5) = ""
Range(Cells(1, 5), Cells(iRow, 6)).ClearContents
Set ie = New InternetExplorer
ie.Visible = False
ie.navigate ("https://www.howdidido.com/")
Do While ie.readyState <> READYSTATE_COMPLETE
Loop
Set iDoc = ie.document
any help greatly appreciated.
I have tried the following code and it is working alright. Maybe it can help you (seems as two loops and doEvents are needed for the ready state completes).
Dim iDoc As MSHTML.HTMLDocument
Dim iCol As Integer
Dim iRow As Integer
Dim iCounter As Integer
Dim i As Long
Dim URL As String
Dim IE As Object
Dim objElement As Object
Dim objCollection As Object
Set IE = CreateObject("InternetExplorer.Application")
'Set IE.Visible = True to make IE visible, or False for IE to run in the background
IE.Visible = True
'Define URL
URL = "https://www.automateexcel.com/excel/"
'Navigate to URL
IE.Navigate URL
' Statusbar let's user know website is loading
Application.StatusBar = URL & " is loading. Please wait..."
' Wait while IE loading...
'IE ReadyState = 4 signifies the webpage has loaded (the first loop is set to avoid inadvertently skipping over the second loop)
Do While IE.ReadyState = 4: DoEvents: Loop 'Do While
Do Until IE.ReadyState = 4: DoEvents: Loop 'Do Until
'Webpage Loaded
Application.StatusBar = URL & " Loaded"
Set iDoc = IE.Document
'Unload IE
Set IE = Nothing
Set objElement = Nothing
Set objCollection = Nothing

Internet scraping issue when using vba on secure site

I am trying to automate IE to pull data from a secure site how every I keep getting the same error message: "Object required"
The debugger points to the line -HTMLinput.Value = "test"
Everything before works fine. I don't know what's the problem. I have verified my ID to make sure there is no mistakes.
Sub Brows()
Dim IE As New SHDocVw.InternetExplorerMedium
Dim Policy As Object
Dim certificate As Object
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLinput As MSHTML.IHTMLElement
IE.Visible = True
IE.navigate "secure website address"
Application.Wait Now + TimeValue("00:00:02")
Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = IE.Document
Set HTMLinput = HTMLDoc.getElementById("input ID")
HTMLinput.Value = "test"
End Sub
How to log into a web site using Excel and VBA.
Dim HTMLDoc As HTMLDocument
Dim oBrowser As InternetExplorer
Sub Login_2_Website()
Dim oHTML_Element As IHTMLElement
Dim sURL As String
On Error GoTo Err_Clear
sURL = "https://www.google.com/accounts/Login"
Set oBrowser = New InternetExplorer
oBrowser.Silent = True
oBrowser.timeout = 60
oBrowser.navigate sURL
oBrowser.Visible = True
Do
' Wait till the Browser is loaded
Loop Until oBrowser.readyState = READYSTATE_COMPLETE
Set HTMLDoc = oBrowser.Document
HTMLDoc.all.Email.Value = "sample#vbadud.com"
HTMLDoc.all.passwd.Value = "*****"
For Each oHTML_Element In HTMLDoc.getElementsByTagName("input")
If oHTML_Element.Type = "submit" Then oHTML_Element.Click: Exit For
Next
' oBrowser.Refresh ' Refresh If Needed
Err_Clear:
If Err <> 0 Then
Debug.Assert Err = 0
Err.Clear
Resume Next
End If
End Sub
The program requires references to the following:
1 Microsoft Internet Controls
2. Microsoft HTML Object Library
http://vbadud.blogspot.com/2009/08/how-to-login-to-website-using-vba.html

Upload a file into html browser using VBA

I am trying to upload a file to a web page, the following are the steps I followed:
Open the web page http://www.htmlquick.com/reference/tags/input-file.html
Wait for until the page is getting loaded
In this webpage I am uploading the file into the first “Upload a File” browser.
Get the input element by tag name as “input”
Hit the “browse” button, since the paste potion is disabled.
Enter the file path in the “Choose File to Upload” window
Enter
After 5th step, I am not able to enter the file path in the “Choose File to Upload” window, looks like the macro is not supporting for this.
Here is my code :
Sub File_Test()
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLButtons As MSHTML.IHTMLElementCollection
Dim HTMLButton As MSHTML.IHTMLElement
Dim ie As Object
Set ie = CreateObject("internetexplorer.application")
ie.Visible = True
ie.navigate "http://www.htmlquick.com/reference/tags/input-file.html"
Do While ie.readyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = ie.document
Set HTMLButtons = HTMLDoc.getElementsByTagName("input")
For Each HTMLButton In HTMLButtons
If HTMLButton.Type = "file" Then
HTMLButton.Click
HTMLButton.Value = "C:\Documents\Test\Temp.txt"
Exit For
End If
Next
End Sub
And here's a screenshot:
Any suggestions?
===================================================
Here is another modified code but sill I am not able to enter the file name
Sub File_Test()
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLButtons As MSHTML.IHTMLElementCollection
Dim HTMLButton As MSHTML.IHTMLElement
Dim ie As Object
Dim WSshell
Set WSshell = CreateObject("WScript.Shell")
Set ie = CreateObject("internetexplorer.application")
ie.Visible = True
ie.navigate "http://www.htmlquick.com/reference/tags/input-file.html"
Do While ie.readyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = ie.document
Set HTMLButtons = HTMLDoc.getElementsByTagName("input")
For Each HTMLButton In HTMLButtons
If HTMLButton.Type = "file" Then
HTMLButton.Click
With WSshell
Application.Wait (Now + TimeValue("0:00:10"))
.AppActivate "Choose File to Upload"
Application.Wait (Now + TimeValue("0:00:10"))
.SendKeys "C:\Documents\Test\Temp.txt"
Application.Wait (Now + TimeValue("0:00:10"))
.SendKeys "~" 'Enter
End With
Exit For
End If
Next
End Sub
Any idea? why sendkyes are not working?

Data Scraping using VBA From website doesn’t download

I am trying to download some data from website which starts download upon clicking.
But this code is not working, Can anyone help.
Dim ie As InternetExplorer
Dim htmldoc As MSHTML.IHTMLDocument
Dim HTMLInput As MSHTML.IHTMLElementCollection
Dim HTMLAs As MSHTML.IHTMLElementCollection
Dim HTMLA As MSHTML.IHTMLElement
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
'Navigate to webpage
Dim ieURL As String: ieURL = "http://erldc.org/final-schedule.aspx"
ie.navigate ieURL
Do While ie.readyState <> READYSTATE_COMPLETE
Loop
Set htmldoc = ie.document
Set HTMLInput = htmldoc.getElementsByTagName("a")
For Each HTMLA In HTMLAs
Debug.Print HTMLA.getAttribute("classname"), HTMLA.getAttribute("href"), HTMLA.getAttribute("rel")
If HTMLA.getAttribute("href") = "javascript:__doPostBack('ctl00$ContentPlaceHolder1$Calendar1','6420')" Then
HTMLA.Click
Exit For
End If
Next HTMLA
End Sub
Try pausing the macro for a few seconds, maybe 5 seconds, after the Do While/Loop that checks for the ReadyState...
Dim sngFinish As Single
Dim intPauseTime As Integer
intPauseTime = 5 'in seconds
sngFinish = timer + intPauseTime
Do While timer < sngFinish
DoEvents
Loop
Also, I would suggest that you check the Busy state of Internet Explorer, in addition to the ReadyState, and add DoEvents...
Do While ie.Busy Or ie.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Hope this helps!

Selecting value from a dropdown list on a webpage using vba

On this site I am able to select the country and language from dropdown menu but when I click on "Complete new application form" button. It says fields are empty.
Any help would be appreciated.
Sub Test()
strURL = "https://visa.kdmid.ru/PetitionChoice.aspx"
With ie
.Visible = True
.navigate strURL
While .Busy
DoEvents
Wend
Set html = .document
'Country where you will apply for visa.
Set ctY = html.getElementById("ctl00$phBody$Country")
For i = 1 To ctY.Options.Length
If ctY.Options(i).Text = "NETHERLANDS" Then
ctY.selectedIndex = i
Exit For
End If
Next i
'Select Language
Set lnG = html.getElementById("ctl00$phBody$ddlLanguage")
For i = 1 To lnG.Options.Length
If lnG.Options(i).Text = "ENGLISH" Then
lnG.selectedIndex = i
Exit For
End If
Next i
'Click I have read instructions check box
html.getElementById("ctl00$phBody$cbConfirm").Click
'Click apply button
Set btnGo = html.forms(0).all("ctl00$phBody$btnNewApplication")
btnGo.Click
End With
End Sub
So you are on the right track but if you look at the HTML of the site there are actually two elements with the country selection- you got the first one, 'ctl00_phBody_Country', but this is actually just the drop down, and the actual selected value is stored in 'ctl00_phBody_cddCountry_ClientState'... the language section has similar structure. Lastly the accepted value is not just the country name you see in the drop down, it is actually a combination of a country code from the drop down and the country name....
See below for sample code:
Public Sub Test()
Dim IE As InternetExplorer
Dim HTMLDoc As HTMLDocument
Dim countryStr As String
Dim countryObj As HTMLObjectElement
Dim countryCodes As IHTMLElementCollection
Dim codeCounter As Long
Dim languageStr As String
Dim languageObj As HTMLObjectElement
Dim languageCodes As IHTMLElementCollection
countryStr = "Netherlands"
languageStr = "English"
Set IE = New InternetExplorer
With IE
.Visible = False
.Navigate "https://visa.kdmid.ru/PetitionChoice.aspx?AspxAutoDetectCookieSupport=1"
While .Busy Or .ReadyState <> READYSTATE_COMPLETE: Wend
Set HTMLDoc = IE.document
End With
Set countryObj = HTMLDoc.getElementById("ctl00_phBody_cddCountry_ClientState")
Set countryCodes = HTMLDoc.getElementById("ctl00_phBody_Country").getElementsByTagName("option")
For codeCounter = 0 To countryCodes.Length - 1
If countryCodes(codeCounter).innerText = UCase(countryStr) Then
countryObj.Value = countryCodes(codeCounter).Value & ":::" & countryCodes(codeCounter).innerText & ":::"
While IE.Busy Or IE.ReadyState <> READYSTATE_COMPLETE: Wend
Exit For
End If
Next
Set languageObj = HTMLDoc.getElementById("ctl00_phBody_cddLanguage_ClientState")
Set languageCodes = HTMLDoc.getElementById("ctl00_phBody_ddlLanguage").getElementsByTagName("option")
For codeCounter = 0 To languageCodes.Length - 1
If languageCodes(codeCounter).innerText = UCase(languageStr) Then
languageObj.Value = languageCodes(codeCounter).Value & ":::" & languageCodes(codeCounter).innerText & ":::"
While IE.Busy Or IE.ReadyState <> READYSTATE_COMPLETE: Wend
Exit For
End If
Next
HTMLDoc.getElementById("ctl00$phBody$cbConfirm").Click
While IE.Busy Or IE.ReadyState <> READYSTATE_COMPLETE: Wend
HTMLDoc.getElementById("ctl00_phBody_btnNewApplication").Click 'Launch Form
IE.Quit
Set IE = Nothing
End Sub