ExcelVBA - HttpReq via MSXML2.XMLHTTP - fetch page after loading page - vba

i have a problem with fetching data from an internal web based Dataservice (cognos).
Basically i put together a GET request like "blah.com/cognosapi.dll?product=xxx&date=yyy...", send it to the server and receive a webpage that i can store as HTML and parse into my excel form later.
I build a VBA program which worked quite well in the past, but the webservice changed an now they are displaying a "your report is running" page in between that lasts from 1sec to 30sec. So when i call my function i always download this "your report is running" page insteat of the data. How can i catch the page that automatically loads up after the "report is running" page?
This is the DownloadFile Function with the GETstring and the target path as parameters.
Public Function DownloadFile(sSourceUrl As String, _
sLocalFile As String) As Boolean
Dim HttpReq As Object
Set HttpReq = CreateObject("MSXML2.XMLHTTP")
Dim HtmlDoc As New MSHTML.HTMLDocument
HttpReq.Open "GET", sSourceUrl, False
HttpReq.send
If HttpReq.Status = 200 Then
HttpReq.getAllResponseHeaders
HtmlDoc.body.innerHTML = HttpReq.responseText
Debug.Print HtmlDoc.body.innerHTML
End If
'Download the file. BINDF_GETNEWESTVERSION forces
'the API to download from the specified source.
'Passing 0& as dwReserved causes the locally-cached
'copy to be downloaded, if available. If the API
'returns ERROR_SUCCESS (0), DownloadFile returns True.
DownloadFile = URLDownloadToFile(0&, _
sSourceUrl, _
sLocalFile, _
BINDF_GETNEWESTVERSION, _
0&) = ERROR_SUCCESS
End Function
Thanks
David

finally you gave me the final link to solve my problem. I baked the code into my DownloadFile Function to stay with the IE Object until the end and then close it.
One Error i found is was that the readystate should be polled before anything is done with the HTMLObject.
Public Function DownloadFile(sSourceUrl As String, _
sLocalFile As String) As Boolean
Dim IE As InternetExplorer
Set IE = New InternetExplorer
Dim HtmlDoc As New MSHTML.HTMLDocument
Dim collTables As MSHTML.IHTMLElementCollection
Dim collSpans As MSHTML.IHTMLElementCollection
Dim objSpanElem As MSHTML.IHTMLSpanElement
Dim fnum As Integer
With IE
'May changed to "false if you don't want to see browser window"
.Visible = True
.Navigate (sSourceUrl)
'this waits for the page to be loaded
Do Until .readyState = 4: DoEvents: Loop
End With
'Set HtmlDoc = wait_for_html(sSourceUrl, "text/css")
Do
Set HtmlDoc = IE.Document
'searching for the "Span" tag
Set collSpans = HtmlDoc.getElementsByTagName("span")
'first Span element cotains...
Set objSpanElem = collSpans(0)
'... this if loading screen is display
Loop Until Not objSpanElem.innerHTML = "Your report is running."
'just grab the tables and leave the rest
Set collTables = HtmlDoc.getElementsByTagName("table")
fnum = FreeFile()
Open sLocalFile For Output As fnum ' save the file and add html and body tags
Print #fnum, "<html>"
Print #fnum, "<body>"
Print #fnum, collTables(15).outerHTML 'title
Print #fnum, collTables(17).outerHTML 'Date
Print #fnum, collTables(18).outerHTML 'Part, Operation etc.
Print #fnum, collTables(19).outerHTML 'Measuerements
Print #fnum, "</body>"
Print #fnum, "</html>"
Close #fnum
IE.Quit 'close Explorer
DownloadFile = True
End Function

Since you're using a GET request, I'm assuming any required parameters can be provided in the URL string. In that case, you might be able to use InternetExplorer.Application, which should automatically update its Document property whenever the page refreshes. You could then set up a loop which periodically checks for some value (tag text, URL, etc...) that's unique to the desired page.
Here's a sample which loads a URL, then waits until the page's <title> tag is the desired value.
Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Function wait_for_html(strURL as String, strDesiredText as String) as String
Dim IE As InternetExplorer
Set IE = New InternetExplorer
IE.Navigate (strURL)
While IE.ReadyState <> 4
Sleep 10
Wend
Dim objHtml As MSHTML.HTMLDocument
Dim collTitle As MSHTML.IHTMLElementCollection
Dim objTitleElem As MSHTML.IHTMLTitleElement
Do
Sleep 1000
Set objHtml = IE.Document
Set collTitle = objHtml.getElementsByTagName("title")
Set objTitleElem = collTitle(0)
Loop Until objTitleElem.Text = strDesiredText
wait_for_html = objHtml.body.innerHTML
End Function
The above needs references to Microsoft Internet Controls and Microsoft HTML Object Library.

Related

Run my VBA code from a existing (opened) internet explorer page

I would like to run my VBA macro on a specific internet explorer page that I have already opened and logged into .
This is because I have to login to my account and bypass the captcha code first.
I have to send this extremely repetitive message to over a couple hundred people over the course of the year.
Problem with the following is that I am opening up a brand new page and I wont be able to bypass the captcha.
Dim IE As InternetExplorer
Set IE = CreateObject("InternetExplorer.Application")
IE.navigate ("website")
IE.Visible = True
End Sub
Solution
Sub TestGetIE()
Dim IE As Object
'GetIE runs the Functoin we have created below
Set IE = GetIE("website opened in IE here ")
WaitFor IE
end sub
Function GetIE(sLocation As String) As Object
Dim objShell As Object, objShellWindows As Object, o As Object
Dim sURL As String
Dim retVal As Object
Set retVal = Nothing
Set objShell = CreateObject("Shell.Application")
Set objShellWindows = objShell.Windows
For Each o In objShellWindows
'Loop through all the opened internet explorer pages
sURL = ""
'Loops through all the pages opened on internet explorer
'Then we will tell our macro to work on that page
sURL = o.LocationURL
If sURL Like sLocation & "*" Then
Set retVal = o
Exit For
End If
Next o
Set GetIE = retVal
End Function
From your description, I understand that you want user interaction to fill the captcha and then want to continue to automate the same page.
I suggest you for both steps try to use 2 different modules and use the same IE object in both modules.
You can create a global IE object which you can use in both modules.
Then for the login part execute the first module.
After that user will fill the captcha manually.
Then execute the second module to automate the rest of the page.
Sample code:
Public ie As Object
Sub login()
Set ie = CreateObject("internetexplorer.application")
ie.Visible = True
ie.navigate "Your_web_site_URL_here..."
Do While ie.Busy
Application.Wait DateAdd("s", 1, Now)
Loop
ie.document.getElementById("uname").Value = "user1"
ie.document.getElementById("pswd").Value = "12345"
End Sub
Sub second_part()
ie.document.getElementById("email").Value = "user1#abc.com"
ie.document.getElementById("age").Value = "30"
End Sub
UserForm1 code:
Private Sub CommandButton1_Click()
Call Module1.login
End Sub
Private Sub CommandButton2_Click()
Call Module1.second_part
End Sub
Output:
Calling upon this function helped me solve it.
Function GetIE(sLocation As String) As Object
Dim objShell As Object, objShellWindows As Object, o As Object
Dim sURL As String
Dim retVal As Object
Set retVal = Nothing
Set objShell = CreateObject("Shell.Application")
Set objShellWindows = objShell.Windows
For Each o In objShellWindows
'Loop through all the opened internet explorer pages
sURL = ""
'Loops through all the pages opened on internet explorer
'Then we will tell our macro to work on that page
sURL = o.LocationURL
If sURL Like sLocation & "*" Then
Set retVal = o
Exit For
End If
Next o
Set GetIE = retVal
End Function
One thing you could do is DoEvents until a certain Element is present on the page. Run your code to navigate to the website, then Do Until Element after captcha is present DoEvents. The code will loop until you are on logged into the website. I like using VBA SeleniumBasic ChromeDriver on GetHub. It makes web scraping way easier, and you do not need to use Internet Explorer.

Excel VBA cannot get element and input data a website

I am trying to write a VBA script to input keywords (For expamle: Amuse) into the following website's textbox "Map Positioning" and click "go" to search the place automatically.
CentaMap
Here is the html script I found for the textbox
<INPUT onkeyup=searchBoxTextChanged(this.value); id=qbyid style="FONT-SIZE: 10pt" maxLength=60 name=q autocomplete="off">
Knowing that the normal way to do is to use get element by id then input such data in it. However I am being stuck as I cannot get the textbox element with the following codes:
Sub SubCentalineAutomation()
Dim myIE As InternetExplorer
Const url As String = "http://hk.centamap.com/gc/home.aspx?lg=en"
Set myIE = New InternetExplorer
myIE.navigate (url)
Do While myIE.readyState <> 4
DoEvents
Loop
myIE.Visible = True
myIE.document.getElementsByName("q")(0).Value = "Amuse"
End Sub
I tried to replace the codes by using getElementsById("qbyid") instead however VBA cannot find the element too.
Can anybody help on this?
Needs Reference to
Microsoft Internet Controls
Microsoft HTML Object Library
Sub SubCentalineAutomation()
Dim myIE As InternetExplorer
Dim frame As MSHTML.HTMLFrameElement
Dim inp As MSHTML.HTMLInputElement
Const url As String = "http://hk.centamap.com/gc/home.aspx?lg=en"
Set myIE = New InternetExplorer
myIE.navigate (url)
Do While myIE.readyState <> 4
DoEvents
Loop
myIE.Visible = True
Set frame = myIE.document.getElementsByName("search")(0)
Set inp = frame.contentDocument.getElementsByName("q")(0)
inp.Value = "Amaze"
End Sub
try getElementById("qbyid"), not getElementsById("qbyid")
If you simply want the suggestion, without the map navigation you can use:
Option Explicit
Public Sub GetInfo()
Dim sResponse As String, html As New HTMLDocument, searchTerm As String
searchTerm = "Amuse"
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "http://hk.centamap.com/gsearch/paddresssearch1.aspx?lg=en&search=" & searchTerm & "&ck=gbase&ft2=", False
.send
sResponse = StrConv(.responseBody, vbUnicode)
sResponse = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))
With html
.body.innerHTML = sResponse
Debug.Print html.querySelector("a").innerText
End With
End With
End Sub
If you want to enter a value and then navigate to the first returned suggestion:
Option Explicit
Public Sub SubCentalineAutomation()
Dim myIE As InternetExplorer, html As New MSHTML.HTMLDocument, frame As MSHTML.HTMLFrameElement, form As MSHTML.HTMLFormElement
Const URL As String = "http://hk.centamap.com/gc/home.aspx?lg=en"
Set myIE = New InternetExplorer
With myIE
.navigate URL
.Visible = True
While .Busy Or .readyState < 4: DoEvents: Wend
Set html = .document
Set frame = html.getElementsByTagName("frame")(1)
Set form = frame.contentDocument.getElementsByTagName("form")(0)
form.getElementsByTagName("input")(1).Value = "Amuse"
form.getElementsByTagName("input")(2).Click
While .Busy Or .readyState < 4: DoEvents: Wend
Set frame = .document.getElementsByTagName("frame")(4)
Set frame = frame.contentDocument.getElementsByTagName("iframe")(0)
frame.contentDocument.getElementsByTagName("table")(0).getElementsByTagName("a")(0).Click
'.Quit '<== Remember to quit application
End With
End Sub

How Do I Test If Webpage Contains Certain Text

I'm trying to detect if a web page has certain text. For example, I want to see if this web page includes the following phrase: "Here is my code"
I can't get it to ever find that the "If Then" condition is satisfied. Here's what I'm trying:
Const READYSTATE_COMPLETE = 4
Declare Function SetForegroundWindow Lib "user32" _
Alias "SetForegroundWindow" (ByVal Hwnd As Long)As Long
' Declare Internet Explorer object
Dim IE As SHDocVw.InternetExplorer
Dim strProgramName As String
Sub Main
' create instance of InternetExplorer
Set IE = New InternetExplorer
' using your newly created instance of Internet Explorer
With IE
SetForegroundWindow IE.HWND
.Visible = True
.Navigate2 "https://stackoverflow.com/questions/38355762/how-do-i-modify-web-scraping-code-to-loop-through-product-bullets-until-it-finds"
' Wait until page we are navigating to is loaded
Do While .Busy
Loop
Do
Loop Until .readyState = READYSTATE_COMPLETE
On Error Resume Next
If Err Then
Else
End If
Wait 2
If InStr(IE.document.body.innerHTML, "Here is my code") > 0 Then
MsgBox "Yessiree Bob"
Else
MsgBox "The text dosen't exist"
End If
Set IE = Nothing
' Tidy Up
End With
End Sub
I've also tried:
FindText = InStr(1, IE.document.body.innerHTML, "Here is my code")
If FindText > 0 Then
And
msg = IE.document.body.innerHTML
If InStr(msg, "Here is my code") > 0 Then
But nothing works. I've looked on Stack Overflow, but can't find this exact question.
Thanks in advance!
Use:
If InStr(IE.document.getElementById("body").innerHTML, "Here is my code") > 0 Then

vba code running but not fetching data

I am new to vba.
I am trying to use below code by David Zemens to fetch data from yelp
Option Explicit
Private Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Sub find()
'Uses late binding, or add reference to Microsoft HTML Object Library
' and change variable Types to use intellisense
Dim ie As Object 'InternetExplorer.Application
Dim html As Object 'HTMLDocument
Dim Listings As Object 'IHTMLElementCollection
Dim l As Object 'IHTMLElement
Dim r As Long
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = False
.Navigate "http://www.yelp.com/search?find_desc=boutique&find_loc=New+York%2C+NY&ns=1&ls=3387133dfc25cc99#start=10"
' Don't show window
'Wait until IE is done loading page
Do While .readyState <> 4
Application.StatusBar = "Downloading information, Please wait..."
DoEvents
Sleep 200
Loop
Set html = .Document
End With
Set Listings = html.getElementsByTagName("LI") ' ## returns the list
For Each l In Listings
'## make sure this list item looks like the listings Div Class:
' then, build the string to put in your cell
If InStr(1, l.innerHTML, "media-block clearfix media-block-large main-attributes") > 0 Then
Range("A1").Offset(r, 0).Value = l.innerText
r = r + 1
End If
Next
Set html = Nothing
Set ie = Nothing
End Sub
Problem is that it's not getting any data from the source.
Regards
There's a lot of work to be done.
Here's something that you can start with. Hopefully, you will be able to find the other pieces of information using the same logic. This will print business names in the immediate window. I've found the business names in meta tag description.
I've changed the sleep amount to 5 seconds. IE will be able to fully load and the rest of the code will be processed reliably. The initial 200 milliseconds gave results once every couple of runs. I guess this depends how fast your computer is so 5 seconds is pretty safe I guess.
Option Explicit
Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Sub find()
'Uses late binding, or add reference to Microsoft HTML Object Library
' and change variable Types to use intellisense
Dim returnstring As String 'this is going to hold boutiques names
Dim ie As Object 'InternetExplorer.Application
Dim html As Object 'HTMLDocument
Dim meta As Object 'IHTMLElementCollection
Dim l As Object 'IHTMLElement
Dim r As Long
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = False
.Navigate "http://www.yelp.com/search?find_desc=boutique&find_loc=New+York%2C+NY&ns=1&ls=3387133dfc25cc99#start=10"
' Don't show window
'Wait until IE is done loading page
Do While .readyState <> 4
Application.StatusBar = "Downloading information, Please wait..."
DoEvents
Sleep 5000
Loop
Set html = .Document
End With
Set meta = html.GetElementsByTagName("META") ' ## returns attribures
Dim m As Object
For Each m In meta
If InStr(m.Content, "Reviews on Boutique in New York -") > 0 Then
returnstring = Replace(m.Content, "Reviews on Boutique in New York -", "")
End If
Next
Dim i As Integer
For i = 0 To UBound(Split(returnstring, ","))
Debug.Print (Split(returnstring, ",")(i))
Next
Set html = Nothing
Set ie = Nothing
End Sub
Myoutput:

VBA - Addressing Internet Explorer tabs

Strangely enough I didn't find any information on the topic and I'm currently stuck at the point where I managed to open a new tab in an instance of IE by programmatically clicking a button, but I haven't the faintest clue of how to address the new tab in order to get information from there (the button basically brings up a new tab with the result of a search).
This is basically a straightforward question, but I'm including my code anyway:
Sub AddInfoFromIntranet()
Dim Ie As SHDocVw.InternetExplorer
Dim URL As String
Dim iFrames As MSHTML.IHTMLElementCollection
Dim iFrame As MSHTML.HTMLFrameElement
Dim Doc As MSHTML.HTMLDocument
Dim InputBox As MSHTML.IHTMLElementCollection, htmlButton, allTags, Tag
' Opens Intranet - yeah, sadly it's not a public web page
URL = "{My intranet website}"
Set Ie = New SHDocVw.InternetExplorer
With Ie
.navigate URL
.Visible = True
While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
Set Doc = .document
End With
' Gets top_window frame and navigates to it, then inserts the name to search
Set iFrames = Doc.getElementsByName("top_window")
If Not iFrames Is Nothing Then
Set iFrame = iFrames(0)
Ie.navigate URL & iFrame.src
While Ie.Busy Or Ie.readyState <> READYSTATE_COMPLETE: DoEvents: Wend
Set InputBox = Doc.getElementsByName("Nachnamevalue")
If Not InputBox Is Nothing Then InputBox(0).Value = "test"
' Clicks on "search"
Set allTags = Doc.getElementsByTagName("input")
For Each Tag In allTags
If Tag.Value = "suchen" Then
Tag.Click
Exit For
End If
Next
' Here a new tab is opened, must find info in this tab
While Ie.Busy Or Ie.readyState <> READYSTATE_COMPLETE: DoEvents: Wend
' HERE I HAVE NO CLUE WHAT TO WRITE. THE CODE ABOVE WORKS FLAWLESSLY
End If
Set Doc = Nothing
Set iFrames = Nothing
Set iFrame = Nothing
Set InputBox = Nothing
Set allTags = Nothing
Set Ie = Nothing
Ie.Quit
End Sub
Now, is there a way to address a tab by: 1) its name (and where do I find it) 2) its position in browser 3) the status (if it is "active") ?
Bonus questions: since I am new to VBA and Internet Explorer interaction, what exactly are the variables: htmlButton, allTags, Tag ? Also, could anyone explain if I need to set all the variables at the end to nothing, or I just need to set the Internet Explorer to nothing?
Thanks in advance!
See below for a function you can use to get an open IE document window - I don't think IE exposes any simple (VBA-accessible) API for working directly with tabs or determining whether a specific tab is active.
allTags is a collection of DOM elements with type "" , and Tag is a single memeber of that collection.
You do not have to set objects to Nothing before exiting a Sub (though some people still do that) - the VBA runtime will take care of that for you.
Sub TestGetIE()
Dim IE As Object
Set IE = GetIE("http://stackoverflow.com")
If Not IE Is Nothing Then
IE.document.execCommand "Print", False, 0
End If
End Sub
'Get a reference to an open IE window based on its URL
Function GetIE(sLocation As String) As Object
Dim objShell As Object, objShellWindows As Object, o As Object
Dim sURL As String
Dim retVal As Object
Set retVal = Nothing
Set objShell = CreateObject("Shell.Application")
Set objShellWindows = objShell.Windows
For Each o In objShellWindows
sURL = ""
On Error Resume Next
'check the URL and if it's the one you want then
' assign it to the return value
sURL = o.document.Location
On Error GoTo 0
'Debug.Print sURL
If sURL Like sLocation & "*" Then
Set retVal = o
Exit For
End If
Next o
Set GetIE = retVal
End Function