I've written a script in vba which is able to click on a certain link (Draw a map) of a webpage. When the clicking is done, a new tab opens up containing information I would like to grab from. My script can do all these errorlessly. Upon running the script it scrapes the title visible as Make a Google Map from a GPS file from the new tab.
My question: is there any alternative way to switch to new tab other than using hardcoded search like If IE.LocationURL Like "*" & "output_geocoder" Then?
This is my script:
Sub FetchInfo()
Const url As String = "http://www.gpsvisualizer.com/geocoder/"
Dim IE As New InternetExplorer, Html As HTMLDocument, R&
Dim winShell As New Shell
With IE
.Visible = True
.navigate url
While .Busy = True Or .readyState < 4: DoEvents: Wend
Set Html = .document
End With
Html.querySelector("input[value$='map']").Click
For Each IE In winShell.Windows
If IE.LocationURL Like "*" & "output_geocoder" Then
IE.Visible = True
While IE.Busy = True Or IE.readyState < 4: DoEvents: Wend
Set Html = IE.document
Exit For
End If
Next
Set post = Html.querySelector("h1")
MsgBox post.innerText
IE.Quit
End Sub
To execute the above script, add this reference to the library:
Microsoft Shell Controls And Automation
Microsoft Internet Controls
Microsoft HTML Object Library
Btw, there is nothing wrong with the above script. I only wish to know any better way to do the same.
This is the best I have so far with selenium
Option Explicit
Public Sub GetInfo()
Dim d As WebDriver
Set d = New ChromeDriver
Const url = "http://www.gpsvisualizer.com/geocoder/"
With d
.Start "Chrome"
.get url
.FindElementByCss("input[value$='map']").Click
.SwitchToNextWindow
.FindElementByCss("input.gpsv_submit").Click
MsgBox .Title
Stop
.Quit
End With
End Sub
The more fixed with title is:
.SwitchToWindowByTitle("GPS Visualizer: Draw a map from a GPS data file").Activate
.FindElementByCss("input.gpsv_submit").Click
tl;dr;
I will need to read up more on how robust .SwitchToNextWindow is.
FYI, you can get handles info with:
Dim hwnds As List
Set hwnds = driver.Send("GET", "/window_handles")
Related
I tried to open website using below VBA code:
Dim IE As Object
Dim doc As Object
Dim strURL As String
Set IE = CreateObject("InternetExplorer.Application")
With IE '
.Visible = True
.navigate "https://Google.com"
Do Until .readyState = 4
DoEvents
Loop
It's working for website like "google" etc.
But when I tried to open specific site like my company PLM " Agile (https://agileplm.XXXX.com/Agile/default/login-cms.jsp)" throwing error
"The remote server machine does not exist or is unavailable"
I could open the web page on explorer but throwing error while executing from below line
Do Until .readyState = 4
DoEvents
Loop
Is this due to any protection over site or not?
I used early binding on this and created the object InternetExplorerMedium rather than InternetExplorer and it seemed to work.
The issue is that Internet Explorer disconnects from VBA (In my case some internal security settings). The solution is to reconnect Internet Explorer to VBA:
Include this code after the line IE is not responsive anymore and the existing Internet Explorer window will be assigned to = IEObject1
Dim shellWins As SHDocVw.ShellWindows
Dim explorer As SHDocVw.InternetExplorer
Set shellWins = New SHDocVw.ShellWindows
For Each explorer In shellWins
If explorer.Name = "Internet Explorer" Then
Set IEObject1 = explorer
Debug.Print explorer.LocationURL
Debug.Print explorer.LocationName
End If
Next
Set shellWins = Nothing
Set explorer = Nothing
If you have several IE windows open then this code picks the last one. You can choose between the windows by using the URL or LocationName if you need several open windows.
Try to make a test with example code below may help you to solve your issue.
Sub Automate_IE_Load_Page()
'This will load a webpage in IE
Dim i As Long
Dim URL As String
Dim IE As Object
Dim objElement As Object
Dim objCollection As Object
'Create InternetExplorer Object
Set IE = CreateObject("InternetExplorer.Application")
'Set IE.Visible = True to make IE visible, or False for IE to run in the background
IE.Visible = True
'Define URL
URL = "https://agileplm.xxxx.com/Agile/default/login-cms.jsp"
'Navigate to URL
IE.Navigate URL
' Statusbar let's user know website is loading
Application.StatusBar = URL & " is loading. Please wait..."
' Wait while IE loading...
'IE ReadyState = 4 signifies the webpage has loaded (the first loop is set to avoid inadvertently skipping over the second loop)
Do While IE.ReadyState = 4: DoEvents: Loop 'Do While
Do Until IE.ReadyState = 4: DoEvents: Loop 'Do Until
'Webpage Loaded
Application.StatusBar = URL & " Loaded"
'Unload IE
Set IE = Nothing
Set objElement = Nothing
Set objCollection = Nothing
End Sub
Reference:
Automate Internet Explorer (IE) Using VBA
If your issue persist than try to provide a detailed information, Whether you are having issue with opening the page or you got an error while checking the ready state of IE. We will try to provide further suggestions.
Dim IE As Object
Dim doc As Object
Dim strURL As String
Set IE = CreateObject("InternetExplorer.Application")
With IE
.Visible = True
.navigate "https://Google.com"
End With
For Each win In CreateObject("Shell.Application").Windows
If win.Name Like "*Internet Explorer" Then
Set IE = win: Exit For
End If
Next
With IE
Do Until .readyState = 4
DoEvents
Loop
I've written a script in vba in combination with IE to click on some dots available on a map in a web page. When a dot is clicked, a small box containing relevant information pops up.
Link to that website
I would like to parse the content of each box. The content of that box can be found using class name contentPane. However, the main concern here is to generate each box by clicking on those dots. When a box shows up, it looks how you can see in the below image.
This is the script I've tried so far:
Sub HitDotOnAMap()
Const Url As String = "https://www.arcgis.com/apps/Embed/index.html?webmap=4712740e6d6747d18cffc6a5fa5988f8&extent=-141.1354,10.7295,-49.7292,57.6712&zoom=true&scale=true&search=true&searchextent=true&details=true&legend=true&active_panel=details&basemap_gallery=true&disable_scroll=true&theme=light"
Dim IE As New InternetExplorer, HTML As HTMLDocument
Dim post As Object, I&
With IE
.Visible = True
.navigate Url
While .Busy = True Or .readyState < 4: DoEvents: Wend
Set HTML = .document
End With
Application.Wait Now + TimeValue("00:0:07") ''the following line zooms in the slider
HTML.querySelector("#mapDiv_zoom_slider .esriSimpleSliderIncrementButton").Click
Application.Wait Now + TimeValue("00:0:04")
With HTML.querySelectorAll("[id^='NWQMC_VM_directory_'] circle")
For I = 0 To .Length - 1
.item(I).Focus
.item(I).Click
Application.Wait Now + TimeValue("00:0:03")
Set post = HTML.querySelector(".contentPane")
Debug.Print post.innerText
HTML.querySelector("[class$='close']").Click
Next I
End With
End Sub
when I execute the above script, it looks like it is running smoothly but nothing happens (I meant, no clicking) and it doesn't throw any error either. Finally it quits the browser gracefully.
This is how a box with information looks like when a dot gets clicked.
Although I've used hardcoded delay within my script, they can be fixed later as soon as the macro starts working.
Question: How can I click each of the dots on that map and collect the relevant information from the popped-up box? I only expect to have any solution using Internet Explorer
The data are not the main concern here. I would like to know how IE work in such cases so that I can deal with them in future cases. Any solution other than IE is not I'm looking for.
No need to click on each dots. Json file has all the details and you can extract as per your requirement.
Installation of JsonConverter
Download the latest release
Import JsonConverter.bas into your project (Open VBA Editor, Alt + F11; File > Import File)
Add Dictionary reference/class
For Windows-only, include a reference to "Microsoft Scripting Runtime"
For Windows and Mac, include VBA-Dictionary
References to be added
Download the sample file here.
Code:
Sub HitDotOnAMap()
Const Url As String = "https://www.arcgis.com/sharing/rest/content/items/4712740e6d6747d18cffc6a5fa5988f8/data?f=json"
Dim IE As New InternetExplorer, HTML As HTMLDocument
Dim post As Object, I&
Dim data As String, colObj As Object
With IE
.Visible = True
.navigate Url
While .Busy = True Or .readyState < 4: DoEvents: Wend
data = .document.body.innerHTML
data = Replace(Replace(data, "<pre>", ""), "</pre>", "")
End With
Dim JSON As Object
Set JSON = JsonConverter.ParseJson(data)
Set colObj = JSON("operationalLayers")(1)("featureCollection")("layers")(1)("featureSet")
For Each Item In colObj("features")
For j = 1 To Item("attributes").Count - 1
Debug.Print Item("attributes").Keys()(j), Item("attributes").Items()(j)
Next
Next
End Sub
Output
I would require a bit of help finding a form id on a public website (http://www.medicines.ie/). They updated the site and the previous id no longer works since now there is no Id to be found. What I am trying to do is open this site with VBA, input the value from a specific cell in excel to a form (textbox) on this website and press the search button. I am using the code below:
Sub Medicinesie()
Dim IE As Object
Set IE = CreateObject("INTERNETEXPLORER.APPLICATION")
IE.navigate "http://www.medicines.ie/"
IE.Visible = True
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
IE.Document.getElementById("input").Value = Range("spc") '<---- spc is the name of the cell I am referencing
IE.Document.forms(0).submit
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
End Sub
It looks like you could take advantage of the URL construction at this website. URL is constructed:
http://www.medicines.ie/medicines?page=1&per-page=25&query= + anything you would like to search in this database.
Sub MedicineS()
Dim IE As Object
Set IE = CreateObject("INTERNETEXPLORER.APPLICATION")
Dim URL As String
URL = "http://www.medicines.ie/medicines?page=1&per-page=25&query=" & _
Range("spc")
IE.Visible = True
IE.navigate URL
Do While IE.readyState <> READYSTATE_COMPLETE
Loop
End Sub
However if you still prefer to use your own way keep in mind that input you are looking for is 4th in the code so:
IE.Document.getElementByTagName("input")(3).Value
and the button is second so:
IE.Document.getElementByTagName("button")(1).Click
With selenium vba wrapper installed and adding tools > reference > selenium type library
Option Explicit
Public Sub test()
Dim d As WebDriver
Set d = New ChromeDriver '<== can change to internet explorer driver
With d
.Start "Chrome"
.Get "http://www.medicines.ie/"
.FindElementByCss("input.search__input").SendKeys "Aspirin" '<== Range("spc")
.FindElementByTag("form").Submit
Stop
'.Quit
End With
End Sub
Example run:
I've written a script in vba using IE to parse some links from a webpage. The thing is the links are within an iframe. I've twitched my code in such a way so that the script will first find a link within that iframe and navigate to that new page and parse the required content from there. If i do this way then I can get all the links.
Webpage URL: weblink
Successful approach (working one):
Sub Get_Links()
Dim IE As New InternetExplorer, HTML As HTMLDocument
Dim elem As Object, post As Object
With IE
.Visible = True
.navigate "put here the above link"
While .Busy = True Or .readyState < 4: DoEvents: Wend
Set elem = .document.getElementById("compInfo") #it is within iframe
.navigate elem.src
While .Busy = True Or .readyState < 4: DoEvents: Wend
Set HTML = .document
End With
For Each post In HTML.getElementsByClassName("news")
With post.getElementsByTagName("a")
If .Length Then R = R + 1: Cells(R, 1) = .Item(0).href
End With
Next post
IE.Quit
End Sub
I've seen few sites where no such links exist within iframe so, I will have no option to use any link to track down the content.
If you take a look at the below approach by tracking the link then you can notice that I've parsed the content from a webpage which are within Iframe. There is no such link within Iframe to navigate to a new webpage to locate the content. So, I used contentWindow.document instead and found it working flawlessly.
Link to the working code of parsing Iframe content from another site:
contentWindow approach
However, my question is: why should i navigate to a new webpage to collect the links as I can see the content in the landing page? I tried using contentWindow.document but it is giving me access denied error. How can I make my below code work using contentWindow.document like I did above?
I tried like this but it throws access denied error:
Sub Get_Links()
Dim IE As New InternetExplorer, HTML As HTMLDocument
Dim frm As Object, post As Object
With IE
.Visible = True
.Navigate "put here the above link"
While .Busy = True Or .readyState < 4: DoEvents: Wend
Set HTML = .document
End With
''the code breaks when it hits the following line "access denied error"
Set frm = HTML.getElementById("compInfo").contentWindow.document
For Each post In frm.getElementsByClassName("news")
With post.getElementsByTagName("a")
If .Length Then R = R + 1: Cells(R, 1) = .Item(0).href
End With
Next post
IE.Quit
End Sub
I've attached an image to let you know which links (they are marked with pencil) I'm after.
These are the elements within which one such link (i would like to grab) is found:
<div class="news">
<span class="news-date_time"><img src="images/arrow.png" alt="">19 Jan 2018 00:01</span>
<a style="color:#5b5b5b;" href="/HomeFinancial.aspx?&cocode=INE117A01022&Cname=ABB-India-Ltd&srno=17019039003&opt=9">ABB India Limited - Press Release</a>
</div>
Image of the links of that page I would like to grab:
From the very first day while creating this thread I strictly requested not to use this url http://hindubusiness.cmlinks.com/Companydetails.aspx?cocode=INE117A01022 to locate the data. I requested any solution from this main_page_link without touching the link within iframe. However, everyone is trying to provide solutions that I've already shown in my post. What did I put a bounty for then?
You can see the links within <iframe> in browser but can't access them programmatically due to Same-origin policy.
There is the example showing how to retrieve the links using XHR and RegEx:
Option Explicit
Sub Test()
Dim sContent As String
Dim sUrl As String
Dim aLinks() As String
Dim i As Long
' Retrieve initial webpage HTML content via XHR
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.thehindubusinessline.com/stocks/abb-india-ltd/overview/", False
.Send
sContent = .ResponseText
End With
'WriteTextFile sContent, CreateObject("WScript.Shell").SpecialFolders("Desktop") & "\tmp\tmp.htm", -1
' Extract target iframe URL via RegEx
With CreateObject("VBScript.RegExp")
.Global = True
.MultiLine = True
.IgnoreCase = True
' Process all a within div.news
.Pattern = "<iframe[\s\S]*?src=""([^""]*?Companydetails[^""]*)""[^>]*>"
sUrl = .Execute(sContent).Item(i).SubMatches(0)
End With
' Retrieve iframe HTML content via XHR
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", sUrl, False
.Send
sContent = .ResponseText
End With
'WriteTextFile sContent, CreateObject("WScript.Shell").SpecialFolders("Desktop") & "\tmp\tmp.htm", -1
' Parse links via XHR
With CreateObject("VBScript.RegExp")
.Global = True
.MultiLine = True
.IgnoreCase = True
' Process all anchors within div.news
.Pattern = "<div class=""news"">[\s\S]*?href=""([^""]*)"
With .Execute(sContent)
ReDim aLinks(0 To .Count - 1)
For i = 0 To .Count - 1
aLinks(i) = .Item(i).SubMatches(0)
Next
End With
End With
Debug.Print Join(aLinks, vbCrLf)
End Sub
Generally RegEx's aren't recommended for HTML parsing, so there is disclaimer. Data being processed in this case is quite simple that is why it is parsed with RegEx.
The output for me as follows:
/HomeFinancial.aspx?&cocode=INE117A01022&Cname=ABB-India-Ltd&srno=17047038016&opt=9
/HomeFinancial.aspx?&cocode=INE117A01022&Cname=ABB-India-Ltd&srno=17046039003&opt=9
/HomeFinancial.aspx?&cocode=INE117A01022&Cname=ABB-India-Ltd&srno=17045039006&opt=9
/HomeFinancial.aspx?&cocode=INE117A01022&Cname=ABB-India-Ltd&srno=17043039002&opt=9
/HomeFinancial.aspx?&cocode=INE117A01022&Cname=ABB-India-Ltd&srno=17043010019&opt=9
I also tried to copy the content of the <iframe> from IE to clipboard (for further pasting to the worksheet) using commands:
IE.ExecWB OLECMDID_SELECTALL, OLECMDEXECOPT_DODEFAULT
IE.ExecWB OLECMDID_COPY, OLECMDEXECOPT_DODEFAULT
But actually that commands select and copy the main document, excluding the frame, unless I click on the frame manually. So that might be applied if click on the frame could be reproduced from VBA (frame node methods like .focus and .click didn't help).
Something like this should work. They key is to realize the iFrame is technically another Document. Reviewing the iFrame on the page you listed, you can easily use a web request to get at the data you need. As already mentioned, the reason you get an error is due to the Same-Origin policy. You could write something to get the src of the iFrame then do the web request as I've shown below, or, use IE to scrape the page, get the src, then load that page which looks like what you have done.
I would recommend using a web request approach, Internet Explorer can get annoying, fast.
Code
Public Sub SOExample()
Dim html As Object 'To store the HTML content
Dim Elements As Object 'To store the anchor collection
Dim Element As Object 'To iterate the anchor collection
Set html = CreateObject("htmlFile")
With CreateObject("MSXML2.XMLHTTP")
'Navigate to the source of the iFrame, it's another page
'View the source for the iframe. Alternatively -
'you could navigate to this page and use IE to scrape it
.Open "GET", "https://stocks.thehindubusinessline.com/Companydetails.aspx?&cocode=INE117A01022"
.send ""
'See if the request was ok, exit it there was an error
If Not .Status = 200 Then Exit Sub
'Assign the page's HTML to an HTML object
html.body.InnerHTML = .responseText
Set Elements = html.body.document.getElementByID("hmstockchart_CompanyNews1_updateGLVV")
Set Elements = Elements.getElementsByTagName("a")
For Each Element In Elements
'Print out the data to the Immediate window
Debug.Print Element.InnerText
Next
End With
End Sub
Results
ABB India Limited - AGM/Book Closure
Board of ABB India recommends final dividend
ABB India to convene AGM
ABB India to pay dividend
ABB India Limited - Outcome of Board Meeting
More ?
The simple of solution like everyone suggested is to directly go the link. This would take the IFRAME out of picture and it would be easier for you loop through links. But in case you still don't like the approach then you need to get a bit deeper into the hole.
Below is a function from a library I wrote long back in VB.NET
https://github.com/tarunlalwani/ScreenCaptureAPI/blob/2646c627b4bb70e36fe2c6603acde4cee3354b39/Source%20Code/ScreenCaptureAPI/ScreenCaptureAPI/ScreenCapture.vb#L803
Private Function _EnumIEFramesDocument(ByVal wb As HTMLDocumentClass) As Collection
Dim pContainer As olelib.IOleContainer = Nothing
Dim pEnumerator As olelib.IEnumUnknown = Nothing
Dim pUnk As olelib.IUnknown = Nothing
Dim pBrowser As SHDocVW.IWebBrowser2 = Nothing
Dim pFramesDoc As Collection = New Collection
_EnumIEFramesDocument = Nothing
pContainer = wb
Dim i As Integer = 0
' Get an enumerator for the frames
If pContainer.EnumObjects(olelib.OLECONTF.OLECONTF_EMBEDDINGS, pEnumerator) = 0 Then
pContainer = Nothing
' Enumerate and refresh all the frames
Do While pEnumerator.Next(1, pUnk) = 0
On Error Resume Next
' Clear errors
Err.Clear()
' Get the IWebBrowser2 interface
pBrowser = pUnk
If Err.Number = 0 Then
pFramesDoc.Add(pBrowser.Document)
i = i + 1
End If
Loop
pEnumerator = Nothing
End If
_EnumIEFramesDocument = pFramesDoc
End Function
So basically this is a VB.NET version of below C++ version
Accessing body (at least some data) in a iframe with IE plugin Browser Helper Object (BHO)
Now you just need to port it to VBA. The only problem you may have is finding the olelib rerefernce. Rest most of it is VBA compatible
So once you get the array of object, you will find the one which belongs to your frame and then you can just that one
frames = _EnumIEFramesDocument(IE)
frames.Item(1).document.getElementsByTagName("A").length
I am trying to upload a .jpg file to a free online OCR site. I am using Excel VBA for this project:
Sub getOcrText()
Dim ocrAddress As String: ocrAddress = "http://www.free-online-ocr.com"
Dim picFile As String: picFile = "C:\Users\310217955\Documents\pdfdown\test.jpg"
Dim elementCollection As Variant
Dim IE As New InternetExplorerMedium
With IE
.Visible = True
.Navigate (ocrAddress)
Do While IE.Busy: DoEvents: Loop
Set elementCollection = IE.document.getElementsByName("fileUpload")
End With
IE.Quit
Set IE = Nothing
End Sub
However, when I run the code to see whether I get objects to elementCollection I get a Runtime error, automation error, unspecified error, the code successfully navigates to the desired webpage.
How do I overcome this error?
You need to change a couple lines.
First this one:
Dim IE As Object: Set IE = CreateObject("InternetExplorer.Application")
.
The second problem...
IE.Busy is not a sufficient test. Make that line the following instead:
Do While (IE.Busy Or IE.READYSTATE <> 4): DoEvents: Loop