I have a VBA macro that opens IE and navigates to a webpage and executes a javascript function on the page. This javascript function accepts a SQL command as an argument that is sent to a database and an object is returned.
This works for most commands, but sometimes the command is very long and IE ends up freezing/crashing with the parentwindow.execscript or parentwindow.eval. I am not sure if the issue is related to the length of the string or some other syntax error in the command, but I am able to run the command successfully manually in console and a valid object is returned in a couple seconds.
Is there some other way to do this through XMLHTTP or maybe some more stable method to call this command through the browser? Example code of what I have is below:
dim returnedQuery as variant
dim sqlquery as string
sqlquery = VeryLongPieceofCodeToQueryDatabase
dim ieapp as object
set ieapp = new internetexplorermedium
ieapp.navigate url
call busy(ieapp)
Set returnedQuery = ieapp.document.parentWindow.execScript("javascriptfunction(" & sqlquery & ")")
If XMLHTTP is more reliable, would someone be able to direct me to where I can read up about it?
Related
I want to import the web page source code in excel what I see using View Page Source option in Chrome. But when I import it using below code, it doesn't import all content. The values that I'm looking for do not get displayed on web page.
I'm also unable to locate the element using getElementsByClassName or other methods.
Private Sub HTML_VBA_Excel()
Dim oXMLHTTP As Object
Dim sPageHTML As String
Dim sURL As String
'Change the URL before executing the code
sURL = "http://pntaconline/getPrDetails?entry=8923060"
'Extract data from website to Excel using VBA
Set oXMLHTTP = CreateObject("MSXML2.ServerXMLHTTP")
oXMLHTTP.Open "GET", sURL, False
oXMLHTTP.send
sPageHTML = oXMLHTTP.responseText
'Get webpage data into Excel
' If longer sourcecode mean, you need to save to a external text file or somewhere,
' since excel cell have some limits on storing max characters
ThisWorkbook.Sheets(1).Cells(1, 1) = sPageHTML
MsgBox "XMLHTML Fetch Completed"
End Sub
Data I want to import is IDs and Name:
So you need to understand the DOM in order to realize why this isnt loading everything.
XMLHTTP is going to load that specific resource you requested. A lot of web pages, sorry pretty much all web pages, load extra resources after the initial request is done.
If you're missing stuff, it's probably loaded on a different network request. So open up your DevTools in Chrome, make sure Network tab is recording, and watch how many network requests go in and out when you load your target page.
Essentially, this if you're using XMLHTTP, you'd have to simulate each of those to get the requests you want to scrape.
EDIT
So you're just kind of pasting the data response into Excel.
Better to create HTMLDocument variable then set the response from XMLHTTP to be the response like here: https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms762275(v=vs.85)
set xmlhttp = new ActiveXObject("Msxml2.XMLHTTP.3.0");
xmlhttp.open("GET", "http://localhost/books.xml", false);
xmlhttp.send();
Debug.print(xmlhttp.responseText);
Dim xString as String
xSring = xmlhttp.responseText
'search the xString variable
You can then split that response for the sheet or search it and extract the values in VBA memory, rather than print to the sheet.
You could also set the xString responseText as the innerHTML for a new HTMLDocument variable
Dim xHTML as HTMLDocument
Set xHTML.innertext = xString
first time post here cause for the first time I couldn't find an answer on Stack.
The program I'm fiddling with goes to an API website - gets the passport key - and uses that to load the info in XML and then parses it. The code is very long so I'm only posting the bit that results in the error.
myURL = "https://api.smartbidnet.com/project?PassportKey=" & PassportKey & "&ResultType=xml" 'This is the API URL where we use our passport key to get our data
IE.Navigate myURL 'Go to Webpage
Dim XDoc As Object
Set XDoc = CreateObject("MSXML2.DOMDocument")
XDoc.async = False
XDoc.validateOnParse = False
XDoc.Load (ThisWorkbook.Path & myURL) 'Here we will load the XML info from the web page
Set lists = XDoc.DocumentElement 'lists will be used to access XML nodes
Set Projects = lists.SelectNodes("//root/projects")
Now heres the weird part:
If I open excel by selecting "Blank Workbook" the code works just fine from the editor.
If I save the blank workbook and then try to run it or If I simply right click the desktop and open a blank sheet rather than workbook and try to run it. the last line results in
Runtime 91 error
Any help is of course greatly appreciated.
The bit of code before the above is simply navigating to the site to grab the passport key for the API and build the myURL variable.
I wrote a code to parse some Web tables.
I get some web tables into an IHTMLElementCollection using Internet Explorer with this code:
TabWeb = IE.document.getelementsbytagname("table")
Then I use a sub who gets an object containing the IHTMLElementCollection and some other data:
Private Sub TblParsing(ByVal ArrVal() As Object)
Dim WTab As mshtml.IHTMLElementCollection = ArrVal(0)
'some code
End sub
My issue is: if I simply "call" this code, it works correctly:
Call TblParsing({WTab, LiRow})
but, if I try to run it into a threadpool:
ThreadPool.QueueUserWorkItem(New WaitCallback(AddressOf TblParsing), {WTab, LiRow})
the code fails and give me multiple
System.UnauthorizedAccessException
This happens on (each of) these code rows:
Rws = WTab(RifWT("Disc")).Rows.Length
If Not IsError(WTab(6).Cells(1).innertext) Then
Ogg_W = WTab(6).Cells(1).innertext
My goal is to navigate to another web page while my sub perform parsing.
I want to clarify that:
1) I've tryed to send the entire HTML to the sub and get it into a webbrowser but it didn't work because it isn't possible to cast from System.Windows.Forms.HtmlElementCollection to mshtml.IHTMLElementCollection (or I wasn't able to do it);
2) I can't use WebRequest and similar: I'm forced to use InternetExplorer;
3) I can't use System.Windows.Forms.HtmlElementCollection because my parsing code uses Cells, Rows and so on that are unavailable (and I don't want to rewrite all my parsing code)
EDIT:
Ok, I modified my code using answer hints as below:
'This in the caller sub
Dim IE As Object = CreateObject("internetexplorer.application")
'...some code
Dim IE_Body As String = IE.document.body.innerhtml
ThreadPool.QueueUserWorkItem(New WaitCallback(AddressOf TblParsing_2), {IE_Body, LiRow})
'...some code
'This is the called sub
Private Sub TblParsing_2(ByVal ArrVal() As Object)
Dim domDoc As New mshtml.HTMLDocument
Dim domDoc2 As mshtml.IHTMLDocument2 = CType(domDoc, mshtml.IHTMLDocument2)
domDoc2.write(ArrVal(0))
Dim body As mshtml.IHTMLElement2 = CType(domDoc2.body, mshtml.IHTMLElement2)
Dim TabWeb As mshtml.IHTMLElementCollection = body.getElementsByTagName("TABLE")
'...some code
I get no errors but I'm not sure that it's all right because I tryed to use IE_Body string into webbrowser and it throws errors in the webpage (it shows a popup and I can ignore errors).
Am I using the right way to get Html from Internet Explorer into a string?
EDIT2:
I changed my code to:
Dim IE As New SHDocVw.InternetExplorer
'... some code
Dim sourceIDoc3 As mshtml.IHTMLDocument3 = CType(IE.Document, mshtml.IHTMLDocument3)
Dim html As String = sourceIDoc3.documentElement.outerHTML
ThreadPool.QueueUserWorkItem(New WaitCallback(AddressOf TblParsing_2), {html, LiRow})
'... some code
Private Sub TblParsing_2(ByVal ArrVal() As Object)
Dim domDoc As New mshtml.HTMLDocument
Dim domDoc2 As mshtml.IHTMLDocument2 = CType(domDoc, mshtml.IHTMLDocument2)
domDoc2.write(ArrVal(0))
Dim body As mshtml.IHTMLElement2 = CType(domDoc2.body, mshtml.IHTMLElement2)
Dim TabWeb As mshtml.IHTMLElementCollection = body.getElementsByTagName("TABLE")
But I get an error PopUp like (I tryed to translate it):
Title:
Web page error
Text:
Debug this page?
This page contains errors that might prevent the proper display or function properly.
If you are not testing the web page, click No.
two checkboxes
do not show this message again
Use script debugger built-in Internet Explorer
It's the same error I got trying to get Html text into a WebBrowser.
But, If I could ignore this error, I think the code could work!
While the pop is showing I get error on
Dim domDoc As New mshtml.HTMLDocument
Error text translated is:
Retrieving the COM class factory for component with CLSID {25336920-03F9-11CF-8FD0-00AA00686F13} failed due to the following error: The 8,001,010th message filter indicated that the application is busy. (Exception from HRESULT: 0x8001010A (RPC_E_SERVERCALL_RETRYLATER)).
Note that I've alredy set IE.silent = True
Edit: There was confusion as to what the OP meant by "Internet Explorer". I originally assumed that it meant the WinForm Webbrowser control; however the OP is creating the COM browser directly instead of using the .Net wrapper.
To get the browser document's defining HTML, you can cast the document against the mshtml.IHTMLDocument3 interface to expose the documentElement property.
Dim ie As New SHDocVw.InternetExplorer ' Proj COM Ref: Microsoft Internet Controls
ie.Navigate("some url")
' ... other stuff
Dim sourceIDoc3 As mshtml.IHTMLDocument3 = CType(ie.Document, mshtml.IHTMLDocument3)
Dim html As String = sourceIDoc3.documentElement.outerHTML
End Edit.
The following is based on my comment above. You use the WebBrowser.DocumentText property to create a mshtml.HTMLDocument.
Use this property when you want to manipulate the contents of an HTML page displayed in the WebBrowser control using string processing tools.
Once you extract this property as a String, there is no connection to the WebBrowser control and you can process the data in any thread you want.
Dim html As String = WebBrowser1.DocumentText
Dim domDoc As New mshtml.HTMLDocument
Dim domDoc2 As mshtml.IHTMLDocument2 = CType(domDoc, mshtml.IHTMLDocument2)
domDoc2.write(html)
Dim body As mshtml.IHTMLElement2 = CType(domDoc2.body, mshtml.IHTMLElement2)
Dim tables As mshtml.IHTMLElementCollection = body.getElementsByTagName("TABLE")
' ... do something
' cleanup COM objects
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(body)
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(tables)
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(domDoc)
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(domDoc2)
I've used this simple subroutine for loading HTML documents from the web for some time now with no problems:
Function GetSource(sURL As String) As Variant
' Purpose: To obtain the HTML text of a web page
' Receives: The URL of the web page
' Returns: The HTML text of the web page in a variant
Dim oXHTTP As Object, n As Long
Set oXHTTP = CreateObject("MSXML2.XMLHTTP")
oXHTTP.Open "GET", sURL, False
oXHTTP.send
GetSource = oXHTTP.responsetext
Set oXHTTP = Nothing
End Function
but I've run into a situation where it only loads part of a page most of the time (not always -- sometimes it loads all of the expected HTML code). If you SAVE the HTML of the page to another file on the web from a browser, the subroutine will always read it with no problem.
I'm guessing that the issue is timing -- that the dynamic page registers "done" while a script is still filling in details. Sometimes it completes in time, other times it doesn't.
Has anyone ever encountered this behavior before and surmounted it? It seems that there should be a way of capturing via the MSXML2.XMLHTTP object exactly what you'd get if went to the page and chose the save to HTML option.
If you'd like to see the behavior for yourself, here's a sample of a page that doesn't load consistently:
http://www.tiff.net/festivals/thefestival/programmes/specialpresentations/mr-turner
and here's a saved HTML file of that same page:
http://tofilmfest.ca/2014/film/fest/Mr_Turner.htm
Is there any known workaround for this?
I found a workaround that gives me what I want. I control Internet Explorer programmatically and invoke a three-second delay after I tell it to navigate to a page to enable the content to finish loading. Then I extract the HTML code by using an IHTMLElement from Microsoft's HTML library. It's not pretty, but it retrieves all of the HTML code for every page I've tried it with. If anybody has a better way accomplishing the same end, feel free to show off.
Function testbrowser() As Variant
Dim oIE As InternetExplorer
Dim hElm As IHTMLElement
Set oIE = New InternetExplorer
oIE.Height = 600
oIE.Width = 800
oIE.Visible = True
oIE.Navigate "http://www.tiff.net/festivals/thefestival/programmes/galapresentations/the-riot-club"
Call delay(3)
Set hElm = oIE.Document.all.tags("html").Item(0)
testbrowser = hElm.outerHTML
End Function
Sub delay(ByVal secs As Integer)
Dim datLimit As Date
datLimit = DateAdd("s", secs, Now())
While Now() < datLimit
Wend
End Sub
Following Alex's suggestion, here's how to do it without a brute force fixed delay:
Function GetHTML(ByVal strURL as String) As Variant
Dim oIE As InternetExplorer
Dim hElm As IHTMLElement
Set oIE = New InternetExplorer
oIE.Navigate strURL
Do While (oIE.Busy Or oIE.ReadyState <> READYSTATE_COMPLETE)
DoEvents
Loop
Set hElm = oIE.Document.all.tags("html").Item(0)
GetHTML = hElm.outerHTML
Set oIE = Nothing
Set hElm = Nothing
End Function
So I'm going round in circles trying to get this to work, I've been trying for two days and I just can't figure it out.
I have the following vb function that takes a created powershell script, and should run it in powershell. Everything works fine, until the point at which the command pipeline is invoked. At this point, no commands run.
As you can see, I have tried to add the Microsoft.Exchange.Management.PowerShell.E2010 snapin to the runspace, it didn't like that at all stating something along the lines of the snapin didnt exist (which it does), and also when I run the code as shown, no commands are recognised as valid. I even added the specific command "Add-PSSnapin" to try and load any Exchange snapins, but it states that "Add-PSSnapin" is not recognised as a valid command.
If I pause the program just before the commands are involked, I can see every command within the pipeline, in the correct format. If I copy and paste the command text in the pipeline directly into a powershell window, it runs fine.
My code is below, any suggestions welcome.
edit: I have also tried adding the line "Add-PSSnapin Ex" (with an asterisk each side of Ex - I cant figure the formatting out on this, sorry)
to try and load the Exchange PS Snapins as the first thing the script would run (opposed to setting this up in the runspace) but no luck
Private Function scriptRunner(ByVal scripttorun As String) As String
Dim initial As InitialSessionState = InitialSessionState.CreateDefault()
Dim result As String = ""
Dim lineFromScript As String = ""
Dim reader As New StreamReader(tempScript)
Dim rsConfig As RunspaceConfiguration = RunspaceConfiguration.Create()
Dim snapInException As New PSSnapInException
Dim strUserName As String = "DOMAIN\USER"
Dim strPassword As String = "PASSWORD"
Dim SecuredPSWD As New System.Security.SecureString()
For Each character As Char In strPassword
SecuredPSWD.AppendChar(character)
Next
Dim wsmConnectionInfo As WSManConnectionInfo
Dim strSystemURI As String = "http://SERVER.DOMAIN/powershell?serializationLevel=Full"
Dim strShellURI As String = "http://schemas.microsoft.com/powershell/Microsoft.Exchange"
Dim powerShellCredentials As PSCredential = New PSCredential(strUserName, SecuredPSWD)
wsmConnectionInfo = New WSManConnectionInfo(New Uri(strSystemURI), strShellURI, powerShellCredentials)
Dim runspace As Runspace = RunspaceFactory.CreateRunspace(wsmConnectionInfo)
Runspace.Open()
' runspace.RunspaceConfiguration.AddPSSnapIn("Microsoft.Exchange.Management.PowerShell.E2010", snapInException)
Dim pipeLine As Pipeline = runspace.CreatePipeline()
Dim command As Command = New Command("")
' TEST >> pipeLine.Commands.Add("Add-PSSnapin *Ex*")
Do While reader.Peek() <> -1
lineFromScript = Nothing
lineFromScript = reader.ReadLine()
pipeLine.Commands.Add(lineFromScript)
'command.Parameters.Add(lineFromScript)
'pipeLine.Commands.Add(command)
Loop
'' Run the contents of the pipeline
Dim psObjCollection As Collection(Of PSObject) = pipeLine.Invoke()
runspace.Close()
runspace.Dispose()
Return ""
End Function
I ended up working around the problem rather than fixing it.
I moved the script code into the vb.net application, and wrote each line to a file, i.e.
writer.WriteLine("Add-PSSnapin *Ex*")
Then I loaded the script through PowerShell as an application;
Dim exeStartInfo As System.Diagnostics.ProcessStartInfo
Dim exeStart As New System.Diagnostics.Process
exeStartInfo = New System.Diagnostics.ProcessStartInfo("C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe")
exeStartInfo.Arguments = ("-command work\scriptbuilder.ps1")
exeStartInfo.WorkingDirectory = "C:\ExchangeManager\"
exeStartInfo.UseShellExecute = False
exeStart.StartInfo = exeStartInfo
exeStart.Start()
exeStart.Close()
Not ideal but it got the job done.