How do I get element ID from Object using vba - vba

I defined tdobj as MSHTML.HTMLGenericElement
I have looped through a webpage and have a tdobj.
I am using the below to loop through the page:
Dim tds As MSHTML.IHTMLElementCollection
Dim trs As MSHTML.IHTMLElementCollection
Dim trObj As MSHTML.HTMLGenericElement
Dim tdObj As MSHTML.HTMLGenericElement
Set trs = ie.document.getElementsByTagName("tr")
For Each trObj In trs
Set tds = trObj.getElementsByTagName("td")
For Each tdObj In tds
If InStr(1, tdObj.innerText, NameA, 1) > 0 then
NT = 1
end if
If NT > 0 then NT = NT + 1
if NT = 6 then tdObj.click
next
next
Html of the object is:
<a name="gofo" class="button_link" id="gofo" type="button"
onclick="javascript:return disableLinks(this.href);"
href='javascript:submitLink(document.Form0,"gofo");'
renderer="uk.co.cls.tapestry.link.PortalLinkRenderer#2ce745">
runto
</a>
I want to be able to click this particular tdobj but tdObj.FireEvent ("onclick") tdObj.Click does not work, but does not produce an error message, either.
I was thinking of perhaps obtaining the element ID from the tdobj but how do I do this?
Thanks in advance.

Related

Trying to interact with a exiting browser window

I am trying to work on a page thats already opened, I have managed to find and interact with the window, but I cant seem click on the button the search button. See code below:
Dim ele As Object
Set idoc = ie.Document
Dim eles As mshtml.IHTMLElementCollection
Set eles = idoc.getElementsByTagName("button")
Set objshell = CreateObject("shell.application")
intwincnt = objshell.Windows.Count
For inTWinNo = 0 To (intwincnt - 1)
strwintitle = objshell.Windows(inTWinNo).Document.Title
If strwintitle = "Articles" Then
Set ie = objshell.Windows(inTWinNo).Document
Exit For
End If
Next
ie.getElementsByName("vin")(0).Value = "test"
Set ele = ie.Document.getElementsById("vin-button")
ele.Click
<button class="btn btn-primary" id="vin-utton"type="submit">DECODE</button>

how to set reference to chrome webpage document using selenium

i want to replace the line htmldoc from htmlobject library to something suitable for selenium. i want to pass htmldoc as argument in another subroutine so Here is the code:
Dim htmldoc As MSHTML.HTMLDocument
Dim htmldiv As Selenium.WebElement
Dim htmlul As Selenium.WebElement
Dim htmlAs As Selenium.WebElements
Dim htmlA As Selenium.WebElement
Dim TableName As String
URL = "https://www.whoscored.com/Statistics"
sel.Start "Chrome"
sel.Get URL
'set htmldoc= sel.document..... something....
Set htmldiv = sel.FindElementById("top-player-stats")
Set htmlul = sel.FindElementById("top-player-stats-options")
Set htmlAs = htmlul.FindElementsByTag("a")
For Each htmlA In htmlAs
TableName = htmlA.attribute("href")
htmlA.Click
GoToTable htmldoc, TableName
Next htmlA
End Sub
If you're trying to capture the entire HTML source code.
One options is to use
sel.PageSource
But that might not behave as you expect as a limitation to how it is generated (source: https://www.selenium.dev/selenium/docs/api/java/org/openqa/selenium/WebDriver.html#getPageSource()).
You could also try these after the page is fully loaded:
sel.ExecuteScript("return document.documentElement.innerHTML")
sel.ExecuteScript("return document.body.innerHTML")

Error when changing IE automation code to XML

I recently started working with XML automation and after changing some basic IE automation code over, I seem to be getting an error. Here's the HTML:
<tbody>
<tr class="group-2 first">
<td class="date-col">
<a href="/stats/matches/mapstatsid/48606/teamone-vs-merciless">
<div class="time" data-time-format="d/M/yy" data-unix="1498593600000">27/6/17</div>
</a>
</td>
......SOME MORE HTML HERE......
</tr>
......SOME MORE HTML HERE......
</tbody>
And here's the code i'm using in Excel VBA:
Sub readData()
Dim XMLPage As New MSXML2.XMLHTTP60
Dim html As New MSHTML.HTMLDocument
XMLPage.Open "GET", "https://www.hltv.org/stats/matches", False
XMLPage.send
If XMLPage.Status <> 200 Then MsgBox XMLPage.statusText
html.body.innerHTML = XMLPage.responseText
For Each profile In html.getElementsByTagName("tbody")(0).Children
Debug.Print profile.getElementsByClassName("date-col")(0).getElementsByTagName("a")(0).getAttribute("href") 'Run time error '438' here
Next
End Sub
I'm getting the Run time error '438' at the debug print code. seems to be happening when getting the class but i'm unsure why. It works fine if I use this for example:
Debug.Print profile.innertext
Worked for me:
Sub readData()
Dim XMLPage As New MSXML2.XMLHTTP60
Dim html As New MSHTML.HTMLDocument, links, a, i
XMLPage.Open "GET", "https://www.hltv.org/stats/matches", False
XMLPage.send
If XMLPage.Status <> 200 Then MsgBox XMLPage.statusText
html.body.innerHTML = XMLPage.responseText
Set links = html.querySelectorAll("td.date-col > a")
Debug.Print links.Length
For i = 0 To links.Length - 1
Debug.Print links(i).href
Next
Set links = Nothing
Set html = Nothing
End Sub
FYI when I used For Each to loop over the links collection Excel would reliably crash, so I'd stay with the loop shown
profile refers to a row, and profile.cells(0) will refer to the first column in that row. So try...
profile.cells(0).getElementsByTagName("a")(0).getAttribute("href")
Also, profile should be declared as HTMLTableRow.
The URL you are using isn't serving valid XML, but it's recoverable with some simple regex replacements. Once we have some valid XML, we can load that into a DOM document and use XPath to select the nodes as required:
Option Explicit
'Add references to:
' - MSXML v3
' - Microsoft VBScript Regular Expressions 5.5
Sub test()
Const START_MARKER As String = "<table class=""stats-table matches-table"">"
Const END_MARKER As String = "</table>"
With New MSXML2.XMLHTTP
.Open "GET", "https://www.hltv.org/stats/matches", False
.send
If .Status = 200 Then
'The HTML isn't valid XHTML, so we can't just use the http.XMLResponse DOMDocument
'Let's extract the HTML table
Dim tableStart As Long
tableStart = InStr(.responseText, START_MARKER)
Dim tableEnd As Long
tableEnd = InStr(tableStart, .responseText, END_MARKER)
Dim tableHTML As String
tableHTML = Mid$(.responseText, tableStart, tableEnd - tableStart + Len(END_MARKER))
'The HTML table has invalid img tags (let's add a closing tag with some regex)
With New RegExp
.Global = True
.Pattern = "(\<img [\W\w]*?)"">"
Dim tableXML As String
tableXML = .Replace(tableHTML, "$1"" />")
End With
'And load an XML document from the cleaned up HTML fragment
Dim doc As MSXML2.DOMDocument
Set doc = New MSXML2.DOMDocument
doc.LoadXML tableXML
End If
End With
If Not doc Is Nothing Then
'Use XPath to select the nodes we need
Dim nodes As MSXML2.IXMLDOMSelection
Set nodes = doc.SelectNodes("//td[#class='date-col']/a/#href")
'Enumerate the URLs
Dim node As IXMLDOMAttribute
For Each node In nodes
Debug.Print node.nodeTypedValue
Next node
End If
End Sub
Output:
/stats/matches/mapstatsid/48606/teamone-vs-merciless
/stats/matches/mapstatsid/48607/merciless-vs-teamone
/stats/matches/mapstatsid/48608/merciless-vs-teamone
/stats/matches/mapstatsid/48600/wysix-vs-fnatic-academy
/stats/matches/mapstatsid/48602/skitlite-vs-nexus
/stats/matches/mapstatsid/48604/extatus-vs-forcebuy
/stats/matches/mapstatsid/48605/extatus-vs-forcebuy
/stats/matches/mapstatsid/48599/planetkey-vs-gatekeepers
/stats/matches/mapstatsid/48603/gatekeepers-vs-planetkey
/stats/matches/mapstatsid/48595/wysix-vs-gambit
/stats/matches/mapstatsid/48596/kinguin-vs-playing-ducks
/stats/matches/mapstatsid/48597/spirit-academy-vs-tgfirestorm
/stats/matches/mapstatsid/48601/spirit-academy-vs-tgfirestorm
/stats/matches/mapstatsid/48593/fnatic-academy-vs-gambit
/stats/matches/mapstatsid/48594/alternate-attax-vs-nexus
/stats/matches/mapstatsid/48590/pro100-vs-playing-ducks
/stats/matches/mapstatsid/48583/extatus-vs-ex-fury
/stats/matches/mapstatsid/48589/extatus-vs-ex-fury
/stats/matches/mapstatsid/48584/onlinerol-vs-forcebuy
/stats/matches/mapstatsid/48591/forcebuy-vs-onlinerol
/stats/matches/mapstatsid/48581/epg-vs-veni-vidi-vici
/stats/matches/mapstatsid/48588/epg-vs-veni-vidi-vici
/stats/matches/mapstatsid/48592/veni-vidi-vici-vs-epg
/stats/matches/mapstatsid/48582/log-vs-gatekeepers
/stats/matches/mapstatsid/48586/gatekeepers-vs-log
/stats/matches/mapstatsid/48580/spraynpray-vs-epg
/stats/matches/mapstatsid/48579/quantum-bellator-fire-vs-spraynpray
/stats/matches/mapstatsid/48571/noxide-vs-masterminds
/stats/matches/mapstatsid/48572/athletico-vs-legacy
/stats/matches/mapstatsid/48578/node-vs-avant
/stats/matches/mapstatsid/48573/funky-monkeys-vs-grayhound
/stats/matches/mapstatsid/48574/grayhound-vs-funky-monkeys
/stats/matches/mapstatsid/48575/hegemonyperson-vs-eclipseo
/stats/matches/mapstatsid/48577/eclipseo-vs-hegemonyperson
/stats/matches/mapstatsid/48566/masterminds-vs-tainted-black
/stats/matches/mapstatsid/48562/grayhound-vs-legacy
/stats/matches/mapstatsid/48563/noxide-vs-riotous-raccoons
/stats/matches/mapstatsid/48564/avant-vs-dark-sided
/stats/matches/mapstatsid/48565/avant-vs-dark-sided
/stats/matches/mapstatsid/48567/eclipseo-vs-uya
/stats/matches/mapstatsid/48568/uya-vs-eclipseo
/stats/matches/mapstatsid/48560/uya-vs-new4
/stats/matches/mapstatsid/48561/new4-vs-uya
/stats/matches/mapstatsid/48559/jaguar-sa-vs-miami-flamingos
/stats/matches/mapstatsid/48558/spartak-vs-binary-dragons
/stats/matches/mapstatsid/48557/kungar-vs-spartak
/stats/matches/mapstatsid/48556/igamecom-vs-fragsters
/stats/matches/mapstatsid/48554/nordic-warthogs-vs-aligon
/stats/matches/mapstatsid/48555/binary-dragons-vs-kungar
/stats/matches/mapstatsid/48550/havu-vs-rogue-academy
Looking at the MSHTML.HTMLDocument reference there is no method getElementsByClassName.
You will need to loop through each row in the tbody you are selecting and then get the first td in that row and then get the first link in that td and read the href attribute from it. You could alternately compare the class attribute of the td but since it is the first element in the row there is no need to do that.

Setting a web table scraper by Msxml2.ServerXMLHTTP.6.0 on Excel VBA

I need to do a web data scraper.
I need to login to the site: user, password, click login button
Click a second button
Wait for page to load, here is the Table in question. The table is a call log and adds new content dynamically, so it is always refreshing.
I want to exclude a form from the table content and limit the rows pasted to Excel.
I make it work by InternetExplorer.Application code but I need to switch to MSXML2.XMLHTTP code because it is very slow.
Working InternetExplorer.Application Version:
Sub extractTablesData()
'we define the essential variables
Dim IE As Object, obj As Object
Dim r As Integer, c As Integer, t As Integer
Dim elemCollection As Object
'add the "Microsoft Internet Controls" reference in your VBA Project indirectly
Set IE = CreateObject("InternetExplorer.Application")
With IE
.Silent = True
.Visible = True
.navigate ("https://www.clickphone.ro")
' we ensure that the web page downloads completely before we fill the form automatically
While IE.readyState <> 4
DoEvents
Wend
Application.Wait Now + TimeValue("00:00:03")
Set HTMLDoc = IE.document
HTMLDoc.all.user.Value = "user or email" 'Enter your email/user id here
HTMLDoc.all.pass.Value = "xXXxXXXxxXXXxx" 'Enter your password here
'Login Button Click
With IE.document
Set elems = .getElementsByTagName("a")
For Each e In elems
If (e.getAttribute("class") = "orange_button") Then
e.Click
Exit For
End If
Next e
End With
'Needed Table page Button Click https://www.clickphone.ro/account/istoric_apel_in.html
While IE.readyState <> 4
DoEvents
Wend
Set iedoc = IE.document
Set elems = iedoc.getElementsByClassName("black")(12)
elems.Click
' again ensuring that the web page loads completely before we start scraping data
While IE.readyState <> 4
DoEvents
Wend
Application.Wait Now + TimeValue("00:00:05")
Set iedoc = IE.document
'Clearing any unnecessary or old data in Sheet1
ThisWorkbook.Sheets("Sheet1").Range("A1:K1000").ClearContents
'Scrapping Data and past to Sheet1
Set elemCollection = IE.document.getElementsByTagName("table")
For t = 0 To (elemCollection.Length - 1)
For r = 0 To (elemCollection(t).Rows.Length - 1)
For c = 0 To (elemCollection(t).Rows(r).Cells.Length - 1)
ThisWorkbook.Worksheets(1).Cells(r + 1, c + 1) = elemCollection(t).Rows(r).Cells(c).innerText
Next c
Next r
Next t
End With
' cleaning up memory
Set IE = Nothing
End Sub
This is my attempt of MSXMLHTTP:
Option Explicit
'reference to Microsoft Internet Controls
'reference to Microsoft HTML Object Library
Sub Web_Table_Option_One()
Dim xml As Object
Dim html As Object
Dim objTable As Object
Dim result As String
Dim lRow As Long
Dim lngTable As Long
Dim lngRow As Long
Dim lngCol As Long
Dim ActRw As Long
Set xml = CreateObject("MSXML2.XMLHTTP.6.0")
Set html = CreateObject("htmlFile")
With xml
.Open "POST", "https://www.clickphone.ro/login.html", False
.setRequestHeader "Content-type", "application/x-www-form-urlencoded"
.send "userName=USER&password=XXXXxxxxXxxxxXXX"
.Open "GET", "https://www.clickphone.ro/account/istoric_apel_in.html", False
.setRequestHeader "Content-type", "text/xml"
.send
End With
html.body.innerHTML = xml.responseText
Set objTable = html.getElementsByTagName("table")
For lngTable = 0 To objTable.Length - 1
For lngRow = 0 To objTable(lngTable).Rows.Length - 1
For lngCol = 0 To objTable(lngTable).Rows(lngRow).Cells.Length - 1
ThisWorkbook.Sheets("Sheet2").Cells(ActRw + lngRow + 1, lngCol + 1) = objTable(lngTable).Rows(lngRow).Cells(lngCol).innerText
Next lngCol
Next lngRow
ActRw = ActRw + objTable(lngTable).Rows.Length + 1
Next lngTable
End Sub
HTML source code:
For user,pass,login button:
<form action="/login.html" id="toploginform" name="toploginform" method="post">
<script>
function processLoginForm(){
with (document.toploginform) {
if (user.value=="Email"){alert('Email/Parola incorecte!'); return false}
document.getElementById('toploginform').submit();
}
}
</script>
<fieldset>
<input name="userlogin" type="hidden" id="userlogin" value="true" />
<span class="text">
<input name="user" type="text" onFocus="if(this.value=='Email'){this.value=''}" onBlur="if(this.value==''){this.value='Email'}" value="Email">
</span> <span class="text">
<input name="pass" type="password" onFocus="if(this.value=='Password'){this.value=''}" onBlur="if(this.value==''){this.value='Password'}" value="Password">
</span>
<input name="authcode" type="hidden" id="authcode" value="false" />
<span>Login</span>
<span class="links">Am uitat parola<br/>
<input class="css-checkbox" id="checkbox2" type="checkbox" name="rememberpass" value="da" />
<label for="checkbox2" name="checkbox2_lbl" class="css-label lite-orange-check">Retin datele?</label>
</span>
</fieldset>
</form>
Table page button:
<br /> <img src="/images/sageata_orange.gif" width="7" height="8" /> <a class="black" href="/account/istoric_apel_in.html">Apeluri primite</a>
Table source code:
<table class="TabelDate" cellspacing="0">
<thead>
<tr>
<th width="130">Data</th>
<th>Sursa</th>
<th>Destinatie</th>
<th>Durata</th>
<th class="ultima">Status</th>
</tr>
</thead>
<tr class="u"> <td class="prima">19-03-2017 17:31:16</td><td><font color="green"><form name="form24-1489937476.41719" method="post" action="">0720145931 <a class="TipFB" href="#"><span class="tip"><span class="tipTitle">Suna inapoi la 0720145931</span><span class="tipBody">Click si serverul te va suna gratuit pe numarul tau 0371780434.<br />Dupa ce raspunzi se formeaza automat numarul 0720145931.</span><span class="tipArrow"></span></span><input type="image" name="btn_opentextbox" src="/images/phone_small.gif" value="Submit" /></a> <input name="numartel" type="hidden" id="numartel" value="0720145931" /></form></font></td><td align="center"><font color="green">0371780444</font></td><td align="center"><font color="green">00:00:07</font></td>
<td class="ultima" align="center"><font color="green">Apel preluat</font></td></tr> <tr class="gri"> <td class="prima">19-03-2017 17:30:48</td><td><font color="green"><form name="form24-1489937448.41715" method="post" action="">0728409617 <a class="TipFB" href="#"><span class="tip"><span class="tipTitle">Suna inapoi la 0728409617</span><span class="tipBody">Click si serverul te va suna gratuit pe numarul tau 0371780434.<br />Dupa ce raspunzi se formeaza automat numarul 0728409617.</span><span class="tipArrow"></span></span><input type="image" name="btn_opentextbox" src="/images/phone_small.gif" value="Submit" /></a> <input name="numartel" type="hidden" id="numartel" value="0728409617" /></form></font></td><td align="center"><font color="green">0371780655</font></td><td align="center"><font color="green">00:00:07</font></td>
I manage to partially resolve my problem. Now i can login and retrieve the table i need with XmlHttp. I'l post the working code here so every one can use it (i don't take any credits for this code, i did it with help from different forums)
Option Explicit
'reference to Microsoft Internet Controls
'reference to Microsoft HTML Object Library
Sub CallLog()
Dim xml As Object
Dim html As Object
Dim objTable As Object
Dim result As String
Dim lRow As Long
Dim lngTable As Long
Dim lngRow As Long
Dim lngCol As Long
Dim ActRw As Long
Set xml = CreateObject("MSXML2.XMLHTTP.6.0")
Set html = CreateObject("htmlFile")
With xml
.Open "POST", "https://www.XXXXXX.xx/login.html", False
.setRequestHeader "Content-type", "application/x-www-form-urlencoded" 'send appropriate Headers
.send "userlogin=true&user=USERNAME&pass=PASSWORD&authcode=false" ' send login info
'MsgBox xml.responseText
.Open "GET", "https://www.XXXXXX.xx/account/callLog.html", False
.setRequestHeader "Content-type", "text/xml"
.send
End With
html.body.innerHTML = xml.responseText
Set objTable = html.getElementsByTagName("table")
For lngTable = 0 To objTable.Length - 1
For lngRow = 0 To objTable(lngTable).Rows.Length - 1
For lngCol = 0 To objTable(lngTable).Rows(lngRow).Cells.Length - 1
ThisWorkbook.Sheets("Sheet2").Cells(ActRw + lngRow + 1, lngCol + 1) = objTable(lngTable).Rows(lngRow).Cells(lngCol).innerText
Next lngCol
Next lngRow
ActRw = ActRw + objTable(lngTable).Rows.Length + 1
Next lngTable
End Sub
Now i'm left with the two problems...
How can i get the children "table" from the parent "table" (the table i'm after is in a bigger table, see below source code) and i want to get only the first row, but excluding a "form" from the Row (it's a href link)
Source Code
How i can get this continuously (this table is dynamic, it's updating every time some one call me, this first Row, is updating continuously)
Version 2.0 of my working code:
Option Explicit
'reference to Microsoft Internet Controls
'reference to Microsoft HTML Object Library
Sub CallLog()
Dim xml As Object
Dim html As Object
Dim objTable As Object
Dim result As String
Dim lRow As Long
Dim lngTable As Long
Dim lngRow As Long
Dim lngCol As Long
Dim ActRw As Long
Set xml = CreateObject("MSXML2.XMLHTTP.6.0")
Set html = CreateObject("htmlFile")
With xml
.Open "POST", "https://www.XXXXXX.xx/login.html", False
.setRequestHeader "Content-type", "application/x-www-form-urlencoded" 'send appropriate Headers
.send "userlogin=true&user=USERNAME&pass=PASSWORD&authcode=false" ' send login info
'MsgBox xml.responseText
.Open "GET", "https://www.XXXXXX.xx/account/callLog.html", False
.setRequestHeader "Content-type", "text/xml"
.send
End With
html.body.innerHTML = xml.responseText
Set objTable = html.getElementsByTagName("table")
ThisWorkbook.Sheets("LogClickPhone").Range("A2") = objTable(1).Rows(1).Cells(0).innerText
ThisWorkbook.Sheets("LogClickPhone").Range("B2") = objTable(1).Rows(1).Cells(1).innerText
ThisWorkbook.Sheets("LogClickPhone").Range("C2") = objTable(1).Rows(1).Cells(2).innerText
ThisWorkbook.Sheets("LogClickPhone").Range("D2") = objTable(1).Rows(1).Cells(3).innerText
ThisWorkbook.Sheets("LogClickPhone").Range("E2") = objTable(1).Rows(1).Cells(4).innerText
End Sub
I manage to get only the row i need but it's very slow, it takes 38.5 Sec to complete. I think I will better to use MSXML2.DOMDocument.6.0 structure for getting the text i need. But i don't know how to do that.
Question:
How i can automate this code so it's running every 60 sec or so?
Tx

Retrieving data from the web using vba

Just started using html, reasonably capable in vba but having some problems linking the two.
I have passed a registration to a web site and trying to get the results.
code used so far
Dim HTMLDoc As HTMLDocument
Dim MyBrowser As InternetExplorer
Sub GetVehicleDetails()
Dim MyHTML_Element As IHTMLElement
Dim MyURL As String
Dim x As Integer
On Error GoTo Err_Clear
MyURL = "http://www.1stchoice.co.uk/find-a-part"
x = 0
Set MyBrowser = New InternetExplorer
MyBrowser.Silent = True
MyBrowser.navigate MyURL
MyBrowser.Visible = True
Do
Loop Until MyBrowser.readyState = READYSTATE_COMPLETE
Set HTMLDoc = MyBrowser.document
HTMLDoc.all.license_plate.Value = "LV11VYT"
For Each MyHTML_Element In HTMLDoc.getElementsByTagName("button") '("input")
'Get 2nd button
If MyHTML_Element.Title = "Continue" Then 'MyHTML_Element.Click: Exit For
x = x + 1
If x = 2 Then
MyHTML_Element.Click
End If
End If
Next
Err_Clear:
If Err <> 0 Then
Err.Clear
Resume Next
End If
End Sub
Now I need to wait until page is refreshed and then get the result, but I'm not certain how to pull the result out
Source code is
<div id="block_subheader" class="block_editable block_wysiwyg">
<p>Almost there! <strong>TELL US</strong> which parts you need - <strong>ADD </strong>your contact details & receive <strong>No Obligation Quotes</strong><span style="font-weight: normal;"> to compare & </span><span style="font-weight: normal;"><strong>Save ££'s!</strong></span></p>
</div>
<div class="clear"></div>
<form id="step3" action="/find-a-part/step-3" method="post" enctype="multipart/form-data">
<div class="clearfix">
<h2>RENAULT MEGANE (X95) DYNAMIQUE TOMTOM DCI ECO 3 DOOR COUPE 1461cc (2011) DIESEL</h2>
<p>Not quite the vehicle you're searching for? Click here to specify the vehicle exactly</p>
</div>
Trying to get the Renault Megane details
Can anyone help please?
OK I have got past this part but have run into another problem, when the page changes after the button is clicked I need to update the html.document to the new page as when I use it in the code it pulls up the old source code.
I can get it to work but It only works with a message box activating to say what the browser name is.
Any suggestions?
Dim HTMLDoc As HTMLDocument
Dim MyBrowser As InternetExplorer
Sub GetVehicleDetails2()
Dim MyHTML_Element As IHTMLElement
Dim HTMLDoc As HTMLDocument, Doc As HTMLDocument
Dim MyURL As String, Vehicle As String
Dim x As Integer, y As Integer
On Error GoTo Err_Clear
MyURL = "http://www.1stchoice.co.uk/find-a-part"
x = 0
'open new explorer
Set MyBrowser = New InternetExplorer
MyBrowser.Silent = True
'navigate to page
MyBrowser.navigate MyURL
MyBrowser.Visible = True
'wait until ready
Do While MyBrowser.Busy Or _
MyBrowser.readyState <> 4
DoEvents
Loop
Do
Loop Until MyBrowser.readyState = READYSTATE_COMPLETE
Set HTMLDoc = MyBrowser.document
'enter registration in text box
HTMLDoc.all.license_plate.Value = "LV11VYT"
'click continue button
Set MyHTML_Element = HTMLDoc.getElementsByTagName("button")(1)
MyHTML_Element.Click
Set HTMLDoc = Nothing
'wait until page updated
Set Doc = MyBrowser.document
'Application.Wait (Now() + "00:00:05")
'does not work if you take this out
MsgBox MyBrowser.FullName
'find text returned with vehicle details
For Each MyHTML_Element In Doc.getElementsByTagName("form")
If MyHTML_Element.ID = "step3" Then
Vehicle = MyHTML_Element.innerText
MsgBox Vehicle
End If
Next
'close browser down
'MyBrowser.Quit
Err_Clear:
If Err <> 0 Then
Err.Clear
Resume Next
End If
End Sub
using 2003 or 2007, tried web queries, cant pass value & use continue button.
Without trying to start an argument over extracting an element from HTML using Regex (vs. a parser) but Regex would be an easy way to extract the element you need, as it is well-defined and you only need that element.
You could do something like (and I provide an alternative way just using InStr, that works for your example but if there are lots of results returned at once or syntax changes etc then Regex would be more flexible):
Sub blah()
Dim testStr As String
'test string you provided in the Question -> substitute it for your HTML return
testStr = ActiveSheet.Cells(1, 1).Value
'Method 1: Use a simple Instr (fine for the example you provided, but if different bits you need to search are more complicated then you may need to use Regex instead
Dim startLocation As Long, endLocation As Long
Dim extractedText As String
startLocation = InStr(1, testStr, "<h2>", vbTextCompare)
If Not startLocation > 0 Then
Exit Sub 'or move to next or whatever
Else
endLocation = InStr(startLocation, testStr, "</h2>", vbTextCompare)
extractedText = Mid(testStr, startLocation + 4, endLocation - startLocation - 4)
Debug.Print "Basic InStr method: "; extractedText
End If
'Method 2: Use Regex
'more flexible -> reference a Regex engine.
'This example uses Microsoft VBScript Regular Expressions 5.5
'That engine uses the same syntax as MS JavaScript regex
'See http://msdn.microsoft.com/en-us/library/1400241x.aspx for syntax
Dim regex As RegExp
Dim match As match
Set regex = New RegExp
With regex
.Pattern = "(?:<h2>)([\s\S]*?)(?=</h2>)"
'NB this regex engine does not support lookbehinds :-(
'so we have to extract the submatched group for what we want
'(vs. just using Match.Value)
.IgnoreCase = True
.MultiLine = True
For Each match In .Execute(testStr)
Debug.Print "Regex match: "; match.SubMatches.Item(0)
Next match
End With
End Sub
Output is:
Basic InStr method: RENAULT MEGANE (X95) DYNAMIQUE TOMTOM DCI ECO 3 DOOR COUPE 1461cc (2011) DIESEL
Regex match: RENAULT MEGANE (X95) DYNAMIQUE TOMTOM DCI ECO 3 DOOR COUPE 1461cc (2011) DIESEL