I have following HTML code, that I want to retrieve data from:
<div class="span4">
<div>
<label for="Game_type">Portal Games</label>
XXX
</div>
<div>
<label for="Game_Reference">Game reference</label>
22130903
</div>
<div>
<label for="Release_Date">Release Date</label>
2015-07-13
</div>
<div>
<label for="Prise">Prise</label>
USD 90,00
</div>
<div>
<label for="Game_Rank">Game Rank</label>
4
</div>
</div>
How I am able to get all those label values/at least one value into MsgBox?. (Later I will input them into Excel myself)
I have tried using following code to get first value:
Dim IE As Object
Set IE = CreateObject("INTERNETEXPLORER.APPLICATION")
'page address is stated in code
IE.navigate "page name"
IE.Visible = True
While IE.Busy
'Wait until IE is busy and loading page
DoEvents
Wend
Set gtype = IE.Document.getElementsByClassName("span4")(0).getElementsById("Game_type")
GtypeValue = gtype.Value
MsgBox (GtypeValue)
End Sub
I received run-time error "91:"
Object variable or With Block variable not set.
150904
Hopefully last one question, regarding this topic.Default code looks like
strCont = objIE.Document.getElementsByClassName("span4")(0).innerHTML
But I want to have a variable instead of "span4", in example Dim1= "span4"
I state following:
strCont = "objIE.Document.getElementsByClassName(" & Chr(34) & Dim1 & Chr(34) & ")(0).innerHTML"
It does not work, empty value in MsgBox. How can I make sure that this sting will be counted as exact code to be executed later in step:
Set objMatches = .Execute(strCont)
Why not to try regex for parsing?
Sub MsgGameType()
Dim objIE As Object
Dim strCont As String
Dim objMatches As Object
Dim objMatch As Object
Set objIE = CreateObject("InternetExplorer.Application")
'page address is stated in code
objIE.Navigate "page name"
objIE.Visible = True
Do While objIE.Busy Or Not objIE.readyState = 4
DoEvents
Loop
Do Until objIE.document.readyState = "complete"
DoEvents
Loop
strCont = objIE.document.getElementsByClassName("span4")(0).innerHtml
With CreateObject("VBScript.RegExp")
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "<div>\s*<label for="".*?"">(.*?)</label>\s*(.+?)\s*?</div>"
Set objMatches = .Execute(strCont)
For Each objMatch In objMatches
MsgBox objMatch.SubMatches(0) & " = " & objMatch.SubMatches(1)
Next
End With
End Sub
See XHTML parsing with RegExp disclaimer.
Related
I would like to receive the following data for each player:
Purchase price (this value does not exist for each player!),
lowest price,
maximum price.
With my current coding I get as innertext all 3 values. But these are not separated! How can I evaluate these values individually?
If the purchase price does not exist, the placeholder "not found" should be output. How can I do this without an individual id or classname?
HTML structure is as follows:
<div class="historyPrice">
<div>
<strong>6,2 Mio</strong>
<span>Gekauft</span>
</div>
<div>
<strong>0,5 Mio</strong>
<span>Tiefstwert</span>
</div>
<div>
<strong>9,4 Mio</strong>
<span>Höchstwert</span>
</div>
</div>
It´s the same structure without "purchase Price" (in german "gekauft"). But the first "div" is missing.
<div class="historyPrice">
<div>
<strong>0,5 Mio</strong>
<span>Tiefstwert</span>
</div>
<div>
<strong>9,4 Mio</strong>
<span>Höchstwert</span>
</div>
</div>
My coding is as follows:
Sub PlayerValues()
Dim ie As New SHDocVw.InternetExplorer
Dim HTMLdoc As MSHTML.HTMLDocument
Dim HTMLPlayerValues As MSHTML.IHTMLElementCollection
Dim HTMLPlayerValue As Object
Dim URL1 As String
Dim currentRow As Long
URL1 = ws.Cells(currentRow, 11).Value
ie.Visible = False
ie.Navigate URL1
lastRow = ws.Cells(Rows.Count, 10).End(xlUp).Row
For currentRow = 7 To lastRow
Do While ie.ReadyState <> READYSTATE_COMPLETE
Loop
Application.Wait (Now + TimeValue("0:00:3"))
Set HTMLdoc = ie.Document
Set HTMLPlayerValues = HTMLdoc.getElementsByClassName("historyPrice")
For Each HTMLPlayerValue In HTMLPlayerValues
Debug.Print HTMLPlayerValue.innerText
Debug.Print "----------"
Next HTMLPlayerValue
Next currentRow
ie.Quit
End Sub
I am currently trying to click on a specific img button on a site that does not have an Id so I am trying to click based on the alt. Below is from the DOM Explorer, the alt is in Japanese characters.
<a href="/e-navi/members/statement/index.xhtml?tabNo=0">
<img width="112" height="23" alt="翌月以降" src="https://image.card.jp.rakuten-static.com/r-enavi/WebImages/enavi/common/tab05_after_o.gif">
</a>
The code I am using to try and click the link is:
pUnicodeString = ChrW(32716) & ChrW(26376) & ChrW(20197) & ChrW(38477)
Set tags = ie.document.getElementsByTagName("a")
For Each tagx In tags
If tagx.getAttribute("alt") = pUnicodeString Then
tagx.Click
End If
Next
NOTE: I have also tried it with Set tags = ie.document.getElementsByTagName("img") as well.
When running the code I get the "Permission Denied" error on If tagx.getAttribute("alt") = pUnicodeString Then.
I have been searching on many sites and have tried many peoples past examples but none worked. Below is my full code for reference.
Sub csvfetch()
Dim ie As SHDocVw.InternetExplorer
Dim doc As MSHTML.HTMLDocument
Dim div As HTMLDivElement
Dim url As String
Dim form As Variant
Dim button As Variant
Dim tags As Object
Dim tagx As Object
url = "https://www.rakuten-card.co.jp/e-navi/?l-id=corp_de_enavi_index_001"
Set ie = New SHDocVw.InternetExplorer
With ie
.Visible = True
.navigate url
While .Busy
DoEvents
Wend
End With
While ie.Busy
DoEvents
Wend
Application.Wait (Now + TimeValue("0:00:02"))
'temp user/pass
user = Sheets("Pass").Cells(2, 2)
pass = Sheets("Pass").Cells(2, 3)
ie.document.getElementById("u").Value = user
ie.document.getElementById("p").Value = pass
'login
Set form = ie.document.getElementsByTagName("form")
Set button = form(0).onsubmit
form(0).submit
While ie.Busy
DoEvents
Wend
Set ElementCol = ie.document.getElementsByClassName("rf-button-alt rce-white-button rf-mini")
ElementCol.Item(0).Click
While ie.Busy
DoEvents
Wend
pUnicodeString = ChrW(32716) & ChrW(26376) & ChrW(20197) & ChrW(38477)
Set tags = ie.document.getElementsByTagName("a")
For Each tagx In tags
If tagx.getAttribute("alt") = pUnicodeString Then
tagx.Click
End If
Next
Set ie = Nothing
'Unload UserForm1
End Sub
I am using this code to get product name from a page
code of page is
<div class="product-shop col-sm-7">
<div class="product-name">
<h1 >Claro Glass 1.5 L Rectangular Air Tight Food Container with Lid- Clear GMA0215A</h1>
</div>
my vba code is
Public Sub GetValueFromBrowser()
Dim ie As Object
Dim name As String
Do Until IsEmpty(ActiveCell)
ActiveCell.Offset(0, 1).Value = "RUNNING"
URL = Selection.Value
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = 0
.navigate URL
While .Busy Or .readyState <> 4
DoEvents
Wend
End With
Dim Doc As HTMLDocument
Set Doc = ie.document
ActiveCell.Offset(0, 1).Value = "ERROR"
name = Trim(Doc.getElementByClassName("product-name").innerText)
ActiveCell.Offset(0, 1).Value = name
ie.Quit
Loop
End Sub
error i am getting is
run-time error '438':
Object doesn't support this property or method
GetElementsByClassName method
You are missing an s in the name of the method getElementsByClassName.
Change this name = Trim(Doc.getElementByClassName("product-name").innerText)
To this name = Trim(Doc.getElementsByClassName("product-name")(0).innerText). Substitude the (0) for the item you are targeting.
It is still possible to define your own function getElementByClassName.
This function returns the very first element with given class name in the DOM document and Nothing when no element with this class name exist in the DOM document.
Public Function getElementByClassName(doc As MSHTML.HTMLDocument, className As String) As IHTMLElement
Set getElementByClassName = doc.querySelector("[class='" & className & "']")
End Function
Usage:
Dim elm As IHTMLElement
Set elm = getElementByClassName(doc, "product-name")
If Not elm Is Nothing Then
Debug.Print elm.innerText
End If
I found some code on Stack Exchange, of all places, and was able to modify it 95% of the way for my needs but one last issue keeps cropping up, all the DIVs in the parent DIV paste into one cell, I want them to post to individual cells in my worksheet. The code came from Stack Overflow user "Portland Runner" and the original post can be found here. The HTML I'm up against looks something like this:
<div class="right-header">
<div>Entry 1</div>
<div>Entry 2</div>
<div>Entry 3</div>
<div>Entry 4</div>
<div>Entry 5</div>
<div>Entry 6</div>
</div>
The child DIVs have no IDs, Classes, or Styles, just information surrounded by a lonesome DIV tag. This all gets dumped into a single cell where I'd like instead for it to be dumped into Al (Entry 1), B1 (Entry 2), C1 (Entry 3), etc. The original code is as follows:
Sub extract()
Dim IE As InternetExplorer
Dim html As HTMLDocument
Set IE = New InternetExplorerMedium
IE.Visible = False
IE.Navigate2 "C:\Users\john\Documents\Test.html"
' Wait while IE loading
Do While IE.Busy
Application.Wait DateAdd("s", 1, Now)
Loop
Set html = IE.document
Set holdingsClass = html.getElementsByClassName("right-header")
Dim results As Variant
results = Split(holdingsClass(0).textContent, vbLf)
cntr = 1
For i = LBound(results) To UBound(results)
If Trim(results(i)) <> "" Then
Select Case Right(Trim(results(i)), 1)
Case "<div>"
Range("B" & cntr) = CStr(Trim(results(i)))
Case "%"
Range("C" & cntr).Value = Trim(results(i))
cntr = cntr + 1
Case 0
Range("C" & cntr).Value = Trim(results(i))
Case Else
Range("A" & cntr).Value = Trim(results(i))
End Select
End If
Next i
Sheets("Sheet3").Range("A1").Value = holdingsClass(0).textContent
'Cleanup
IE.Quit
Set IE = Nothing
End Sub
Thank you any and all for your help!
Compiled but not tested:
Sub extract()
Dim IE As InternetExplorer
Dim topDiv, div, childDivs, tc As String, cntr
Set IE = New InternetExplorerMedium
IE.Visible = False
IE.Navigate2 "C:\Users\john\Documents\Test.html"
' Wait while IE loading
Do While IE.Busy
Application.Wait DateAdd("s", 1, Now)
Loop
Set topDiv = IE.document.getElementsByClassName("right-header")(0)
Set childDivs = topDiv.getElementsByTagName("div")
cntr = 2
For Each div In childDivs
tc = Trim(div.textContent)
If tc <> "" Then
Select Case Right(tc, 1)
Case "<div>"
'not sure whether you should be seeing HTML in textcontent...?
Range("B" & cntr) = CStr(tc)
Case "%"
Range("C" & cntr).Value = tc
cntr = cntr + 1
Case 0
Range("C" & cntr).Value = tc
Case Else
Range("A" & cntr).Value = tc
End Select
End If
Next div
Sheets("Sheet3").Range("A1").Value = topDiv.textContent
'Cleanup
IE.Quit
Set IE = Nothing
End Sub
This little script is supposed to go to the website
http://finra-markets.morningstar.com/BondCenter/Default.jsp
Insert under the tab "Search" inside the "Symbol/Cusip" box the number 111320AE7 and click on the "Show Results" button to get the results.
Sub SearchSite()
Dim beta
Dim objIE As InternetExplorer
Set objIE = New InternetExplorer
objIE.Visible = True
objIE.navigate "http://finra-markets.morningstar.com/BondCenter/Default.jsp"
Do While objIE.Busy = True Or objIE.readyState <> 4
DoEvents
Loop
objIE.document.getElementById("firscreener-cusip").Value = "111320AE7"
Set beta = objIE.document.getElementsByClassName("ms-finra-advanced-search-btn")(1)
beta.Click
Do While objIE.Busy = True Or objIE.readyState <> 4
DoEvents
Loop
'objIE.Quit
End Sub
I get Run-time error 91: Object variable or With block variable not set
The problem appears to be the beta.click line
I would appreciate some help.
Thanks a lot.
Beta is referencing the div that contains the buttons. The element arrays are base 0. They start at 0 not 1.
Sub SearchSite()
Dim beta, buttons, btnReset, btnSubmit
Dim objIE As InternetExplorer
Set objIE = New InternetExplorer
objIE.Visible = True
objIE.navigate "http://finra-markets.morningstar.com/BondCenter/Default.jsp"
Do While objIE.Busy = True Or objIE.readyState <> 4
Loop
objIE.document.getElementById("firscreener-cusip").Value = "111320AE7"
Set beta = objIE.document.getElementsByClassName("ms-finra-advanced-search-btn")(0)
' <div class="ms-finra-advanced-search-btn">
' <input class="button_blue" value="CLEAR CRITERIA" type="reset">
' <input class="button_blue" value="SHOW RESULTS" type="submit">
' </div>
Set buttons = beta.GetElementsByTagName("input")
WScript.Echo buttons(0).outerHTML
' <input class="button_blue" value="CLEAR CRITERIA" type="reset">
Set btnReset = buttons(0)
' <input class="button_blue" value="SHOW RESULTS" type="submit">
Set btnSubmit = buttons(0)
beta.btnSubmit
Do While objIE.Busy = True Or objIE.readyState <> 4
Loop
objIE.Quit
End Sub