I am writing the code in VBA.
How can I delete a specific element?
Thank you
Dim html As MSHTML.HTMLDocument
Set html = New MSHTML.HTMLDocument
Dim document As MSHTML.HTMLDocument
Set document = html.createDocumentFromUrl("http://example.com/", vbNullString)
' wait download
Do While document.readyState <> "complete"
Loop
' Insert I Can
Call document.DocumentElement.insertAdjacentHTML("afterend", "<div>test</div>")
' I want to delete a specific element here But I fail
' Call document.removeNode("specific element")
You cant remove elements, but you can remove nodes, which is more or less the same.
Function DelTagById(strData As String, strID As String) As String
On Local Error GoTo MyError
Dim HTMLDoc As HTMLDocument
Dim Node As IHTMLDOMNode
DelTagById = strData
If strID = "" Then GoTo MyExit
Set HTMLDoc = New HTMLDocument
HTMLDoc.body.innerHTML = strData
Set Node = HTMLDoc.getElementById(strID)
If Node Is Nothing Then GoTo MyExit
Node.parentNode.removeChild Node
DelTagById = HTMLDoc.body.innerHTML
MyExit:
Set Node = Nothing
Set HTMLDoc = Nothing
Exit Function
MyError:
'Handle Error
Resume MyExit
End Function
Related
I am trying to extract the 3/14/2017 from the following HTML code:
<div class="divListTableBodyCell" id="tdColumnPostDateCell">
<table class="tblListTableBodyCell">
<tr>
<td>
<div class="divListTableBodyLabel">3/14/2017</div>
I am using Excel VBA to do so and have the following code to try to test the information that I am pulling:
Sub CC()
Dim ie As Object
marker = 0
Set objShell = CreateObject("Shell.Application")
IE_count = objShell.Windows.Count
For x = 0 To (IE_count - 1)
On Error Resume Next
my_url = objShell.Windows(x).document.Location
my_title = objShell.Windows(x).document.Title
If my_title Like "Main" & "*" Then
Set ie = objShell.Windows(x)
marker = 1
Exit For
Else
End If
Next
extract1 = ie.document.getElementsByClassName("divListTableBodyLabel")(3).innerText
MsgBox extract1
There are multiple instances of divListTableBodyLabel on the page with various dates, so I am just seeing if I can get any of them to appear and then I can worry about getting the exact one I want. I have tried all of the id's or class names above and nothing returns?
Someone here gave me the following code to set an IE object, and it works fantastically.
Function GetIE(sLocation As String) As Object
Dim objShell As Object, objShellWindows As Object, o As Object
Dim sURL As String
Dim RetVal As Object
Set RetVal = Nothing
Set objShell = CreateObject("shell.application")
Set objShellWindows = objShell.Windows
For Each o In objShellWindows
sURL = ""
On Error Resume Next
'check the URL and if it's the one you want then
'assign it to the return value and exit the loop
sURL = o.document.location
On Error GoTo 0
If sURL Like "*" & sLocation & "*" Then
Set RetVal = o
Exit For
End If
Next o
Set GetIE = RetVal
End Function
So just Set IE = GetIE("www.google.com") and you are good to go. See if this helps any.
I'm trying to retrieve a value in one of the spanclasses in an online XML-file.
File: http://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml
I want to get to the USD rate but my code does not seem to loop through and span-classes, where is my mistake?
My code
Function response_Text(url As String) ' get the responsetext from an xml request
Dim xml As Object
Set xml = CreateObject("MSXML2.XMLHTTP")
With xml
.Open "Get", url, False
.send
response_Text = .responsetext
End With
Set xml = Nothing
End Function
Private Sub find_ClassElement(HTML_doc As MSHTML.HTMLDocument) ' return value inside span_class
Dim ticker As Variant
Dim XML_elements As MSHTML.IHTMLElementCollection
Dim XML_spanclass As MSHTML.HTMLSpanElement
Dim XML_targetElement As MSHTML.HTMLLIElement
Set XML_elements = HTML_doc.getElementsByClassName("line") **<--- something seems to be wrong here, the code does not loop through any span_classes after this point as intended ( the for statement is not being executed )**
For Each XML_spanclass In XML_elements
If InStr(XML_spanclass.innerHTML, "USD") > 0 Then
Debug.Print "success"
Set XML_targetElement = XML_spanclass.parentElement
Debug.Print CSng(XML_targetElement.getElementsByClassName("webkit-html-attribute-value")(0).innerHTML)
End If
Next
End Sub
Private Sub run() ' run the whole operation
Dim http_req As http_req: Set http_req = New http_req
Dim xml As MSHTML.HTMLDocument: Set xml = New MSHTML.HTMLDocument
Dim url As String: url = "http://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml"
xml.body.innerHTML = http_req.response_Text(url)
Call find_ClassElement(xml)
End Sub
There are no tags with a class of "line" so your collection is empty - nothing to loop through. Here's another way
Sub GetUSD()
Dim xHttp As MSXML2.XMLHTTP
Dim xDoc As MSXML2.DOMDocument
Dim xCube As MSXML2.IXMLDOMElement
Dim xCubes As MSXML2.IXMLDOMSelection
Dim sCurrency As String
'load the xml document
Set xDoc = New MSXML2.DOMDocument
xDoc.Load "http://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml"
'wait until it's completely loaded
Do
DoEvents
Loop Until xDoc.readyState = 4
'get all the cube tags
Set xCubes = xDoc.getElementsByTagName("Cube")
For Each xCube In xCubes
'some cube tags don't have attributes
On Error Resume Next
sCurrency = xCube.Attributes(0).Text
On Error GoTo 0
'if the first attribute is USD, get the second attribute
If sCurrency = "USD" Then
Debug.Print xCube.Attributes(1).Text
End If
Next xCube
End Sub
edit
I don't know xpath well enough to do this properly, but this works.
Sub GetUSD()
Dim xDoc As MSXML2.DOMDocument60
Dim xCube As MSXML2.IXMLDOMNode
Dim xCubes As MSXML2.IXMLDOMNodeList
Dim sCurrency As String
'load the xml document
Set xDoc = New MSXML2.DOMDocument60
xDoc.Load "http://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml"
'wait until it's completely loaded
Do
DoEvents
Loop Until xDoc.readyState = 4
'get all the cube tags
Set xCubes = xDoc.SelectNodes("//*")
For Each xCube In xCubes
On Error Resume Next
sCurrency = xCube.Attributes(0).NodeValue
On Error GoTo 0
'if the first attribute is USD, get the second attribute
If sCurrency = "USD" Then
Debug.Print xCube.Attributes(1).NodeValue
End If
Next xCube
End Sub
I am trying to instantiate the Root node of a Dom document. However i am naming it xbrl and this name is in the default namespace which is xmlns="http://www.xbrl.org/2003/instance"
According to a previous post-answer MSXML is buggy (answer of barrowc) when it comes to default namespaces. So i had to make some modification to my code. These where
objXMLDoc.LoadXML (objXMLHTTP.responseText)
replaced by
objXMLDoc.LoadXML objXMLHTTP.responseText
objXMLDoc.setProperty "SelectionNamespaces", "xmlns:r='http://www.xbrl.org/2003/instance'"
AND ALSO
Dim objXMLHTTP As New MSXML2.XMLHTTP
Dim objXMLDoc As New MSXML2.DOMDocument
replace by
Dim objXMLHTTP As New MSXML2.XMLHTTP60
Dim objXMLDoc As New MSXML2.DOMDocument60
The number 60 symbolizes the version 6.0
So when i made these modifications the macro worked without an error. But now it works only sometimes. When it doesn't it gives me a
Run-time error -2147467259(80004005)':
Reference to undeclared namespace prefix:'us-gaap.'
I cannot understand the reason the macro crashes and believe it is a bug.
Can you help?
For reasons of completeness the entire macro is submitted below
Sub READSITE()
Dim IE As InternetExplorer
Dim els, el, colDocLinks As New Collection
Dim lnk, res
Dim Ticker As String
Dim colXMLPaths As New Collection
Dim XMLElement As String
Set IE = New InternetExplorer
IE.Visible = False
Ticker = Worksheets("Sheet1").Range("A1").Value
LoadPage IE, "https://www.sec.gov/cgi-bin/browse-edgar?" & _
"action=getcompany&CIK=" & Ticker & "&type=10-Q" & _
"&dateb=&owner=exclude&count=20"
Set els = IE.Document.getelementsbytagname("a")
For Each el In els
If Trim(el.innertext) = "Documents" Then
colDocLinks.Add el.href
End If
Next el
For Each lnk In colDocLinks
LoadPage IE, CStr(lnk)
For Each el In IE.Document.getelementsbytagname("a")
If el.href Like "*[0-9].xml" Then
Debug.Print el.innertext, el.href
colXMLPaths.Add el.href
End If
Next el
Next lnk
XMLElement = Range("C1").Value
'For each link, open the URL and display the Debt Instrument Insterest Rate
For Each lnk In colXMLPaths
res = GetData(CStr(lnk), XMLElement)
With Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).Offset(1, 0)
.NumberFormat = "#"
.Value = Ticker
.Offset(0, 1).Value = lnk
.Offset(0, 2).Value = res
End With
Next lnk
End Sub
Function GetData(sURL As String, sXMLElement As String)
Dim strXMLSite As String
Dim objXMLHTTP As New MSXML2.XMLHTTP60
Dim objXMLDoc As New MSXML2.DOMDocument60
Dim objXMLNodexbrl As MSXML2.IXMLDOMNode
Dim objXMLNodeElement As MSXML2.IXMLDOMNode
Dim objXMLNodeStkhldEq As MSXML2.IXMLDOMNode
GetData = "?" 'No data from XML
objXMLHTTP.Open "GET", sURL, False '<<EDIT: GET the site
objXMLHTTP.send
objXMLDoc.LoadXML (objXMLHTTP.responseText)
objXMLDoc.setProperty "SelectionNamespaces", "xmlns:r='http://www.xbrl.org/2003/instance'"
Set objXMLNodexbrl = objXMLDoc.SelectSingleNode("r:xbrl")
Set objXMLNodeElement = objXMLNodexbrl.SelectSingleNode(sXMLElement)
If Not objXMLNodeElement Is Nothing Then
GetData = objXMLNodeElement.Text
End If
End Function
Sub LoadPage(IE As Object, url As String)
IE.Navigate url
Do While IE.Busy Or IE.ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
End Sub
It is also a strange and frustrating note, to see that if i alter my macro in the before-corrections status according to the modifications given to me by barrowc i can now see that the macro works!
I was able to terminate the bug swiftly by checking in Diff Checker the vast code with previous code samples i had. It seems all you need is to remove the 60 from these lines (god knows why...)
Dim objXMLHTTP As New MSXML2.XMLHTTP60
Dim objXMLDoc As New MSXML2.DOMDocument60
I'm trying to get the data to Excel of a div element with a specific class name, like:
<div class="myClass">
<span>Text1</span>
<span>Text2</span>
</div>
My VBA code:
Sub GetData()
Dim oHtml As HTMLDocument
Dim oElement As Object
Set oHtml = New HTMLDocument
With CreateObject("WINHTTP.WinHTTPRequest.5.1")
.Open "GET", "http://www.example.com/", False
.send
oHtml.body.innerHTML = .responseText
End With
For Each oElement In oHtml.getElementsByClassName("myClass")
Debug.Print oElement.Children(0).src
Next oElement
End Sub
This is returning the error:
Run-time error: '438': Object doesn't support this property or method. The error is on line Debug.Print ...
I have activated the following refereces:
Microsoft HTML Object Library
Microsoft Internet Controls
I want to be able to select the text on the first or second span and paste it on a cell, how can I do this?
This worked for me:
Dim oHtml As HTMLDocument
Dim oElement As Object
Set oHtml = New HTMLDocument
With CreateObject("WINHTTP.WinHTTPRequest.5.1")
.Open "GET", "http://www.example.com", False
.send
oHtml.body.innerHTML = .responseText
End With
Set dados = oHtml.getElementsByClassName("myClass")(0).getElementsByTagName("span")
i = 0
For Each oElement In dados
Sheets("Sheet1").Range("A" & i + 1) = dados(i).innerText
i = i + 1
Next oElement
Here's how I've done it in the past:
Set oElement = oHtml.getElementsByClassName("myClass")
i = 0
While i < oElement.Length
Debug.Print oElement(i).innerText
i = i + 1
Wend
you might want innerHtml instead?
I am writing some VBA code which manipulates an HTML document. The document is opened as text so that the HTML can be worked on. Thus:
Dim oWordDoc As Word.Document
Set oWordDoc = Documents.Open(FileName:=strFolder & "\" & strFileName, _
Format:=wdOpenFormatText)
The HTML contains some meta tags that I would like to be able to access by ID. This is the code I tried to attempt this:
Dim objHtmlDoc As HTMLDocument
Dim objMetaTag As HTMLMetaElement
Set objHtmlDoc = oWordDoc
Set objMetaTag = objHtmlDoc.getElementById("keywords")
However, I get a Type Mismatch error on the line:
Set objHtmlDoc = oWordDoc
I tried to set the objHtmlDoc to oWordDoc.content, and get the same error. Is there anyway that I can convert the Word.Document object to an HTMLDocument object so that I can set the HTMLDocument to be the Word.Document? Or will I have to develop my own getElementbyID function to perform this?
Thanks.
An Alternative that I was suggesting.
Sub Sample()
Dim objHtmlDoc As HTMLDocument
Dim objMetaTag As HTMLMetaElement
Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
IE.navigate "C:\Sample.Htm"
Set objHtmlDoc = IE.Document
Set objMetaTag = objHtmlDoc.getElementById("keywords")
End Sub