Extract data from website as separate fields - vba

I am trying to use excel to scrap some information from a website.
This is what shows on source:
<tr class="even">
<td align="right">1</td>
<td>Acrobatic Maneuver</td>
<td>Instant</td>
<td>2W</td>
<td>Common</td>
<td>Winona Nelson</td>
<td><img src="http://magiccards.info/images/en.gif" alt="English" width="16" height="11" class="flag2"> Kaladesh</td>
</tr>
So I want to get everything that has even, and extract the data between the <td></td>
However, all I've found until now is this code
Sub getcards()
Dim IE As Object
Dim i As Long
Dim objCollection As Object
' Create InternetExplorer Object
Set IE = CreateObject("InternetExplorer.Application")
' You can uncoment Next line To see form results
IE.Visible = False
' URL to get data from
IE.Navigate "http://magiccards.info/query?q=%2B%2Be%3Akld/en&v=list&s=issue"
' Statusbar
Application.StatusBar = "Loading, Please wait..."
' Wait while IE loading...
Do While IE.Busy
DoEvents
Application.Wait DateAdd("s", 1, Now)
Loop
On Error GoTo abort
Application.StatusBar = "Searching for value. Please wait..."
Dim dd As String
Set objCollection = IE.document.getElementsByClassName("even")
For i = 0 To objCollection.Length
dd = IE.document.getElementsByClassName("even")(i).innerText
MsgBox dd
Next i
abort:
' Show IE
IE.Visible = True
IE.Quit
' Clean up
Set IE = Nothing
Application.StatusBar = ""
End Sub
It works in such a way that it extracts the data, but the output is 1Acrobatic ManeuverInstant2WCommonWinona Nelson Kaladesh all together.
How can I do so it understands each <td> as a separate field, so I can extract it easily?

When you're looping through i within objCollection you're actually looping through all elements with the ClassName of "even" as opposed to the elements inside the specific element you want.
Try this:
For i = 0 To objCollection.Length - 1
For c = 0 to IE.document.getElementsByClassName("even")(i).getElementsByTagName("td").Length - 1
dd = IE.document.getElementsByClassName("even")(i).getElementsByTagName("td")(c).innerText
MsgBox dd
Next c
Next i

Related

Extract Value from a Webpage with VBA based on Tag

In need few help please in my VBA code:
Private Sub CommandButton1_Click()
Dim IE As Object
' Create InternetExplorer Object
Set IE = CreateObject("InternetExplorer.Application")
' You can uncomment Next line To see form results
IE.Visible = False
' URL to get data from
IE.Navigate "https://www.avanza.se/aktier/om-aktien.html/5247/investor-b"
' Statusbar
Application.StatusBar = "Loading, Please wait..."
' Wait while IE loading...
Do While IE.Busy
Application.Wait DateAdd("s", 1, Now)
Loop
Application.StatusBar = "Searching for value. Please wait..."
Dim dd As String
dd = IE.Document.getElementsByClassName(" ")(0).innerText
MsgBox dd
' Show IE
IE.Visible = True
' Clean up
Set IE = Nothing
Application.StatusBar = ""
End Sub
I want to extract a value from a webpage which is in my case (computer 17) as mention the image by using VBA macro.
I already followed this link (Trying to extract ONE value from a webpage with VBA in Excel) and it work but in my case I have multiple class inside each other.
Thank you very much
This is something to get you started with the InnerText case:
Dim dd As Object
Set dd = IE.Document.getElementsByTagName("tr")
Dim obj As Object
For Each obj In dd
Debug.Print obj.innertext
Next
Write the code above on the place of dd = IE.Document.getElementsByClassName(" ")(0).innerText and see the innertext of all tr on the site.

VBA Internet Explorer clicking on Text Error

I am new to VBA and it would be great if you can help me on resolving this issue.
I am trying to click on a Text on an IE SharePoint webpage. I am am able to navigate to IE browser, but I am getting a VBA error for clicking the text "Americas" highlighted in Yellow in attached Screenshot. I need help with the IE.Document part of the code at the end of VBA code below. I assume GetElementbyID and GetElementByTagName are correct from HTML code below.
Error - Method Document of Object "IEwebBrowser"Failed
VBA Code:
Private Sub UploadFile()
Dim i As Long
Dim IE As Object
Dim Doc As Object
Dim objElement As Object
Dim objCollection As Object
Dim buttonCollection As Object
Dim AllSpanElements
Dim Span
' Create InternetExplorer Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
' Send the form data To URL As POST binary request
IE.navigate "URL"
' Wait while IE loading...
While IE.Busy
DoEvents
Wend
' I AM GETTING ERROR HERE
Set AllSpanElements = IE.Document.getElementById("ext-gen1271").getElementsByTagName("div")
AllSpanElements.Click
Set IE = Nothing
Set objElement = Nothing
End Sub
HTML CODE
<table class="x-grid-table x-grid-table-resizer" border="0" cellspacing="0" cellpadding="0" style="width:10000px;"><tbody>
<tr class="x-grid-row x-grid-row-selected x-grid-row-focused" id="ext-gen1271">
<td class="x-grid-cell-treecolumn x-grid-cell x-grid-cell-treecolumn-1030 x-grid-cell-first">
<div class="x-grid-cell-inner " style="text-align: left; ;"><img src="data:image/gif;base64,R0lGODlhAQABAID/AMDA" class="x-tree-elbow-plus x-tree-expander">
<img src="data:image/gif;base64,R0lGODlhAQABAID/AMDAwAAAA" class="x-tree-icon x-tree-icon-parent ">
Americas</div>
</td>
</tr>
</tbody>
</table>
Give this a shot, I cleaned up the code slightly and I'm trying a slightly different approach. Basically I'm iterating over each element on the page, then clicking it when the InnerText has "Americas" contained with in it.
It may not be the InnerText Property you want to check, it might be the Value or Title so you will need to check that.
Here's the code:
Private Sub UploadFile()
Dim IE As Object: Set IE = CreateObject("InternetExplorer.Application")
Dim Elements As Object
Dim Element As Object
With IE
.Visible = True
' Send the form data To URL As POST binary request
.navigate "URL"
' Wait while IE loading...
While .Busy Or .readystate <> 4
Application.Wait (Now() + TimeValue("00:00:01"))
DoEvents
Wend
End With
' I AM GETTING ERROR HERE
Set Elements = IE.Document.getElementsByTagName("*") ' * is all elements
For Each Element In Elements
If InStr(1, Element.innerText, "Americas") > 0 Then ' If the element has the word Americas...click it
Element.Focus
Element.Click
Element.FireEvent ("OnClick")
End If
Next
'Clean up
Set IE = Nothing
End Sub

Extract one figure from website table

I am currently trying to use VBA to scrape a particular figure held in a particular table on a certain website. Below is the HTML code surrounding it from the inspect element panel in my browser.
<tr class="cmeRowBandingOff cmeTableRowHighlight">
<th scope="row">MAR 15</th>
<td>2056.50</td>
<td>2062.50</td>
<td>2042.25</td>
<td>2043.25</td>
<td><span>-12.50</span></td>
<td>2044.00</td>
<td class="cmeTableRight">1,351,989</td>
<td class="cmeTableRight">2,701,326</td>
</tr>
I have the VBA code written that will extract the whole table, from "MAR 15" all the way to "2,701,326" - however I only wish to extract the figure "2044.00" into a cell/message box in excel.
My current code is as follows:
Private Sub CommandButton1_Click()
Dim IE As Object
' Create InternetExplorer Object
Set IE = CreateObject("InternetExplorer.Application")
' You can uncoment Next line To see form results
IE.Visible = False
' URL to get data from
IE.Navigate "http://www.cmegroup.com/trading/equity-index/us-index/" _
& "e-mini-sandp500_quotes_settlements_futures.html"
' Statusbar
Application.StatusBar = "Loading, Please wait..."
' Wait while IE loading...
Do While IE.Busy
Application.Wait DateAdd("s", 5, Now)
Loop
Application.StatusBar = "Searching for value. Please wait..."
Dim dd As String
dd = IE.Document.getElementsByClassName("cmeRowBandingOff")(0).innerText
MsgBox dd
' Show IE
IE.Visible = True
' Clean up
Set IE = Nothing
Application.StatusBar = ""
End Sub
I know I need to change this:
dd = IE.Document.getElementsByClassName("cmeRowBandingOff")(0).innerText
but am unsure as to how to go about it.
Could someone kindly help me alter the VBA code to get the result of just 2044.00 on it's own?
Something like:
Dim rw, dd
Set rw = IE.Document.getElementsByClassName("cmeRowBandingOff")(0)
dd = rw.getElementsByTagName("td")(5).innerText
'or
dd = rw.childNodes(6).innerText
'or
dd = rw.Cells(6).innerText

VBA Getting results in another url but same window

I'm working with VBA to fill a form in a URL and submiting to get results.
When I submit the form with correct values and submit via VBA, I get results in another URL but on the same window.
The problem is that I don't know how to change html reference to start scrapping data with this new url.
Here is my code:
'to refer to the running copy of Internet Explorer
Dim IE As InternetExplorer
'to refer to the HTML document returned
Dim html As HTMLDocument
'open Internet Explorer in memory, and go to website
Set IE = New InternetExplorer
IE.Visible = False
IE.navigate "http://url..."
'Wait until IE is done loading page
Do While IE.READYSTATE <> READYSTATE_COMPLETE
Application.StatusBar = "Connecting with http://url..."
DoEvents
Loop
'show text of HTML document returned
Set html = IE.document
'close down IE and reset status bar
Set IE = Nothing
Application.StatusBar = ""
Set txtArea = html.getElementsByTagName("textarea")(0)
txtArea.Value = txtArea_data
Set formSubmit = html.getElementsByName("submit")(1)
formSubmit.Click
'-------------Get results
'Dim html_results As HTMLDocument
IE.navigate "http://new_url" 'Im not sure if I must do it this way...
'Wait until IE is done loading page
Do While IE.READYSTATE <> READYSTATE_COMPLETE
Application.StatusBar = "Connecting with http://new_url..."
DoEvents
Loop
Set html = IE.document
Dim trResults As IHTMLElementCollection
Set trResults = html.getElementsByClassName("tr")
MsgBox (trResults.Length) 'At this point, trResults always have 0 results...
Have you any idea to help me?
Thanks!
This is a new approach, and still don't work.
I've used IE.Visible = True, so I could check that I get many <tr> results on the results page.
Dim rowNumber As Long
Dim txtArea_data As String
Dim txtArea As Object
Dim formSubmit As Object
Dim IE As InternetExplorer
Dim html As HTMLDocument
Set IE = New InternetExplorer
IE.Visible = True
IE.navigate "http://url..."
Do While IE.READYSTATE <> READYSTATE_COMPLETE
Application.StatusBar = "Connecting with http://url..."
DoEvents
Loop
Set html = IE.document
rowNumber = 4
For rowNumber = 4 To 120 'Rows.Count
txtArea_data = txtArea_data & Cells(rowNumber, 1).Value & Chr(10)
Next rowNumber
Set txtArea = html.getElementsByTagName("textarea")(0)
txtArea.Value = txtArea_data
Set formSubmit = html.getElementsByName("submit")(1)
formSubmit.Click
'Wait until results
Do While IE.Busy: DoEvents: Loop
'-------------Get results
'At this point, the page URL with results has changed, but on the same tab. In the other code, I've used IE.navigate "http://new_url..." But a "Invalid file" message appears.
'-------------
'I suppose html var could be recharged with these new results data, but nothing happen...
Set html = IE.document
Dim trResults As IHTMLElementCollection
Set trResults = html.getElementsByClassName("tr")
MsgBox (trResults.Length) 'At this point, MsgBox returns O...
Set html = Nothing
Results page looks like:
<body>
<table id="tabla-a">
<thead>
<tr>
<th>...</th>
<th>...</th>
<th>...</th>
</tr>
</thead>
<tbody>
<tr>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
<tr>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
<tr>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
</tbody>
</table>
</body>
Is there anything I could tell you to make to help you finding the problem?
Thanks!
EDIT
Oh my God! I've been using Set trResults = html.getElementsByClassName("tr") instead of Set trResults = html.getElementsByTagName("tr") !!!
Thanks for your time!
What should I do now? Edit the original question with the final solution or close the entire question?
I don't use Stackoverflow too much to ask questions...
Thanks!

I need to find and press a button on a webpage using VBA

I have written a macro that will open a webpage, find the spots I need to put the data into, and then I want the macro to hit a prefill button, then hit Ok.
The page source code for the button is:
<input type="text" size="30" maxlength="40" name="name" id="name">
<input type="button" value="Prefill" onclick="prefill()">
I've been searching for answers all week and I think I have a basic understanding of how this is supposed to work by running a loop to search for it, but I'm having no luck in my endeavor of actually getting this to work.
Can someone show me the loop that will search my page for this button?
Thank you in advance.
Requested code so far
Private Sub Populate_Click()
Dim i As Long
Dim IE As Object
Dim objElement As Object
Dim objCollection As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
IE.Navigate "website" 'make sure you've logged into the page
Do
DoEvents
Loop Until IE.READYSTATE = 3
Do
DoEvents
Loop Until IE.READYSTATE = 4
ActiveSheet.EnableCalculation = False
ActiveSheet.EnableCalculation = True
Call IE.document.getelementbyid("name").SetAttribute("value", ActiveSheet.Range("b2").Value)
Call IE.document.getelementbyid("aw_login").SetAttribute("value", ActiveSheet.Range("a2").Value)
Set objCollection = IE.document.getElementsByTagName("input")
i = 0
While i < objCollection.Length
If objCollection(i).Type = "button" And _
objCollection(i).Name = "Prefill" Then
Set objElement = objCollection(i)
End If
i = i + 1
Wend
objElement.Click
End Sub
Looks like you are pretty close, this is what I have used to do something similiar
With ie.document
Set elems = .getelementsbytagname("input")
For Each e In elems
If (e.getattribute("className") = "saveComment") Then
e.Click
Exit For
End If
Next e
End With
You will probably just have to change the if statement.
I also notice that your code refers to .Name = "Prefill" but your html snippet refers to .value = "Prefill"