I am trying to pull data from this website: http://securities.stanford.edu/filings.html?page=1
Each "page" is a table with 21 items. There are 97 pages I would like to pull data from, but I am unable to automate it so that the macro cycles through all 97, and places the results every 21 rows, starting on cell A1. (sequence: a1, a22, a43, ect...)
this what I got, but I dont want to edit the code 97 time to get all the pages. Any idea how I could automate the task?
Sub Macro1()
' Macro1 Macro
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://securities.stanford.edu/filings.html?page=1", Destination:=Range( _
"A1"))
.Name = "filings.html?page=1"**
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlAllTables
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
end Sub
For x = 1 to 97
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://securities.stanford.edu/filings.html?page=" & x, Destination:=Range( _
"A" & (1 + ((x - 1) * 21)))
.Name = "filings.html?page=" & x
End With
Next
x contains page number and the cell is complicated to make it start at A1 rather than A21.
You could make it 0 to 96 and cell & (1 + (x + 21)) and the name and query x + 1.
I would abandon the 'from Web Query' method and delve into some xmlHTTP. For the following, you will have use the VBE's Tools ► References to add Microsoft HTML Object Library, Microsoft Internet Controls and Microsoft XML 6.0.
Option Explicit
Sub mcr_Collect_Filings()
Dim htmlBDY As HTMLDocument, xmlHTTP As New MSXML2.ServerXMLHTTP60
Dim rw As Long, pg As Long, iTH As Long, iTD As Long, iTR As Long
Dim eTBL As MSHTML.IHTMLElement
For pg = 1 To 99 '<-set to something reasonable; routine will kick out whehn it cannot find anything more
xmlHTTP.Open "GET", "http://securities.stanford.edu/filings.html?page=" & pg, False
xmlHTTP.setRequestHeader "Content-Type", "text/xml"
xmlHTTP.send
If xmlHTTP.Status <> "200" Then GoTo bm_CleanUp
Set htmlBDY = New HTMLDocument
htmlBDY.body.innerHTML = xmlHTTP.responseText
Set eTBL = htmlBDY.getElementById("records").getElementsByTagName("table")(0)
If eTBL Is Nothing Then GoTo bm_CleanUp
'skip the header row if on page 2 and above
With Sheet1 '<-worksheet codename
rw = .Cells(Rows.Count, 1).End(xlUp).Row
For iTR = (1 + (pg = 1)) To (eTBL.getElementsByTagName("tr").Length - 1)
For iTH = 0 To (eTBL.getElementsByTagName("tr")(iTR).getElementsByTagName("th").Length - 1)
.Cells(rw, 1).Offset(iTR, iTH) = _
eTBL.getElementsByTagName("tr")(iTR).getElementsByTagName("th")(iTH).innerText
Next iTH
For iTD = 0 To (eTBL.getElementsByTagName("tr")(iTR).getElementsByTagName("td").Length - 1)
.Cells(rw, 1).Offset(iTR, iTD) = _
eTBL.getElementsByTagName("tr")(iTR).getElementsByTagName("td")(iTD).innerText
Next iTD
Next iTR
End With
Next pg
bm_CleanUp:
Set eTBL = Nothing
Set htmlBDY = Nothing
Set xmlHTTP = Nothing
End Sub
The XMLHTTP is invisible so you have to know a little about the page and what to expect in the form of HTML code you are going to receive under different circumstances. A browser's Inspect Element command take care of that.
This is by far the fastest method in VBA. While you actually have more than 99 rows to retrieve, this went to 99 pages in 56.3 seconds. You might even speed that up a bit by turning off screen updating.
Related
I wrote a Web Query macro to import financial statements from Yahoo Finance based on the value in cell A1. It was working seamlessly for the past few weeks, but suddenly, it no longer returns any data (but does not generate an error). If anyone has any insights, I would appreciate your guidance. I have posted the code below--thank you!
Sub ThreeFinancialStatements()
On Error GoTo Explanation
Rows("2:1000").Select
Selection.ClearContents
Columns("B:AAT").Select
Range(Selection, Selection.End(xlToRight)).Select
Selection.ClearContents
Dim inTicker As String
inTicker = Range("A1")
ActiveSheet.Name = UCase(inTicker)
GetFinStats inTicker
Exit Sub
Explanation:
MsgBox "Please make sure you type a valid stock ticker symbol into cell A1 and are not trying to create a duplicate sheet." & _
vbLf & " " & _
vbLf & "Also, for companies with different classes of shares (e.g. Berkshire Hathaway), use a hyphen to designate the ticker symbol instead of a period (e.g. BRK-A)." & _
vbLf & " " & _
vbLf & "Please also note that not every company has three years of financial statements, so data may appear incomplete or missing for some companies.", _
, "Error"
Exit Sub
End Sub
Sub GetFinStats(inTicker As String)
'
' GetBalSheet Macro
'
'
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://finance.yahoo.com/q/bs?s=" & inTicker & "+Balance+Sheet&annual", Destination:= _
Range("$D$1"))
.Name = "bs?s=PEP+Balance+Sheet&annual"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlOverwriteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "9"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://finance.yahoo.com/q/is?s=" & inTicker & "+Income+Statement&annual", Destination _
:=Range("$J$1"))
.Name = "is?s=PEP+Income+Statement&annual"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlOverwriteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "9"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://finance.yahoo.com/q/cf?s=" & inTicker & "+Cash+Flow&annual", Destination:= _
Range("$P$1"))
.Name = "cf?s=PEP+Cash+Flow&annual"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlOverwriteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "9"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Range("A3").Select
ActiveCell.FormulaR1C1 = "Current Ratio"
Range("A4").Select
ActiveCell.FormulaR1C1 = "Quick Ratio"
Range("A5").Select
ActiveCell.FormulaR1C1 = "Cash Ratio"
Range("A6").Select
Range("A7").Select
ActiveCell.FormulaR1C1 = "Revenue Growth Rate"
Range("A9").Select
Columns("A:A").ColumnWidth = 21.86
ActiveCell.FormulaR1C1 = "ROA"
Range("A10").Select
ActiveCell.FormulaR1C1 = "ROE"
Range("A11").Select
ActiveCell.FormulaR1C1 = "ROIC"
Range("B3").Select
ActiveCell.Formula = "=F11/F28"
Range("B4").Select
ActiveCell.Formula = "=(F11-F8)/F28"
Range("B5").Select
ActiveCell.Formula = "=F5/F28"
Range("B7").Select
ActiveCell.Formula = "=(L2/N2)^(1/2)-1"
Range("B9").Select
ActiveCell.Formula = "=L35/SUM(F12:F18)"
Range("B10").Select
ActiveCell.Formula = "=L35/F47"
Range("B11").Select
ActiveCell.Formula = "=L35/(F47+SUM(F29:F33))"
Range("B3").Select
Selection.NumberFormat = "0.00"
Range("B4").Select
Selection.NumberFormat = "0.00"
Range("B5").Select
Selection.NumberFormat = "0.00"
Range("B7").Select
Selection.NumberFormat = "0.00%"
Range("B9").Select
Selection.NumberFormat = "0.00%"
Range("B10").Select
Selection.NumberFormat = "0.00%"
Range("B11").Select
Selection.NumberFormat = "0.00%"
Range("A1").Select
End Sub
Your code is obviously working against a specific worksheet:
Rows("2:1000").Select
But what sheet is that? Only you can know that.
As written, it's whatever the active worksheet is, regardless of how much sense that makes.
Unqualified, these functions all implicitly refer to the ActiveSheet:
Range
Cells
Columns
Rows
Names
So you need to qualify them. And you do that by specifying a specific Worksheet object they should be working with - suppose that's DataSheet (I've no idea):
DataSheet.Rows("2:1000").Select
That would .Select the specified rows on the worksheet pointed to by the DataSheet object.
By why do you need to .Select it? This:
Rows("2:1000").Select
Selection.ClearContents
Could just as well be:
DataSheet.Rows("2:1000").ClearContents
Or better - assuming your data is formatted as a table (seems it looks like one anyway - so why not use the ListObjects API?):
DataSheet.ListObjects("DataTable").DataBodyRange.Delete
Sounds like that instruction has just replaced all the .Select and .ClearContents going on here. Note that .Select mimicks user action - the user clicking on a cell (or anything really) and selecting it. You have programmatic access to the entire object model - you never need to .Select anything!
Dim inTicker As String
inTicker = Range("A1")
Here you're implicitly reading from the active sheet, but you're also implicitly converting a Variant (the cell's value) into a String, which may or may not succeed. If A1 contains an error value (e.g. #REF!), the instruction fails.
With DataSheet.Range("A1")
If Not IsError(.Value) Then
inTicker = CStr(.Value)
Else
'decide what to do then
End If
End With
Your error-handling subroutine should at least Debug.Print Err.Number, Err.Description so that you have a bit of a clue about why things blew up. Right now it's assuming a reason for failure, and as you saw, Excel is full of traps.
Also you're using vbLf, but that's only half of a proper Windows newline character. Use vbNewLine if you're not sure what that is.
An Exit Sub instruction just before an End Sub token is completely useless.
Sub GetFinStats(inTicker As String)
The procedure is implicitly Public, and inTicker is implicitly passed ByRef. Kudos for giving it an explicit type!
This would be better:
Private Sub GetFinStats(ByVal inTicker As String)
With ActiveSheet.QueryTables
At least that's explicit about using the active sheet. But should it use the active sheet, or a specific sheet? And what happens to the query tables that were already there?
I strongly recommend you type this in the immediate pane:
?ThisWorkbook.Connections.Count
If the number is greater than the number of .QueryTables.Add calls you have in your procedure (likely), you have quite a problem there: I suspect you have over a hundred connections in the workbook, and clicking the "Refresh All" button takes forever to finish, and it's fairly possible that finance.yahoo.com is receiving dozens of requests from a single IP in a very limited amount of time, and refuses to serve them.
Delete all unused workbook connections. And then fix the implicit ActiveSheet references there too, and get rid of all these useless .Select calls:
With TheSpecificSheet
With .QueryTables.Add( ... )
End With
With .QueryTables.Add( ... )
End With
With .QueryTables.Add( ... )
End With
'assgin .Value, not .FormulaR1C1; you're not entering a R1C1 formula anyway
.Range("A3").Value = "Current Ratio"
.Range("A4").Value = "Quick Ratio"
.Range("A5").Value = "Cash Ratio"
End With
Consecutive .Select calls mean all but the last one serve a purpose, if any:
Range("A6").Select
Range("A7").Select
Again, don't assign ActiveCell when you can assign .Range("A7").Value directly.
And you can set number formats for a range of cells:
.Range("B3:B11").NumberFormat = "0.00%"
You can still retrieve the necessary data by parsing JSON response either from
https://finance.yahoo.com/quote/AAPL/financials(extracting data from HTML content, AAPL here just for example)
or via API
https://query1.finance.yahoo.com/v10/finance/quoteSummary/AAPL?lang=en-US®ion=US&modules=incomeStatementHistory%2CcashflowStatementHistory%2CbalanceSheetHistory%2CincomeStatementHistoryQuarterly%2CcashflowStatementHistoryQuarterly%2CbalanceSheetHistoryQuarterly%2Cearnings
You may use the below VBA code to parse response and output result. Import JSON.bas module into the VBA project for JSON processing. Here are Sub Test_query1_finance_yahoo_com() to get data via API and Test_finance_yahoo_com_quote to extract data from HTML content:
Option Explicit
Sub Test_query1_finance_yahoo_com()
Dim sSymbol As String
Dim sJSONString As String
Dim vJSON As Variant
Dim sState As String
sSymbol = "AAPL"
' Get JSON via API
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://query1.finance.yahoo.com/v10/finance/quoteSummary/" & sSymbol & "?lang=en-US®ion=US&modules=incomeStatementHistory%2CcashflowStatementHistory%2CbalanceSheetHistory%2CincomeStatementHistoryQuarterly%2CcashflowStatementHistoryQuarterly%2CbalanceSheetHistoryQuarterly%2Cearnings", False
.Send
sJSONString = .ResponseText
End With
' Parse JSON response
JSON.Parse sJSONString, vJSON, sState
If sState = "Error" Then
MsgBox "Invalid JSON"
Exit Sub
End If
' Pick core data
Set vJSON = vJSON("quoteSummary")("result")(0)
' Output
QuoteDataOutput vJSON
MsgBox "Completed"
End Sub
Sub Test_finance_yahoo_com_quote()
Dim sSymbol As String
Dim sJSONString As String
Dim vJSON As Variant
Dim sState As String
sSymbol = "AAPL"
' Get webpage HTML response
With CreateObject("Msxml2.XMLHTTP")
.Open "GET", "https://finance.yahoo.com/quote/" & sSymbol & "/financials", False
.Send
sJSONString = .ResponseText
End With
' Extract JSON from HTML content
sJSONString = "{" & Split(sJSONString, "root.App.main = {")(1)
sJSONString = Split(sJSONString, "}(this));")(0)
sJSONString = Left(sJSONString, InStrRev(sJSONString, "}"))
' Parse JSON response
JSON.Parse sJSONString, vJSON, sState
If sState = "Error" Then
MsgBox "Invalid JSON"
Exit Sub
End If
' Pick core data
Set vJSON = vJSON("context")("dispatcher")("stores")("QuoteSummaryStore")
' Output
QuoteDataOutput vJSON
MsgBox "Completed"
End Sub
Sub QuoteDataOutput(vJSON)
Const Transposed = True ' Output option
Dim oItems As Object
Dim vItem
Dim aRows()
Dim aHeader()
' Fetch main structures available from JSON object to dictionary
Set oItems = CreateObject("Scripting.Dictionary")
With oItems
.Add "IncomeStatementY", vJSON("incomeStatementHistory")("incomeStatementHistory")
.Add "IncomeStatementQ", vJSON("incomeStatementHistoryQuarterly")("incomeStatementHistory")
.Add "CashflowY", vJSON("cashflowStatementHistory")("cashflowStatements")
.Add "CashflowQ", vJSON("cashflowStatementHistoryQuarterly")("cashflowStatements")
.Add "BalanceSheetY", vJSON("balanceSheetHistory")("balanceSheetStatements")
.Add "BalanceSheetQ", vJSON("balanceSheetHistoryQuarterly")("balanceSheetStatements")
.Add "EarningsChartQ", vJSON("earnings")("earningsChart")("quarterly")
.Add "FinancialsChartY", vJSON("earnings")("financialsChart")("yearly")
.Add "FinancialsChartQ", vJSON("earnings")("financialsChart")("quarterly")
End With
' Output each data set to separate worksheet
For Each vItem In oItems
' Convert each data set to array
JSON.ToArray oItems(vItem), aRows, aHeader
' Output array to worksheet
With GetSheet((vItem))
.Cells.Delete
If Transposed Then
Output2DArray .Cells(1, 1), WorksheetFunction.Transpose(aHeader)
Output2DArray .Cells(1, 2), WorksheetFunction.Transpose(aRows)
Else
OutputArray .Cells(1, 1), aHeader
Output2DArray .Cells(2, 1), aRows
End If
.Columns.AutoFit
End With
Next
End Sub
Function GetSheet(sName As String, Optional bCreate = True) As Worksheet
On Error Resume Next
Set GetSheet = ThisWorkbook.Sheets(sName)
If Err Then
If bCreate Then
Set GetSheet = ThisWorkbook.Sheets.Add(, ThisWorkbook.Sheets(ThisWorkbook.Sheets.Count))
GetSheet.Name = sName
End If
Err.Clear
End If
End Function
Sub OutputArray(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize(1, UBound(aCells) - LBound(aCells) + 1)
.NumberFormat = "#"
.Value = aCells
End With
End With
End Sub
Sub Output2DArray(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize( _
UBound(aCells, 1) - LBound(aCells, 1) + 1, _
UBound(aCells, 2) - LBound(aCells, 2) + 1)
.NumberFormat = "#"
.Value = aCells
End With
End With
End Sub
Finally Sub QuoteDataOutput(vJSON) input is a JSON object, to make it clear how the necessary data is being extracted from it, you may save the JSON string to file, copy the contents and paste it to any JSON viewer for further study. I use online tool http://jsonviewer.stack.hu, target element structure is shown below:
The output for me is as follows (first worksheet shown):
There are 9 main sections, the relevant part of the data is extracted and output to 9 worksheets:
IncomeStatementY
IncomeStatementQ
CashflowY
CashflowQ
BalanceSheetY
BalanceSheetQ
EarningsChartQ
FinancialsChartY
FinancialsChartQ
Having that example you can extract the data you need from that JSON response.
It turns out that Yahoo ended the application from which the web query drew its data. Thank you for all your tips.
I am new in Excel VBA/Macro
I need to grab the specific part of the page, not the full page.
The bellow code works in complete page, but don't need the all parts of the page.
Sub GrabOutStandingTable()
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://dsebd.org/displayCompany.php?name=ABBANK", Destination:=Range( _
"$A$1"))
.CommandType = 0
.Name = "displayCompany.php?name=ABBANK"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = """company"""
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Sheets.Add After:=ActiveSheet
End Sub
The table part headed as "Other Information of the Company" is in lower part of the page, which is that I am talking about. The macro should extract this part.
XHR request:
You could do a much faster browserless XHR request and simply target the table of interest, which is at position 23, if you collect elements by their (non-unique) company id.
I use querySelectorAll method to grab the matching nodes and then extract the table at index 23.
Note the additional sponsor info that is displayed in then code output.
Webpage view:
Sample code output:
Code:
Option Explicit
Public Sub GetTable()
Dim sResponse As String, hTable As Object, HTML As New HTMLDocument
Application.ScreenUpdating = False
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://dsebd.org/displayCompany.php?name=ABBANK", False
.send
sResponse = StrConv(.responseBody, vbUnicode)
End With
sResponse = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))
With HTML
.body.innerHTML = sResponse
Set hTable = .querySelectorAll("#company")(23)
End With
WriteTable hTable
Application.ScreenUpdating = True
End Sub
Public Sub WriteTable(ByVal hTable As HTMLTable, Optional ByVal startRow As Long = 1, Optional ByVal ws As Worksheet)
If ws Is Nothing Then Set ws = ActiveSheet
Dim tSection As Object, tRow As Object, tCell As Object, tr As Object, td As Object, R As Long, C As Long, tBody As Object
R = startRow
With ws
Dim headers As Object, header As Object, columnCounter As Long
Set headers = hTable.getElementsByTagName("th")
For Each header In headers
columnCounter = columnCounter + 1
.Cells(startRow, columnCounter) = header.innerText
Next header
startRow = startRow + 1
Set tBody = hTable.getElementsByTagName("tbody")
For Each tSection In tBody 'HTMLTableSection
Set tRow = tSection.getElementsByTagName("tr") 'HTMLTableRow
For Each tr In tRow
R = R + 1
Set tCell = tr.getElementsByTagName("td")
C = 1
For Each td In tCell 'DispHTMLElementCollection
.Cells(R, C).Value = td.innerText 'HTMLTableCell
C = C + 1
Next td
Next tr
Next tSection
End With
End Sub
References:
VBE > Tools > References > HTML Object Library
The old Data > From Web won't handle this due to the way that website is structured. The data you need is nested deeply into other tables and consists of several tables.
Suggest using Power Query instead (no VBA required). Here is how to use Power Query in XL2013 for this. Use Excel's ribbon and find tab POWER QUERY.
Use menu option: POWER QUERY > From Web
From Web dialog displays. Enter your URL.
Click OK
The data you need is in Table 30. Find and click it, then click Load.
If we have XL2016 (office 365) we already have Power Query. If we have XL2010 or XL2013 we can download it from: https://www.microsoft.com/en-us/download/details.aspx?id=39379&CorrelationId=1441491e-917e-43de-8d6a-21f98287c3c2
I want to extract match data from this web site "http://bet.hkjc.com/football/index.aspx?lang=en" using the following code :
Sub Macro4()
' Macro4 Macro
' steve lau 在 28/04/2016 錄製的巨集
baseURL = "http://www.hkjc.com/chinese/news/redirect_odds_ch_football.asp"
baseName = "summary"
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & baseURL _
, Destination:=Range("A1"))
End With
With ActiveSheet.QueryTables.Add(Destination:=Range("A1"))
.Name = baseName
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=True
End With
End Sub
But nothing was returned. I think it may due to different frames in the web page. Could anyone can help to figure out how to extract the match details ?
Many thanks.
You can use the following script where I grab the table using a
.document.getElementById("footballmaincontent").getElementsByTagName("table")(2)
and then loop the rows and columns (cells within row) within the table.
Sample results on page on 14/06/2018
Matches output from script:
Code:
Option Explicit
Public Sub GetInfo()
Dim IE As New InternetExplorer, a As HTMLTable
Const URL = "http://bet.hkjc.com/football/index.aspx?lang=en"
Application.ScreenUpdating = True
With IE
.Visible = False
.navigate URL
While .Busy Or .readyState < 4: DoEvents: Wend
Set a = .document.getElementById("footballmaincontent").getElementsByTagName("table")(2)
Dim r As Long, c As Long, iRow As HTMLTableRow, iCell As HTMLTableCell
With ActiveSheet
For Each iRow In a.getElementsByTagName("tr")
For Each iCell In iRow.getElementsByTagName("td")
Select Case iCell.innerText
Case "Home", "Draw", "Away"
Case Else
c = c + 1: .Cells(r + 1, c) = iCell.innerText
End Select
Next iCell
c = 0: r = r + 1
Next iRow
End With
.Quit
End With
Application.ScreenUpdating = True
End Sub
References required (VBE > Tools > References)
HTML Object Library
Microsoft Internet Controls
I've been pretty new to Excel VBA, and I'm at some simple but stumping issues (might be from the overdose of coffee). My code currently pull tables from Yahoo Finance in a loop (so I can put in multiple tickers). What I'm trying to adjust is first inputting the tickers along a row instead of along a column - I tried changing all the "rows" of the lr1 line and that didnt do much. Secondly, I tried to figure out a way to pull the information so it's only the numbers, instead of having the entire table pulled out since its the same line items each time. It would help if I could also erase/overwrite my macro results everytime I click the refresh button.
The result would be as simple as putting in a ticker on one column at the top, and the numbers come out right underneath after hitting a button - I feel that I'm close, but no cigar.
This is the code that works, not the one riddled with as many mistakes.
Sub RefreshQuery()
On Error Resume Next
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Dim DestinationCell As Range
Dim StockSymbol As String
Dim i As Long, lr1 As Long, lr2 As Long
lr1 = Range("B:B").Find("*", LookIn:=xlValues, SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
For i = 1 To QueryTables.Count
QueryTables(i).Delete
Next i
Range("C:D").Clear
For i = 2 To lr1
lr2 = Range("D:D").Find("*", LookIn:=xlValues, SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
Set DestinationCell = Cells((lr2 + 3), 4)
StockSymbol = Cells(i, 2).Value
Cells((lr2 + 2), 4).Value = "****" & StockSymbol & "****"
With QueryTables.Add(Connection:="URL;http://finance.yahoo.com/q/ks?s=" & StockSymbol & "+Key+Statistics", Destination:=DestinationCell)
.Name = "q/ks?s=" & StockSymbol & "+Key+Statistics"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "8,9,10,11,12,13,14,15,16,17,18,19,20,21,25,26,27,29"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next i
Application.ScreenUpdating = False
Application.Calculation = xlCalculationAutomatic
End Sub
Thanks for all the help!
For only the numbers you do a query on another sheet. On your main sheet type =SheetWithYahooPage!b5 where you want the number only. It's easier to just pull the whole page in.
Queries have their own refresh abilities. Why are you doing it in code? Queries can delete the data. There are two dialogs for Queries. One called Options where you select the page. And another button called Properties on the last dialog where you are asked where to insert the data.
While my code works for 10 loop iterations, it crashes for home = 30 or more. Can someone please provide me with a clue? Even weirder this code used to work fine... and is not working anymore.
Here's the code:
Sub datascrap_clean()
'
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
Dim home As Integer
Dim output_rows As Integer
Dim output_columns As Integer
Dim date_columns As Integer
'Output rows and columns starting values
output_rows = 3
output_columns = 3
date_columns = 8
For home = 3 To 33
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://www.cqc.org.uk/directory/" & Sheets("Output").Cells(home, 1), Destination:=Range("$A$1") _
)
'.CommandType = 0
.Name = "Homes"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
For x = 20 To 250
Select Case Left(Cells(x, 1), 7)
'Is it a score?
Case Is = "Overall"
Sheets("Output").Cells(output_rows, output_columns).Value = Cells(x, 1)
output_columns = output_columns + 1
'Is it a date?
'Case Is = "Carried"
' Sheets("Output").Cells(output_rows, output_columns).Value = Cells(x, 1)
'date_columns = date_columns + 1
Case Else
End Select
Sheets(2).Select
Next x
'Clean sheet
ActiveSheet.Cells.Delete
'Reset column count
output_columns = 3
date_columns = 8
output_rows = output_rows + 1
Next home
MsgBox ("Done!")
End Sub
I had the same problem with creating QueryTable objects within a loop and having Excel hang at seemingly random times (usually after creating about 15 QueryTable objects). I noticed that the problem did not happen when I was in the VBE debugging and running with breakpoints inserted. So, in addition to deleting QueryTable objects after using them as suggested in a previous answer, I inserted a short delay at the start of my loop:
Application.Wait(Now + TimeValue("0:00:02"))
Was able to successfully run a case with ~300 QueryTable objects created with no hanging. Yes, a kludge, but it least provides a work around. Without the delay, I still got Excel to hang even after deleting the QueryTable objects.