Excel VBA script freezes excel after running for a few minutes - vba

While my code works for 10 loop iterations, it crashes for home = 30 or more. Can someone please provide me with a clue? Even weirder this code used to work fine... and is not working anymore.
Here's the code:
Sub datascrap_clean()
'
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
Dim home As Integer
Dim output_rows As Integer
Dim output_columns As Integer
Dim date_columns As Integer
'Output rows and columns starting values
output_rows = 3
output_columns = 3
date_columns = 8
For home = 3 To 33
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://www.cqc.org.uk/directory/" & Sheets("Output").Cells(home, 1), Destination:=Range("$A$1") _
)
'.CommandType = 0
.Name = "Homes"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
For x = 20 To 250
Select Case Left(Cells(x, 1), 7)
'Is it a score?
Case Is = "Overall"
Sheets("Output").Cells(output_rows, output_columns).Value = Cells(x, 1)
output_columns = output_columns + 1
'Is it a date?
'Case Is = "Carried"
' Sheets("Output").Cells(output_rows, output_columns).Value = Cells(x, 1)
'date_columns = date_columns + 1
Case Else
End Select
Sheets(2).Select
Next x
'Clean sheet
ActiveSheet.Cells.Delete
'Reset column count
output_columns = 3
date_columns = 8
output_rows = output_rows + 1
Next home
MsgBox ("Done!")
End Sub

I had the same problem with creating QueryTable objects within a loop and having Excel hang at seemingly random times (usually after creating about 15 QueryTable objects). I noticed that the problem did not happen when I was in the VBE debugging and running with breakpoints inserted. So, in addition to deleting QueryTable objects after using them as suggested in a previous answer, I inserted a short delay at the start of my loop:
Application.Wait(Now + TimeValue("0:00:02"))
Was able to successfully run a case with ~300 QueryTable objects created with no hanging. Yes, a kludge, but it least provides a work around. Without the delay, I still got Excel to hang even after deleting the QueryTable objects.

Related

Web Scraping Data Destination

I am very new to using VBA in Excel. I have a list of hundreds of links that I want to scrape the data from (the links do not have nicely formatted tables, I have to scrape the raw data for what I need). I have a macro built that is working nicely, but the destination is not quite what I'm hoping for.
I want excel to read each url, and then dump the data in the next column over. BUT I want each set of data to appear directly below the previous. So I want all of the data from all of the URLs in the same column. Currently, my macro is putting the data from each URL into its own column.
Example:
My URLs are listed in each row separately in column A. The macro reads A1 and dumps the in Data B1. Then, it shifts that column to the right a bit (so it is now C1), and dumps the data from A2 into (the new) B2.
What I want it to do is read A1, and put the data in B1. Then, read A2 and put the data in B30 (if B29 was the last used row for the data from A1).
Hopefully this makes sense.
Here is the macro I currently have working:
Sub WebScraping()
Dim Erw, Frw, Lrw
Frw = 1
Lrw = Range("A" & Rows.Count).End(xlUp).Row
For Erw = Frw To Lrw
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & Range("A" & Erw).Value, Destination:=Range("B" & Erw))
.Name = ""
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next Erw
End Sub
And here are some examples of the URLs I am scraping:
http://www.washington.edu/students/timeschd/AUT2018/arctic.html
http://www.washington.edu/students/timeschd/AUT2018/hnrs.html
Thanks for any suggestions!
So the yellow URL is the first in the list, and the red URL is the second in the list. The yellow column to the right is where it is placing the data from the first URL, and the red column to the right is where it is placing the data from the second URL. But I want to it to first place the data from the first URL in Column B, then place the data from the second URL directly below that.
You need to write to the next available row in B (Or add some padding rows in between as well)
Option Explicit
Public Sub WebScraping()
Dim Erw As Long, Frw As Long, Lrw As Long, LrwB As Long
Frw = 1
Lrw = ActiveSheet.Range("A" & ActiveSheet.Rows.Count).End(xlUp).Row
For Erw = Frw To Lrw
LrwB = ActiveSheet.Range("B" & ActiveSheet.Rows.Count).End(xlUp).Row + 1
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & ActiveSheet.Range("A" & Erw).Value, Destination:=ActiveSheet.Range("B" & LrwB))
.Name = vbNullString
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next Erw
End Sub

Extract match data from this web "http://bet.hkjc.com/football/index.aspx?lang=en"

I want to extract match data from this web site "http://bet.hkjc.com/football/index.aspx?lang=en" using the following code :
Sub Macro4()
' Macro4 Macro
' steve lau 在 28/04/2016 錄製的巨集
baseURL = "http://www.hkjc.com/chinese/news/redirect_odds_ch_football.asp"
baseName = "summary"
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & baseURL _
, Destination:=Range("A1"))
End With
With ActiveSheet.QueryTables.Add(Destination:=Range("A1"))
.Name = baseName
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=True
End With
End Sub
But nothing was returned. I think it may due to different frames in the web page. Could anyone can help to figure out how to extract the match details ?
Many thanks.
You can use the following script where I grab the table using a
.document.getElementById("footballmaincontent").getElementsByTagName("table")(2)
and then loop the rows and columns (cells within row) within the table.
Sample results on page on 14/06/2018
Matches output from script:
Code:
Option Explicit
Public Sub GetInfo()
Dim IE As New InternetExplorer, a As HTMLTable
Const URL = "http://bet.hkjc.com/football/index.aspx?lang=en"
Application.ScreenUpdating = True
With IE
.Visible = False
.navigate URL
While .Busy Or .readyState < 4: DoEvents: Wend
Set a = .document.getElementById("footballmaincontent").getElementsByTagName("table")(2)
Dim r As Long, c As Long, iRow As HTMLTableRow, iCell As HTMLTableCell
With ActiveSheet
For Each iRow In a.getElementsByTagName("tr")
For Each iCell In iRow.getElementsByTagName("td")
Select Case iCell.innerText
Case "Home", "Draw", "Away"
Case Else
c = c + 1: .Cells(r + 1, c) = iCell.innerText
End Select
Next iCell
c = 0: r = r + 1
Next iRow
End With
.Quit
End With
Application.ScreenUpdating = True
End Sub
References required (VBE > Tools > References)
HTML Object Library
Microsoft Internet Controls

How to scrape data from Twitter from multiple URLs in a column of Excel

I need to basically fill two cells in each row with data from the respective URL present in the same row. When the query is made using macro it returns 5 pieces of data out of which I need only two.
I recorded two macros(one to query the page and other to adjust the data pieces i get) and combined them and applied a do while loop.
But it is returning an error in this line - myurl = "URL;" & Tabsheet.Cells(i, 6)
Below is the code:
'
' Macro1 Macro
'
Sheets("Tabsheet").Select
Range("A1").Select
Dim i As Integer, myurl As String
i = 1
Do While i < 102
myurl = "URL;" & Tabsheet.Cells(i, 6)
With ActiveSheet.QueryTables.Add(Connection:= _
myurl, Destination:=ActiveCell.Offset(i, 8))
.Name = "Query" & i
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "3"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Range("H105").Select
Selection.Cut
Range("I104").Select
Selection.Cut
Range("I103").Select
Selection.ClearContents
Range("H102").Select
i = i + 1
Loop
End Sub
Any suggestions will be extremely helpful.
This should work :
myurl = CStr("URL;" & Sheets("Tabsheet").Cells(i, 6))
You can use TabSheet directly as you didn't set it as an object, you could have if you did that before :
Dim TabSheet As Worksheet
Set TabSheet = ThisWorkbook.Sheets("TabSheet")
'------------------------------------
'--------Here come your code---------
'------------------------------------
'And when it's done, don't forget to free what you have set
Set TabSheet = Nothing

using From Web Query in a loop

I am trying to pull data from this website: http://securities.stanford.edu/filings.html?page=1
Each "page" is a table with 21 items. There are 97 pages I would like to pull data from, but I am unable to automate it so that the macro cycles through all 97, and places the results every 21 rows, starting on cell A1. (sequence: a1, a22, a43, ect...)
this what I got, but I dont want to edit the code 97 time to get all the pages. Any idea how I could automate the task?
Sub Macro1()
' Macro1 Macro
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://securities.stanford.edu/filings.html?page=1", Destination:=Range( _
"A1"))
.Name = "filings.html?page=1"**
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlAllTables
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
end Sub
For x = 1 to 97
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://securities.stanford.edu/filings.html?page=" & x, Destination:=Range( _
"A" & (1 + ((x - 1) * 21)))
.Name = "filings.html?page=" & x
End With
Next
x contains page number and the cell is complicated to make it start at A1 rather than A21.
You could make it 0 to 96 and cell & (1 + (x + 21)) and the name and query x + 1.
I would abandon the 'from Web Query' method and delve into some xmlHTTP. For the following, you will have use the VBE's Tools ► References to add Microsoft HTML Object Library, Microsoft Internet Controls and Microsoft XML 6.0.
Option Explicit
Sub mcr_Collect_Filings()
Dim htmlBDY As HTMLDocument, xmlHTTP As New MSXML2.ServerXMLHTTP60
Dim rw As Long, pg As Long, iTH As Long, iTD As Long, iTR As Long
Dim eTBL As MSHTML.IHTMLElement
For pg = 1 To 99 '<-set to something reasonable; routine will kick out whehn it cannot find anything more
xmlHTTP.Open "GET", "http://securities.stanford.edu/filings.html?page=" & pg, False
xmlHTTP.setRequestHeader "Content-Type", "text/xml"
xmlHTTP.send
If xmlHTTP.Status <> "200" Then GoTo bm_CleanUp
Set htmlBDY = New HTMLDocument
htmlBDY.body.innerHTML = xmlHTTP.responseText
Set eTBL = htmlBDY.getElementById("records").getElementsByTagName("table")(0)
If eTBL Is Nothing Then GoTo bm_CleanUp
'skip the header row if on page 2 and above
With Sheet1 '<-worksheet codename
rw = .Cells(Rows.Count, 1).End(xlUp).Row
For iTR = (1 + (pg = 1)) To (eTBL.getElementsByTagName("tr").Length - 1)
For iTH = 0 To (eTBL.getElementsByTagName("tr")(iTR).getElementsByTagName("th").Length - 1)
.Cells(rw, 1).Offset(iTR, iTH) = _
eTBL.getElementsByTagName("tr")(iTR).getElementsByTagName("th")(iTH).innerText
Next iTH
For iTD = 0 To (eTBL.getElementsByTagName("tr")(iTR).getElementsByTagName("td").Length - 1)
.Cells(rw, 1).Offset(iTR, iTD) = _
eTBL.getElementsByTagName("tr")(iTR).getElementsByTagName("td")(iTD).innerText
Next iTD
Next iTR
End With
Next pg
bm_CleanUp:
Set eTBL = Nothing
Set htmlBDY = Nothing
Set xmlHTTP = Nothing
End Sub
The XMLHTTP is invisible so you have to know a little about the page and what to expect in the form of HTML code you are going to receive under different circumstances. A browser's Inspect Element command take care of that.
This is by far the fastest method in VBA. While you actually have more than 99 rows to retrieve, this went to 99 pages in 56.3 seconds. You might even speed that up a bit by turning off screen updating.

Fetch specific table only from website into Excel

I need to fetch the table from http://www.zillow.com/homes/comps/67083361_zpid/ into Excel using VBA. I just want the table, nothing else. But when I'm using:
Set objIE = CreateObject("InternetExplorer.Application")
With objIE
.Visible = True
.Navigate "http://www.zillow.com/homes/comps/67083361_zpid/"
Do While .ReadyState <> 4: DoEvents: Loop
Debug.Print .document.Body.outerText
End With
it gives me text like:
4723 N 63rd Dr$63,50008/17/201241.752,0747,6751972$360.11
for each product which I can't analyze and store into different cells of Excel.
So is there a way I can fetch the page data in a manageable way. I am OK if I need to traverse a loop for this. Also I can do additional processing to fill the row data into Excel properly.
I'd use the below since I find query tables slow and IE excruciatingly slow ;)
Sub GetData()
Dim x As Long, y As Long
Dim htm As Object
Set htm = CreateObject("htmlFile")
With CreateObject("msxml2.xmlhttp")
.Open "GET", "http://www.zillow.com/homes/comps/67083361_zpid/", False
.send
htm.body.innerhtml = .responsetext
End With
With htm.getelementbyid("comps-results")
For x = 0 To .Rows.Length - 1
For y = 0 To .Rows(x).Cells.Length - 1
Sheets(1).Cells(x + 1, y + 1).Value = .Rows(x).Cells(y).innertext
Next y
Next x
End With
End Sub
I have done it using following code:
Sub FetchData()
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://www.zillow.com/homes/comps/67083361_zpid", Destination:=Range( _
"$A$1"))
.Name = "67083361_zpid"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
End Sub