Excel VBA Web Table & Input - vba

I've been pretty new to Excel VBA, and I'm at some simple but stumping issues (might be from the overdose of coffee). My code currently pull tables from Yahoo Finance in a loop (so I can put in multiple tickers). What I'm trying to adjust is first inputting the tickers along a row instead of along a column - I tried changing all the "rows" of the lr1 line and that didnt do much. Secondly, I tried to figure out a way to pull the information so it's only the numbers, instead of having the entire table pulled out since its the same line items each time. It would help if I could also erase/overwrite my macro results everytime I click the refresh button.
The result would be as simple as putting in a ticker on one column at the top, and the numbers come out right underneath after hitting a button - I feel that I'm close, but no cigar.
This is the code that works, not the one riddled with as many mistakes.
Sub RefreshQuery()
On Error Resume Next
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Dim DestinationCell As Range
Dim StockSymbol As String
Dim i As Long, lr1 As Long, lr2 As Long
lr1 = Range("B:B").Find("*", LookIn:=xlValues, SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
For i = 1 To QueryTables.Count
QueryTables(i).Delete
Next i
Range("C:D").Clear
For i = 2 To lr1
lr2 = Range("D:D").Find("*", LookIn:=xlValues, SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
Set DestinationCell = Cells((lr2 + 3), 4)
StockSymbol = Cells(i, 2).Value
Cells((lr2 + 2), 4).Value = "****" & StockSymbol & "****"
With QueryTables.Add(Connection:="URL;http://finance.yahoo.com/q/ks?s=" & StockSymbol & "+Key+Statistics", Destination:=DestinationCell)
.Name = "q/ks?s=" & StockSymbol & "+Key+Statistics"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "8,9,10,11,12,13,14,15,16,17,18,19,20,21,25,26,27,29"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next i
Application.ScreenUpdating = False
Application.Calculation = xlCalculationAutomatic
End Sub
Thanks for all the help!

For only the numbers you do a query on another sheet. On your main sheet type =SheetWithYahooPage!b5 where you want the number only. It's easier to just pull the whole page in.
Queries have their own refresh abilities. Why are you doing it in code? Queries can delete the data. There are two dialogs for Queries. One called Options where you select the page. And another button called Properties on the last dialog where you are asked where to insert the data.

Related

Web Scraping Data Destination

I am very new to using VBA in Excel. I have a list of hundreds of links that I want to scrape the data from (the links do not have nicely formatted tables, I have to scrape the raw data for what I need). I have a macro built that is working nicely, but the destination is not quite what I'm hoping for.
I want excel to read each url, and then dump the data in the next column over. BUT I want each set of data to appear directly below the previous. So I want all of the data from all of the URLs in the same column. Currently, my macro is putting the data from each URL into its own column.
Example:
My URLs are listed in each row separately in column A. The macro reads A1 and dumps the in Data B1. Then, it shifts that column to the right a bit (so it is now C1), and dumps the data from A2 into (the new) B2.
What I want it to do is read A1, and put the data in B1. Then, read A2 and put the data in B30 (if B29 was the last used row for the data from A1).
Hopefully this makes sense.
Here is the macro I currently have working:
Sub WebScraping()
Dim Erw, Frw, Lrw
Frw = 1
Lrw = Range("A" & Rows.Count).End(xlUp).Row
For Erw = Frw To Lrw
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & Range("A" & Erw).Value, Destination:=Range("B" & Erw))
.Name = ""
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next Erw
End Sub
And here are some examples of the URLs I am scraping:
http://www.washington.edu/students/timeschd/AUT2018/arctic.html
http://www.washington.edu/students/timeschd/AUT2018/hnrs.html
Thanks for any suggestions!
So the yellow URL is the first in the list, and the red URL is the second in the list. The yellow column to the right is where it is placing the data from the first URL, and the red column to the right is where it is placing the data from the second URL. But I want to it to first place the data from the first URL in Column B, then place the data from the second URL directly below that.
You need to write to the next available row in B (Or add some padding rows in between as well)
Option Explicit
Public Sub WebScraping()
Dim Erw As Long, Frw As Long, Lrw As Long, LrwB As Long
Frw = 1
Lrw = ActiveSheet.Range("A" & ActiveSheet.Rows.Count).End(xlUp).Row
For Erw = Frw To Lrw
LrwB = ActiveSheet.Range("B" & ActiveSheet.Rows.Count).End(xlUp).Row + 1
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & ActiveSheet.Range("A" & Erw).Value, Destination:=ActiveSheet.Range("B" & LrwB))
.Name = vbNullString
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next Erw
End Sub

How would I open a .csv file with VBA and read all the data?

When I extract my data from this one software into raw text it separates the values with commas and quotation marks for the majority of the file. Except in certain cases the data has a line break. When I save it as a .csv and open it, the data is cleanly formatted into a proper table.
However, if I attempt the same process with QueryTablesit processes the enter character as a new line.
I have two different approaches in two different subprograms
The first is able to read the number of rows and columns properly, but since it uses the QueryTables method it reads that enter charachter as a new line.
The code for this approach is below:
Private Sub OpenCSVFile()
With ThisWorkbook
Set primeSheet = .Sheets.Add(After:=.Sheets(.Sheets.Count))
primeSheet.Name = "Temp"
End With
Set informationRange = primeSheet.Range("A1")
xAddress = informationRange.Address
With primeSheet.QueryTables.Add("TEXT;" & strPath, primeSheet.Range(xAddress))
.AdjustColumnWidth = False
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = False
.RefreshOnFileOpen = False
.RefreshStyle = xlOverwriteCells
.SaveData = False
.RefreshPeriod = 0
.TextFileStartRow = 1
.TextFileParseType = xlDelimited
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = False
.TextFileSemicolonDelimiter = False
.TextFileCommaDelimiter = True
.TextFileSpaceDelimiter = False
.TextFileTrailingMinusNumbers = True
.Refresh BackgroundQuery:=False
End With
rowLength = primeSheet.Cells(1, Columns.Count).End(xlToLeft).Column
colLength = primeSheet.Cells(Rows.Count, "A").End(xlUp).Row
End Sub
The other method keeps the formatting intact as to when I regularly open .csv files of my extracted data, but it is unable to read the number of rows and lengths, and returns a value for 1 for each.
Private Sub OpenCSV()
Set primeBook = Workbooks.Open(strPath)
With primeBook
rowLength = Sheet1.Cells(1, Columns.Count).End(xlToLeft).Column
colLength = Sheet1.Cells(Rows.Count, "A").End(xlUp).Row
End With
Set informationRange = Sheet1.Range("A1", Sheet1.Cells(colLength, rowLength))
End Sub
How would you recommend I address my issue and read the contents of the csv file while ensuring no weird line breaks.
Would this work?
Private Sub OpenCSV()
Set primeBook = Workbooks.Open(strPath)
With primeBook.Sheet1
rowLength = .usedrange.rows.count
colLength = .usedrange.columns.count
End With
Set informationRange = Sheet1.Range("A1", Sheet1.Cells(colLength, rowLength))
End Sub

Pulling external data to next empty row as static, then repeating with next empty row? (Backing up Google Sheet data to Excel)

I have an Excel sheet that I would like to serve as the backup of a Google sheet, which I'll be clearing out periodically to prevent it from slowing down. I'm attempting to write a macro which, after a set period of time, will find the next empty row in the Excel sheet, activate the cell in column "A", and import the data from the Google Sheet. I don't want to "refresh" the data in Excel, because the plan is to delete the data in the Google Sheet every so often while the Excel sheet serves as a continuous record. I would simply like to pull the current Google Sheet data into the first cell of the next empty row, and schedule this to repeat.
Here's what I've been trying:
Sub addData()
newCell = Sheet1.Cells(Rows.Count, 1).End(xlUp).Offset(1, 0).Address
MsgBox newCell
Sheet1.QueryTables.Add(Connection:= _
"URL;googleSheetURL" _
, Destination:=Range(newCell))
.PostText = "transaction-data_1"
.Name = False
.FieldNames = False
.RefreshStyle = xlInsertDeleteCells
.RowNumbers = False
.FillAdjacentFormulas = False
.HasAutoFormat = False
.RefreshOnFileOpen = 2
.BackgroundQuery = False
.TablesOnlyFromHTML = True
.SaveData = True
.Refresh BackgroundQuery:=False
.UseListObject = False
End With
End Sub
Where googleSheetURL is replaced with the published link of the sheet.
I just keep getting errors, debug mode highlights the Refresh BackgroundQuery line. I disabled background refresh because I didn't want the queries to update once I pulled them. Does anyone have any insight?
This code doesn't compile. You're missing a With in front of Sheet1:
Sub addData()
newCell = Sheet1.Cells(Rows.Count, 1).End(xlUp).Offset(1, 0).Address
With Sheet1.QueryTables.Add(Connection:= _
"URL;googleSheetURL" _
, Destination:=Range(newCell))
.PostText = "transaction-data_1"
.Name = False
.FieldNames = False
.RefreshStyle = xlInsertDeleteCells
.RowNumbers = False
.FillAdjacentFormulas = False
.HasAutoFormat = False
.RefreshOnFileOpen = 2
.BackgroundQuery = False
.TablesOnlyFromHTML = True
.SaveData = True
.Refresh BackgroundQuery:=False
.UseListObject = False
End With
End Sub

VBA Excel: How to import CSV data into a range when the data array may differ

My macro imports CSV data saved from a DAQ app into range A2:AC12. The data "typically" includes 16 sets seen in the figure below from D2:D17. The set goes from 0,1,2,etc, and to the max of 15.
(FYI, I use A1 for fixed header info & E2:F17 is to translate the 0-15 set to 1-16 for graphing purposes).
However, the reason I say "typically" is because its possible that the DAQ app user can choose a different set of runs or skip some runs! Here is an example of when they performed the run and skipped sets 0-4, and choose 4-15 only. If I imported that data into my macro as currently is, this is what it would look like.
You can see here the problem where set 5 is at D2 (where set 0 should be). Additionally, a calculation array (N21:AC35) notices the empty spaces and results in DIV/0 error. Due to my limited knowledge of importing data, I use the most basic code for a simple import of the CSV into A2:
'Imports CSV Data
With ActiveSheet.QueryTables.Add(Connection:= _
"TEXT;" & myDirString, Destination:=Range("$A$2"))
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.TextFilePromptOnRefresh = False
.TextFilePlatform = 437
.TextFileStartRow = 1
.TextFileParseType = xlDelimited
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = False
.TextFileSemicolonDelimiter = False
.TextFileCommaDelimiter = True
.TextFileSpaceDelimiter = False
.TextFileColumnDataTypes = Array(1, 1, 1)
.TextFileTrailingMinusNumbers = True
.Refresh BackgroundQuery:=False
End With
So the main question is how can I import the data, (whether it is a 0-15 set, a 5-15 set, or a 2-4-6-8-etc set), and still have it organized appropriately within that 0-15 rows? Should I rely on a FOR loop or IF statements somewhere? I'm perplexed about how to handle wide ranges of data like this and only used to fixed data runs.
Ideally, I would expect it to look like this. I manually manipulated the data as an example and okay with just putting in "n/a" for the empty non-run spaces:
reading your post it and looking at the screenshots it looks like you always expect something in the range A1:A17. if that's so then you can run this after importing the data in your import macro:
For i = 2 To 17
If ActiveSheet.Range("d" & i) <> i - 2 Then
ActiveSheet.Range(i & ":" & i).Insert shift:=xlShiftDown
With ActiveSheet.Range("a" & i & ":ac" & i)
.Formula = "=na()"
.Value = .Value
End With
ActiveSheet.Range("d" & i).Value = i - 2
End If
Next i
'assuming your button is named 'Button 1'
'if you knew it was always the first shape made on the sheet then
'you could also access the button by index: Shapes(1)
with ActiveSheet.Shapes("Button 1")
.Left = ActiveSheet.Range("b18").Left
.Top = ActiveSheet.Range("b18").Top
End With
Edit
Added a with statement after the for loop to place the top left of your button at cell b18

using From Web Query in a loop

I am trying to pull data from this website: http://securities.stanford.edu/filings.html?page=1
Each "page" is a table with 21 items. There are 97 pages I would like to pull data from, but I am unable to automate it so that the macro cycles through all 97, and places the results every 21 rows, starting on cell A1. (sequence: a1, a22, a43, ect...)
this what I got, but I dont want to edit the code 97 time to get all the pages. Any idea how I could automate the task?
Sub Macro1()
' Macro1 Macro
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://securities.stanford.edu/filings.html?page=1", Destination:=Range( _
"A1"))
.Name = "filings.html?page=1"**
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlAllTables
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
end Sub
For x = 1 to 97
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://securities.stanford.edu/filings.html?page=" & x, Destination:=Range( _
"A" & (1 + ((x - 1) * 21)))
.Name = "filings.html?page=" & x
End With
Next
x contains page number and the cell is complicated to make it start at A1 rather than A21.
You could make it 0 to 96 and cell & (1 + (x + 21)) and the name and query x + 1.
I would abandon the 'from Web Query' method and delve into some xmlHTTP. For the following, you will have use the VBE's Tools ► References to add Microsoft HTML Object Library, Microsoft Internet Controls and Microsoft XML 6.0.
Option Explicit
Sub mcr_Collect_Filings()
Dim htmlBDY As HTMLDocument, xmlHTTP As New MSXML2.ServerXMLHTTP60
Dim rw As Long, pg As Long, iTH As Long, iTD As Long, iTR As Long
Dim eTBL As MSHTML.IHTMLElement
For pg = 1 To 99 '<-set to something reasonable; routine will kick out whehn it cannot find anything more
xmlHTTP.Open "GET", "http://securities.stanford.edu/filings.html?page=" & pg, False
xmlHTTP.setRequestHeader "Content-Type", "text/xml"
xmlHTTP.send
If xmlHTTP.Status <> "200" Then GoTo bm_CleanUp
Set htmlBDY = New HTMLDocument
htmlBDY.body.innerHTML = xmlHTTP.responseText
Set eTBL = htmlBDY.getElementById("records").getElementsByTagName("table")(0)
If eTBL Is Nothing Then GoTo bm_CleanUp
'skip the header row if on page 2 and above
With Sheet1 '<-worksheet codename
rw = .Cells(Rows.Count, 1).End(xlUp).Row
For iTR = (1 + (pg = 1)) To (eTBL.getElementsByTagName("tr").Length - 1)
For iTH = 0 To (eTBL.getElementsByTagName("tr")(iTR).getElementsByTagName("th").Length - 1)
.Cells(rw, 1).Offset(iTR, iTH) = _
eTBL.getElementsByTagName("tr")(iTR).getElementsByTagName("th")(iTH).innerText
Next iTH
For iTD = 0 To (eTBL.getElementsByTagName("tr")(iTR).getElementsByTagName("td").Length - 1)
.Cells(rw, 1).Offset(iTR, iTD) = _
eTBL.getElementsByTagName("tr")(iTR).getElementsByTagName("td")(iTD).innerText
Next iTD
Next iTR
End With
Next pg
bm_CleanUp:
Set eTBL = Nothing
Set htmlBDY = Nothing
Set xmlHTTP = Nothing
End Sub
The XMLHTTP is invisible so you have to know a little about the page and what to expect in the form of HTML code you are going to receive under different circumstances. A browser's Inspect Element command take care of that.
This is by far the fastest method in VBA. While you actually have more than 99 rows to retrieve, this went to 99 pages in 56.3 seconds. You might even speed that up a bit by turning off screen updating.