Web Scraping Data Destination - vba

I am very new to using VBA in Excel. I have a list of hundreds of links that I want to scrape the data from (the links do not have nicely formatted tables, I have to scrape the raw data for what I need). I have a macro built that is working nicely, but the destination is not quite what I'm hoping for.
I want excel to read each url, and then dump the data in the next column over. BUT I want each set of data to appear directly below the previous. So I want all of the data from all of the URLs in the same column. Currently, my macro is putting the data from each URL into its own column.
Example:
My URLs are listed in each row separately in column A. The macro reads A1 and dumps the in Data B1. Then, it shifts that column to the right a bit (so it is now C1), and dumps the data from A2 into (the new) B2.
What I want it to do is read A1, and put the data in B1. Then, read A2 and put the data in B30 (if B29 was the last used row for the data from A1).
Hopefully this makes sense.
Here is the macro I currently have working:
Sub WebScraping()
Dim Erw, Frw, Lrw
Frw = 1
Lrw = Range("A" & Rows.Count).End(xlUp).Row
For Erw = Frw To Lrw
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & Range("A" & Erw).Value, Destination:=Range("B" & Erw))
.Name = ""
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next Erw
End Sub
And here are some examples of the URLs I am scraping:
http://www.washington.edu/students/timeschd/AUT2018/arctic.html
http://www.washington.edu/students/timeschd/AUT2018/hnrs.html
Thanks for any suggestions!
So the yellow URL is the first in the list, and the red URL is the second in the list. The yellow column to the right is where it is placing the data from the first URL, and the red column to the right is where it is placing the data from the second URL. But I want to it to first place the data from the first URL in Column B, then place the data from the second URL directly below that.

You need to write to the next available row in B (Or add some padding rows in between as well)
Option Explicit
Public Sub WebScraping()
Dim Erw As Long, Frw As Long, Lrw As Long, LrwB As Long
Frw = 1
Lrw = ActiveSheet.Range("A" & ActiveSheet.Rows.Count).End(xlUp).Row
For Erw = Frw To Lrw
LrwB = ActiveSheet.Range("B" & ActiveSheet.Rows.Count).End(xlUp).Row + 1
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & ActiveSheet.Range("A" & Erw).Value, Destination:=ActiveSheet.Range("B" & LrwB))
.Name = vbNullString
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next Erw
End Sub

Related

pasting scraped data from web to new worksheet and name the worksheet based on values in a column

I'm a novice in VBA, was looking to export tables from HTML & I have been successful in that.
My code will create a new worksheet to paste the tables from each URL. Currently the new worksheet is named in a sequence of 1,2,3... I want help to rewrite this part of the code.
I have two columns and multiple rows, 1st column is for the URL from where the data needs to be copied and the 2nd column contains worksheet name.
Example:
Column 1 :
URL;http://www.cricbuzz.com/cricket-series/2489/england-tour-of-india-2016-17/matches
Column 2:
Eng-India
When the data is copied from column 1, I would like the code to name the new worksheet based on the values in column 2
Currently I use the below code:
Sub adds()
For x = 1 To 2
Worksheets("MATCH").Select
Worksheets("MATCH").Activate
mystr = "URL;http://www.cricbuzz.com/cricket-series/2489/england-tour-of-india-2016-17/matches"
mystr = Cells(x, 1)
Worksheets.Add(After:=Worksheets(Worksheets.Count)).Name = x
With ActiveSheet.QueryTables.Add(Connection:=mystr, Destination:=Range("$A$1"))
'CommandType = 0
.Name = "01000_1"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "1,2,3,4,5,6,7,8,9,10"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next x
End Sub

VBA: printing on the last row of excel

I am planning to extract data from a table on the web. The code below was mostly from built-in web scraping function in Excel.
What I want is to find the last row with data and print on the first blank row.
Sub Sub1()
ActiveSheet.Cells.Clear
Dim lastRow As Long
Dim i As Integer
For i = 1 To 2
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://www.aastocks.com/tc/stocks/quote/symbolsearch.aspx?page=" & i & " &order=symbol&seq=asc", Destination _
:=Range("A,lastRow")) '???? I got an error here, what I want is to detect the last row and print on the first blank row.???
'.CommandType = 0
.Name = "symbolsearch_1"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "10"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
lastRow = ActiveSheet.Cells(Rows.Count, "A").End(xlUp).Row + 1
Debug.Print lastRow
Next i
End Sub
This seems to work ok now. You were close, you should be able to edit as required.
Sub Sub1()
ActiveSheet.Cells.Clear
Dim lastRow As Long
Dim i As Integer
Dim sTicker As String
'set the first row to write to
lastRow = 2
For i = 1 To 2
sTicker = InputBox("Enter the code to search for")
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://www.aastocks.com/tc/stocks/quote/symbolsearch.aspx?page=" & sTicker & " &order=symbol&seq=asc", Destination _
:=Range("A" & lastRow)) 'uses the ticker value entered and fixed the destination part
'.CommandType = 0
.Name = "symbolsearch_1"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "10"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
lastRow = ActiveSheet.Cells(Rows.Count, "A").End(xlUp).Row + 1
Next i
End Sub

How to scrape data from Twitter from multiple URLs in a column of Excel

I need to basically fill two cells in each row with data from the respective URL present in the same row. When the query is made using macro it returns 5 pieces of data out of which I need only two.
I recorded two macros(one to query the page and other to adjust the data pieces i get) and combined them and applied a do while loop.
But it is returning an error in this line - myurl = "URL;" & Tabsheet.Cells(i, 6)
Below is the code:
'
' Macro1 Macro
'
Sheets("Tabsheet").Select
Range("A1").Select
Dim i As Integer, myurl As String
i = 1
Do While i < 102
myurl = "URL;" & Tabsheet.Cells(i, 6)
With ActiveSheet.QueryTables.Add(Connection:= _
myurl, Destination:=ActiveCell.Offset(i, 8))
.Name = "Query" & i
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "3"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Range("H105").Select
Selection.Cut
Range("I104").Select
Selection.Cut
Range("I103").Select
Selection.ClearContents
Range("H102").Select
i = i + 1
Loop
End Sub
Any suggestions will be extremely helpful.
This should work :
myurl = CStr("URL;" & Sheets("Tabsheet").Cells(i, 6))
You can use TabSheet directly as you didn't set it as an object, you could have if you did that before :
Dim TabSheet As Worksheet
Set TabSheet = ThisWorkbook.Sheets("TabSheet")
'------------------------------------
'--------Here come your code---------
'------------------------------------
'And when it's done, don't forget to free what you have set
Set TabSheet = Nothing

Import data from web: repeat loop

I have this VBA script that works:
Sub Basic_Web_Query()
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & ActiveSheet.Range("A2").Value, Destination:=Range("E2"))
.Name = "q?s=goog_2"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "2"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
End Sub
I want the macro to do this in a loop, i.e. go to A3 with destination E3, A4 with destination A4... etc.
any help will be appreciated
thansk!
I like to use 'offset' for these loops: OFFSET(rows down, columns right).
Negative values move up and left.
for example:
for p= 1 to 5
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & ActiveSheet.Range("A2").offset(p-1,0).Value, Destination:=Range("E2").offset(p-1,0))
'other code
End With
next p
google 'loop vba' and you will hit plenty of examples. Here's one:
Dim rw As Long
For rw = 1 To 5
Range("E" & rw).Value = Range("A" & rw).Value
'other stuff
Next rw

Getting Worksheets renamed in excel after importing data from the web

I would like to download the daily prices of our stock exchange using the following code in vba. Although the code works i cant seem to be able to get the sheets renamed to the corresponding day when the price list was obtained.
Dim DownloadDay As Date
DownloadDay = #3/4/2014#
Do While DownloadDay < #4/4/2014#
ActiveWorkbook.Worksheets.Add
Call website(Format(DownloadDay, "YYYYMMDD"))
'INCREMENT THE DAY
Sheets.Add.Name = "DownloadDay"
DownloadDay = DownloadDay + 1
Loop
End Sub
Sub website(sDate As String)
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://live.mystocks.co.ke/price_list/" & DownloadDay & "/", Destination:=Range("$A$1"))
.Name = DownloadDay
'To rename each work sheet with the corresponding day'
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlTables
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
End Sub
This line of VBA will set the name of your worksheet:
Sheets("Sheet2").Name = "NewName"
You can not use "/" the character in the name though, and the variable is set to a date not a string.
Try this.
.name = CSTR(FORMAT(DownloadDay,"YYYYMMMDD"))