I am scraping data from a webpage. Once on the last page, how do I get the macro to end?
For instance, if there are 4 pages of data, how do I stop and display the data scrape through the 4 pages?
There are 4 pages of data. I receive the 1004 run time error, if in the code I say
Do While i < 5
...
'Macro to query Delinquency Status Search for DFB Counties
'Run Monday to pull data from Friday
Sub queryActivityDailyMforF()
Dim nextrow As Integer, i As Long
Dim dates
dates = Date - 3
Application.ScreenUpdating = False
Application.DisplayStatusBar = True
Do While i < 4
Application.StatusBar = "Processing Page " & i
nextrow = ActiveSheet.Cells(Rows.Count, "A").End(xlUp).Row + 1
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;https://www.myfloridalicense.com/delinquency_results.asp?SID=&page=" & i & "&county_1=16&county_1=21&county_1=23&county_1=32&county_1=36&county_1=41&county_1=46&county_1=53&county_1=54&county_1=57&county_1=60&county_1=66&status=NS&send_date=" & dates & "&search_1.x=1", _
Destination:=Range("A" & nextrow))
'.Name = _
"2015&search_1.x=40&search_1.y=11&date=on&county_1=AL&lic_num_del=&lic_num_rep=&status=NS&biz_name=&owner_name="
.FieldNames = False
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "10"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
'autofit columns
Columns("A:G").Select
Selection.EntireColumn.AutoFit
'check for filter, if not then turn on filter
ActiveSheet.AutoFilterMode = False
If Not ActiveSheet.AutoFilterMode Then
ActiveSheet.Range("D2").AutoFilter
End If
i = i + 1
End With
Loop
Application.StatusBar = False
'Align text left
Cells.Select
With Selection
.HorizontalAlignment = xlLeft
.VerticalAlignment = xlBottom
.WrapText = False
.Orientation = 0
.AddIndent = False
.IndentLevel = 0
.ShrinkToFit = False
.ReadingOrder = xlContext
.MergeCells = False
End With
If i = 0 Then Exit Sub
'Loop
End Sub
I'm sure there is a better answer involving querying the table or getting an xml version of the results from the webmaster of the site, and THIS IS NOT BEST PRACTICE but you could use the error handler. It should look like this, generically:
Sub Stuff()
Do While True
On Error Goto ErrHndl
ErrorProneCode
Loop
LoopExit:
On Error Resume 0
Non_ErrorProneCode
Exit Sub
ErrHndl:
If Err.Number = 1004 Then
Resume LoopExit
Else
Err.Raise Err.Number, Err.Description, Err.Source
End If
End Sub
In the sample above, if an error is throne by the ErrorProneCode sub, the script will jump to the ErrHndl: label. There we check for the 1004 error and if that's the error we find, we resume at the LoopExit: label, and continue on. It is advised that you disable your error handler by calling On Error Resume 0 once you are executing code which you do not expect to break.
Related
I am very new to using VBA in Excel. I have a list of hundreds of links that I want to scrape the data from (the links do not have nicely formatted tables, I have to scrape the raw data for what I need). I have a macro built that is working nicely, but the destination is not quite what I'm hoping for.
I want excel to read each url, and then dump the data in the next column over. BUT I want each set of data to appear directly below the previous. So I want all of the data from all of the URLs in the same column. Currently, my macro is putting the data from each URL into its own column.
Example:
My URLs are listed in each row separately in column A. The macro reads A1 and dumps the in Data B1. Then, it shifts that column to the right a bit (so it is now C1), and dumps the data from A2 into (the new) B2.
What I want it to do is read A1, and put the data in B1. Then, read A2 and put the data in B30 (if B29 was the last used row for the data from A1).
Hopefully this makes sense.
Here is the macro I currently have working:
Sub WebScraping()
Dim Erw, Frw, Lrw
Frw = 1
Lrw = Range("A" & Rows.Count).End(xlUp).Row
For Erw = Frw To Lrw
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & Range("A" & Erw).Value, Destination:=Range("B" & Erw))
.Name = ""
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next Erw
End Sub
And here are some examples of the URLs I am scraping:
http://www.washington.edu/students/timeschd/AUT2018/arctic.html
http://www.washington.edu/students/timeschd/AUT2018/hnrs.html
Thanks for any suggestions!
So the yellow URL is the first in the list, and the red URL is the second in the list. The yellow column to the right is where it is placing the data from the first URL, and the red column to the right is where it is placing the data from the second URL. But I want to it to first place the data from the first URL in Column B, then place the data from the second URL directly below that.
You need to write to the next available row in B (Or add some padding rows in between as well)
Option Explicit
Public Sub WebScraping()
Dim Erw As Long, Frw As Long, Lrw As Long, LrwB As Long
Frw = 1
Lrw = ActiveSheet.Range("A" & ActiveSheet.Rows.Count).End(xlUp).Row
For Erw = Frw To Lrw
LrwB = ActiveSheet.Range("B" & ActiveSheet.Rows.Count).End(xlUp).Row + 1
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & ActiveSheet.Range("A" & Erw).Value, Destination:=ActiveSheet.Range("B" & LrwB))
.Name = vbNullString
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next Erw
End Sub
There is a website, that can create thousands of .csv files that contain tables. The CSV files are based on the information the user asks.
I created an excel file with VBA script. The user enters the data to the excel file, then the VBA script generates the correct URL, and tries to get the required data from the .csv in that URL.
In my excel file, the user can ask for hundreds of .csv tables, and I want the user to be able to enter the hundreds of information kinds he wants, then run the VBA script and leave the computer to work on it.
I first do URL check, and if it is ok, I try to get the data in the .csv file in that URL.
most of the times, it works completely fine. Works fine for a case when HttpExists returns TRUE, and also works fine for a case that HttpExists returns FALSE (it just skips the current active cell and goes to the next one).
But there are a few times, that the URL check answers that the URL is fine (HttpExists returns TRUE), but when it tried to get the data, it opens a message box that says "sorry, we couldn't open 'url address' ". (message box for Run Time Error 1004) and then the VBA scripts terminates.
I would like to know how can I fix it. How can I just skip the current URL in case of error, instead of showing a message box that terminates the script run?
Sub my_method()
On Error GoTo handleCancel
Dim errorFlag As Boolean
.......
Do Until ActiveCell.Value = ""
errorFlag = True
URLstring= ....
........
If Not HttpExists(URLstring) Then
symbolStatus = "Data unavailable"
logAddress = updateLog("invalid URL " & ActiveCell.Value, logAddress, debugString)
Application.DisplayAlerts = False
Sheets(currentSymbol).Delete
Application.DisplayAlerts = True
Else
With Sheets(currentSymbol).QueryTables.Add(Connection:= _
"TEXT;" & URLstring _
, Destination:=Sheets(currentSymbol).Range(dataAddress))
.Name = ""
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.RefreshStyle = xlOverwriteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.TextFilePromptOnRefresh = False
.TextFilePlatform = 850
.TextFileStartRow = 2
.TextFileParseType = xlDelimited
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = False
.TextFileSemicolonDelimiter = False
.TextFileCommaDelimiter = True
.TextFileSpaceDelimiter = False
.TextFileColumnDataTypes = Array(2, 2, 2, 2, 2, 2, 9)
.TextFileTrailingMinusNumbers = True
.Refresh BackgroundQuery:=False
End With
.......
errorFlag = False
handleCancel:
ActiveCell.Offset(1, 0).Select
If errorFlag = True Then
symbolStatus = "Data unavailable"
logAddress = updateLog("invalid URL " & ActiveCell.Value,
logAddress, debugString)
Application.DisplayAlerts = False
Sheets(currentSymbol).Delete
Application.DisplayAlerts = True
End If
Loop
End Sub
Function HttpExists(sURL As String) As Boolean
Dim oXHTTP As Object
Set oXHTTP = CreateObject("MSXML2.ServerXMLHTTP")
If Not UCase(sURL) Like "HTTP:*" Then
sURL = "http://" & sURL
End If
On Error GoTo haveError
oXHTTP.Open "HEAD", sURL, False
oXHTTP.send
HttpExists = IIf(oXHTTP.status = 200, True, False)
Exit Function
haveError:
HttpExists = False
End Function
It sometimes goes out with a messagebox of Run Time Error 1004, and it happens in the line of:
With Sheets(currentSymbol).QueryTables.Add(Connection:= _
"TEXT;" & URL _
I would like it just to skip the current cell in a case of error, and go on with the next cell, without any messagebox and without crashing.
How can I fix it?
Thanks
See if this error handling structure works better. I eliminated parts that are unnecessary and adjusted to what should work, but I am not sure what code is in the ..... sections. Anyway, this should at least give you a general understanding. I commented a few things to explain more clearly in code.
Option Explicit
Sub my_method()
Do Until ActiveCell.Value = ""
'URLstring= ....
If Not HttpExists(URLstring) Then
LogError 'create sub since you do same thing twice
Else
On Error GoTo handleBadURL 'now is only time you need to move to actual error handling
With Sheets(currentSymbol).QueryTables.Add(Connection:= _
"TEXT;" & URLstring _
, Destination:=Sheets(currentSymbol).Range(dataAddress))
.Name = ""
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.RefreshStyle = xlOverwriteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.TextFilePromptOnRefresh = False
.TextFilePlatform = 850
.TextFileStartRow = 2
.TextFileParseType = xlDelimited
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = False
.TextFileSemicolonDelimiter = False
.TextFileCommaDelimiter = True
.TextFileSpaceDelimiter = False
.TextFileColumnDataTypes = Array(2, 2, 2, 2, 2, 2, 9)
.TextFileTrailingMinusNumbers = True
.Refresh BackgroundQuery:=False
End With
On Error Go To 0 'reset error handling (doesn't matter so much here, but good practice to always reset when not needed
End If
ActiveCell.Offset(1, 0).Select
Loop
Exit Sub 'leave sub when all is done (so it doesn't move to error handling code below
handleBadURL:
LogError 'created sub since you do same thing twice
Resume Next 'this statement will allow code to continue from point of error onward (the loop will keep going
End Sub
Sub LogError()
symbolStatus = "Data unavailable"
logAddress = updateLog("invalid URL " & ActiveCell.Value, logAddress, debugString)
Application.DisplayAlerts = False
Sheets(currentSymbol).Delete
Application.DisplayAlerts = True
End Sub
You need to add error handling to your code. Server timeout notices doesn't reflect an issue with your coding, but an issue with the server (which is out of your control, unless of course, you entered an incorrect URL).
In your code, you need to place On Error GoTo ErrHandler, make sure you have the error number, and since you are wanting to just resume to the next cell you can do something like this:
Sub Test()
On Error GoTo ErrHandler
'Your code
Exit Sub
ErrHandler:
If Err.Number = 123456 Then
'Get the code ready for the next cell, if necessary
Resume Next
Else
'Other Errs
End If
End Sub
I am planning to extract data from a table on the web. The code below was mostly from built-in web scraping function in Excel.
What I want is to find the last row with data and print on the first blank row.
Sub Sub1()
ActiveSheet.Cells.Clear
Dim lastRow As Long
Dim i As Integer
For i = 1 To 2
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://www.aastocks.com/tc/stocks/quote/symbolsearch.aspx?page=" & i & " &order=symbol&seq=asc", Destination _
:=Range("A,lastRow")) '???? I got an error here, what I want is to detect the last row and print on the first blank row.???
'.CommandType = 0
.Name = "symbolsearch_1"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "10"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
lastRow = ActiveSheet.Cells(Rows.Count, "A").End(xlUp).Row + 1
Debug.Print lastRow
Next i
End Sub
This seems to work ok now. You were close, you should be able to edit as required.
Sub Sub1()
ActiveSheet.Cells.Clear
Dim lastRow As Long
Dim i As Integer
Dim sTicker As String
'set the first row to write to
lastRow = 2
For i = 1 To 2
sTicker = InputBox("Enter the code to search for")
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://www.aastocks.com/tc/stocks/quote/symbolsearch.aspx?page=" & sTicker & " &order=symbol&seq=asc", Destination _
:=Range("A" & lastRow)) 'uses the ticker value entered and fixed the destination part
'.CommandType = 0
.Name = "symbolsearch_1"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "10"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
lastRow = ActiveSheet.Cells(Rows.Count, "A").End(xlUp).Row + 1
Next i
End Sub
I need to basically fill two cells in each row with data from the respective URL present in the same row. When the query is made using macro it returns 5 pieces of data out of which I need only two.
I recorded two macros(one to query the page and other to adjust the data pieces i get) and combined them and applied a do while loop.
But it is returning an error in this line - myurl = "URL;" & Tabsheet.Cells(i, 6)
Below is the code:
'
' Macro1 Macro
'
Sheets("Tabsheet").Select
Range("A1").Select
Dim i As Integer, myurl As String
i = 1
Do While i < 102
myurl = "URL;" & Tabsheet.Cells(i, 6)
With ActiveSheet.QueryTables.Add(Connection:= _
myurl, Destination:=ActiveCell.Offset(i, 8))
.Name = "Query" & i
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "3"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Range("H105").Select
Selection.Cut
Range("I104").Select
Selection.Cut
Range("I103").Select
Selection.ClearContents
Range("H102").Select
i = i + 1
Loop
End Sub
Any suggestions will be extremely helpful.
This should work :
myurl = CStr("URL;" & Sheets("Tabsheet").Cells(i, 6))
You can use TabSheet directly as you didn't set it as an object, you could have if you did that before :
Dim TabSheet As Worksheet
Set TabSheet = ThisWorkbook.Sheets("TabSheet")
'------------------------------------
'--------Here come your code---------
'------------------------------------
'And when it's done, don't forget to free what you have set
Set TabSheet = Nothing
I have this VBA script that works:
Sub Basic_Web_Query()
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & ActiveSheet.Range("A2").Value, Destination:=Range("E2"))
.Name = "q?s=goog_2"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "2"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
End Sub
I want the macro to do this in a loop, i.e. go to A3 with destination E3, A4 with destination A4... etc.
any help will be appreciated
thansk!
I like to use 'offset' for these loops: OFFSET(rows down, columns right).
Negative values move up and left.
for example:
for p= 1 to 5
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & ActiveSheet.Range("A2").offset(p-1,0).Value, Destination:=Range("E2").offset(p-1,0))
'other code
End With
next p
google 'loop vba' and you will hit plenty of examples. Here's one:
Dim rw As Long
For rw = 1 To 5
Range("E" & rw).Value = Range("A" & rw).Value
'other stuff
Next rw