Extract Data from a Web Page - using VBA - vba

Using VBA, I need to extract data from webpage http://emops.tse.com.tw/t21/sii/t21sc03_2011_9_e.htm
I am able to fetch all the data using following code:
With ActiveSheet.QueryTables.Add(Connection:="URL;http://emops.tse.com.tw/t21/sii/t21sc03_2012_2_e.htm", Destination:=Range("$A$1"))
.Name = "67083361_zpid"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
But the problem is I don't want data from whole page. I want data from the table where Industry name is Electron (It is the last table in this case)
Any trick for the same please?

Change:
.WebSelectionType = xlEntirePage to .WebSelectionType = xlSpecifiedTables
Add:
.WebTables = "2" below .WebFormatting = xlWebFormattingNone
'You will have to use trial and error with the "2" to find the exact table you are wanting to grab

Related

Run-time error 91 while getting data from password protected site using vba

Below is the piece of code I am using:
Set IE = CreateObject("InternetExplorer.Application")
With IE
.Visible = True
.Navigate "https:......"
Do Until .ReadyState = 4
DoEvents
Loop
.document.all.Item("sso_username").Value = "username"
.document.all.Item("ssopassword").Value = "password"
.document.forms(1).submit
End With
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;https:...." , Destination:=Range("A2"))
.Name = "q?s=goog_2"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "1,2"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
The error is shown for
.document.all.Item("sso_username").Value = "username"
Can anybody let me know what i am missing out?

Extracting tabular form data from multiple webpages into an excel using VBA macros..!

Extracting tabular form data from multiple webpages into an excel using VBA macros..!! Currently iam using below link but i could able to only one webpage in the code..i have list of ulr's to get data from...and it has to come in vertical..!! please suggest me.. :)
Sub INDEXdata()
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://recorder.maricopa.gov/recdocdata/GetRecDataDetail.aspx?rec=19770000007" _
, Destination:=Range("$A$1"))
.Name = "rec=19770000006"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "2,3"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
End Sub
OK, I don't know how much programming background you have, and I also don't know what parts of the code you posted are specific to that one source url and target location.
But, something like this might work. I made the assumption that the url, the destination, and the name would change for each page you wanted to pull from.
What I did was take the part of the code that looked like it would be true for all of the source pages and destinations, and put that in its own parameterized subroutine. The original routine IndexData just specifies the URL and the destination, and the name, for each copy operation.
Sub IndexData()
GetData("http://recorder.maricopa.gov/recdocdata/GetRecDataDetail.aspx?rec=19770000007" , _
"$A$1", _
"rec=19770000006")
GetData("http://somewhereelse.com/somedata.aspx?rec=12345", _
"$A$2", _
"rec=12345")
GetData("http://anotherurl.com/etc", _
"$A$3", _
"something")
End
Sub GetData(url as string, destination as string, name as string)
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & url , Destination:=Range(destination))
.Name = name
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "2,3"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
End Sub

VBA Web page scrape finishes before page loads

I'm doing a web scrape in VBA (see code below) to btc-e.com to fetch prices of some Cryptocurrency. When to it manually by going to data tab and then clicking on from web It works fine, but When I do it in the Macro I only get back "please wait..."
The page displays "please wait..." as it loads and the macro assumes that is the entire page.
I have been looking for a way to make the macro wait for the full page load and cant find anything.
Any help would be appreciated.
Thanks
With ActiveSheet.QueryTables.Add(connection:="URL;https://btc-e.com", _
Destination:=Range("$A$1"))
.Name = "btc-e"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = False ' was true
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False ' True ' was false
.WebDisableRedirections = False 'True ' was false
.Refresh BackgroundQuery:=False
End With
You have to choose a specific table or area of the page you want scraped, otherwise it won't work. The reason is that you are automatically forwarded from the page you are trying to scrape to the page you can actually scrape.
When I chose to scrape the Sell Orders, this is the code I got from the macro recorder:
With ActiveSheet.QueryTables.Add(Connection:="URL;https://btc-e.com", _
Destination:=Range("$B$2"))
.Name = "btc-e"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "3"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
As you can see it takes the attribute ".WebTables" which chooses a specific portion of the site. You can choose the portion you want by activating the macro recorder, scraping it through the normal way, choosing the area you want and then looking at the value of WebTables in the resulting code.
Hope this helps!

How to NOT refresh backgroundQuery vba

Does anyone know how to stop the refresh query table to constantly refreshing and only refresh itself once. he constant refresh, is making my excel spreadsheet run slow.
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & FilePath, _
Destination:=temp.Range("A1"))
.Name = "Deloitte_2013_08"
' .CommandType = 0
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Change this line:
.BackgroundQuery = True
to:
.BackgroundQuery = False
Use Application.ScreenUpdating to wrap your code.
application.screenupdating = false
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & FilePath, _
Destination:=temp.Range("A1"))
.Name = "Deloitte_2013_08"
' .CommandType = 0
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
application.screenupdating = true
You might also be interested in setting Application.Calculation to xlCalculationManual before a large operation and then setting it to xlCalculationAutomatic after you are done.

How to add data below existing data on sheet?

I recorded a macro and edited it. It scrapes particular data from webpages (found on links) and display them on separate pages.
I want to use the data (VLOOKUP) but the data is on different pages which makes it hard to get an accurate formula.
Every week I change the second line of the code
For x = 1 To 20
to
For x = 21 to ....
for example, because new links/data come out every week.
How do I find the last line to add the next lot of data below that?
Sub Update()
For x = 1 To 20
Worksheets("Links").Select
Worksheets("Links").Activate
mystr = Cells(x, 8)
mystr2 = Cells(x, 15)
Worksheets.Add(After:=Worksheets(Worksheets.Count)).Name = x
With ActiveSheet.QueryTables.Add(Connection:=mystr, Destination:=Range("$K$1"))
.Name = "report2_1"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlAllTables
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Range("A1").Select
With ActiveSheet.QueryTables.Add(Connection:=mystr2, Destination:=Range("$A$1"))
.Name = "report6_1"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlAllTables
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next x
End Sub
instead of
for x = 1 to 20
do
for x = Range("A" & Rows.Count).End(xlUp).Row to Range("A" & Rows.Count).End(xlUp).Row-20 step -1
which will find the last cell used in Column A and subtract 20 rows from it and iterate backwars