I have the following two pieces of code for extracting a large table from a web service, one for a URL connection:
With ActiveSheet.QueryTables.Add(Connection:="URL;" & URL, Destination:=Cells(1,1))
.PostText = ""
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = False
.RefreshStyle = xlOverwriteCells
.SavePassword = False
.AdjustColumnWidth = False
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = False
.WebSingleBlockTextImport = True
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
.WorkbookConnection.Delete
End With
And one for a TEXT connection:
With ActiveSheet.QueryTables.Add(Connection:="TEXT;" & URL, Destination:=Cells(1,1))
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.RefreshStyle = xlOverwriteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = False
.RefreshPeriod = 0
.TextFilePromptOnRefresh = False
.TextFilePlatform = 850
.TextFileStartRow = 1
.TextFileParseType = xlDelimited
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = True
.TextFileSemicolonDelimiter = False
.TextFileCommaDelimiter = False
.TextFileSpaceDelimiter = False
.TextFileColumnDataTypes = Array(1, 1, 1, 1)
.TextFileTrailingMinusNumbers = True
.Refresh BackgroundQuery:=False
.WorkbookConnection.Delete
End With
Because I sometimes need to send parameters to the web service (either by PostText or in a long URL) the URL connection is more suitable for my purposes. However, for the same data set from the same web service (no parameters in this case), the refresh consistently takes 21 seconds with the URL connection, and only 12 seconds with the TEXT connection.
Is there a reason why the TEXT connection is so much faster? And is there anything I can do about the relative slowness of the URL connection?
Related
I want to scrap data from a website using Excel and I have a problem that each extraction of data is inserted into new sheet while I want the new scraped data to be inserted bellow the previous data , first I have a sheet that has the links of all webpages that from the data are scraped .. my VBA codes are :
Sub adds()
For x = 1 To 316
Worksheets("Sheet3").Select
Worksheets("Sheet3").Activate
mystr = " "
mystr = Cells(x, 1)
Worksheets.Add(After:=Worksheets(Worksheets.Count)).Name = x
With ActiveSheet.QueryTables.Add(Connection:=mystr, Destination:=Range("$A$1"))
'CommandType = 0
.Name = "?q=ÍäÇä&daleel=mtn&do=search&page=2"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next x
End Sub
Below is the piece of code I am using:
Set IE = CreateObject("InternetExplorer.Application")
With IE
.Visible = True
.Navigate "https:......"
Do Until .ReadyState = 4
DoEvents
Loop
.document.all.Item("sso_username").Value = "username"
.document.all.Item("ssopassword").Value = "password"
.document.forms(1).submit
End With
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;https:...." , Destination:=Range("A2"))
.Name = "q?s=goog_2"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "1,2"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
The error is shown for
.document.all.Item("sso_username").Value = "username"
Can anybody let me know what i am missing out?
Does anyone know how to stop the refresh query table to constantly refreshing and only refresh itself once. he constant refresh, is making my excel spreadsheet run slow.
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & FilePath, _
Destination:=temp.Range("A1"))
.Name = "Deloitte_2013_08"
' .CommandType = 0
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Change this line:
.BackgroundQuery = True
to:
.BackgroundQuery = False
Use Application.ScreenUpdating to wrap your code.
application.screenupdating = false
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & FilePath, _
Destination:=temp.Range("A1"))
.Name = "Deloitte_2013_08"
' .CommandType = 0
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
application.screenupdating = true
You might also be interested in setting Application.Calculation to xlCalculationManual before a large operation and then setting it to xlCalculationAutomatic after you are done.
I recorded a macro and edited it. It scrapes particular data from webpages (found on links) and display them on separate pages.
I want to use the data (VLOOKUP) but the data is on different pages which makes it hard to get an accurate formula.
Every week I change the second line of the code
For x = 1 To 20
to
For x = 21 to ....
for example, because new links/data come out every week.
How do I find the last line to add the next lot of data below that?
Sub Update()
For x = 1 To 20
Worksheets("Links").Select
Worksheets("Links").Activate
mystr = Cells(x, 8)
mystr2 = Cells(x, 15)
Worksheets.Add(After:=Worksheets(Worksheets.Count)).Name = x
With ActiveSheet.QueryTables.Add(Connection:=mystr, Destination:=Range("$K$1"))
.Name = "report2_1"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlAllTables
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Range("A1").Select
With ActiveSheet.QueryTables.Add(Connection:=mystr2, Destination:=Range("$A$1"))
.Name = "report6_1"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlAllTables
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next x
End Sub
instead of
for x = 1 to 20
do
for x = Range("A" & Rows.Count).End(xlUp).Row to Range("A" & Rows.Count).End(xlUp).Row-20 step -1
which will find the last cell used in Column A and subtract 20 rows from it and iterate backwars
Using VBA, I need to extract data from webpage http://emops.tse.com.tw/t21/sii/t21sc03_2011_9_e.htm
I am able to fetch all the data using following code:
With ActiveSheet.QueryTables.Add(Connection:="URL;http://emops.tse.com.tw/t21/sii/t21sc03_2012_2_e.htm", Destination:=Range("$A$1"))
.Name = "67083361_zpid"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
But the problem is I don't want data from whole page. I want data from the table where Industry name is Electron (It is the last table in this case)
Any trick for the same please?
Change:
.WebSelectionType = xlEntirePage to .WebSelectionType = xlSpecifiedTables
Add:
.WebTables = "2" below .WebFormatting = xlWebFormattingNone
'You will have to use trial and error with the "2" to find the exact table you are wanting to grab