VBA-Web Scraping- Can't acces table web page - vba

I tried to scrape the data prices table in this web https://www.energylive.cloud/ , like I did in other webs, but I can't (I don't have much experience scraping). Thanks in advance!!!:
Sub ej()
Dim XMLrequest As New MSXML2.XMLHTTP60
Dim HTMLdoc As New MSHTML.HTMLDocument
Dim HTMLtable As MSHTML.IHTMLElement
'Dim HTMLi As MSHTML.IHTMLElementCollection
Dim url As String
url = "https://www.energylive.cloud/"
XMLrequest.Open "GET", url, False
XMLrequest.send
If XMLrequest.Status <> 200 Then
MsgBox XMLrequest.Status & XMLrequest.statusText
End If
HTMLdoc.body.innerHTML = XMLrequest.responseText
'debug.print htmldoc.body.innerText 'I checked here but the table is not here
Set HTMLtable = HTMLdoc.getElementById("price_table")
'Debug.Print HTMLtable.ID
End Sub

The content you look for is not available in that page. It's added dynamically. This is the link where you can find the desired content which are static that you can grab using xhr. To find out that link you need to make use of chrome dev tools or something similar. After opening dev tools, select network tab and then try reloading the page to observe network activity within All or xhr where you should find that link.
It's not that easy to parse required content out of json response especially when you are using vba as there is no such built-in library to help you grab them. The more common approach though is to go for any third party json converter.
However, I've used regex here which seems to have grabbed the data flawlessly. When you run the script, you should get all the tabular content with the blink of an eye.
Sub FetchTabularData()
Const mainUrl$ = "https://www.energylive.cloud/pwr-hour/get-index-averages?callback=%3F"
Dim I&, S$, Elem As Object, subElemName As Object
Dim subElemChange As Object, subElemPrice As Object
Dim subElemMtd As Object, subElemYtd As Object, R As Long: R = 1
Dim ws As Worksheet: Set ws = ThisWorkbook.Worksheets("Sheet1")
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", mainUrl, False
.send
S = .responseText
End With
ws.Range("A1:E1") = [{"Index","Value","Changes","Month To Date","Year To Date"}]
With CreateObject("VBScript.RegExp")
.Global = True
.MultiLine = True
.Pattern = "\[?{[\s\S]+?\},?"
Set Elem = .Execute(S)
For I = 0 To Elem.count - 1
.Pattern = "Index""\:""(.*?)"","
Set subElemName = .Execute(Elem(I))
.Pattern = "Value""\:""(.*?)\"","
Set subElemPrice = .Execute(Elem(I))
.Pattern = "Perc""\:""(.*?)"","
Set subElemChange = .Execute(Elem(I))
.Pattern = "Month-to-date""\:""(.*?)"","
Set subElemMtd = .Execute(Elem(I))
.Pattern = "Year-to-date""\:""(.*?)"""
Set subElemYtd = .Execute(Elem(I))
R = R + 1: ws.Cells(R, 1) = subElemName(0).submatches(0)
ws.Cells(R, 2) = subElemPrice(0).submatches(0)
ws.Cells(R, 3) = subElemChange(0).submatches(0) & "%"
ws.Cells(R, 4) = subElemMtd(0).submatches(0)
ws.Cells(R, 5) = subElemYtd(0).submatches(0)
Next I
End With
End Sub
PS You don't need to add any reference to the library to execute the above script. Just make sure you have a sheet named Sheet1 in your excel workbook.

Related

How can i web scrape data outside of div tags

Im having trouble webscraping a value from https://next-episode.net/star-trek-picard. Im looking to pull the season number from the "Previous__episode" tag ID. The number seems to be in between 2 tags and I cant seem to find a way to reference it. Ive previously been pulling all of the inner text of the previous_episode tag and using regex functions to isolate the values that I want giving me the following
Previous Episode
Name:Broken Pieces
Date:
Thu Mar 12, 2020
Season:
1
Episode:8
Summary:Episode Summary
(the above has no lines in between them in excel. The formatting in the question box is placing them all on one line for some reason)
Before tonight the number 1 was in the same line as "Season:" allowing my regex function to work.
Or possibly a regex patter to allow me to get the 1 on the next line after "season:"
Dim XML_05 As New MSXML2.XMLHTTP60
Dim HTML_05 As New MSHTML.HTMLDocument
XML_05.Open "GET", Cells(Row, NextEpisodeURL).Value, False
XML_05.send
HTML_05.body.innerHTML = XML_05.responseText
Dim NETC_05 As MSHTML.IHTMLElementCollection
Dim NET_05 As MSHTML.IHTMLElement
Dim REC_05 As MSHTML.IHTMLElement
Dim CEC_05 As MSHTML.IHTMLElementCollection
Dim CE_05 As MSHTML.IHTMLElement
Dim REO_05 As VBScript_RegExp_55.RegExp
Dim MO_05 As Object
Dim SN_05() As String
Dim ENA_05() As String
Dim EN_05() As String
Dim LatestEpisodeName As String
Set NET_05 = HTML_05.getElementById("previous_episode")
Set REO_05 = New VBScript_RegExp_55.RegExp
REO_05.Global = True
REO_05.IgnoreCase = True
REO_05.Pattern = "(Name:(.*))"
Set MO_05 = REO_05.Execute(NET_05.innerText)
Debug.Print MO_05.Count
Debug.Print MO_05(0).Value
ENA_05 = Split(MO_05(0), ":")
Debug.Print ENA_05(1)
LatestEpisodeName = ENA_05(1)
REO_05.Pattern = "(Episode:([0-9]*))"
Set MO_05 = REO_05.Execute(NET_05.innerText)
Debug.Print MO_05.Count
Debug.Print MO_05(0).Value
EN_05 = Split(MO_05(0), ":")
Debug.Print EN_05(1)
Cells(Row, EpisodeNet).Value = EN_05(1)
REO_05.Pattern = "(Season:\s+([0-9]*))"
Set MO_05 = REO_05.Execute(NET_05.innerText)
Debug.Print MO_05.Count
Debug.Print MO_05(5).Value
SN_05 = Split(MO_05(0), ":")
Debug.Print SN_05(1)
Trim (SN_05(1))
Cells(Row, SeasonNet).Value = SN_05(1)
Set NETC_05 = HTML_05.getElementById("next_episode").Children
Cells(Row, CountDown).Value = NETC_05(5).innerText
Debug.Print NETC_05(5).innerText
I suppose this is something that might help you get the required fields you wish to grab:
Sub FetchData()
Const Url$ = "https://next-episode.net/star-trek-picard"
Dim HTML As New HTMLDocument, post As Object
Dim rxp As New RegExp, R&
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", Url, False
.send
HTML.body.innerHTML = .responseText
End With
Set post = HTML.getElementById("previous_episode")
With rxp
.pattern = "(Name:(.*))"
If .Execute(post.innerText).Count > 0 Then
R = R + 1: Cells(R, 1) = .Execute(post.innerText)(0).SubMatches(0)
End If
.pattern = "(Season:\s*([0-9]*))"
If .Execute(post.innerText).Count > 0 Then
Cells(R, 2) = .Execute(post.innerText)(0).SubMatches(0)
End If
.pattern = "(Episode:([0-9]*))"
If .Execute(post.innerText).Count > 0 Then
Cells(R, 3) = .Execute(post.innerText)(0).SubMatches(0)
End If
End With
End Sub
Const Url$ = "https://next-episode.net/star-trek-picard"
Dim HTML As New HTMLDocument, post As Object
Dim rxp As New RegExp, R&
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", Url, False
.send
HTML.body.innerHTML = .responseText
End With
Set post = HTML.getElementById("previous_episode")
With rxp
.pattern = "(Name:(.*))"
If .Execute(post.innerText).Count > 0 Then
R = R + 1: Cells(R, 1) = .Execute(post.innerText)(0).SubMatches(0)
End If
.pattern = "(Season:\s+([0-9]*))"
If .Execute(post.innerText).Count > 0 Then
CleanString = Application.WorksheetFunction.Clean(.Execute(post.innerText)(0).SubMatches(0))
SeasonNumber = Split(CleanString, ":")
Cells(R, 2) = SeasonNumber(1)
End If
.pattern = "(Episode:([0-9]*))"
If .Execute(post.innerText).Count > 0 Then
Cells(R, 3) = .Execute(post.innerText)(0).SubMatches(0)
End If
End With
End Sub
It is cleaner and faster to use the DOM to your advantage. The nodes you want are NextSiblings to the items matched by the nice and fast css selector #previous_episode .subheadline. You get a returned nodeList which are the left hand side elements e.g. "Name", "Date" etc... Simply, navigate with NextSibling to jump to the right hand side elements e.g. "Et in Arcadia Ego, Part 1".
The returned nodeList starts at 0 and you can index in to get individual items. Note that the second item you want is a textNode and therefore you must use a NodeValue property rather than innerText.
This is faster and mpre correct in terms of dealing with HTML parsing.
Option Explicit
Public Sub GetPriorEpisodeInfp()
Const URL = "https://next-episode.net/star-trek-picard"
Dim html As New mshtml.HTMLDocument, previousEpisodeItems As Object
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", URL, False
.send
html.body.innerHTML = .responseText
End With
Set previousEpisodeItems = html.querySelectorAll("#previous_episode .subheadline")
With ActiveSheet
.Cells(1, 1) = previousEpisodeItems.item(0).NextSibling.innerText
.Cells(1, 2) = previousEpisodeItems.item(2).NextSibling.NodeValue 'textNode
.Cells(1, 3) = previousEpisodeItems.item(3).NextSibling.innerText
End With
End Sub

Need to split the data in the cell into different columns and help in copying data from website to excel using vba

So i am copying the traffic data from this website.
I have used the following code so far:
Sub main()
Dim IE As InternetExplorer
Dim i
Set IE = New InternetExplorer
IE.Navigate "https://www.cp24.com/mobile/commuter-centre/traffic"
Do
DoEvents
Loop Until IE.ReadyState = ReadyState_Complete
Dim Doc As HTMLDocument
Set Doc = IE.Document
Dim AllRoute As String
Set holdingsClass =
Doc.getElementsByClassName("trafficWidget")
ActiveSheet.Range("A1").Value = holdingsClass(0).textContent
IE.Quit
End Sub
There are two problems i am facing
1) It's copying all the data in traffic widget class into one cell so its deleting data when the cell runs out of space
2) I want a way to split the data so right now everything shows up in one cell
It should look like this
col.A col.B col.C col.D
HighwayName Current Ideal Delay
Any guidance would be appreciated?
Here you go using CSS selectors to target the information required.
Option Explicit
Sub Getinfo()
Dim http As New XMLHTTP60, html As New HTMLDocument '< XMLHTTP60 is for Excel 2016 so change according to your versione.g. XMLHTTP for 2013
Const URL As String = "https://www.cp24.com/mobile/commuter-centre/traffic"
Application.ScreenUpdating = False
With http
.Open "GET", URL, False
.send
html.body.innerHTML = .responseText
End With
Dim routeNodeList As Object, currentNodeList As Object, idealNodeList As Object, delayNodeList As Object
With html
Set routeNodeList = .querySelectorAll(".location")
Set currentNodeList = .querySelectorAll(".current")
Set idealNodeList = .querySelectorAll(".ideal")
Set delayNodeList = .querySelectorAll(".delaymin")
End With
Dim i As Long
For i = 0 To routeNodeList.Length - 1
With ActiveSheet
.Cells(i + 2, 1) = routeNodeList.item(i).innerText
.Cells(i + 2, 2) = currentNodeList.item(i).innerText
.Cells(i + 2, 3) = idealNodeList.item(i).innerText
.Cells(i + 2, 4) = delayNodeList.item(i).innerText
End With
Next i
Application.ScreenUpdating = True
End Sub
References required (VBE > Tools > References):
HTML Object library and MS XML < your version
Example output:
Late bound version:
Option Explicit
Public Sub Getinfo()
Dim http As Object, html As Object, i As Long
Const URL As String = "https://www.cp24.com/mobile/commuter-centre/traffic"
Application.ScreenUpdating = False
With CreateObject("MSXML2.serverXMLHTTP")
.Open "GET", URL, False
.send
Set html = CreateObject("HTMLFile")
html.body.innerHTML = .responseText
End With
Dim counter As Long: counter = 1
With ActiveSheet
For i = 0 To html.all.Length - 1
Select Case html.all(i).className
Case "location"
counter = counter + 1
.Cells(counter, 1).Value = html.all(i).innerText
Case "current"
.Cells(counter, 2).Value = html.all(i).innerText
Case "ideal"
.Cells(counter, 3).Value = html.all(i).innerText
Case "delaymin"
.Cells(counter, 4).Value = html.all(i).innerText
End Select
Next i
End With
Application.ScreenUpdating = True
End Sub

How to import multiple pages into Excel from ESPN

I need to import data from ESPN Fantasy Football into Excel. This includes multiple pages and a web query does not work. The URL is http://games.espn.com/ffl/tools/projections?startIndex=0. When you click next, the URL becomes http://games.espn.com/ffl/tools/projections?startIndex=40 and 80 and so on. Basically, the startIndex increments to multiples of 40 when you click on the next page.
This is the code I've user so far modifying the code provided by #Dee and it seems to adding a new worksheet but has no data in it.
Private Const URL_TEMPLATE As String =
"URL;http://games.espn.com/ffl/tools/projections?startIndex={0}"
Private Const NUMBER_OF_PAGES As Byte = 7
Sub test()
Dim page As Byte
Dim queryTableObject As QueryTable
Dim url As String
For page = 1 To NUMBER_OF_PAGES
url = VBA.Strings.Replace(URL_TEMPLATE, "{0}", page * 40)
Set queryTableObject = ActiveSheet.QueryTables.Add(Connection:=url, Destination:=ThisWorkbook.Worksheets.Add.[a1])
queryTableObject.WebSelectionType = xlSpecifiedTables
queryTableObject.WebTables = "14"
queryTableObject.Refresh
Next page
End Sub
The end result should be pulling all the columns from each page into a tab in one Excel sheet. Is there a way to do this? Please assist.
Thanks,
J
The Id of the table is not 14 but playertable_0. If you need to paste all the data to one single sheet try to use the following modified code. Number of pages is here determined just by guessing. You will have to add some more code to prove if e.g. the Next element is present on the page and do it in do-loop or just simply guess the number of pages. HTH
Option Explicit
Private Const URL_TEMPLATE As String = _
"URL;http://games.espn.com/ffl/tools/projections?startIndex={0}"
Private Const NUMBER_OF_PAGES As Byte = 20
Private Const TableId As String = "playertable_0"
Sub test()
Dim page As Byte
Dim queryTableObject As QueryTable
Dim url As String
Dim ws As Worksheet
Dim target As Range
Dim lastRow As Long
Set ws = ThisWorkbook.Worksheets.Add
Set target = ws.[a1]
lastRow = 1
For page = 0 To NUMBER_OF_PAGES
url = VBA.Strings.Replace(URL_TEMPLATE, "{0}", page * 40)
Set queryTableObject = ActiveSheet.QueryTables.Add( _
Connection:=url, _
Destination:=target)
With queryTableObject
.BackgroundQuery = False ' Run the query synchronously
' to be able to compute the last row
.WebSelectionType = xlSpecifiedTables
.WebTables = TableId
.Refresh
End With
lastRow = ws.Cells.Find( _
what:="*", _
SearchDirection:=xlPrevious, _
SearchOrder:=xlByRows) _
.Row
Set target = ws.Cells(lastRow + 1, 1)
target.Select
Next page
End Sub
Give this a go. I hope it will fix all the issues you are having.
Sub Espn_Data()
Const URL = "http://games.espn.com/ffl/tools/projections?startIndex="
Dim http As New XMLHTTP60, html As New HTMLDocument, page As Long
Dim htmla As Object, tRow As Object, tCel As Object, ws As Worksheet
For page = 0 To 120 Step 40 '''input the highest number replacing 120 to get them all
With http
.Open "GET", URL & page, False
.send
html.body.innerHTML = .responseText
End With
With ThisWorkbook
Set ws = .Sheets.Add(After:=.Sheets(.Sheets.Count))
ws.Name = page
End With
Set htmla = html.getElementsByClassName("playerTableTable tableBody")(0)
For Each tRow In htmla.Rows
For Each tCel In tRow.Cells
c = c + 1: Cells(x + 1, c) = tCel.innerText
Next tCel
c = 0
x = x + 1
Next tRow
c = 0
x = 0
Next page
End Sub
The Web Query for ESPN doesn't work any longer with ESPN Fantasy Football... I don't know why as it only returns all of the page info except what is in the table... Don't insult my intelligence, I've tried all of the settings...
I have come to discover that Excel only allows IE 6 as a web browser as you can find this in the 'File/Options/Advanced/General' "Web Options" then Browsers and then scroll into Target Browsers.... There are your choices for Web Query to obtain Data.... ESPN does not support any of those browsers so like you, we're screwed until EXCEL updates our choices - Yes, I know... It's 2021 :(

Excel VBA Macro: Scraping data from site table that spans multiple pages

Thanks in advance for the help. I'm running Windows 8.1, I have the latest IE / Chrome browsers, and the latest Excel. I'm trying to write an Excel Macro that pulls data from StackOverflow (https://stackoverflow.com/tags). Specifically, I'm trying to pull the date (that the macro is run), the tag names, the # of tags, and the brief description of what the tag is. I have it working for the first page of the table, but not for the rest (there are 1132 pages at the moment). Right now, it overwrites the data everytime I run the macro, and I'm not sure how to make it look for the next empty cell before running.. Lastly, I'm trying to make it run automatically once per week.
I'd much appreciate any help here. Problems are:
Pulling data from the web table beyond the first page
Making it scrape data to the next empty row rather than overwriting
Making the Macro run automatically once per week
Code (so far) is below. Thanks!
Enum READYSTATE
READYSTATE_UNINITIALIZED = 0
READYSTATE_LOADING = 1
READYSTATE_LOADED = 2
READYSTATE_INTERACTIVE = 3
READYSTATE_COMPLETE = 4
End Enum
Sub ImportStackOverflowData()
'to refer to the running copy of Internet Explorer
Dim ie As InternetExplorer
'to refer to the HTML document returned
Dim html As HTMLDocument
'open Internet Explorer in memory, and go to website
Set ie = New InternetExplorer
ie.Visible = False
ie.navigate "http://stackoverflow.com/tags"
'Wait until IE is done loading page
Do While ie.READYSTATE <> READYSTATE_COMPLETE
Application.StatusBar = "Trying to go to StackOverflow ..."
DoEvents
Loop
'show text of HTML document returned
Set html = ie.document
'close down IE and reset status bar
Set ie = Nothing
Application.StatusBar = ""
'clear old data out and put titles in
'Cells.Clear
'put heading across the top of row 3
Range("A3").Value = "Date Pulled"
Range("B3").Value = "Keyword"
Range("C3").Value = "# Of Tags"
'Range("C3").Value = "Asked This Week"
Range("D3").Value = "Description"
Dim TagList As IHTMLElement
Dim Tags As IHTMLElementCollection
Dim Tag As IHTMLElement
Dim RowNumber As Long
Dim TagFields As IHTMLElementCollection
Dim TagField As IHTMLElement
Dim Keyword As String
Dim NumberOfTags As String
'Dim AskedThisWeek As String
Dim TagDescription As String
'Dim QuestionFieldLinks As IHTMLElementCollection
Dim TodaysDate As Date
Set TagList = html.getElementById("tags-browser")
Set Tags = html.getElementsByClassName("tag-cell")
RowNumber = 4
For Each Tag In Tags
'if this is the tag containing the details, process it
If Tag.className = "tag-cell" Then
'get a list of all of the parts of this question,
'and loop over them
Set TagFields = Tag.all
For Each TagField In TagFields
'if this is the keyword, store it
If TagField.className = "post-tag" Then
'store the text value
Keyword = TagField.innerText
Cells(RowNumber, 2).Value = TagField.innerText
End If
If TagField.className = "item-multiplier-count" Then
'store the integer for number of tags
NumberOfTags = TagField.innerText
'NumberOfTags = Replace(NumberOfTags, "x", "")
Cells(RowNumber, 3).Value = Trim(NumberOfTags)
End If
If TagField.className = "excerpt" Then
Description = TagField.innerText
Cells(RowNumber, 4).Value = TagField.innerText
End If
TodaysDate = Format(Now, "MM/dd/yy")
Cells(RowNumber, 1).Value = TodaysDate
Next TagField
'go on to next row of worksheet
RowNumber = RowNumber + 1
End If
Next
Set html = Nothing
'do some final formatting
Range("A3").CurrentRegion.WrapText = False
Range("A3").CurrentRegion.EntireColumn.AutoFit
Range("A1:C1").EntireColumn.HorizontalAlignment = xlCenter
Range("A1:D1").Merge
Range("A1").Value = "StackOverflow Tag Trends"
Range("A1").Font.Bold = True
Application.StatusBar = ""
MsgBox "Done!"
End Sub
There's no need to scrape Stack Overflow when they make the underlying data available to you through things like the Data Explorer. Using this query in the Data Explorer should get you the results you need:
select t.TagName, t.Count, p.Body
from Tags t inner join Posts p
on t.ExcerptPostId = p.Id
order by t.count desc;
The permalink to that query is here and the "Download CSV" option which appears after the query runs is probably the easiest way to get the data into Excel. If you wanted to automate that part of things, the direct link to the CSV download of results is here
You can improve this to parse out exact elements but it loops all the pages and grabs all the tag info (everything next to a tag)
Option Explicit
Public Sub ImportStackOverflowData()
Dim ie As New InternetExplorer, html As HTMLDocument
Application.ScreenUpdating = False
With ie
.Visible = True
.navigate "https://stackoverflow.com/tags"
While .Busy Or .READYSTATE < 4: DoEvents: Wend
Set html = .document
Dim numPages As Long, i As Long, info As Object, item As Object, counter As Long
numPages = html.querySelector(".page-numbers.dots ~ a").innerText
For i = 1 To 2 ' numPages ''<==1 to 2 for testing; use to numPages
DoEvents
Set info = html.getElementById("tags_list")
For Each item In info.getElementsByClassName("grid-layout--cell tag-cell")
counter = counter + 1
Cells(counter, 1) = item.innerText
Next item
html.querySelector(".page-numbers.next").Click
While .Busy Or .READYSTATE < 4: DoEvents: Wend
Set html = .document
Next i
Application.ScreenUpdating = True
.Quit '<== Remember to quit application
End With
End Sub
I'm not making use of the DOM, but I find it very easy to get around just searching between known tags. If ever the expressions you are looking for are too common just tweak the code a bit so that it looks for a string after a string).
An example:
Public Sub ZipLookUp()
Dim URL As String, xmlHTTP As Object, html As Object, htmlResponse As String
Dim SStr As String, EStr As String, EndS As Integer, StartS As Integer
Dim Zip4Digit As String
URL = "https://tools.usps.com/go/ZipLookupResultsAction!input.action?resultMode=1&companyName=&address1=1642+Harmon+Street&address2=&city=Berkeley&state=CA&urbanCode=&postalCode=&zip=94703"
Set xmlHTTP = CreateObject("MSXML2.XMLHTTP")
xmlHTTP.Open "GET", URL, False
On Error GoTo NoConnect
xmlHTTP.send
On Error GoTo 0
Set html = CreateObject("htmlfile")
htmlResponse = xmlHTTP.ResponseText
If htmlResponse = Null Then
MsgBox ("Aborted Run - HTML response was null")
Application.ScreenUpdating = True
GoTo End_Prog
End If
'Searching for a string within 2 strings
SStr = "<span class=""address1 range"">" ' first string
EStr = "</span><br />" ' second string
StartS = InStr(1, htmlResponse, SStr, vbTextCompare) + Len(SStr)
EndS = InStr(StartS, htmlResponse, EStr, vbTextCompare)
Zip4Digit = Left(Mid(htmlResponse, StartS, EndS - StartS), 4)
MsgBox Zip4Digit
GoTo End_Prog
NoConnect:
If Err = -2147467259 Or Err = -2146697211 Then MsgBox "Error - No Connection": GoTo End_Prog 'MsgBox Err & ": " & Error(Err)
End_Prog:
End Sub

Get website data from Urls using VBA

I have multiple urls stored in Excel sheet. I want to Get data reside within particular div tag. For One Website it works fine
Sub Cityline()
Dim IE As Object
Set IE = CreateObject("Internetexplorer.application")
IE.Visible = True
IE.navigate "http://Someurl.com/bla/bla/bla"
Do While IE.busy
DoEvents
Loop
Do
DoEvents
Dim Doc As Object
Set Doc = IE.Document
Dim workout As String
workout = Doc.getElementsByClassName("CLASS_NAME_OF_DATA")(0).innertext
Range("A2") = workout
Loop
End Sub
I used Below code for loop Through all urls but its not working
Sub GetData()
Dim oHtm As Object: Set oHtm = CreateObject("HTMLFile")
Dim req As Object: Set req = CreateObject("msxml2.xmlhttp")
Dim oRow As Object
Dim oCell As Range
Dim url As String
Dim y As Long, x As Long
x = 1
For Each oCell In Sheets("sheet1").Range("A2:A340")
req.Open "GET", oCell.Offset(, 1).Value, False
req.send
With oHtm
.body.innerhtml = req.responsetext
With .getelementsbytagname("table")(1)
With Sheets(1)
.Cells(x, 1).Value = oCell.Offset(, -1).Value
.Cells(x, 2).Value = oCell.Value
End With
y = 3
For Each oRow In .Rows
Sheets(1).Cells(x, y).Value = oRow.Cells(1).innertext
y = y + 1
Next oRow
End With
End With
x = x + 1
Next oCell
End Sub
But its not working
can any one suggest me where i went wrong ?
I used Fetching Data from multiple URLs but it doesn't works for me.
Please guide me how to get data from all urls at a Time
I'm new to SO, so apologies to the mods if this should be in comments (I couldn't get it to fit).
I agree with Silver's comments, but I thought I'd suggest a different approach that might help. If you have URLs in a column of cells, you could create a custom VBA function that will extract the relevant data out of the HTML. Just use this function in the cells to the right of your URL to return the relevant data from the HTML. An example is this:
Public Function GetHTMLData(SiteURL As String, FieldSearch As String) As String
Dim IE As Object
Dim BodyHTML As String
Dim FieldStart As Integer
Dim FieldEnd As Integer
Set IE = CreateObject("InternetExplorer.Application")
With IE
.Navigate SiteURL
Do While .Busy Or .ReadyState <> 4
DoEvents
Loop
BodyHTML = IIf(StrComp(.Document.Title, "Cannot find server", vbTextCompare) = 0, _
vbNullString, .Document.body.innerhtml)
FieldStart = InStr(1, BodyHTML, FieldSearch) + Len(FieldSearch) + 12
FieldEnd = InStr(FieldStart, BodyHTML, "<")
GetHTMLData = Mid(BodyHTML, FieldStart, FieldEnd - FieldStart)
.Quit
End With
Set IE = Nothing
End Function
The function above has 2 input parameters: the URL and a string that will be searched for within the HTML. It will then return a string from within the HTML, starting from 12 characters after the searched parameter and ending at the following '<' within the HTML.
Hope that helps.