How can I import data from a child URL? - vba

I thought I figured this out over the weekend, but it actually doesn't work the way I thought it would. I have a confidential corporate SharePoint site that I work with. I can't post the link here, or any specific data, but the concept below will illustrate the point fine.
I have a parent URL that I want to import data from. Let's say this is the parent URL.
http://www.sharenet.co.za/v3/q_sharelookup.php
From there, I want to import data from a specific link. Let's say this is the link: 'Building & Construction Materials'
I think the best way to do this is some kind of InStr() function and search for the string. Then, if found, click the link and open the child URL. When the child URL opens, it looks something like this:
http://www.sharenet.co.za/v3/sharesfound.php?ssector=2353&exch=JSE&bookmark=Building%20&%20Construction%20Materials&scheme=default
I can't tell what the sector numbers will be ahead of time, so I can't use a specific URL. I need to reference it as the parent and child, or maybe IE1 and IE2. I want to import all data from the child URL, which in this example, looks like this.
Name Full Name Code Sector
BUILDMX BUILDMAX LIMITED BDM 2353
KAYDAV KAYDAV GROUP LTD KDV 2353
AFRIMAT AFRIMAT LTD AFT 2353
Trellidor Trellidor Hldgs Ltd TRL 2353
MASONITE MASONITE (AFRICA) LIMITED MAS 2353
DAWN DISTRIBUTION AND WAREHOUSING NETWORK LIMITED DAW 2353
MAZOR MAZOR GROUP LTD MZR 2353
PPC PPC LIMITED PPC 2353
PPCN PPC Limited NPL PPCN 2353
Just to demonstrate how I tried to solve this, I tried the script below.
Sub ListLinks()
'Set a reference to microsoft Internet Controls
Dim IeApp As InternetExplorer
Dim sURL As String
Dim IeDoc As Object
Dim i As Long
Set IeApp = New InternetExplorer
IeApp.Visible = True
sURL = "http://www.sharenet.co.za/v3/q_sharelookup.php"
IeApp.Navigate sURL
Do
Loop Until IeApp.ReadyState = READYSTATE_COMPLETE
Set IeDoc = IeApp.Document
For i = 0 To IeDoc.Links.Length - 1
Cells(i + 1, 1).Value = IeDoc.Links(i).href
Next i
Set IeApp = Nothing
End Sub
I thought it would work fine, to list all URLs, and then loop through each to import data, but the problem on my SharePoint site is that the href doesn't appear to have any relevance to the name of the hyperlink.
In the picture above you can see 'Building & Construction Materials' in the TD element. If I can reference that in the 1st browser, and click the correct link to open a 2nd browser, and then reference that 2nd browser and scrape all TD elements from that, everything should work fine. Does anyone here know how to do that?

Good try on the code, got it pretty close- the one area that needs some fixing is when you try and get the list of items and loop it. You had the right idea on how it would work, but the HTML element syntaxes a little off so looks like just need some more experience using HTML objects... see sample code below:
Public Sub sampleCode()
Dim URL As String
Dim XMLHTTP As MSXML2.XMLHTTP60
Dim HTMLDoc_Main As HTMLDocument
Dim HTMLDoc_Secondary As HTMLDocument
Dim targetTable As HTMLObjectElement
Dim links As IHTMLElementCollection
Dim linkCounter As Long
Dim searchText As String
URL = "http://www.sharenet.co.za/v3/q_sharelookup.php"
searchText = "Building & Construction Materials"
Set XMLHTTP = New MSXML2.XMLHTTP60
Set HTMLDoc_Main = New HTMLDocument
With XMLHTTP
.Open "GET", URL, False
.send
While .readyState <> 4: Wend
HTMLDoc_Main.body.innerHTML = .responseText
End With
Set targetTable = HTMLDoc_Main.getElementsByClassName("dataTable")(0)
Set links = targetTable.getElementsByTagName("a")
For linkCounter = 0 To links.Length - 1
With links(linkCounter)
If InStr(1, .innerText, searchText) > 0 Then
Set XMLHTTP = New MSXML2.XMLHTTP60
Set HTMLDoc_Secondary = New HTMLDocument
XMLHTTP.Open "GET", .href, False
XMLHTTP.send
While XMLHTTP.readyState <> 4: Wend
HTMLDoc_Secondary.body.innerHTML = XMLHTTP.responseText
'Parse HTMLDoc_Secondary
End If
End With
Next
Set XMLHTTP = Nothing
Set HTMLDoc_Main = Nothing
Set HTMLDoc_Secondary = Nothing
End Sub
Couple notes- 1) I used XMLHTTPRequest instead of IE as it is faster so 2) you are going to need to add 'Microsoft HTML Object Library' and 'Microsoft XML, v6.0' to your references and 3) I can see you are outputting to ranges in your original code- if at all possible this should be avoided. Populate an array and then dump its entire contents out into your target sheet all at once to save time...
Hope this helps,
TheSilkCode

Related

How to enter a website address using VBA and search

I know this may seem easy. I have already entered a code to try and get this to work, but ran into one problem. The format on the link below is the same for all city and states. As long as you can type the name of the city ("City_Search") and the State ("State_Search") you should be able to access the website with the information as seen below.
I have attached the formula I am using below. If anyone can assist me with the search I would appreciate it.
Sub SearchBot1()
'dimension (declare or set aside memory for) our variables
Dim objIE As InternetExplorer 'special object variable representing the IE browser
Dim aEle As HTMLLinkElement 'special object variable for an <a> (link) element
Dim HTMLinputs As MSHTML.IHTMLElementCollection
'initiating a new instance of Internet Explorer and asigning it to objIE
Set objIE = New InternetExplorer
'make IE browser visible (False would allow IE to run in the background)
objIE.Visible = True
'navigate IE to this web page (a pretty neat search engine really)
objIE.navigate "https://datausa.io/profile/geo/" & Range("City_Search").Value & "-" & Range("State_Search").Value
'wait here a few seconds while the browser is busy
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
End Sub
The Idea would be for me to type any city into excel and once I hit run on a macro it will go to the site and search for the towns data. I have added a link below as an example of the page I am looking to get when I search.
https://datausa.io/profile/geo/hoboken-nj/
You need to hyphenate cities that have spaces in their title. Counties need to be the correct abbreviation and both are case sensitive i.e. need to be all lower case. So you need to add these hyphens, if missing, using a function like Replace in vba, to swop Chr$(32) with "-" or Chr$(45), and potentially LCase$ to convert to lowercase.
You should also fully qualify the range with the worksheet you intend to use.
With data already in correct format in cell:
E.g. with los-angeles-ca or los-angeles-county-ca in a cell.
Option Explicit
Public Sub SearchBot1()
Dim objIE As InternetExplorer, aEle As HTMLLinkElement
Dim HTMLinputs As MSHTML.IHTMLElementCollection
Set objIE = New InternetExplorer
'e.g. https://datausa.io/profile/geo/los-angeles-ca/
With objIE
.Visible = True
.navigate "https://datausa.io/profile/geo/" & Range("City_Search").Value & "-" & Range("State_Search").Value
Do While .Busy = True Or .readyState <> 4: DoEvents: Loop
Stop
' .Quit '<== Uncomment me to close browser at end
End With
End Sub
Adding hyphens:
If you had los angeles, not los-angeles, in a cell:
Replace$(Range("City_Search").Value, Chr$(32), Chr$(45))
Lowercase and hyphen:
To be really safe you could convert to lowercase aswell to handle any upper case letters in the cell you are referencing e.g.
For Los Angeles use: Replace$(LCase$(Range("City_Search").Value)
Option Explicit
Public Sub SearchBot1()
Dim objIE As InternetExplorer, aEle As HTMLLinkElement
Dim HTMLinputs As MSHTML.IHTMLElementCollection, ws As Worksheet
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set objIE = New InternetExplorer
'e.g. https://datausa.io/profile/geo/los-angeles-ca/
With objIE
.Visible = True
.navigate "https://datausa.io/profile/geo/" & ws.Range("City_Search").Value & "-" & ws.Range("State_Search").Value
Do While .Busy = True Or .readyState <> 4: DoEvents: Loop
Stop
' .Quit '<== Uncomment me to close browser at end
End With
End Sub
That gets you to the pages. What you do then......
DID you know that this website has its own data-search API?
And you can also extract data using a background object instead of creating an Internet Explorer?
For instance:
Sub getCityData()
''' Create a background server connection
Dim myCon As Object: Set myCon = CreateObject("MSXML2.ServerXMLHTTP.6.0")
''' Open a connection string with the DataUSA API and basic request for (geo, place, population)
myCon.Open "GET", "http://api.datausa.io/api/?show=geo&sumlevel=place&required=pop"
myCon.send ''' Send the request
''' Dataset in the ResponseText is HUGE so for demo show first 5000 characters
Sheet1.Range("A1").Value2 = Left(myCon.responseText, 5000)
End Sub
That will pull the ENTIRE DATA SET for every "place" in America with its population for every year from 2013 onwards in about a second. It will place the first 5000 characters of the dataset in to cell A1 on Sheet1 (I recommend putting this in a new Excel file).
I don't have time to learn the site's API but it seems to have good documentation On github and the responses come back in JSON format - if you really want to make a powerful excel interface use their API with background connections - they have so much data for the USA at your fingertips

VBA doesn't read XMLHTTP request's response according to its tree structure

I have checked that both browser-generated page and VBA XMLHTTP request's string response have the same tree structure, with a tag being a child of aside.
Unfortunately when I want to return bookie name, which is title attribute of a, I get an error accessing 1st child of aside. It comes out that I need to use code assuming that a tag is a sibling of aside to get it working:
Required reference: Microsoft HTML Library
Sub SendRequest()
Dim XMLHTTP As Object: Set XMLHTTP = CreateObject("MSXML2.XMLHTTP.6.0")
Dim htmlEle1 As IHTMLElement
Dim htmlDoc As New HTMLDocument
Dim urlName As String
urlName = "https://www.oddschecker.com/golf/the-masters/2018-us-masters/winner"
With XMLHTTP
.Open "GET", urlName, False
.send
htmlDoc.body.innerHTML = .responseText
For Each htmlEle1 In htmlDoc.getElementsByClassName("eventTableHeader")(0).Children
If InStr(htmlEle1.className, "bookie-area") <> 0 Then
Debug.Print htmlEle1.Children(1).getAttribute("title")
End If
Next htmlEle1
End With
End Sub
Does this behavior have something to do with the fact that aside is HTML5 element and VBA thinks that it is a semi-closing tag?
So this took awful lot of time to figure out. The issue is that you can't do it this way. When you launch a new HTMLDocument the documentMode of it is by default set to 5
So when we load a write any HTML inside it, it has no idea of these HTML5 tags and it just does its own correction. This is as good as you running HTML5 site in a IE6 browser or something. Unfortunately there is no way I could find out which would allow us to create/parse document with a higher documentMode
Update
Thanks to #FlorentB for pointing out that emulation mode works on the MSHTML library as well. I was already aware of the same from below
Embedding Youtube Videos in webbrowser. Object doesn't support property or method
But I assumed it won't work for the MSHTML library. I have now tested it by running below command
REG ADD "HKCU\Software\Microsoft\Internet Explorer\Main\FeatureControl\FEATURE_BROWSER_EMULATION" /v excel.exe /t REG_DWORD /d 11001 /f
And then the existing code and it works.
Alternat approach
If setting the registry key needs to be avoided for any reason then one can use the IE COM Browser directly.
You can do this by adding a reference to Microsoft Internet Controls and then execute the below code
Sub dothis()
Dim XMLHTTP As Object: Set XMLHTTP = CreateObject("MSXML2.XMLHTTP.6.0")
Dim htmlEle1 As IHTMLElement
Dim htmlDoc As HTMLDocument
'Set htmlIDoc = htmlDoc
Dim urlName As String
urlName = "https://www.oddschecker.com/golf/the-masters/2018-us-masters/winner"
Dim ie As InternetExplorerMedium
Set ie = New InternetExplorerMedium
ie.Visible = False
ie.navigate2 urlName
While ie.readyState <> READYSTATE_COMPLETE
DoEvents
Wend
Set htmlDoc = ie.document
Debug.Print (htmlDoc.documentMode)
For Each htmlEle1 In htmlDoc.getElementsByClassName("eventTableHeader")(0).Children
If InStr(htmlEle1.className, "bookie-area") <> 0 Then
Debug.Print htmlEle1.Children(0).children(0).getAttribute("title")
End If
Next htmlEle1
End Sub
And now you can see that a is a child of aside

VBA: Run-time error 424: Object required when trying to web scrape

I'm trying to update various fund sizes using morgninstar.co.uk. The code worked fine until it suddenly stopped and gave an error:
"Run-time error 424: Object required".
The exact line where the error occurs is:
Set allData = IE.document.getElementById("overviewQuickstatsDiv").getElementsByTagName("tbody")(0)
The idea is to ultimately scan the whole "tbody"-tag and look for the line "Fund Size" inside "tr" and "td"-tags. When "Fund Size" is found, the code would return the 3rd "td"-tag (actual fund size).
After this I'd add a loop to loop through a list of funds that I've got.
As the code stopped working completely, I haven't got this far yet. Here I'm just trying to check if the code returns the actual fund size.
Since there are not always 3 "td"-tags inside the "tr"-tags, I'll still have to construct some sort of IF-statement to fix that issue.
But for now I'd just want to know how I could get the code running again? I've spent great deal of time searching for an answer but as it seems that this is a variable type problem the solution depends on the situation.
I'm using Excel 2010 and Internet Explorer 11.
URL in easy form to copy-paste:
http://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F0GBR04BKW
Sub testToScrapeWholeTbodyTag()
'Microsoft Internet Controls
'Microsoft HTML Object Library
'Microsoft Shell Controls and Automation
'======Opens URL======
Dim IE As Object
Set IE = CreateObject("internetexplorer.application")
With IE
.navigate "http://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F0GBR04BKW"
.Visible = False
End With
While IE.Busy
DoEvents
Wend
'======Got from internet, fixed a previous error. However, I'm not 100% sure what this does======
Dim sh
Dim eachIE As Object
Do
Set sh = New Shell32.Shell
For Each eachIE In sh.Windows
If InStr(1, eachIE.LocationURL, "http://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F0GBR04BKW") Then
Set IE = eachIE
IE.Visible = False '"This is here because in some environments, the new process defaults to Visible."
Exit Do
End If
Next eachIE
Loop
Set eachIE = Nothing
Set sh = Nothing
'======Looks for the "Fund Size"======
'Trying to look for "Fund Size" inside "tr"-tag and if found, return the value in the 3rd "tr"-tag
Set allData = IE.document.getElementById("overviewQuickstatsDiv").getElementsByTagName("tbody")(0) 'Run-time error 424: Object required
row1 = allData.getElementsByTagName("tr")(5).Cells(0).innerHTML
row2 = allData.getElementsByTagName("tr")(5).Cells(1).innerHTML
row3 = allData.getElementsByTagName("tr")(5).Cells(2).innerHTML
If Left(row1, 9) = "Fund Size" Then
Worksheets("Sheet3").Range("B3") = Split(row3, ";")(1)
End If
Debug.Print allData.getElementsByTagName("tr")(5).Cells(0).innerHTML '"Fund Size"
Debug.Print allData.getElementsByTagName("tr")(5).Cells(2).innerHTML 'Actual fund size
IE.Quit
Set IE = Nothing
End Sub
EDIT:
Switched method. Now the problem is to get the fund size extracted. So the below code works as it is but I'd need to add a couple of lines to get the fund size out of it. This is my first time using this method so it may well be that I've just not understood some really basic thing. Still, I wasn't able to find a solution to this on my own.
Sub XMLhttpRequestTest()
'Microsoft XML, v 6.0
'Microsoft HTML object library
Dim HTMLDoc As New HTMLDocument
Dim ohttp As New MSXML2.XMLHTTP60
Dim myurl As String
Dim TRelements As Object
Dim TRelement As Object
myurl = "http://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F0GBR04BKW"
ohttp.Open "GET", myurl, False
ohttp.send
HTMLDoc.body.innerHTML = ohttp.responseText
With HTMLDoc.body
Set TRelements = .getElementsByTagName("tr")
For Each TRelement In TRelements
Debug.Print TRelement.innerText
Next
End With
End Sub
You can use a css selector of
#overviewQuickstatsDiv td.line.text
And then select the element at index 4
# means id. . = className.
Option Explicit
Public Sub XMLhttpRequestTest()
'Microsoft XML, v 6.0
'Microsoft HTML object library
Dim HTMLDoc As New HTMLDocument, ohttp As New MSXML2.XMLHTTP60
Const URL As String = "http://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F0GBR04BKW"
Dim TRelements As Object, TRelement As Object
With ohttp
.Open "GET", URL, False
.send
HTMLDoc.body.innerHTML = .responseText
Debug.Print HTMLDoc.querySelectorAll("#overviewQuickstatsDiv td.line.text")(4).innerText
'Your other stuff
End With
End Sub

Extracting website data with Excel and VBA [duplicate]

Im trying to scrape data from website: http://uk.investing.com/rates-bonds/financial-futures via vba, like real-time price, i.e. German 5 YR Bobl, US 30Y T-Bond, i have tried excel web query but it only scrapes the whole website, but I would like to scrape the rate only, is there a way of doing this?
There are several ways of doing this. This is an answer that I write hoping that all the basics of Internet Explorer automation will be found when browsing for the keywords "scraping data from website", but remember that nothing's worth as your own research (if you don't want to stick to pre-written codes that you're not able to customize).
Please note that this is one way, that I don't prefer in terms of performance (since it depends on the browser speed) but that is good to understand the rationale behind Internet automation.
1) If I need to browse the web, I need a browser! So I create an Internet Explorer browser:
Dim appIE As Object
Set appIE = CreateObject("internetexplorer.application")
2) I ask the browser to browse the target webpage. Through the use of the property ".Visible", I decide if I want to see the browser doing its job or not. When building the code is nice to have Visible = True, but when the code is working for scraping data is nice not to see it everytime so Visible = False.
With appIE
.Navigate "http://uk.investing.com/rates-bonds/financial-futures"
.Visible = True
End With
3) The webpage will need some time to load. So, I will wait meanwhile it's busy...
Do While appIE.Busy
DoEvents
Loop
4) Well, now the page is loaded. Let's say that I want to scrape the change of the US30Y T-Bond:
What I will do is just clicking F12 on Internet Explorer to see the webpage's code, and hence using the pointer (in red circle) I will click on the element that I want to scrape to see how can I reach my purpose.
5) What I should do is straight-forward. First of all, I will get by the ID property the tr element which is containing the value:
Set allRowOfData = appIE.document.getElementById("pair_8907")
Here I will get a collection of td elements (specifically, tr is a row of data, and the td are its cells. We are looking for the 8th, so I will write:
Dim myValue As String: myValue = allRowOfData.Cells(7).innerHTML
Why did I write 7 instead of 8? Because the collections of cells starts from 0, so the index of the 8th element is 7 (8-1). Shortly analysing this line of code:
.Cells() makes me access the td elements;
innerHTML is the property of the cell containing the value we look for.
Once we have our value, which is now stored into the myValue variable, we can just close the IE browser and releasing the memory by setting it to Nothing:
appIE.Quit
Set appIE = Nothing
Well, now you have your value and you can do whatever you want with it: put it into a cell (Range("A1").Value = myValue), or into a label of a form (Me.label1.Text = myValue).
I'd just like to point you out that this is not how StackOverflow works: here you post questions about specific coding problems, but you should make your own search first. The reason why I'm answering a question which is not showing too much research effort is just that I see it asked several times and, back to the time when I learned how to do this, I remember that I would have liked having some better support to get started with. So I hope that this answer, which is just a "study input" and not at all the best/most complete solution, can be a support for next user having your same problem. Because I have learned how to program thanks to this community, and I like to think that you and other beginners might use my input to discover the beautiful world of programming.
Enjoy your practice ;)
Other methods were mentioned so let us please acknowledge that, at the time of writing, we are in the 21st century. Let's park the local bus browser opening, and fly with an XMLHTTP GET request (XHR GET for short).
Wiki moment:
XHR is an API in the form of an object whose methods transfer data
between a web browser and a web server. The object is provided by the
browser's JavaScript environment
It's a fast method for retrieving data that doesn't require opening a browser. The server response can be read into an HTMLDocument and the process of grabbing the table continued from there.
Note that javascript rendered/dynamically added content will not be retrieved as there is no javascript engine running (which there is in a browser).
In the below code, the table is grabbed by its id cr1.
In the helper sub, WriteTable, we loop the columns (td tags) and then the table rows (tr tags), and finally traverse the length of each table row, table cell by table cell. As we only want data from columns 1 and 8, a Select Case statement is used specify what is written out to the sheet.
Sample webpage view:
Sample code output:
VBA:
Option Explicit
Public Sub GetRates()
Dim html As HTMLDocument, hTable As HTMLTable '<== Tools > References > Microsoft HTML Object Library
Set html = New HTMLDocument
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://uk.investing.com/rates-bonds/financial-futures", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT" 'to deal with potential caching
.send
html.body.innerHTML = .responseText
End With
Application.ScreenUpdating = False
Set hTable = html.getElementById("cr1")
WriteTable hTable, 1, ThisWorkbook.Worksheets("Sheet1")
Application.ScreenUpdating = True
End Sub
Public Sub WriteTable(ByVal hTable As HTMLTable, Optional ByVal startRow As Long = 1, Optional ByVal ws As Worksheet)
Dim tSection As Object, tRow As Object, tCell As Object, tr As Object, td As Object, r As Long, C As Long, tBody As Object
r = startRow: If ws Is Nothing Then Set ws = ActiveSheet
With ws
Dim headers As Object, header As Object, columnCounter As Long
Set headers = hTable.getElementsByTagName("th")
For Each header In headers
columnCounter = columnCounter + 1
Select Case columnCounter
Case 2
.Cells(startRow, 1) = header.innerText
Case 8
.Cells(startRow, 2) = header.innerText
End Select
Next header
startRow = startRow + 1
Set tBody = hTable.getElementsByTagName("tbody")
For Each tSection In tBody
Set tRow = tSection.getElementsByTagName("tr")
For Each tr In tRow
r = r + 1
Set tCell = tr.getElementsByTagName("td")
C = 1
For Each td In tCell
Select Case C
Case 2
.Cells(r, 1).Value = td.innerText
Case 8
.Cells(r, 2).Value = td.innerText
End Select
C = C + 1
Next td
Next tr
Next tSection
End With
End Sub
you can use winhttprequest object instead of internet explorer as it's good to load data excluding pictures n advertisement instead of downloading full webpage including advertisement n pictures those make internet explorer object heavy compare to winhttpRequest object.
This question asked long before. But I thought following information will useful for newbies. Actually you can easily get the values from class name like this.
Sub ExtractLastValue()
Set objIE = CreateObject("InternetExplorer.Application")
objIE.Top = 0
objIE.Left = 0
objIE.Width = 800
objIE.Height = 600
objIE.Visible = True
objIE.Navigate ("https://uk.investing.com/rates-bonds/financial-futures/")
Do
DoEvents
Loop Until objIE.readystate = 4
MsgBox objIE.document.getElementsByClassName("pid-8907-last")(0).innerText
End Sub
And if you are new to web scraping please read this blog post.
Web Scraping - Basics
And also there are various techniques to extract data from web pages. This article explain few of them with examples.
Web Scraping - Collecting Data From a Webpage
I modified some thing that were poping up error for me and end up with this which worked great to extract the data as I needed:
Sub get_data_web()
Dim appIE As Object
Set appIE = CreateObject("internetexplorer.application")
With appIE
.navigate "https://finance.yahoo.com/quote/NQ%3DF/futures?p=NQ%3DF"
.Visible = True
End With
Do While appIE.Busy
DoEvents
Loop
Set allRowofData = appIE.document.getElementsByClassName("Ta(end) BdT Bdc($c-fuji-grey-c) H(36px)")
Dim i As Long
Dim myValue As String
Count = 1
For Each itm In allRowofData
For i = 0 To 4
myValue = itm.Cells(i).innerText
ActiveSheet.Cells(Count, i + 1).Value = myValue
Next
Count = Count + 1
Next
appIE.Quit
Set appIE = Nothing
End Sub

need help scraping with excel vba

I need to scrape Title, product description and Product code and save it into worksheet from <<<HERE>>> in this case those are :
"Catherine Lansfield Helena Multi Bedspread - Double"
"This stunning ivory bedspread has been specially designed to sit with the Helena bedroom range. It features a subtle floral design with a diamond shaped quilted finish. The bedspread is padded so can be used as a lightweight quilt in the summer or as an extra layer in the winter.
Polyester.
Size L260, W240cm.
Suitable for a double bed.
Machine washable at 30°C.
Suitable for tumble drying.
EAN: 5055184924746.
Product Code 116/4196"
I have tried different methods and none was good for me in the end. For Mid and InStr functions result was none, it could be that my code was wrong. Sorry i do not give any code because i had already messed it up many times and have had no result. I have tried to scrape hole page with GetDatafromPage. It works well, but for different product pages the output goes to different rows as ammount of elements changes from page to page. Also it`s not possible to scrape only chosen elements. So it is pointless to get value from defined cells.
Another option instead of using the InternetExplorer object is the xmlhttp object. Here is a similar example to kekusemau but instead using xmlhttp object to request the page. I am then loading the responseText from the xmlhttp object in the html file.
Sub test()
Dim xml As Object
Set xml = CreateObject("MSXML2.XMLHTTP")
xml.Open "Get", "http://www.argos.co.uk/static/Product/partNumber/1164196.htm", False
xml.send
Dim doc As Object
Set doc = CreateObject("htmlfile")
doc.body.innerhtml = xml.responsetext
Dim name
Set name = doc.getElementById("pdpProduct").getElementsByTagName("h1")(0)
MsgBox name.innerText
Dim desc
Set desc = doc.getElementById("genericESpot_pdp_proddesc2colleft").getElementsByTagName("div")(0)
MsgBox desc.innerText
Dim id
Set id = doc.getElementById("pdpProduct").getElementsByTagName("span")(0).getElementsByTagName("span")(2)
MsgBox id.innerText
End Sub
This seems to be not too difficult. You can use Firefox to take a look at the page structure (right-click somewhere and click inspect element, and go on from there...)
Here is a simple sample code:
Sub test()
Dim ie As InternetExplorer
Dim x
Set ie = New InternetExplorer
ie.Visible = True
ie.Navigate "http://www.argos.co.uk/static/Product/partNumber/1164196.htm"
While ie.ReadyState <> READYSTATE_COMPLETE
DoEvents
Wend
Set x = ie.Document.getElementById("pdpProduct").getElementsByTagName("h1")(0)
MsgBox Trim(x.innerText)
Set x = ie.Document.getElementById("genericESpot_pdp_proddesc2colleft").getElementsByTagName("div")(0)
MsgBox x.innerText
Set x = ie.Document.getElementById("pdpProduct").getElementsByTagName("span")(0).getElementsByTagName("span")(2)
MsgBox x.innerText
ie.Quit
End Sub
(I have a reference in Excel to Microsoft Internet Controls, I don't know if that is there by default, if not you have to set it first to run this code).