MS Access Web Scraping -vba runtime error 91 - vba

I have a small MS Access (Office 365) program to perform web page scrapes and store the results in Access tables. The program looks up a web page then reads in any menu links. However, when trying to do this I keep receiving the error:
Object variable not set (Error 91)
And this line gets highlighted:
Set arrAreas = vDiv.getElementsByClassName("rmLink rmRootLink")
This is the full code:
Private Function fx_Read_List()
'------------< fx_Read_List() >------------
'< get Menu >
Dim vDiv As HTMLDivElement
Set vDiv = hdoc.getElementById("ctl00_ctlMenu")
'< get Menu-Area Links >
Dim arrAreas As IHTMLElementCollection
Set arrAreas = vDiv.getElementsByClassName("rmLink rmRootLink")
'</ get Menu-Area Links >
'< save in Array >
Dim arrLinks() As String
ReDim arrLinks(100)
Dim iLink As Integer
For iLink = 0 To arrAreas.length - 1
Dim aLink As HTMLAnchorElement
Set aLink = arrAreas(iLink)
arrLinks(iLink) = aLink.hRef
Next
ReDim Preserve arrLinks(iLink - 1)
'</ save in Array >
What am I missing?

Related

VBA : Finding nested elements in HTML Document

I am trying to scrape all the href links of products from this link.
I am using the following code to get the product links on the page:
Sub urlCatch()
Dim internet As Object
Dim internetdata As Object
Dim div_result As Object
Dim header_links As Object
Dim link, itm As Object
Dim url As String
Dim X As Variant
Set internet = CreateObject("InternetExplorer.Application")
internet.Visible = True
url = "http://www.dell.com/vg/p/desktops.aspx?c=vg&cs=vgdhs1&l=en&s=dhs&~ck=mn"
internet.Navigate url
Do Until internet.ReadyState >= 4
DoEvents
Loop
Set internetdata = internet.document
Set div_result = internetdata.getelementsbyclassname("categorySubNavigation").getelementsbyclassname("c4 seriesOptions")
Set header_links = div_result.getelementsbytagname("a")
For Each itm In header_links
Set link = itm.ChildNodes.Item(0)
Cells(Range("A" & Rows.Count).End(xlUp).Row + 1, 1) = link.href
Next
End Sub
I am getting an error at div_reult: "object doesn't support this property or method".
This is the first time I am using an element to retrieve something, so maybe I might be making mistakes which I am currently unaware off.
Please look into my code, and let me know what blunder I am making here.
I am told by one of my friends that I might need to use a regex to get all these links, but I wanted to get a hold of this method first.
Please give me some guidance. Thanks.
GetElementsByTagName or other fetch methods work on single element and not over a collection. You need another loop.
Sub urlCatch()
Dim url As String
Dim internet As Object
Dim internetdata
Dim div_result
Dim links
Dim itm
Dim itm2
Set internet = CreateObject("InternetExplorer.Application")
internet.Visible = True
url = "http://www.dell.com/vg/p/desktops.aspx?c=vg&cs=vgdhs1&l=en&s=dhs&~ck=mn"
internet.Navigate url
Do
DoEvents
Loop Until internet.ReadyState >= 4 And Not internet.busy
Set internetdata = internet.document.body
Set div_result = internetdata.getelementsbyclassname("c4 seriesOptions")
For Each itm In div_result
Set links = itm.getElementsByTagName("A")
For Each itm2 In links
Cells(Range("A" & Rows.Count).End(xlUp).Row + 1, 1) = itm2.href
Next
Next
End Sub

DraftSight API - MoveEntities via VB.Net

I have a VB.Net program that will open a .dwg and then give the user the ability to add “part blocks” to the drawing.
However, I need to be able to shift all the components in the drawing at one time.
How do I “Select All” from the API in DraftSight? The code below is what I have got started.
'Connect to DraftSight
dsApp = GetObject(, "DraftSight.Application")
'Get active document
dsDoc = dsApp.GetActiveDocument()
If Not dsDoc Is Nothing Then
'Get model space
dsModel = dsDoc.GetModel()
'Get Sketch Manager
dsSketchManager = dsModel.GetSketchManager()
'Basic Command I want to use to Move Selection
' - MOVE
' - Specify Entities = SelectALL
' - Specify Entities = "Enter"
' - Specify from Point = insertX, insertY
' - Specify destination = 0,0
'Move instance parameters
Dim Move_Instance As ISketchManager
Dim insertX As Double
insertX = 5.0
Dim insertY As Double
insertY = 10.0
Dim insertZ As Double
insertZ = 0.0
Dim SelectALL As Object
'Move Selection
Move_Instance = dsApp.GetActiveDocument().GetModel().GetSketchManager.MoveEntities(insertX, insertY, insertZ, SelectALL)
Else
MsgBox("There are no open documents in DraftSight")
End If
After some trial and error, I decide a way to do what I am looking for is to just use the “RunCommand” feature. Below is what I did…
Dim dsApp As DraftSight.Application
Dim dsDoc As DraftSight.Document
Dim dsModel As DraftSight.Model
Dim dsSketchManager As DraftSight.SketchManager
Dim dsViewManager As DraftSight.ViewManager
'Connect to DraftSight
dsApp = GetObject(, "DraftSight.Application")
'Abort any command currently running in DraftSight
'to avoid nested commands
dsApp.AbortRunningCommand()
'Get active document
dsDoc = dsApp.GetActiveDocument()
If Not dsDoc Is Nothing Then
'Get model space
dsModel = dsDoc.GetModel()
'Get Sketch Manager
dsSketchManager = dsModel.GetSketchManager()
Dim dsSheet As DraftSight.Sheet
Dim dsVarSheets As Object
dsVarSheets = dsDoc.GetSheets
dsSheet = dsVarSheets(1)
If dsSheet Is Nothing Then
Return
End If
'Get View Manager
dsViewManager = dsDoc.GetViewManager()
'Select All and Move to location 0,0
Dim Move As Integer
Move = dsApp.RunCommand("MOVE ALL -5.0,-10.0 0.0,0.0", False)
Else
MsgBox("There are no open documents in DraftSight")
End If

Collect images url and data from webpage table to Excel table

Need to collect data from table on a webpage, some table cell have images.
The codes is to copy the data to Excel, and if the cell has images, then get its src links instead of images. below are the codes, but it is not working, I don't know how to detect if the cell has image in it or not, and add its src links to Excel cell.
Sub extractData()
Dim IE As Object, obj As Object
Dim myYear As String
Dim r As Integer, c As Integer, t As Integer
Dim elemCollection As Object
Set IE = CreateObject("InternetExplorer.Application")
myYear = InputBox("Enter year")
With IE
.Visible = True
.navigate ("url")
While IE.ReadyState <> 4
DoEvents
Wend
For Each obj In IE.Document.All.Item("Year").Options
If obj.innerText = myYear Then
obj.Selected = True
End If
Next obj
IE.Document.getElementsByName("btn_search").Item.Click
Do While IE.busy: DoEvents: Loop
ThisWorkbook.Sheets("Sheet1").Range("A1:K2000").ClearContents
Set elemCollection = IE.Document.getElementsByTagName("TABLE")
For t = 0 To (elemCollection.Length - 9)
For r = 0 To (elemCollection(t).Rows.Length - 1)
For c = 0 To (elemCollection(t).Rows(r).Cells.Length - 1)
If elemCollection(t).Rows(r).Cells(c).innerText = "" Then
ThisWorkbook.Worksheets(1).Cells(r + 1, c + 1) = elemCollection(t).Rows(r).Cells(c).getAttribute("src")
Exit For
End If
Next
ThisWorkbook.Worksheets(1).Cells(r + 1, c + 1) = elemCollection(t).Rows(r).Cells(c).innerText
Next c
Next r
Next t
End With
Set IE = Nothing
End Sub
First, brush up on HTML Document Object Model. There are tons of tutorials on how to use JavaScript to work with the DOM, and VBA is real similar (because the DOM doesn't change based on language and VBA is very closely related to VBScript which is similar to JavaScript). Second, if you get an error but no line is highlighted when you click the Debug button, step through your code line by line with the F8 key. This will let you figure out where the error is occurring. Third, add a reference to the Microsoft HTML Object Library so you can use Intellisense for code hints.
It's tough to give an actual solution without seeing the HTML source so instead I'll give some pointers:
Use IE.Document.Body.getElementsByTagName("TABLE") (note the addition of BODY to narrow the scope) to get a collection of every table on the page. With a reference to the HTML Objects Lib you can do something like this:
Dim oTable As HTMLTable
Dim oCell As HTMLTableCell
Dim oImg As HTMLImage
Dim strSrc As String
For Each oCell In oTable.Cells
strSrc = ""
On Error Resume Next
Set oImg = oCell.getElementsByTagName("img")
strSrc = oImg.Source
On Error GoTo 0
If strSrc <> "" Then Debug.Print strSrc
Next
This should (I did not test it) loop through every cell in a table and attempt to get an img element. If it fails, no biggie, just continue to the next cell. If you want to use late binding after you get it working, remove the HTML Obj Lib reference then simply dim everything as an object. Eg:
Dim oTable As Object 'HTMLTable
Dim oCell As Object 'HTMLTableCell
Dim oImg As Object 'HTMLImage
Dim strSrc As String

How to scrape data from the following table format VBA

I am trying to scrape all the table from start page to end contents from this Webpage
Using the code below I can scrape the table contents of page 1 but I don't know how can I modify the code to get the data from start page to end.
Option Explicit
Sub NBAStats()
Dim IE As Object, obj As Object
Dim r As Integer, c As Integer, t As Integer
Dim elemCollection As Object
Set IE = CreateObject("InternetExplorer.Application")
With IE
.Visible = True
.navigate ("http://stats.nba.com/league/player/#!/")
While IE.ReadyState <> 4
DoEvents
Wend
Do While IE.busy: DoEvents: Loop
ThisWorkbook.Sheet1.Clear
Set elemCollection = IE.Document.getElementsByTagName("TABLE")
For t = 0 To (elemCollection.Length - 1)
For r = 0 To (elemCollection(t).Rows.Length - 1)
For c = 0 To (elemCollection(t).Rows(r).Cells.Length - 1)
ThisWorkbook.Worksheets(1).Cells(r + 1, c + 1) = elemCollection(t).Rows(r).Cells(c).innerText
Next c
Next r
Next t
End With
Set IE = Nothing
End Sub
Try to find the Sitemap.xml of the website you are scraping. The sitemap.xml fill have all the links present in the webpage.
Import that xml file to your Excel Sheet, Read each link & fetch each table in it.
At first, in my opinion VBA automation of the Internet Explorer is highly instable and not really practicable in productive use-cases. This means also scraping data from web sites which are provided only for viewing within a browser is not really practicable in productive use-cases. If you are entitled to use those data then you should ask for another data source (XML or JSONfor example). If you are not entitled then you should not do that. Possible the provider of the web site does not agree with this.
To be clear, I'm talking about web sites like this, which provides it's data with JavaScript only. If the data would be within the HTML then you could get those data via XMLHTTP. This is another thing.
I will nevertheless provide a "solution". So you can't simply think "He is simply unable to do this, so he is saying you should not do that."
So you must analyze the site and pick out the elements you can click for navigation.
Option Explicit
Sub NBAStats()
Dim IE As Object
Dim r As Long, c As Long, t As Long, rSheet As Long, rStart As Long
Dim bReady As Boolean
Dim elementsTable As Object
Dim elementsPageNavRigth As Object
Dim elemPageNavRigth As Object
Dim elementsTableDiv As Object
ThisWorkbook.Worksheets(1).Cells.Clear
Set IE = CreateObject("InternetExplorer.Application")
With IE
.Visible = True
.navigate ("http://stats.nba.com/league/player/#!/")
Do While IE.busy
DoEvents
Loop
rSheet = 0
Do
Do While elementsTableDiv Is Nothing
Set elementsTableDiv = IE.Document.getElementsByClassName("table-responsive")
DoEvents
Loop
Do While elementsTableDiv(0) Is Nothing
DoEvents
Loop
Set elementsPageNavRigth = IE.Document.getElementsByClassName("page-nav right")
Set elemPageNavRigth = elementsPageNavRigth(0)
If elemPageNavRigth.className = "page-nav right disabled" Then bReady = True
'If rSheet = 0 Then rStart = 0 Else rStart = 1
Set elementsTable = elementsTableDiv(0).getElementsByTagName("TABLE")
For r = rStart To (elementsTable(0).Rows.Length - 1)
For c = 0 To (elementsTable(0).Rows(r).Cells.Length - 1)
ThisWorkbook.Worksheets(1).Cells(r + rSheet + 1, c + 1) = elementsTable(t).Rows(r).Cells(c).innerText
Next c
Next r
rSheet = rSheet + r
If Not elemPageNavRigth Is Nothing Then elemPageNavRigth.Click
Set elementsTableDiv = Nothing
Loop Until bReady Or elemPageNavRigth Is Nothing
End With
Set IE = Nothing
End Sub

Copying from Internet Explorer text area (box) but into more than a single cell

I'm currently trying to control/automate a postcode looking website from postcodes stored and updated in Excel, and my code works perfectly up to the point it has to copy the data once it's finished. For the life of me I can't figure out how to copy the data from the text box / area into Excel without it just putting it ALL into one cell (Text to Columns doesn't really work either).
The website is : http://www.doogal.co.uk/DrivingDistances.php
Sub Geo2()
Dim sht As Worksheet
Dim IE As Object
'Dim ieDoc As HTMLDocument
Dim Item As Variant
Dim objElement As Object
Dim startLoc As String
Dim endLoc As String
Dim x As Integer
Dim objNotes As Object
Dim strNotes As String
Dim str As String
'Dim SignInButton As HTMLInputButtonElement
Set sht = ThisWorkbook.Sheets("Postcode")
Set IE = CreateObject("InternetExplorer.Application")
'Open IE
IE.Visible = True
IE.Navigate "http://www.doogal.co.uk/DrivingDistances.php"
'Wait until site is loaded
Do While IE.ReadyState <> 4 'READYSTATE_COMPLETE
DoEvents
Loop
IE.Document.getElementbyID("startLocs").Value = "dn1 5pq" 'random postcode
IE.Document.getElementbyID("endLocs").Value = "wf12 2fd" 'random postcode
IE.Document.getElementsByName("calculateFor").Item(1).Checked = True
IE.Document.getElementsByName("units").Item(1).Checked = True
IE.Document.getElementsByClassName("btn btn-primary").Item(0).Click
------
'Ive tried without having it as a object and using .value but it either comes with only the first line or the entire thing rammed into a string and is unusable
----Code here is the problem-----
***Set objNotes = IE.Document.getElementbyID("distances")
str = objNotes.Value***
---------
Do While IE.ReadyState <> 4 'READYSTATE_COMPLETE
DoEvents
Loop
End Sub
The following VBA function uses the Google Maps Directions API to calculate the driving distance in meters between two locations. The code is modified from a version submitted by barrowc on this similar question.
Make sure to add a reference in Excel to Microsoft XML, v6.0.
Function getDistance(origin As String, destination As String) As String
Dim xhrRequest As XMLHTTP60
Dim domDoc As DOMDocument60
Dim ixnlDistanceNode As IXMLDOMNode
Dim RequestString As String
Dim API_Key As String
' Insert your own Google Maps Directions API key here
API_Key = "XXXXXX"
' Read the data from the website
Set xhrRequest = New XMLHTTP60
RequestString = "https://maps.googleapis.com/maps/api/directions/xml?origin=" _
& origin & "&destination=" & destination & "&sensor=false&key=" & API_Key
xhrRequest.Open "GET", RequestString, False
xhrRequest.send
' Copy the results into a format we can manipulate with XPath
Set domDoc = New DOMDocument60
domDoc.LoadXML xhrRequest.responseText
' Select the node called value underneath the leg and distance parents.
' The distance returned is the driving distance in meters.
Set ixnlDistanceNode = domDoc.SelectSingleNode("//leg/distance/value")
getDistance = ixnlDistanceNode.Text
Set ixnlDistanceNode = Nothing
Set domDoc = Nothing
Set xhrRequest = Nothing
End Function
Please note that this code by itself violates the Terms of Use of Google's API. "The Google Maps Directions API may only be used in conjunction with displaying results on a Google map; using Directions data without displaying a map for which directions data was requested is prohibited."1
Instead of putting the data all in one string, Split the string into an array, then loop through the array like this:
Set objNotes = IE.Document.getElementbyID("distances")
Dim x as Integer
Dim aDist() as Variant
aDist = Split(objNotes.Value, vbNewLine) 'May need to be vbCr or vbLf or vbCrLf
For x = 0 to Ubound(aDist) - 1
debug.print aDist(x)
Next x