how to get the meta name keywords -vba - vba

I am trying to get the meta name keywords from a webpage
meta name="keywords" content="Mitch Albom,For One More Day,Little, Brown Book Group,0751537535,Fiction / General,General & Literary Fiction,Modern & contemporary fiction (post c 1945),USA
I need to get the contents from it need help.
Option Explicit
Sub GetData()
Dim ie As New InternetExplorer
Dim str As String
Dim wk As Worksheet
Dim webpage As New HTMLDocument
Dim item As HTMLHtmlElement
Set wk = Sheet1
str = wk.Range("Link").value
ie.Visible = True
ie.Navigate str
Do
DoEvents
Loop Until ie.ReadyState = READYSTATE_COMPLETE
Dim Doc As HTMLDocument
Set Doc = ie.Document
Dim kwd As String
kwd = Trim(Doc.getElementsByTagName("keywords").innerText)
MsgBox kwd
End Sub

The best way to do that is by finding meta-element with name keyword and referring to its content property. You can do it like that:
Option Explicit
Sub GetData()
Dim ie As New InternetExplorer
Dim str As String
Dim wk As Worksheet
Dim webpage As New HTMLDocument
Dim item As HTMLHtmlElement
Set wk = Sheet1
str = wk.Range("Link").value
ie.Visible = True
ie.Navigate str
Do
DoEvents
Loop Until ie.ReadyState = READYSTATE_COMPLETE
'Find the proper meta element --------------
Const META_TAG As String = "META"
Const META_NAME As String = "keywords"
Dim Doc As HTMLDocument
Dim metaElements As Object
Dim element As Object
Dim kwd As String
Set Doc = ie.Document
Set metaElements = Doc.all.tags(META_TAG)
For Each element In metaElements
If element.Name = META_NAME Then
kwd = element.Content
End If
Next
MsgBox kwd
End Sub

Related

Type Mismatch on one machine

I wrote some code to scrape data from a website. I've tested it on 5 difference machines with different versions of excel and it all works fine. But on the intended users machine we get type mismatch error.The code fails at the last line below.
Sub LogIn()
Dim ie As SHDocVw.InternetExplorer
Dim iDoc As MSHTML.HTMLDocument
Dim ele As MSHTML.IHTMLElement
Dim eles As MSHTML.IHTMLElementCollection
Dim tableSection As MSHTML.IHTMLElement
Dim tableRow As MSHTML.IHTMLElement
Dim tableCell As MSHTML.IHTMLElement
Dim smallCell As MSHTML.IHTMLElement
Dim iCol As Integer
Dim iRow As Integer
Dim iCounter As Integer
iRow = 0
Do
iRow = iRow + 1
Loop Until Cells(iRow, 5) = ""
Range(Cells(1, 5), Cells(iRow, 6)).ClearContents
Set ie = New InternetExplorer
ie.Visible = False
ie.navigate ("https://www.howdidido.com/")
Do While ie.readyState <> READYSTATE_COMPLETE
Loop
Set iDoc = ie.document
any help greatly appreciated.
I have tried the following code and it is working alright. Maybe it can help you (seems as two loops and doEvents are needed for the ready state completes).
Dim iDoc As MSHTML.HTMLDocument
Dim iCol As Integer
Dim iRow As Integer
Dim iCounter As Integer
Dim i As Long
Dim URL As String
Dim IE As Object
Dim objElement As Object
Dim objCollection As Object
Set IE = CreateObject("InternetExplorer.Application")
'Set IE.Visible = True to make IE visible, or False for IE to run in the background
IE.Visible = True
'Define URL
URL = "https://www.automateexcel.com/excel/"
'Navigate to URL
IE.Navigate URL
' Statusbar let's user know website is loading
Application.StatusBar = URL & " is loading. Please wait..."
' Wait while IE loading...
'IE ReadyState = 4 signifies the webpage has loaded (the first loop is set to avoid inadvertently skipping over the second loop)
Do While IE.ReadyState = 4: DoEvents: Loop 'Do While
Do Until IE.ReadyState = 4: DoEvents: Loop 'Do Until
'Webpage Loaded
Application.StatusBar = URL & " Loaded"
Set iDoc = IE.Document
'Unload IE
Set IE = Nothing
Set objElement = Nothing
Set objCollection = Nothing

Scraping Data from Website using vba doesn´t work

I want to create a list of player names scraping a website. The Internet Explorer starts but I get an run time error "438" - Object does not support this property or method.
Structure of webpage
My coding is as follows:
Option Explicit
Sub Kickbase()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLdoc As MSHTML.HTMLDocument
Dim HTMLPlayers As MSHTML.IHTMLElementCollection
Dim HTMLPlayer As MSHTML.IHTMLElement
Dim i As Integer
Dim HTMLfirstName As Object
Dim firstName As String
IE.Visible = True
IE.Navigate "https://play.kickbase.com/transfermarkt/kaufen"
Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop
Application.Wait (Now + TimeValue("0:00:10"))
Set HTMLdoc = IE.Document
Set HTMLPlayers = HTMLdoc.getElementsByClassName("players")
For i = 0 To HTMLPlayers(0).getElementsByClassName("firstName").Length - 1
Set HTMLfirstName = HTMLPlayers(0).getElementsByClassName("firstName")
If Not HTMLfirstName Is Nothing Then
firstName = Trim(HTMLfirstName.innerText)
Else
firstName = "no_value"
End If
Debug.Print firstName
Next i
End Sub
I have activated the following libraries:
Since it's not possible to test the website on my own, the code below might not be the best way to do it but it should work:
Sub Kickbase()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLdoc As MSHTML.HTMLDocument
Dim HTMLPlayers As Object
Dim i As Integer
Dim firstName As String
IE.Visible = True
IE.navigate "https://play.kickbase.com/transfermarkt/kaufen"
Do While IE.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Application.Wait (Now + TimeValue("0:00:10"))
Set HTMLdoc = IE.document
Set HTMLPlayers = HTMLdoc.getElementsByClassName("playerName")
For i = 0 To HTMLPlayers(0).getElementsByClassName("firstName").Length - 1
firstName = Trim$(HTMLPlayers(0).getElementsByClassName("firstName")(i).innerText)
If firstName = vbNullString Then firstName = "no_value"
Debug.Print firstName
Next i
'=== Optional depending on your use case, remember to close IE or else it will remain there ===
'IE.Quit
'Set IE = Nothing
End Sub

Internet Explorer VBA Automation Error: The object Invoked has disconnected from its clients

I'm trying to write code that will read a value from Excel, look it up in an internal web based system and store the results back in the Excel. It reads the Excel with no problem, opens Internet Explorer with no problem, but when I then try to reference what's been opened, I get the above error. The line "ie.Navigate url" works, but the next line "Set DOC = ie.Document" generates the error. Any ideas on what's causing this? Here's my code:
Public Sub getClient()
Dim xOpen As Boolean
xOpen = False
Dim row As Long
Dim xL As Excel.Application
Set xL = New Excel.Application
xL.Visible = False
Dim wb As Excel.Workbook
Dim sh As Excel.Worksheet
'Change the name as needed, out put in some facility to input it or
'process multiples...
Dim filename As String
filename = "auditLookup.xlsx"
Set wb = xL.Workbooks.Open(getPath("Audit") + filename)
xOpen = True
Set sh = wb.Sheets(1)
Dim ie As Variant
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
Dim DOC As HTMLDocument
Dim idx As Integer
Dim data As String
Dim links As Variant
Dim lnk As Variant
Dim iRow As Long
iRow = 2 'Assume headers
Dim clientName As String
Dim clientID As String
Dim nameFound As Boolean
Dim idFound As Boolean
Dim url As String
While sh.Cells(iRow, 1) <> ""
'Just in case these IDs are ever prefixed with zeroes, I'm inserting
'some random character in front, but removing it of course when
'processing.
url = "https://.../" + mid(sh.Cells(iRow, 1), 2)
ie.navigate url
Set DOC = ie.Document
'Search td until we find "Name:" then the next td will be the name.
'Then search for "P1 ID (ACES):" and the next td with be that.
Set links = DOC.getElementsByTagName("td")
clientName = ""
clientID = ""
nameFound = False
idFound = False
For Each lnk In links
data = lnk.innerText
If nameFound Then
clientName = data
ElseIf idFound Then
clientID = data
End If
If nameFound And idFound Then
Exit For
End If
If data = "Name:" Then
nameFound = True
ElseIf data = "P1 ID (ACES):" Then
idFound = True
End If
Next
sh.Cells(iRow, 2) = clientName
sh.Cells(iRow, 2) = clientID
iRow = iRow + 1
Wend
Set ie = Nothing
If xOpen Then
wb.Save
Set wb = Nothing
xL.Quit
Set xL = Nothing
Set sh = Nothing
xOpen = False
End If
Exit Sub
Changing to:
Dim ie As InternetExplorer
Set ie = New InternetExplorerMedium
...
Solved the problem. Plus I did need to add back the Do loop mentioned in the comments:
Do
DoEvents
Loop Until ie.ReadyState = READYSTATE_COMPLETE

VBA Script pull data from website

I want to pull the data from http://www.buyshedsdirect.co.uk/ to get the most recent prices of specific items.
I have an excel spreadsheet with the following:
|A | B
1 |Item |Price
2 |bfd/garden-structures/arches/premier-arches-pergola
and the VBA script:
Dim ie As New InternetExplorer
Dim item As String
item = Sheet1.Range("A2").Value
Dim doc As HTMLDocument
ie.Visible = True
ie.navigate "http://www.buyshedsdirect.co.uk/" & item
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.document
On Error Resume Next
output = doc.getElementByClass("NowValue").innerText
Sheet1.Range("B2").Value = output
ie.Quit
End Sub
i am new to VBA scripting and have no idea why it isnt pulling the value form the class "NowValue"
Any help would be appreciated :)
The On Error Resume Next line is stopping an error message from being displayed. That error message would be that there is no method on HTMLDocument called "getElementByClass". You probably want "getElementsByClassName" instead and will have to handle the fact that this returns a collection rather than a single element. Code like this would work:
Option Explicit
Sub foo()
Dim ie As New InternetExplorer
Dim item As String
item = Sheet1.Range("A2").Value
Dim doc As HTMLDocument
ie.Visible = True
ie.navigate "http://www.buyshedsdirect.co.uk/" & item
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.document
Dim results As IHTMLElementCollection
Dim result As IHTMLElement
Dim output As String
Set results = doc.getElementsByClassName("NowValue")
output = ""
For Each result In results
output = output & result.innerText
Next result
Sheet1.Range("B2").Value = output
ie.Quit
End Sub
You would then find that there are multiple elements with class "NowValue" on that page. It looks as though the one you want might be enclosed in a div called "VariantPrice" so this code should work:
Option Explicit
Sub bar()
Dim ie As New InternetExplorer
Dim item As String
item = Sheet1.Range("A2").Value
Dim doc As HTMLDocument
ie.Visible = True
ie.navigate "http://www.buyshedsdirect.co.uk/" & item
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.document
Dim results As IHTMLElementCollection
Dim results2 As IHTMLElementCollection
Dim result As IHTMLElement
Dim result2 As IHTMLElement
Dim output As String
Set results = doc.getElementsByClassName("VariantPrice")
output = ""
For Each result In results
Set results2 = result.getElementsByClassName("NowValue")
For Each result2 In results2
output = output & result2.innerText
Next result2
Next result
Sheet1.Range("B2").Value = output
ie.Quit
End Sub
edit: as the code above works perfectly for me but fails to work for the question asker, it may be the case that they are using an older version of Internet Explorer which does not support getElementsByClassName. It may be the case that using querySelector will work instead. To be certain, go to this QuirksMode page to determine exactly what your browser supports.
New code using querySelector:
Option Explicit
Sub bar()
Dim ie As New InternetExplorer
Dim doc As HTMLDocument
Dim result As IHTMLElement
Dim result2 As IHTMLElement
Dim item As String
item = Sheet1.Range("A2").Value
ie.Visible = True
ie.navigate "http://www.buyshedsdirect.co.uk/" & item
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.document
Set result = doc.querySelector(".VariantPrice")
Set result2 = result.querySelector(".NowValue")
Sheet1.Range("B2").Value = result2.innerText
ie.Quit
End Sub
further edit: to make the macro loop through all of the entries in column A, here are the relevant bits to add or change:
Option Explicit
Sub bar()
Dim ie As New InternetExplorer
Dim doc As HTMLDocument
Dim result As IHTMLElement
Dim result2 As IHTMLElement
Dim item As String
Dim lRow As Long
ie.Visible = True
lRow = 2
item = Sheet1.Range("A" & lRow).Value
Do Until item = ""
ie.navigate "http://www.buyshedsdirect.co.uk/" & item
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.document
Set result = doc.querySelector(".VariantPrice")
Set result2 = result.querySelector(".NowValue")
Sheet1.Range("B" & lRow).Value = result2.innerText
lRow = lRow + 1
item = Sheet1.Range("A" & lRow).Value
Loop
ie.Quit
End Sub

Use VBA to click html button and then scrape refreshed data

I am trying to write a procedure which enters a date into an input box
<input name="Mdate" type="text" id="Mdate" size="30" value="" /></td>
clicks a submit button
<input type="submit" name="button" id="button" value="Submit" />
then scrapes the resulting data, which appears in the "a" tags.
<center>
<b>Tuesday, 6 January 2015</b><br />
Ruakaka
This data is not available until the submit button has been entered. My attempt is posted in full below. The problem I seem to be having is that i am not able to access the modified html code (modified by clicking submit). Can anyone provide any suggestions?
'dimension variables
Dim ie As InternetExplorer
Dim htmldoc As MSHTML.IHTMLDocument 'Document object
Dim inputs As MSHTML.IHTMLElementCollection 'Element collection for "input" tags
Dim eles1, eles2 As MSHTML.IHTMLElementCollection 'Element collection for th tags
Dim element As MSHTML.IHTMLElement 'input elements
Dim ele1, ele2 As MSHTML.IHTMLElement 'Header elements
'Open InternetExplorer
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False 'make IE invisible
'Navigate to webpage
Dim ieURL As String: ieURL = "http://www.racenet.com.au/horse-racing-results/" 'set URL from which to retrieve racemeet and date data
ie.navigate ieURL 'navigate to URL
Do While ie.Busy Or ie.readyState <> 4 'wait for page to load
DoEvents
Loop
Set htmldoc = ie.document 'Document webpage
Set inputs = htmldoc.getElementsByTagName("input") 'Find all input tags
Dim dd, mm, yyyy As Integer
Dim startdate, enddate As Date
Dim i, j, k As Long
Dim raceMeet, raceURL As String
startdate = #1/1/2008#: enddate = Date - 1
Dim racemeetArr As Variant
ReDim racemeetArr(1 To 2, 1)
For i = startdate To enddate
dd = Day(i): mm = Month(i): yyyy = Year(i)
For Each element In inputs
If element.Name = "Mdate" Then
element.Value = yyyy & "-" & mm & "-" & dd
Else
If element.Name = "button" Then
element.Click
'insert scraper
Set eles1 = htmldoc.getElementsByTagName("a") 'Find all centre tags
For Each ele1 In eles1
If InStr(ele1.href, "/horse-racing-results/") > 0 Then
raceMeet = ele1.innerText
raceURL = ele1.innerHTML
ReDim Preserve racemeetArr(1 To 2, UBound(racemeetArr, 2) + 1)
racemeetArr(1, UBound(racemeetArr, 2)) = raceMeet
racemeetArr(2, UBound(racemeetArr, 2)) = raceURL
End If
Next ele1
Else
End If
End If
Next element
Stop
Next i
ie.Quit
Insert a condition to wait while the page is loading.
The following rewrite successfully fetches data from the target page on my pc:
Private Sub CommandButton1_Click()
'dimension variables
Dim ie As InternetExplorer
Dim htmldoc As MSHTML.IHTMLDocument 'Document object
Dim inputs As MSHTML.IHTMLElementCollection 'Element collection for "input" tags
Dim eles1, eles2 As MSHTML.IHTMLElementCollection 'Element collection for th tags
Dim element As MSHTML.IHTMLElement 'input elements
Dim ele1, ele2 As MSHTML.IHTMLElement 'Header elements
'Open InternetExplorer
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True 'make IE invisible
'Navigate to webpage
Dim ieURL As String: ieURL = "http://www.racenet.com.au/horse-racing-results/" 'set URL from which to retrieve racemeet and date data
ie.navigate ieURL 'navigate to URL
Do While ie.Busy Or ie.readyState <> 4 'wait for page to load
DoEvents
Loop
Set htmldoc = ie.document 'Document webpage
Set inputs = htmldoc.getElementsByTagName("input") 'Find all input tags
Dim dd, mm, yyyy As Integer
Dim startdate, enddate As Date
Dim i, j, k As Long
Dim raceMeet, raceURL As String
startdate = #1/1/2008#: enddate = Date - 1
Dim racemeetArr As Variant
ReDim racemeetArr(1 To 2, 1)
For i = startdate To enddate
dd = Day(i): mm = Month(i): yyyy = Year(i)
For Each element In inputs
If element.Name = "Mdate" Then
element.Value = yyyy & "-" & mm & "-" & dd
Else
If element.Name = "button" Then
element.Click
Exit For
End If
End If
Next element
Do
' Wait until the Browser is loaded'
Loop Until ie.readyState = READYSTATE_COMPLETE
'insert scraper
Set eles1 = htmldoc.getElementsByTagName("a") 'Find all centre tags
For Each ele1 In eles1
If InStr(ele1.href, "/horse-racing-results/") > 0 Then
raceMeet = ele1.innerText
raceURL = ele1.innerHTML
ReDim Preserve racemeetArr(1 To 2, UBound(racemeetArr, 2) + 1)
racemeetArr(1, UBound(racemeetArr, 2)) = raceMeet
racemeetArr(2, UBound(racemeetArr, 2)) = raceURL
End If
Next ele1
Stop
Next i
ie.Quit
End Sub
Edit:
After analyzing the HTTP requests I managed to slim down the code a little bit (results can be queried directly without filling the form and submitting the page)
I am not a huge fan of expensive array ReDims, so I created a class instead, and save the results in a collection of that class (feel free to use it or not).
Add a new class module, call it clRaceMeet and paste this code:
Option Explicit
Private pMeet As String
Private pUrl As String
Public Property Let Meet(ByVal Val As String)
pMeet = Val
End Property
Public Property Get Meet() As String
Meet = pMeet
End Property
Public Property Let URL(ByVal Val As String)
pUrl = Val
End Property
Public Property Get URL() As String
URL = pUrl
End Property
Then, use this modified code version to scrape the data and dump it to the debugging window:
Option Explicit
Private Sub CommandButton1_Click()
'dimension variables
Dim ie As InternetExplorer
Dim ieURL As String
Dim dd As Integer
Dim mm As Integer
Dim yyyy As Integer
Dim startDate As Date
Dim endDate As Date
Dim i As Long
Dim htmlDoc As MSHTML.IHTMLDocument
Dim colLeftEleColl As MSHTML.IHTMLElementCollection
Dim colLeftEle As MSHTML.IHTMLElement
Dim centerEleColl As MSHTML.IHTMLElementCollection
Dim centerEle As MSHTML.IHTMLElement
Dim raceMeet As String
Dim raceURL As String
Dim objRaceMeet As clRaceMeet
Dim raceMeetColl As New Collection
'Open InternetExplorer
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
startDate = #1/1/2009#
endDate = Date - 1
For i = startDate To endDate
dd = Day(i)
mm = Month(i)
yyyy = Year(i)
ieURL = "http://www.racenet.com.au/horse-racing-results-search.asp?Mdate=" & yyyy & "-" & mm & "-" & dd
ie.navigate ieURL
Do
' Wait until the Browser is loaded'
Loop Until ie.readyState = READYSTATE_COMPLETE
Set htmlDoc = ie.document
'insert scraper
Set colLeftEleColl = htmlDoc.getElementById("ColLeft").all
'Loop through elements of ColLeft div
For Each colLeftEle In colLeftEleColl
If colLeftEle.tagName = "CENTER" Then
Set centerEleColl = colLeftEle.all
'Loop through elements of <center> tag
For Each centerEle In centerEleColl
If centerEle.tagName = "A" Then
If InStr(centerEle.href, "/horse-racing-results/") > 0 Then
raceMeet = centerEle.innerText
raceURL = centerEle.href
Set objRaceMeet = New clRaceMeet
objRaceMeet.Meet = raceMeet
objRaceMeet.URL = raceURL
raceMeetColl.Add objRaceMeet
End If
End If
Next centerEle
Exit For
End If
Next colLeftEle
' Dump results to immediate window:
For Each objRaceMeet In raceMeetColl
Debug.Print objRaceMeet.Meet & " - " & objRaceMeet.URL
Next objRaceMeet
'Stop
Next i
ie.Quit
End Sub
Happy betting! :)
I toyed around with the last one and the for each loop within the for next loop has to go after it. I then also made it list into sheet1 and it worked. I did a few minor adjustments such as adding a variable to increment the cells.
this code didn't produce the actual results just the websites, not sure if that is what you were aiming for.