I am using the below code to retrieve some data from websites.
Public Function giveMeValue(ByVal link As String) As String
Set htm = CreateObject("htmlFile")
With CreateObject("msxml2.xmlhttp")
.Open "POST", link, False
.send
htm.body.innerhtml = .responsetext
End With
With htm.getelementbyid("JS_topStoreCount")
giveMeValue = .innerText
End With
htm.Close
Set htm = Nothing
End Function
Sometimes the element with ID "JS_topStoreCount" doesn't exist and the function returns #VALUE!. How do I modify this function so that errors are returned as 0 and are highlighted in red?
I couldn't see the reason for the Do Loop so I have removed it, I've added an if statement to check if the html element is nothing before assigning it to the return value.
Public Function giveMeValue(ByVal link As String) As String
Set htm = CreateObject("htmlFile")
With CreateObject("msxml2.xmlhttp")
.Open "GET", link, False
.send
htm.body.innerhtml = .responsetext
End With
If Not htm.getelementbyId("JS_topStoreCount") Is Nothing Then
giveMeValue = htm.getelementbyId("JS_topStoreCount").innerText
Else
giveMeValue = "0"
End If
htm.Close
Set htm = Nothing
End Function
Related
I've written a script in vba to get only the links of different properties under the title Single Family Homes from the right sided area of a webpage. When I run my script, I get nothing, no error either. The content I wish to grab are static and available within page source, so XMLHttpRequestshould do the trick.
Although it seems the selectors I've defined within my script is errorless, I can't still fetch the links of different properties.
Webpage address
I've written:
Sub GetLinks()
Const link$ = "https://www.zillow.com/homes/for_sale/33125/house_type/12_zm/0_mmm/"
Dim oHttp As New XMLHTTP60, Html As New HTMLDocument
Dim I&
With oHttp
.Open "GET", link, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
Html.body.innerHTML = .responseText
With Html.querySelectorAll("article > a.list-card-info")
For I = 0 To .Length - 1
Sheet1.Range("A1").Offset(I, 0) = .item(I).getAttribute("href")
Next I
End With
End With
End Sub
Expected links are like:
https://www.zillow.com/homedetails/3446-NW-15th-St-Miami-FL-33125/43822210_zpid/
https://www.zillow.com/homedetails/1877-NW-22nd-Ave-Miami-FL-33125/43823838_zpid/
https://www.zillow.com/homedetails/1605-NW-8th-Ter-Miami-FL-33125/43825765_zpid/
How can I get all the links of different properties from it's landing page from the link above?
Use the class of the child alone. Note there are a number of other things I would like to change about the code but know you like to keep your structure/style.
Sub GetLinks()
Const link$ = "https://www.zillow.com/homes/for_sale/33125/house_type/12_zm/0_mmm/"
Dim oHttp As New XMLHTTP60, Html As New HTMLDocument
Dim I&
With oHttp
.Open "GET", link, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
Html.body.innerHTML = .responseText
With Html.querySelectorAll(".list-card-info")
For I = 0 To .Length - 1
Sheet1.Range("A1").Offset(I, 0) = .item(I).getAttribute("href")
Next I
End With
End With
End Sub
Some of the changes I might make:
Private Sub GetLinks()
Const LINK As String = "https://www.zillow.com/homes/for_sale/33125/house_type/12_zm/0_mmm/"
Dim http As MSXML2.XMLHTTP60, html As MSHTML.HTMLDocument
Dim i As Long, links As Object
Set http = New MSXML2.XMLHTTP60: Set html = New MSHTML.HTMLDocument
With http
.Open "GET", LINK, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
html.body.innerHTML = .responseText
End With
Set links = html.querySelectorAll(".list-card-info")
With ThisWorkbook.Worksheets("Sheet1")
For i = 0 To links.Length - 1
.Cells(i + 1, 1) = links.item(i).href
Next i
End With
End Sub
Currently i am able to extract values from webpage but facing issue json value extraction.
I am using following code for other values extraction.
On Error Resume Next
Set http = CreateObject("MSXML2.XMLHTTP")
http.Open "GET", url1234, False
http.Send
html.body.innerHTML = http.ResponseText
brand = html.body.innerText
'MsgBox (brand)
Above code is not extracting following values of this url
"" : {"0":"B0037RYT96","1":"B0152VYOQ2","2":"B0152WOT70","3":"B003W0NYKS","4":"B0152WOT8Y","5":"B00C2O7M1M","6":"B0037RMS6W","7":"B0037RMI0S","8":"B0152VYPXY"},
There isn't anything I can see in your code that attempts to extract this.
You could use regex to specify the appropriate pattern to extract that string. Below, the string you are after is stored in r variable.
EDIT:
Given your edit to the required string you can change the regex to:
\"dimensionToAsinMap\" :(.*)[^\r\n].*
Try it here
Former answer:
Try regex here
Option Explicit
Public Sub GetData()
Dim s As String, r As String, re As Object
Set re = CreateObject("vbscript.regexp")
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.yoursite.com?tag=stackoverfl08-20", False
.send
s = .responseText
End With
With re
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "(""dimensionToAsinMap"" :.(.|\n)*)[;^\r\n].*return dataToReturn"
If .test(s) Then
r = .Execute(s)(0).SubMatches(0)
Else
r = "No match"
End If
End With
End Sub
Locals window check:
Regex explanation:
I just want to check if a file already exist in a sharepoint. I used a Boolean Function that worked perfectly at the begining but since a short periode it doesn't work anymore!!
this is what i wrote:
Public Function checkFile(URLStr As String) As Boolean
Dim oHttpRequest As Object
If Len(Trim(URLStr)) = 0 Then checkFile = Empty: Exit Function
Set oHttpRequest = CreateObject("MSXML2.XMLHTTP.6.0")
With oHttpRequest
.Open "GET", URLStr, False ', [UserName], [Password]
.setRequestHeader "Cache-Control", "no-cache"
.setRequestHeader "Pragma", "no-cache"
.send
End With
If oHttpRequest.Status = 200 Then
checkFile = True
Else
checkFile = False
End If
Set oHttpRequest = Nothing
End Function
Here the mistake is at the point. .send
I don't understand why and i don't find another solution to check if a file exist since the file is in a sharepoint site.
How do I output the result of a WinHTTPRequest in Excel?
For example, the following code queries the stock quote of Apple from a webpage but it doesn't output anything:
Sub GetQuotes()
Dim XMLHTTP As Object, html As Object, pontod As Object
On Error Resume Next
Set oHtml = New HTMLDocument
With CreateObject("WINHTTP.WinHTTPRequest.5.1")
.Open "GET", "http://www.reuters.com/finance/stocks/overview?symbol=AAPL.O", False
.send
oHtml.body.innerHTML = .responseText
End With
'Price
Set pontod = oHtml.getElementsByClassName("sectionQuote nasdaqChange")(0).getElementsByTagName("span")(1)
MsgBox pontod.innerText
End Sub
While this runs perfectly for the name:
Sub GetQuotes2()
Dim XMLHTTP As Object, html As Object, pontod As Object
On Error Resume Next
Set oHtml = New HTMLDocument
With CreateObject("WINHTTP.WinHTTPRequest.5.1")
.Open "GET", "http://www.reuters.com/finance/stocks/overview?symbol=AAPL.O", False
.send
oHtml.body.innerHTML = .responseText
End With
'Name
Set pontod = oHtml.getElementById("sectionTitle").getElementsByTagName("h1")(0)
MsgBox pontod.innerText
End Sub
I'd like to be able to fetch the whole page and look for specific HTML elements in it, but how do I manage to see the whole response from the query?
As Jeeped said above, the method getElementsByClassName doesn't work on an XML request.
However, by looking at the webpage you're trying to scrape, you can work-around the issue by using this line:
Set pontod = oHtml.getElementById("headerQuoteContainer").getElementsByTagName("span")(1)
instead of this one:
Set pontod = oHtml.getElementsByClassName("sectionQuote nasdaqChange")(0).getElementsByTagName("span")(1)
As you can observe from the HTML structure of the webpage:
... your price is not only the second span element of the first div with class names sectionQuote and nasdaqChange, but also the second span element of the unique object with id headerQuoteContainer.
Hence, scrape it from there will avoid you to use the invalid method getElementsByClassName (which is a valid HTML method, but not when the HTML is an XML response) for the classic getElementById().
I have a spreadsheet that has hundreds of links that point to a server (with authentication) that can be accessed via the web. I've been searching for a solution to a Link Checker in a spreadsheet that would tell me which links are broken and which are ok. By broken I mean that the website does not get called up at all.
There are various solutions I have found around the web, none of which work for me. I'm boggled by this...
One example that I've tried to use and figure out is re-posted below.
As I've stepped through the code, I have come to realize that the oHTTP.send request brings back "Nothing". It does so for all links in the spreadsheet, regardless of whether the link works, or not.
Public Function CheckHyperlink(ByVal strUrl As String) As Boolean
Dim oHttp As New MSXML2.XMLHTTP30
On Error GoTo ErrorHandler
oHttp.Open "HEAD", strUrl, False
oHttp.send
If Not oHttp.Status = 200 Then CheckHyperlink = False Else CheckHyperlink = True
Exit Function
ErrorHandler:
CheckHyperlink = False
End Function
Any suggestions as to what might be wrong, or right, is highly appreciated!
A couple of possible causes..
Do you mean oHttp.Open "GET", strUrl, False instead of oHttp.Open "HEAD", strUrl, False ?
Perhaps MSXML2.XMLHTTP30 is not available? You can declare an instance of MSXML2.XMLHTTPX as either early bound or late bound which may impact which version you want to use vs what is available (example http://word.mvps.org/FAQs/InterDev/EarlyvsLateBinding.htm)
eg
Option Explicit
'Dim oHTTPEB As New XMLHTTP30 'For early binding enable reference Microsoft XML, v3.0
Dim oHTTPEB As New XMLHTTP60 'For early binding enable reference Microsoft XML, v6.0
Sub Test()
Dim chk1 As Boolean
Dim chk2 As Boolean
chk1 = CheckHyperlinkLB("http://stackoverflow.com/questions/11647297/xmlhttp-send-request-brings-back-nothing")
chk2 = CheckHyperlinkEB("http://stackoverflow.com/questions/11647297/xmlhttp-send-request-brings-back-nothing")
End Sub
Public Function CheckHyperlinkLB(ByVal strUrl As String) As Boolean
Dim oHTTPLB As Object
'late bound declaration of MSXML2.XMLHTTP30
Set oHTTPLB = CreateObject("Msxml2.XMLHTTP.3.0")
On Error GoTo ErrorHandler
oHTTPLB.Open "GET", strUrl, False
oHTTPLB.send
If Not oHTTPLB.Status = 200 Then CheckHyperlinkLB = False Else CheckHyperlinkLB = True
Set oHTTPLB = Nothing
Exit Function
ErrorHandler:
Set oHTTPLB = Nothing
CheckHyperlinkLB = False
End Function
Public Function CheckHyperlinkEB(ByVal strUrl As String) As Boolean
'early bound declaration of MSXML2.XMLHTTP60
On Error GoTo ErrorHandler
oHTTPEB.Open "GET", strUrl, False
oHTTPEB.send
If Not oHTTPEB.Status = 200 Then CheckHyperlinkEB = False Else CheckHyperlinkEB = True
Set oHTTPEB = Nothing
Exit Function
ErrorHandler:
Set oHTTPEB = Nothing
CheckHyperlinkEB = False
End Function
EDIT:
I tested the OP's link by opening in a browser which I've now discovered redirects to the login page instead so it's a different link I was testing. It's probably failing because the oHttp object has not been set to allow redirects. I know it's possible to set redirects for WinHttp.WinHttpRequest.5.1 using the code below. I would need to investigate if this also works for MSXML2.XMLHTTP30 though.
Option Explicit
Sub Test()
Dim chk1 As Boolean
chk1 = CheckHyperlink("http://portal.emilfrey.ch/portal/page/portal/toyota/30_after_sales/20_ersatzteile%20und%20zubeh%C3%B6r/10_zubeh%C3%B6r/10_produktbezogene%20informationen/10_aussen/10_felgen/10_asa-pr%C3%BCfberichte/iq/tab1357333/iq%20016660.pdf")
End Sub
Public Function CheckHyperlink(ByVal strUrl As String) As Boolean
Dim GetHeader As String
Const WinHttpRequestOption_EnableRedirects = 6
Dim oHttp As Object
Set oHttp = CreateObject("WinHttp.WinHttpRequest.5.1")
On Error GoTo ErrorHandler
oHttp.Option(WinHttpRequestOption_EnableRedirects) = True
oHttp.Open "HEAD", strUrl, False
oHttp.send
If Not oHttp.Status = 200 Then
CheckHyperlink = False
Else
GetHeader = oHttp.getAllResponseHeaders()
CheckHyperlink = True
End If
Exit Function
ErrorHandler:
CheckHyperlink = False
End Function
EDIT2:
MSXML2.XMLHTTP does allow redirects (although I believe MSXML2.ServerXMLHTTP doesn't). The redirects are allowed/disallowed depending upon whether the redirect is cross-domain, cross-port etc (see details here http://msdn.microsoft.com/en-us/library/ms537505(v=vs.85).aspx)
Since the redirect to the login page is cross-domain then IE zone policy is implemented. Open IE/Tools/Internet Options/Security/Custom Level and change 'Access data sources across domains' to ENABLED
The original OP's code will now redirect properly.