I am pretty new with VBA, and the issue is that I am trying to scrape the web: https://finance.yahoo.com/screener/predefined/Aluminum and get all sectors of each ticker of the list. The complex part (at least for me) is remove the filters "price" and "exchange" and send the request, I am using the following code, but I can´t not connect it propertly. Some suggestion? Many thanks
sub connect_to_web
Dim XMLPage As New MSXML2.XMLHTTP60
Dim url As String
Dim query as string
url="https://query1.finance.yahoo.com/v1/finance/screener?lang=en-US®ion=US&formatted=true&corsDomain=finance.yahoo.com"
XMLPage.Open "Post", url, False
XMLPage.setRequestHeader "Content-Type", "application/x-www-form-urlencoded; charset=UTF-8"
XMLPage.send ""
Debug.Print XMLPage.Status
XMLPage.send ""
Debug.Print XMLPage.Status
end sub
Related
I am trying to get Addresses Data from URL but facing some error. I am just beginner in VBA, i did not Understand where is problem in my code. wish somebody can help me to get right solution.
here I attached Image and also my VBA code
here is my Code
Public Sub IE_GetLink()
Dim sResponse As String, HTML As HTMLDocument
Dim url As String
Dim Re As Object
Set HTML = New HTMLDocument
Set Re = CreateObject("MSXML2.XMLHTTP")
'On Error Resume Next
url = "http://markexpress.co.in/network1.aspx?Center=360370&Tmp=1656224682265"
With Re
.Open "GET", url, False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
sResponse = StrConv(.responseBody, vbUnicode)
End With
Dim Title As Object
With HTML
.body.innerHTML = sResponse
Title = .querySelectorAll("#colspan")(0).innerText
End With
MsgBox Title
End Sub
Please help me ...
Several things.
What is wrong with your code:
Title should be a string as you are attempting to assign the return of .innerText to it. You have declared it as an object which would require SET keyword (and the removal of the .innerText accessor).
Colspan is an attribute not an id so your css selector list is incorrect.
Furthermore, looking at what the page actually does, there is a request for an additional document which actually has the info you need. You need to take the centre ID you already have and change the URI you make a request to.
Then, you want only the first td in the target table. Change your CSS selector list to target that.
Public Sub GetInfo()
Dim HTML As MSHTML.HTMLDocument
Dim re As Object
Set HTML = New MSHTML.HTMLDocument
Set re = CreateObject("MSXML2.XMLHTTP")
Dim url As String
Dim response As String
url = "http://crm.markerp.in/NetworkDetail.aspx?Center=360370&Tmp="
With re
.Open "GET", url, False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
response = .responseText
End With
Dim info As String
With HTML
.body.innerHTML = response
info = .querySelector("#tblDisp td").innerText
End With
MsgBox info
End Sub
I need some help to download the stock table located in this URL:
I’ve tried with the code below to at least grab the first line, but what in the inspector is showed as :
<a target=”_blank”href=”/equities/apple-computer-inc” title=Apple Inc”>Apple</a>
I can only see:
A title={fullName} href="about:{pairLink}" target=_blank>{pairName}
This is the code I've put together:
Sub table()
Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim Tables As MSHTML.IHTMLElementCollection
Dim table As MSHTML.IHTMLElement
Dim TableRow As MSHTML.IHTMLElement
XMLReq.Open "GET", "https://es.investing.com/stock-screener/?sp=country::5|sector::a|industry::a|equityType::a|exchange::a|eq_market_cap::110630000,1990000000000%3Ceq_market_cap;2", False
XMLReq.send
If XMLReq.Status <> 200 Then
MsgBox "problem" & vbNewLine & XMLReq.Status & "- " & XMLReq.statusText
Exit Sub
End If
HTMLDoc.body.innerHTML = XMLReq.responseText
Set Tables = HTMLDoc.getElementsByTagName("Table")
For Each table In Tables
If table.className = "displayNone genTbl openTbl resultsStockScreenerTbl elpTbl " Then
For Each TableRow In table.getElementsByTagName("td")
Debug.Print TableRow.innerHTML
Next
End If
Next table
End Sub
Any help will be appreciated.
It looks like the actual data that fills the table is pulled from JSON from another request that some javascript or something runs on the page.
This might make it easier to parse the response with a json parser but it might be difficult to compose the correct request to get the data you want. The owners of the website might not want you do do this so they might not make it easy.
It looks like a POST request with a bunch of parameters and also a cookie sent along. So basically you would need to re-create this POST request by adding all of the correct parameters and the correct cookie in the header. I would get a web debugging program like fiddler (shown above) to look and see what is going on.
I was going to also suggest you check and see if that website provides an API but it looks like it doesn't?
EDIT:
I was actually able to get the JSON with the data you want by pretty much just copying the request used on the site:
Sub getdata()
Dim XMLReq As New MSXML2.XMLHTTP60
XMLReq.Open "POST", "https://es.investing.com/stock-screener/Service/SearchStocks", False
XMLReq.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
XMLReq.setRequestHeader "Accept", "application/json"
XMLReq.setRequestHeader "X-Requested-With", "XMLHttpRequest"
XMLReq.send "country%5B%5D=5&exchange%5B%5D=95&exchange%5B%5D=2&exchange%5B%5D=1§or=5%2C12%2C3%2C8%2C9%2C1%2C7%2C6%2C2%2C11%2C4%2C10&industry=74%2C56%2C73%2C29%2C25%2C4%2C47%2C12%2C8%2C44%2C52%2C45%2C71%2C99%2C65%2C70%2C98%2C40%2C39%2C42%2C92%2C101%2C6%2C30%2C59%2C77%2C100%2C9%2C50%2C46%2C88%2C94%2C62%2C75%2C14%2C51%2C93%2C96%2C34%2C55%2C57%2C76%2C66%2C5%2C3%2C41%2C87%2C67%2C85%2C16%2C90%2C53%2C32%2C27%2C48%2C24%2C20%2C54%2C33%2C19%2C95%2C18%2C22%2C60%2C17%2C11%2C35%2C31%2C43%2C97%2C81%2C69%2C102%2C72%2C36%2C78%2C10%2C86%2C7%2C21%2C2%2C13%2C84%2C1%2C23%2C79%2C58%2C49%2C38%2C89%2C63%2C64%2C80%2C37%2C28%2C82%2C91%2C61%2C26%2C15%2C83%2C68&equityType=ORD%2CDRC%2CPreferred%2CUnit%2CClosedEnd%2CREIT%2CELKS%2COpenEnd%2CRight%2CParticipationShare%2CCapitalSecurity%2CPerpetualCapitalSecurity%2CGuaranteeCertificate%2CIGC%2CWarrant%2CSeniorNote%2CDebenture%2CETF%2CADR%2CETC%2CETN&eq_market_cap%5Bmin%5D=110630000&eq_market_cap%5Bmax%5D=1990000000000&pn=1&order%5Bcol%5D=eq_market_cap&order%5Bdir%5D=d"
If XMLReq.Status <> 200 Then
MsgBox "problem" & vbNewLine & XMLReq.Status & "- " & XMLReq.statusText
Exit Sub
End If
Debug.Print XMLReq.responseText
End Sub
So now you will just need to figure out how to parse the JSON response.
I've written a script in vba to get only the links of different properties under the title Single Family Homes from the right sided area of a webpage. When I run my script, I get nothing, no error either. The content I wish to grab are static and available within page source, so XMLHttpRequestshould do the trick.
Although it seems the selectors I've defined within my script is errorless, I can't still fetch the links of different properties.
Webpage address
I've written:
Sub GetLinks()
Const link$ = "https://www.zillow.com/homes/for_sale/33125/house_type/12_zm/0_mmm/"
Dim oHttp As New XMLHTTP60, Html As New HTMLDocument
Dim I&
With oHttp
.Open "GET", link, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
Html.body.innerHTML = .responseText
With Html.querySelectorAll("article > a.list-card-info")
For I = 0 To .Length - 1
Sheet1.Range("A1").Offset(I, 0) = .item(I).getAttribute("href")
Next I
End With
End With
End Sub
Expected links are like:
https://www.zillow.com/homedetails/3446-NW-15th-St-Miami-FL-33125/43822210_zpid/
https://www.zillow.com/homedetails/1877-NW-22nd-Ave-Miami-FL-33125/43823838_zpid/
https://www.zillow.com/homedetails/1605-NW-8th-Ter-Miami-FL-33125/43825765_zpid/
How can I get all the links of different properties from it's landing page from the link above?
Use the class of the child alone. Note there are a number of other things I would like to change about the code but know you like to keep your structure/style.
Sub GetLinks()
Const link$ = "https://www.zillow.com/homes/for_sale/33125/house_type/12_zm/0_mmm/"
Dim oHttp As New XMLHTTP60, Html As New HTMLDocument
Dim I&
With oHttp
.Open "GET", link, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
Html.body.innerHTML = .responseText
With Html.querySelectorAll(".list-card-info")
For I = 0 To .Length - 1
Sheet1.Range("A1").Offset(I, 0) = .item(I).getAttribute("href")
Next I
End With
End With
End Sub
Some of the changes I might make:
Private Sub GetLinks()
Const LINK As String = "https://www.zillow.com/homes/for_sale/33125/house_type/12_zm/0_mmm/"
Dim http As MSXML2.XMLHTTP60, html As MSHTML.HTMLDocument
Dim i As Long, links As Object
Set http = New MSXML2.XMLHTTP60: Set html = New MSHTML.HTMLDocument
With http
.Open "GET", LINK, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
html.body.innerHTML = .responseText
End With
Set links = html.querySelectorAll(".list-card-info")
With ThisWorkbook.Worksheets("Sheet1")
For i = 0 To links.Length - 1
.Cells(i + 1, 1) = links.item(i).href
Next i
End With
End Sub
Checking some URLs with
Function SiteStatus(ByVal URL As String, SiteStatusText As String) As Long
Dim oHttp As New WinHttp.WinHttpRequest
oHttp.Option(WinHttpRequestOption_EnableRedirects) = False
oHttp.Open "GET", URL, False
oHttp.Send
SiteStatus = oHttp.Status
SiteStatusText = oHttp.StatusText
End Function
generally works fine. Only a few URLs throw a VBA error -2147012744. The Server is giving an invalid or unknown response.
Some URLs in real work, I can open them with SHDocVw library, and some not, - that makes no difference.
for instance:
http: //s2.excoboard.com/Courthouse_Steps_Mavens/150601/1831324
or
http: //www.geld-und-leben.com/anleitung/
or
http: //globalnews.ca/news/3025046/justin-timberlakes-illegal-voting-booth-selfie-is-under-review/
I wanna check the status of these sites although.
How?
What's the point?
i know that this is old, but here is some VBA code that works.
some of the values for oHttp.Option came from http://www.808.dk/?code-simplewinhttprequest
Sub test()
' add reference to Microsoft WinHTTP Services
Dim URL As String
URL = "http://globalnews.ca/news/3025046/justin-timberlakes-illegal-voting-booth-selfie-is-under-review"
Dim oHttp As WinHttpRequest
Set oHttp = New WinHttpRequest
oHttp.Open "GET", URL, False
oHttp.Option(WinHttpRequestOption_UserAgentString) = "http_requester/0.1"
oHttp.Option(WinHttpRequestOption_SslErrorIgnoreFlags) = &H3300 ' ignore all err, 0: accept no err
oHttp.Option(WinHttpRequestOption_EnableRedirects) = True
oHttp.Option(WinHttpRequestOption_EnableHttpsToHttpRedirects) = True
oHttp.send
Do While True
DoEvents
If oHttp.Status Then Exit Do
Loop
Debug.Print oHttp.Status
End Sub
I have a code (given below) in excel vba that fetches web page source html. The code is working fine but the html that it fetches is incomplete. When the line webpageSource = oHttp.ResponseText is executed, the variable webpageSource contains "DOCTYPE html PUBLIC ....... etc etc till the end /html" and that is how it should be. Everything is correct till here. But the next line debug.print webpageSource prints only half the html from "(adsbygoogle = window.adsbygoogle || []).push({}); ...... etc etc till the end /html" Why is that so? I want to find some strings from the returned response text but since it is incomplete, I am unable to do so. Can someone shed some light on it?
Thanks
Sub source()
Dim oHttp As New WinHttp.WinHttpRequest
Dim sURL As String
Dim webpageSource As String
sURL = "http://www.somewebsite.com"
oHttp.Open "GET", sURL, False
oHttp.send
webpageSource = oHttp.ResponseText
debug.print webpageSource
End Sub
EDIT:
I also tried .WaitForResponse with no help :(
Debug.Print and/or the immediate window have limitations. Nowhere documented however they have.
So try writing the webpageSource to a file:
Sub source()
Dim oHttp As New WinHttp.WinHttpRequest
Dim sURL As String
Dim webpageSource As String
sURL = "http://www.google.com"
oHttp.Open "GET", sURL, False
oHttp.send
webpageSource = oHttp.ResponseText
Set FSO = CreateObject("Scripting.FileSystemObject")
Set oFile = FSO.CreateTextFile("webpageSource.txt")
oFile.Write webpageSource
oFile.Close
Shell "cmd /C start webpageSource.txt"
End Sub
Does the file contain all content?