EXCEL - Open all links in a new tab - vba

I have an excel-sheet which contains many links.
How do I open them all at once in a new tab with my default browser?

Like this? Included checking url is valid (basic check). The advantage here is you adapt to log information about the response from the URL.
Option Explicit
Sub TEST()
Dim h As Hyperlink
For Each h In ActiveSheet.Hyperlinks
If UrlOK(h.Address) Then h.Follow
Next h
End Sub
Public Function UrlOK(ByVal url As String) As Boolean
Dim request As Object
Dim respCode As Long
On Error Resume Next
Set request = CreateObject("WinHttp.WinHttpRequest.5.1")
With request
.Open "GET", url, False
.Send
respCode = .Status
End With
If respCode = 200 Then UrlOK = True
On Error GoTo 0
End Function
Edit: Thanks to #Omegastripes for noting
1) If you use MSXML2.XMLHTTP over WinHttp.WinHttpRequest.5.1 you get a more reliable result
Benefits include (amongst others):
A) Simplified code to open a URL.
B) Separate sessions do not impact each other.
C) Protected Mode IE Support
D) Credential Cache
2) Use HEAD over GET, in the request, to reduce network traffic
With a HEAD request, a server will only return the headers of a resource, rather than the resource itself.
So you could use a revised, more efficient function, as follows:
Public Function UrlOK(ByVal url As String) As Boolean
Dim request As Object
Dim respCode As Long
On Error Resume Next
Set request = CreateObject("MSXML2.XMLHTTP")
With request
.Open "HEAD", url, False
.Send
respCode = .Status
End With
If respCode = 200 Then UrlOK = True
On Error GoTo 0
End Function
Image of code in a standard module and where to place cursor to execute Test sub.

That's pretty easy in VBA
Sub OpenAll()
Dim H As Hyperlink
For Each H In ActiveWorkbook.ActiveSheet.UsedRange.Hyperlinks
H.Follow
Next
End Sub
If there are invalid URLs you can stop the code from erroring like this:
Sub OpenAll()
Dim H As Hyperlink
For Each H In ActiveWorkbook.ActiveSheet.Hyperlinks
On Error Resume Next
H.Follow
On Error GoTo 0
Next
End Sub

Related

How to test whether WinHttpRequest works or if corporate security issue causes error

I found the following code on an excel forum:
Function URLExists(url As String) As Boolean
Dim Request As Object
Dim ff As Integer
Dim rc As Variant
On Error GoTo EndNow
Set Request = CreateObject("WinHttp.WinHttpRequest.5.1")
With Request
.Open "GET", url, False
.Send
rc = .StatusText
End With
Set Request = Nothing
If rc = "OK" Then URLExists = True
Exit Function
EndNow:
End Function
I see no reason why it would not work if I use a legitiamte URL. I have been testing it and keep on getting a FALSE returned. Opening the same URL in a browser works.
I suspect that this might have something to do with our coporate security settings. How can I test whether the code works or if it is a security issue?
The function goes to the error line on .Send

How to handle errors efficiently to prevent misleading results?

I've written some code in vba to find certain identities against some names in some websites. The code is working well if everything in it's right order, i meant if the link is valid, the name matches with a tags and finally the regex can find the identity. If any of the three or all of the three are bad searches then the script throws error. I've already specified the position where error occurs in my below script.
All i expect from you experts to provide me with any solution as to how i can handle the errors and let my script continue until all the links are exhausted.
As I do not have much knowledge on VBA so i tried with On error resume next to skip the errors. However, it turns out to be a clear mess when i take a look at the results. I'm pasting a rough example what i get when i use On error resume next.
Sub Identity_Finder()
Dim http As New XMLHTTP60, html As New HTMLDocument
Dim post As Object, link As Variant, refined_links As String
Dim rxp As New RegExp, identity As Object
For Each link In [{"http://spltech.in/","http://www.unifrostindia.com/","http://www.unitfrostindia.com/","http://www.greenplanet.in/"}]
With http
.Open "GET", link, False
.send '''throws here the first error if the link is invalid
html.body.innerHTML = .responseText
End With
For Each post In html.getElementsByTagName("a")
If InStr(post.innerText, "certain_name") > 0 Then refined_links = post.href: Exit For
Next post
With http
.Open "GET", refined_links, False
.send ''throws another error here if no such link is found
End With
With rxp
.Pattern = "some_regex"
.Global = True
Set identity = .Execute(http.responseText)
End With
r = r + 1: Cells(r, 1) = link
Cells(r, 2) = identity(0) ''''throws another error here if no such identity is noticed
Next link
End Sub
Upon using On error resume next What i get:
John executive
Mac lawyer
lulu lawyer
Robin lawyer
Cathy student
Expected output:
John executive
Mac lawyer
lulu
Robin
Cathy student
The empty fields (when they are not found) are getting filled in with the previous values when i use On error resume next. How can I get around this misleading result? Thanks in advance.
The most efficient way to error trap in VBA is to
1) actually test the inputs / results before running them either through custom-made functions or built-in coding concepts or a mix of both.
2) Use VBA built-in error-handling if absolutely needed
Example 1
For example. You can wrap this statement with a custom function to test if a URL is valid or not.
With http
.Open "GET", link, False
.send '''throws here the first error if the link is invalid
html.body.innerHTML = .responseText
End With
If ValidURL Then
With http
.Open "GET", link, False
.send
html.body.innerHTML = .responseText
End With
End If
Where ValidURL is a function defined as:
Function ValidURL(URL as String) as Boolean
Dim result as Boolean
'I don't know how you would specify a valid link in your specific case
'but that code goes here
'a dummy example follows
result = Left(URL,7) = "http://"
ValidURL = result 'True or False
End Function
Example 2
I assume in this statement:
With http
.Open "GET", refined_links, False
.send ''throws another error here if no such link is found
End With
there is a specific error number (code) that is produced when no such link is found. Discover that number and use this code to bypass.
With http
.Open "GET", refined_links, False
On Error Resume Next
.Send
On Error GoTo 0
End With
If err.Number <> 9999 'replace with correct number
'continue with regex test
End If
PUTTING IT ALL TOGETHER
Finally putting that all together you can build like so, with minimal use of On Error Resume Next and no GoTo statements.
For Each link In [{"http://spltech.in/","http://www.unifrostindia.com/","http://www.unitfrostindia.com/","http://www.greenplanet.in/"}]
If ValidURL(link) Then
With http
.Open "GET", link, False
.send
html.body.innerHTML = .responseText
End With
For Each post In html.getElementsByTagName("a")
If InStr(post.innerText, "certain_name") > 0 Then refined_links = post.href: Exit For
Next post
With http
.Open "GET", refined_links, False
On Error Resume Next
.Send
On Error GoTo 0
End With
If err.Number <> 9999 'replace with correct number
With rxp
.Pattern = "some_regex"
.Global = True
Set identity = .Execute(http.responseText)
End With
'i will leave it to you on how to account for no pattern match
r = r + 1: Cells(r, 1) = link
Cells(r, 2) = identity(0) ''''throws another error here if no such identity is noticed
End If
End If
Next link

Web-scraping across multipages without even knowing the last page number

Running my code for a site to crawl the titles of different tutorials spreading across several pages, I found it working flawless. I tried to write some code not depending on the last page number the url has but on the status code until it shows http.status<>200. The code I'm pasting below is working impeccably in this case. However, Trouble comes up when I try to use another url to see whether it breaks automatically but found that the code did fetch all the results but did not break. What is the workaround in this case so that the code will break when it is done and stop the macro? Here is the working one?
Sub WiseOwl()
Const mlink = "http://www.wiseowl.co.uk/videos/default"
Dim http As New XMLHTTP60, html As New HTMLDocument
Dim post As Object
Do While True
y = y + 1
With http
.Open "GET", mlink & "-" & y & ".htm", False
.send
If .Status <> 200 Then
MsgBox "It's done"
Exit Sub
End If
html.body.innerHTML = .responseText
End With
For Each post In html.getElementsByClassName("woVideoListDefaultSeriesTitle")
With post.getElementsByTagName("a")
x = x + 1
If .Length Then Cells(x, 1) = .item(0).innerText
End With
Next post
Loop
End Sub
I found a logic to get around with yellowpage. My update script is able to parse yellowpage but breaks before scraping the last page because there is no "Next Page" button. I tried with this:
"https://www.dropbox.com/s/iptqm79b0byw3dz/Yellowpage.txt?dl=0"
However, the same logic I tried to apply with torrent site but it doesn't work here:
"https://www.yify-torrent.org/genres/western/p-1/"
You can always rely on elements if they exits or not. Here for example, if you try to use the object which you have set your element to, you will get:
Run-time error '91': Object variable or With block variable not set
This is the key you should be looking for to put an end to your code. Please see the below example:
Sub yify()
Const mlink = "https://www.yify-torrent.org/genres/western/p-"
Dim http As New XMLHTTP60, html As New HTMLDocument
Dim post As Object
Dim posts As Object
y = 1
Do
With http
.Open "GET", mlink & y & "/", False
.send
html.body.innerHTML = .responseText
End With
Set posts = html.getElementsByClassName("mv")
On Error GoTo Endofpage
Debug.Print Len(posts) 'to force Error 91
For Each post In posts
With post.getElementsByTagName("div")
x = x + 1
If .Length Then Cells(x, 1) = .Item(0).innerText
End With
Next post
y = y + 1
Endofpage:
Loop Until Err.Number = 91
Debug.Print "It's over"
End Sub

How to differentiate legitmate URL redirects from ISP redirects or other hijacks

I'm trying to use VBA to test URLs for bad links and redirects.
I'm able to obtain redirect URLs using HTTP request GET method for URLs that return a 3xx series .Status response using either MSXML2 or WinHttp libraries through VBA.
When I attempt this method for some URLs that have legitimate (intended by the URL site) redirects, I receive an error when trying to obtain the Request object.
For example, when this URL is entered into a browser:
http://www.teconnectivity.com
...the browser will ultimately arrive at this address:
http://www.te.com/en/home.html
However this VBA code will return Error -2147012744: "The server returned an invalid or unrecognized response". The error is thrown on the .Send statement.
Sub Test()
Debug.Print sGetRedirectURL("http://www.teconnectivity.com/")
End Sub
Private Function sGetRedirectURL(ByVal sURL As String) As String
Dim oRequest As WinHttp.WinHttpRequest
Dim sReturn As String
Set oRequest = New WinHttp.WinHttpRequest
On Error GoTo ErrProc
With oRequest
.Option(WinHttpRequestOption_EnableRedirects) = False
.Open "GET", sURL, False
.Send
If Left$(.Status, 1) = "3" Then
sReturn = .GetResponseHeader("Location")
End If
End With
On Error GoTo 0
ExitProc:
sGetRedirectURL = sReturn
Exit Function
ErrProc:
Debug.Print Err.Number & ": " & Err.Description
Resume ExitProc
End Function
How is my browser able to get to the destination: http://www.te.com/en/home.html, if an HTTP request with that URL returns an error?
I can use the browser's application to get the final destination URL like this:
Function sGetFinalURL(sURL As String) As String
Dim oAppIE As Object
Dim sReturn As String
Set oAppIE = CreateObject("InternetExplorer.Application")
With oAppIE
.Navigate sURL
Do While .Busy
Loop
sReturn = .document.url
.Quit
End With
Set oAppIE = Nothing
sGetFinalURL = sReturn
End Function
However, that approach will also return URLs for ISP redirects or other hijacks.
The approach I ended up taking for this was to use the browser application to try to navigate to a non-existent domain URL. If the browser arrives at a URL, then that means it has been redirected by the ISP or something other than the non-existent domain.
Private Function sGetNXDomainRedirect() As String
'--attempts to return the domain used to redirect non-existant
' domain urls. Used to distinguish legitimate redirects
' from ISP redirects or other hijacks
'--returns "NoRedirect" if the bogus url is not redirected.
Dim sReturn As String
Dim oIEapp As Object
Dim uTest As udtHttpInfo
Const sBOGUS_URL As String = _
"http://wwXYXw.NXDomainToTest"
Set oIEapp = CreateObject("InternetExplorer.Application")
With oIEapp
.Navigate sBOGUS_URL
Do While .Busy
Loop
On Error Resume Next
sReturn = .document.Domain
On Error GoTo 0
.Quit
End With
Set oIEapp = Nothing
If Len(sReturn) = 0 Then sReturn = "NoRedirect"
sGetNXDomainRedirect = sReturn
End Function
Link to complete code on MrExcel.com forum
Once that "ISP Redirect" site is known, then the IE Application can be used to test for bad links or legitimate redirects. If the Navigation through the IE Application returns a domain other than the ISP Redirect domain, then I interpret that as a legitimate redirect.

XMLHTTP.send request brings back "Nothing"

I have a spreadsheet that has hundreds of links that point to a server (with authentication) that can be accessed via the web. I've been searching for a solution to a Link Checker in a spreadsheet that would tell me which links are broken and which are ok. By broken I mean that the website does not get called up at all.
There are various solutions I have found around the web, none of which work for me. I'm boggled by this...
One example that I've tried to use and figure out is re-posted below.
As I've stepped through the code, I have come to realize that the oHTTP.send request brings back "Nothing". It does so for all links in the spreadsheet, regardless of whether the link works, or not.
Public Function CheckHyperlink(ByVal strUrl As String) As Boolean
Dim oHttp As New MSXML2.XMLHTTP30
On Error GoTo ErrorHandler
oHttp.Open "HEAD", strUrl, False
oHttp.send
If Not oHttp.Status = 200 Then CheckHyperlink = False Else CheckHyperlink = True
Exit Function
ErrorHandler:
CheckHyperlink = False
End Function
Any suggestions as to what might be wrong, or right, is highly appreciated!
A couple of possible causes..
Do you mean oHttp.Open "GET", strUrl, False instead of oHttp.Open "HEAD", strUrl, False ?
Perhaps MSXML2.XMLHTTP30 is not available? You can declare an instance of MSXML2.XMLHTTPX as either early bound or late bound which may impact which version you want to use vs what is available (example http://word.mvps.org/FAQs/InterDev/EarlyvsLateBinding.htm)
eg
Option Explicit
'Dim oHTTPEB As New XMLHTTP30 'For early binding enable reference Microsoft XML, v3.0
Dim oHTTPEB As New XMLHTTP60 'For early binding enable reference Microsoft XML, v6.0
Sub Test()
Dim chk1 As Boolean
Dim chk2 As Boolean
chk1 = CheckHyperlinkLB("http://stackoverflow.com/questions/11647297/xmlhttp-send-request-brings-back-nothing")
chk2 = CheckHyperlinkEB("http://stackoverflow.com/questions/11647297/xmlhttp-send-request-brings-back-nothing")
End Sub
Public Function CheckHyperlinkLB(ByVal strUrl As String) As Boolean
Dim oHTTPLB As Object
'late bound declaration of MSXML2.XMLHTTP30
Set oHTTPLB = CreateObject("Msxml2.XMLHTTP.3.0")
On Error GoTo ErrorHandler
oHTTPLB.Open "GET", strUrl, False
oHTTPLB.send
If Not oHTTPLB.Status = 200 Then CheckHyperlinkLB = False Else CheckHyperlinkLB = True
Set oHTTPLB = Nothing
Exit Function
ErrorHandler:
Set oHTTPLB = Nothing
CheckHyperlinkLB = False
End Function
Public Function CheckHyperlinkEB(ByVal strUrl As String) As Boolean
'early bound declaration of MSXML2.XMLHTTP60
On Error GoTo ErrorHandler
oHTTPEB.Open "GET", strUrl, False
oHTTPEB.send
If Not oHTTPEB.Status = 200 Then CheckHyperlinkEB = False Else CheckHyperlinkEB = True
Set oHTTPEB = Nothing
Exit Function
ErrorHandler:
Set oHTTPEB = Nothing
CheckHyperlinkEB = False
End Function
EDIT:
I tested the OP's link by opening in a browser which I've now discovered redirects to the login page instead so it's a different link I was testing. It's probably failing because the oHttp object has not been set to allow redirects. I know it's possible to set redirects for WinHttp.WinHttpRequest.5.1 using the code below. I would need to investigate if this also works for MSXML2.XMLHTTP30 though.
Option Explicit
Sub Test()
Dim chk1 As Boolean
chk1 = CheckHyperlink("http://portal.emilfrey.ch/portal/page/portal/toyota/30_after_sales/20_ersatzteile%20und%20zubeh%C3%B6r/10_zubeh%C3%B6r/10_produktbezogene%20informationen/10_aussen/10_felgen/10_asa-pr%C3%BCfberichte/iq/tab1357333/iq%20016660.pdf")
End Sub
Public Function CheckHyperlink(ByVal strUrl As String) As Boolean
Dim GetHeader As String
Const WinHttpRequestOption_EnableRedirects = 6
Dim oHttp As Object
Set oHttp = CreateObject("WinHttp.WinHttpRequest.5.1")
On Error GoTo ErrorHandler
oHttp.Option(WinHttpRequestOption_EnableRedirects) = True
oHttp.Open "HEAD", strUrl, False
oHttp.send
If Not oHttp.Status = 200 Then
CheckHyperlink = False
Else
GetHeader = oHttp.getAllResponseHeaders()
CheckHyperlink = True
End If
Exit Function
ErrorHandler:
CheckHyperlink = False
End Function
EDIT2:
MSXML2.XMLHTTP does allow redirects (although I believe MSXML2.ServerXMLHTTP doesn't). The redirects are allowed/disallowed depending upon whether the redirect is cross-domain, cross-port etc (see details here http://msdn.microsoft.com/en-us/library/ms537505(v=vs.85).aspx)
Since the redirect to the login page is cross-domain then IE zone policy is implemented. Open IE/Tools/Internet Options/Security/Custom Level and change 'Access data sources across domains' to ENABLED
The original OP's code will now redirect properly.