OCR Captcha through VBA - vba

I have been trying access a web site and OCR a captcha.
But this captcha part I have no idea how to proceed.
Can someone walk me through this?
1. Access website
2. Donwload captcha image (?)
3. OCR it
Any help is appreciated.

Downloading a file from the web can be done with the Microsoft.XMLHTTP object:
Dim myURL As String
myURL = "http://www.somesite.com/captcha.png"
Dim WinHttpReq As Object
Set WinHttpReq = CreateObject("Microsoft.XMLHTTP")
WinHttpReq.Open "GET", myURL, False
WinHttpReq.Send
myURL = WinHttpReq.ResponseBody
If WinHttpReq.Status = 200 Then
Set oStream = CreateObject("ADODB.Stream")
oStream.Open
oStream.Type = 1
oStream.Write WinHttpReq.ResponseBody
oStream.SaveToFile ("C:\captcha.png")
oStream.Close
End If
Snippet Copied from here
However, In order to run OCR on the image you just downloaded, you'll need to use an ActiveX component,
After googling, I came up with this component, and I haven't found anything free.

Related

WinHttp.WinHttpRequest.5.1 URL ENCODE

I'm trying to use the Qrickt API:
https://qrickit.com/qrickit_apps/qrickit_api.php
to create a QRCode for Google Map address in VBA.
To do this I have to send a Http request like this:
"http://qrickit.com/api/qr.php?d=http://google.com/maps?q=Via+Roma,+1+Milano&qrsize=150&t=p&e=m"
The API documentation says:
*For non-English and special characters, url encode your data first.
The problem is that I cannot manage to pass an encoded address to the API.
If I pass a string such as "Via+Roma", or "Via%20Roma", the generated QRCode URL is always
http://maps.google.com/maps?q=Via Roma, 1 Milano
so the QRCode image is created, but phone do not open directly google maps.
Can somehome help me?
Here's the code:
Public Function f_QRCode(ByVal Address As String, ByVal Destination As String) As Boolean
On Error GoTo Err_Handler
Const ApiPath As String = "https://qrickit.com/api/qr.php?d=http://maps.google.com/maps?q="
Dim WinHttpReq As Object '\\ Oggetto che serve al download del verbale
Dim fic As Integer
Dim buffer() As Byte
Dim URL As String
'\\ Costruisco l'URL
URL = ApiPath + "Via%20Roma%2C%%201%20Milano" + "&qrsize=150&t=p&e=m"
'\\ Creo l'oggetto per la connessione
Set WinHttpReq = CreateObject("WinHttp.WinHttpRequest.5.1")
WinHttpReq.Open "POST", URL, False
WinHttpReq.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
WinHttpReq.send
If WinHttpReq.Status = 200 Then
fic = FreeFile
Open Destination For Binary Lock Read Write As #fic
buffer = WinHttpReq.responseBody
Put #fic, , buffer
Close #fic
f_QRCode = True
Else
MsgBox "Error"
End If
ExitHere:
Erase buffer
Set WinHttpReq = Nothing
Exit Function
Err_Handler:
Resume ExitHere
End Function
Their API accepts GET requests, and you're sending a POST.
Try:
URL = ApiPath + "Via%20Roma%2C%%201%20Milano" + "&qrsize=150&t=p&e=m"
Set WinHttpReq = CreateObject("WinHttp.WinHttpRequest.5.1")
WinHttpReq.Open "GET", URL, False
WinHttpReq.send
I would add that you might consider using the function EncodeURL for encoding.
Application.EncodeURL("url")

Http request VBA response

Making my first steps on HTTP within VBA.
Already managed to get cookie value in order to make a request. The problem I'm having is that the file I want to download is not in the first response. When I analyse using HTTP Header Live, the browser receives several responses and only the last one is the file, a PDF that is generated after a query sent by the user. The only thing I'm getting is the first response that I'm displaying with a MsgBox. Can someone help me solving this problem. Made some searches through the web but haven't found yet, a solution.
The code I am using is:
Sub Test()
Dim WinHttpReq As Object
Set WinHttpReq = CreateObject("WinHTTP.WinHTTPrequest.5.1")
With WinHttpReq
.Open "POST", myURL, False ', "username", "password"
.send
x = .getResponseHeader("Set-Cookie")
i = InStr(x, ";")
x = Left(x, i - 1)
MsgBox x
End With
With WinHttpReq
.Open "GET", myURL, False
.Option(WinHttpRequestOption_EnableRedirects) = True
.SetRequestHeader "Content-Type", "application/pdf"
.SetRequestHeader "Accept-Encoding", "gzip, deflate, br"
.SetRequestHeader "Cookie", x
.SetRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:62.0) Gecko/20100101 Firefox/62.0"
.send
End With
MsgBox (WinHttpReq.getAllResponseHeaders())
If WinHttpReq.Status = 200 Then
Set oStream = CreateObject("ADODB.Stream")
With oStream
.Type = 1
.Open
.Write WinHttpReq.responseBody
.SaveToFile "C:\Users\xxx\Desktop\file.pdf", 2
.Close
End With
End If
Set WinHttpReq = Nothing
Set oStream = Nothing
End Sub
Analyzing with Firefox, when I enter the URL, I receive two responses with 200 OK. I wonder how can I get the second response? The site uses a javascript that interprets the query I send and returns a PDF file. The file name changes according to the user and the query.
Now, i reached the following point. I make a first request to get the cookie, then a second to get an ETag from a diferent address. The problem is that when i make the third request to download the file, the filename is apparently generated by the server (APACHE?), based on the ETag and something I'm not being able to find. The last 5 numbers from the ETag change in the filename.
For example:
ETag - 1543932096000
File - 1543932095115.xxxxx.address.com.5245.idp.pdf
Since I don't have the filename, i cannot download the file with an httprequest.
Help?

Download file from URL throws status code 406

Hi guys I have this sub I want to download a file from URL but everytime when I run it WinHttpReq.Status contains 406.
Sub DownloadFile()
Dim myURL As String
myURL = "https://YourWebSite.com/?your_query_parameters"
Dim WinHttpReq As Object
Set WinHttpReq = CreateObject("Microsoft.XMLHTTP")
WinHttpReq.Open "GET", myURL, False, "username", "password"
WinHttpReq.send
myURL = WinHttpReq.responseBody
If WinHttpReq.Status = 200 Then
Set oStream = CreateObject("ADODB.Stream")
oStream.Open
oStream.Type = 1
oStream.Write WinHttpReq.responseBody
oStream.SaveToFile "C:\file.csv", 2 ' 1 = no overwrite, 2 = overwrite
oStream.Close
End If
End Sub
The 406 status code means that, although the server understood and processed the request, the response is of a form the client cannot understand. A client sends, as part of a request, headers indicating what types of data it can use, and a 406 error is returned when the response is of a type not in that list.
Eg. If you ask the server to send a GIF picture, but it can only send plain text and PNG pictures you will receive a 406 status code meaning your server understood your question but cannot fulfill it.
So you should include an Accept header specifying which type of media you want your server to send to your client (your VBA), and that type should be a type that your server can actually deliver.
Example how to send headers:
WinHttpReq.SetRequestHeader "Content-Type", "text/xml;charset=utf-8"
WinHttpReq.SetRequestHeader "Accept", "text/xml"
Of course nobody can tell you which media type is the correct because we don't know your server and which type of media it can provide.

Microsoft.XMLHTTP - how to prevent caching?

So I want to retrieve some JSON data which changes frequently from Windows gadget, by using Microsoft.XMLHTTP ActiveXObject. The problem is that it returns cached version of the page instead of requesting new one.
I have no control over the server, and I can't use usual hack of sending extra parameter because server returns error if I send any parameters.
I've googled this to death and the best information is in this Stackoverflow question, but none of the answers work for me; I haven't been able to find a way to use ServerXMLHTTP from gadget Javascript. How do I either use ServerXMLHTTP, or prevent caching in a way other than adding random parameter in the URL?
try using POST instead of GET
request = new ActiveXObject("Microsoft.XMLHTTP");
request.onreadystatechange = callback;
request.open("POST", "server.php", true);
request.setRequestHeader("Content-Type", "application/x-www-form-urlencoded");
request.send();
references:
Prevent Chrome from caching AJAX requests
and my own experience.
Adding a random number worked for me in the context of VBA, eg
myURL= "http://acme.com/someapp/rest/someendpointurl?randombit=" & Rnd()
Dim WinHttpReq As Object
Set WinHttpReq = CreateObject("Microsoft.XMLHTTP")
WinHttpReq.Open "GET", myURL, False, "username", "password"
WinHttpReq.send
myURL = WinHttpReq.responseBody
If WinHttpReq.status = 200 Then
Set oStream = CreateObject("ADODB.Stream")
oStream.Open
oStream.Type = 1
oStream.Write WinHttpReq.responseBody
oStream.SaveToFile saveToPath, 2 ' 1 = no overwrite, 2 = overwrite
oStream.Close
End If
Thank you Sean Cull your solution worked for me in the context of VBA.
(shame I'm not allowed to upvote it so reposting here for reference)
Can you not add a random element to the end of your URL ?
your http://acme.com/someapp/rest/someendpointurl?randombit=20171701143500

How to browse through box.com folders to select a file and download the selected file

I need a excel macro (vba) to select a file from box.com by iterating though existing folders and at the same time I need to upload the file from my machine to box.com folder using excel macro. I am searching for a long time on net. But no use. Please help or try to give some ideas how to achieve this.
Thanks in advance.
-Edit
I am using the below code for getting authentication token. But I am getting an error message at the place of .send(url). Error message is "The server name or address could not be resolved".
Function getAuthToken()
Dim WinHttpReq As WinHttp.WinHttpRequest
Dim api_key As String
api_key = "{api_key}"
Set WinHttpReq = New WinHttp.WinHttpRequest
strUrl = "https://www.box.net/api/1.0/rest?action=get_ticket&api_key=" & api_key
WinHttpReq.Open Method:="GET", url:=strUrl, async:=False
WinHttpReq.Send
getTicket = WinHttpReq.responseText
Debug.Print getTicket
End Function
Not being a vba expert, I suspect that you'll get more answers if you tag your question with a vba tag. However, some quick scanning around shows that vba can call REST apis by doing something like this:
Dim MyURL as String
MyURL = "http://api.box.com/2.0/folders/0.xml"
Set objHTTP = CreateObject("MSXML2.ServerXMLHTTP")
With objHTTP
.Open "GET", MyURL, False
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.setRequestHeader "Authorization", "BoxAuth api_key=<your api key>&auth_token=<your auth token>
.send (MyURL)
End With
I'll defer to a real VBA expert, but something roughly along these lines should work.
Yeah, this is frustrating. I tried code like Peter's using both WinHttp.WinHttpRequest.5.1 and MSXML2.ServerXMLHTTP and with both I just get a zero-length string back. No error message or anything.
I installed cURL and tested the URL. It works fine there. The script below also works fine with a generic JSON web service, like jsonplaceholder.typicode.com.
All this makes me think that Box.com is receiving the message, detecting it is coming from a non-approved source, and returning nothing . . . probably for security reasons.
Option Explicit
Const URL As String = "https://api.box.com/2.0/folders/0 -H ""Authorization: Bearer MyToken"""
'Const URL As String = "https://jsonplaceholder.typicode.com/posts/1"
Sub Test()
Dim winHTTP As Object
' Set winHTTP = CreateObject("WinHttp.WinHttpRequest.5.1")
Set winHTTP = CreateObject("MSXML2.ServerXMLHTTP")
winHTTP.Open "GET", URL
winHTTP.setRequestHeader "Content-Type", "application/json"
winHTTP.send
Debug.Print winHTTP.ResponseText
If Len(winHTTP.ResponseText) = 0 Then
MsgBox "blank string returned"
Else
Dim objResponse As Object
Set objResponse = JsonConverter.ParseJson(winHTTP.ResponseText) 'Converter from Tim Hall - https://github.com/VBA-tools/VBA-JSON
End If
End Sub