I've been using the Google Vision API for a while now to extract text from documents (PDFs) but just came across an issue. I have created a long running job and now I need to check the job status. According to the documentation the GET request should be;
GET https://vision.googleapis.com/v1/operations/operation-id
However when trying that I get a response;
{ "error": { "code": 400, "message": "Invalid operation id format. Valid format is either projects/*/operations/* or projects/*/locations/*/operations/*", "status": "INVALID_ARGUMENT" } }
Ok, no problem, so I look through the docs and according to the message I should be able to do the following;
https://vision.googleapis.com/v1/projects/project-id/operations/1efec2285bd442df
Or;
https://vision.googleapis.com/v1/projects/project-id/locations/location-id/operations/1efec2285bd442df
My final code is a GET request using PHP Curl like so;
$url = "https://vision.googleapis.com/v1/projects/myproject-id/operations/longrunningjobid";
// create a curl request
$ch = curl_init($url);
// define the parameters
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Authorization:Bearer $token", "Content-Type: application/json; charset=utf-8"));
// execute the request
$result = curl_exec($ch);
// close the connection
curl_close($ch);
echo $result;
I have tried several combinations of the url to try and get this to work. My gcp project id is correct and the job number is correct but I feel the url is not right. Any ideas?
The implementation was correct, however I was using regex earlier in the code and didn't realize that in PHP the \n character is escaped differently than in javascript.
So in javascript I was using \\n to escape it but in PHP I needed to use \\\n.
This was causing the longrunningjobid to have one too many characters.
Related
https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets.values/batchUpdateByDataFilter
We have used above function in our code, while we are passing the more than 50 or 100 records within the array records then given 400 bad request array in response.
Can anyone describe the limit of the total values that we are going to pass within the above function?
Here is my code:
$batchupdate = array("valueInputOption" => "RAW", "data" => $dataarray);
try {
$requestBody = new Google_Service_Sheets_BatchUpdateValuesByDataFilterRequest($batchupdate);
$response = $service->spreadsheets_values->BatchUpdateByDataFilter($spreadsheetId, $requestBody);
}
catch(Exception $e) {
echo 'Message: ' . $e->getMessage();
}
Troubleshooting:
Problems with the Request
Until you attach a sanitized version of your Request body we cannot be sure about the root-cause of the problem you are facing.
However, an error 400 means that the request you did is invalid. So, most likely, the problem is in that.
Check if your request object is formatted as detailed on the documentation.
Problems with the Client
If you are able to use the Try this API sidebar with the same Request Body then it could be related to the PHP client.
Note: This is language independent. Create a JSON Object that has the same structure as your request body.
If that's the case, we will need to see more of your code to verify that you are not using your valid Request body in an invalid way (eg. sending it encapsulated in another object).
By referencing the PHP Library documentation you can see the properties of the objects you can use.
It it possible to get a list of all the photos or photo IDs by a specific user on Unsplash using the Unsplash API?
According to their API documentation it should be possible:
https://unsplash.com/documentation#list-a-users-photos
You can do that by sending an XHR GET request to their API url:
GET /users/:username/photos
In addition, you can use the parameters page and per_page to increase/decrease the number of photos returned in the request, and therefore, get all the photos of that specific user within one request. I do not see in their documentation an established hard limit on the number of items per_page, which by default is 10.
Actually per_page is limited to 30.
This would be my answer on here (I've been working with this on PowerShell)
I've created a script to get a bearer token that can be read in here:
https://github.com/j0rt3g4/MyTechNetScript/blob/master/TechNet/Unsplash/Do-Oauth20forUnsplash.md#Create-an-account-on-unsplash
and here:
https://medium.com/#josegabrielortegac/how-to-fix-capture-one-software-c3e59b2924da
and downloaded here:
https://gallery.technet.microsoft.com/scriptcenter/Get-Bearer-Token-to-use-360f9ae2
or here (because TechNet would be closing soon)
https://github.com/j0rt3g4/MyTechNetScript/blob/master/TechNet/Unsplash/Do-Oauth20forUnsplash.ps1
my solution for this is after getting the Bearer token from the script, you can use it on any app because that particular token never expires. you'd need just to add the "Authorization", "Bearer Token" header, where token should be changed by the value received by the script and after that, you can just get the count:
Total Number of pictures / 30 (images per page)
In my case my account has right now 750 pictures so that number gives = 25.
and run this:
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Authorization","Bearer <beater token value>")
for($i=1; $i -lt 25;$i++){
$obj+= Invoke-WebRequest -Uri "https://api.unsplash.com/users/j0rt/photos?per_page=30&page=$i&order_by=latest&stats=true" -Headers $headers | select -ExpandProperty Content | ConvertFrom-Json # | select Downloads,Views,Likes
}
We are creating REST API and implemented oAuth 2, using YII framework.
We are facing a strange issue, while we are trying to access the resource and sending access token via "Authorization Request Header Field" we are getting the expected output.
e.g.
curl -i -H "Accept:application/json" -H "Authorization: Bearer XXXXXX"
Whereas while we are trying to send the access token via "URI Query Parameter" we are getting response as "Unauthorized".
e.g.
https://server.example.com/resource?access_token=XXXXXX&p=q
Your suggestions would be really helpful for us.
RFC 6750 (Bearer Token Usage) defines 3 ways to pass an access token to a protected resource endpoint.
Via Authorization header. (2.1. Authorization Request Header Field)
Via a form parameter access_token. (2.2. Form-Encoded Body Parameter)
Via a query parameter access_token. (2.3. URI Query Parameter)
Among the above, only the first way is mandatory. It seems your authorization server does not support the third way.
Addition for the comment
Below is an example to support all the 3 ways in PHP. See "3. Extract Access Token" in "Protected Resource" for details and for other examples in Ruby and Java.
/**
* Function to extract an access token from a request.
*/
function extract_access_token()
{
// The value of Authorization header.
$header = $_SERVER['HTTP_AUTHORIZATION'];
// If the value is in the format of 'Bearer access-token'.
if ($header != null && preg_match('/^Bearer[ ]+(.+)/i', $header, $captured))
{
// Return the value extracted from Authorization header.
return $captured;
}
if ($_SERVER['REQUEST_METHOD'] == 'GET')
{
// Return the value of 'access_token' query parameter.
return $_GET['access_token'];
}
else
{
// Return the value of 'access_token' form parameter.
return $_POST['access_token'];
}
}
I don't know Yii, but my guess is that simply the framework does not contain code like the above.
More specifically, I am looking at the Web Services of Commission Junction (http://help.cj.com/en/web_services/web_services.htm#Commission_Detail_Service.htm) and the Authorization key is supposed to be part of the "Header" for the request.
Would I be able to send the request with just a url? For example (using the URI from their website): https://publisher-lookup.api.cj.com/v2/joined-publisher-lookup?Authorization=[developer key]&url=http%3A%2F%2Fwww.cj.com
Also, if anybody is familiar with Pentaho Data Integration v4.3 (PDI or Kettle), help with making this API call using PDI would be much appreciated (that is ultimately what I am trying to achieve).
Thank you!
For Firefox, there is an add-on to make a REST API call with headers: https://addons.mozilla.org/en-US/firefox/addon/modify-headers/
And Commission Junction outlines how to use it: http://www.cj.com/webservices/quick-start-guide
This depends 100% on commission junction since they'll be expecting the key in one place or another. They may have allowed for the means to pass it in the URL or not but it's implementation on their side which determines this and should be in their docs. Is doesn't invalidate the REST pattern per-se.
Sounds like you found out how to pass the params in the header anyway - so that's probably the way to go.
you should send the developer key as http header. Here is an example code for commissionjunction (CJ) in PHP:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://commission-detail.api.cj.com/v3/commissions?date-type=posting&start-date=" . date('Y-m-d', (time()-(24*3600))) . "&end-date=" . date('Y-m-d'));
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_GET, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("authorization: " . $yourdeveloperkey));
$result = curl_exec($ch);
curl_close($ch);
print_r($result);
I have a bunch of URLs which are currently indexed in Google. Given those URLs, is there a way to figure out when was the last time Google crawled them ?
Manually, if i check the link in Google and check the 'cached' link, I see the date on when it was crawled. Is there a way to do this automatically ? A Google API of some sort ?
Thank you :)
Google doesn't provide an API for this type of data. The best way of tracking last crawled information is to mine your server logs.
In your server logs, you should be able to identify Googlebot by it's typical user-agent: Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html). Then you can see what URLs Googlebot has crawled, and when.
If you want to be sure it's Googlebot crawling those pages you can verify it with a Reverse DNS lookup.. Bingbot also supports Reverse DNS lookups.
If you don't want to manually parse your server logs, you can always use something like splunk or logstash. Both are great log processing platforms.
Also note, that the "cached" date in the SERPs doesn't always necessarily match the last crawled date. Googlebot can crawl your pages multiple times after the "cached" date, but not update their cached version. You can think of "cached date" as more of a "last indexed" date, but that's not exactly correct either. In either case, if you ever need to get a page re-indexed, you can always use Google Webmaster Tools (GWT). There's an option in GWT to force Googlebot to re-crawl a page, and also re-index a page. There's a weekly limit of 50 or something like that.
<?php
$domain_name = $_GET["url"];
//get googlebot last access
function googlebot_lastaccess($domain_name)
{
$request = 'http://webcache.googleusercontent.com/search?hl=en&q=cache:'.$domain_name.'&btnG=Google+Search&meta=';
$data = getPageData($request);
$spl=explode("as it appeared on",$data);
//echo "<pre>".$spl[0]."</pre>";
$spl2=explode(".<br>",$spl[1]);
$value=trim($spl2[0]);
//echo "<pre>".$spl2[0]."</pre>";
if(strlen($value)==0)
{
return(0);
}
else
{
return($value);
}
}
$content = googlebot_lastaccess($domain_name);
$date = substr($content , 0, strpos($content, 'GMT') + strlen('GMT'));
echo "Googlebot last access = ".$date."<br />";
function getPageData($url) {
if(function_exists('curl_init')) {
$ch = curl_init($url); // initialize curl with given url
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']); // add useragent
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // write the response to a variable
if((ini_get('open_basedir') == '') && (ini_get('safe_mode') == 'Off')) {
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow redirects if any
}
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5); // max. seconds to execute
curl_setopt($ch, CURLOPT_FAILONERROR, 1); // stop when it encounters an error
return #curl_exec($ch);
}
else {
return #file_get_contents($url);
}
}
?>
Just upload this PHP and create a Cron-Job.
You can test it as following .../bot.php/url=http://www....
You can check the google bot last visit using the link http://www.gbotvisit.com/