403 From Google Drive API When Using API Key - kotlin

I've got a Kotlin application that retrieves publicly availably PDFs stored on Google drive. To download the PDFs, I do the following
#Throws(IOException::class)
fun download(url: String?, destination: File?) {
val connection: URLConnection = URL(url).openConnection()
connection.setConnectTimeout(60000)
connection.setReadTimeout(60000)
connection.addRequestProperty("User-Agent", "Mozilla/5.0")
val output = FileOutputStream(destination, false)
val buffer = ByteArray(2048)
var read: Int
val input: InputStream = connection.getInputStream()
while (input.read(buffer).also { read = it } > -1) output.write(buffer, 0, read)
output.flush()
output.close()
input.close()
}
My url is of the form https://www.googleapis.com/drive/v3/files/${fileId}?key=<MY_KEY>&alt=media.
Google seems to be rejecting requests after it serves about 10 requests. I checked the API usage, and it says I get 20,000 requests per 100 seconds (https://developers.google.com/drive/api/guides/limits). I can see my requests on the API usage chart, so the API key is being recognized. I'm using 10-15 requests then getting the 403. It's not coming back as json, so here is the detailed message:
We're sorry...
... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now.
See Google Help for more information.
I assume I'm missing something obvious. In that HTML blob, it says but your computer or network may be sending automated queries. To protect our users, we can't process your request right now., which is obviously what I'm trying to do.
Do I need to use a different method to pull a couple hundred PDFs from Drive?

You should be using Oauth2 to request that much data tbh. However if you insist on using an api key try adding quotaUser and userIp as part of your request.
Standard Query Parameters
Note: Per-user quotas are always enforced by the Drive API, and the user's identity is determined from the access token passed in the request. The quotaUser and userIp parameters can only be used for anonymous requests against public files.
If all the files are in the same directory you could use a service account and not have to worry about this error.
Oauth tokens.
An Api key is created on Google cloud console. They are used to access public api end points only. They identify your application to google and no more. You can only access public data not private user data. How to create an api key
Access token + refresh token. Are the results of an Oauth2 authorization request by a user. Access tokens are short lived they work for an hour then expire, they give you access to a users data, by sending an authorization header with the access token along with your request for data. Refresh tokens are long lived and can be used to request a new access token on behalf of the user when the one you have has expired Understand Oauth2 with curl

Related

How to use the watchlist ID to access messages via the Stocktwits API?

I am trying to write a python script that collects messages from Stocktwits.
I managed to successfully access the API, with
params = {
'client_id':'my_consumer_key_here',
'response_type':'code',
'redirect_uri':'https://api.stocktwits.com/api/2/oauth/authorize',
'scope':['read','watch_lists'],
'prompt':1,
}
r = requests.get('https://api.stocktwits.com/api/2/oauth/authorize',params)
r.status_code # 200
The messages should be accessed via a watchlist (which I created on my account), like this
r = requests.get('https://api.stocktwits.com/api/2/streams/watchlist/<watchlist_id>.json',params)
To get at the watchlist-id, which I couldn't find on my account, I tried:
r = requests.get('https://api.stocktwits.com/api/2/watchlists.json',params)
r.status_code # 401
This should list the watchlists and their IDs. The '401' indicates that authorization for some reason doesn't work here the same way as above.
I changed the redirect_uri to 'https://api.stocktwits.com/api/2/watchlists.json' as well, frankly I'm not sure what it's good for.
The problem is likely that the authorization process is geared towards having many possible users accessing my app and via my app the API. What I want to do is much simpler.
I have never worked with an API before and most of the documentation is pretty opaque to me. So my question is whether I'm doing something obviously wrong and how I should proceed to get the watchlist_id and retrieve the messages.
You should add access token to your requests for secured resources like
https://api.stocktwits.com/api/2/streams/watchlists.json?access_token=<access_token>
or instead of this you can add authorization header to each request to protected resources.
And access_token you should get from your authorize request
r = requests.get('https://api.stocktwits.com/api/2/oauth/authorize',params)

Using Sessions vs Tokens for API authentication

I have built a simple test API for a CakePHP application that will let a user login from a mobile device (or any device for that matter) and get a JSON response. This API could be used for a mobile app built in PhoneGap.
The login method looks like so:
public function login()
{
if($this->request->is('post'))
{
// Use custom method in Model to find record with password params
$findUser = $this->User->findUser(
$_POST['username_or_email'],
AuthComponent::password($_POST['password'])
);
// If a user exists and matches params
if($findUser)
{
$this->User->id = $findUser['User']['id'];
$this->autoRender = false;
$this->response->type('json');
$this->response->body(json_encode(array('authenticated'=>true,'message'=>__('You have been logged in successfully'))));
}
else
{
$this->autoRender = false;
$this->response->type('json');
$this->response->body(json_encode(array('authenticated'=>false,'message'=>__('Username or password is incorrect'))));
}
}
else
{
$this->autoRender = false;
$this->response->type('json');
$this->response->body(json_encode(array('message'=>'GET request not allowed!')));
}
}
The mobile device (or any API user) can send their login details and then they get the request as JSON as true or false for authenticated. This boolean is NOT used to give the user access, it instead tells the mobile app if they can see certain screens and they ONLY get the data or can send data if the session exists!
As just stated, they are also actually logged into the API itself on the device so if they visit the website directly (from that device) they will have a session and see the same response for the JSON.
So essentially a user remains logged in for the duration of the session on the device they communicated with the server on. This is different to a token which would need to be passed for every request, where as in this example they have a session.
Now the questions...
Is it bad practice for the user to be 'actually' logged into the API
with a session like shown above? It seems like the most secure way to handle authentication for a device as it's using the same logic as the direct web root.
I've seen some APIs use access tokens instead which I've also
implemented (user gets their token returned instead of the boolean
and no session is created). But from what I can tell, this seems
like more work as then I need to check for the access token against
a user record every time a request is made.
edit
For the sake of clarity, I am not a supporter of REST, I AM a supporter of RESTful/RESTlike services. If you look at all of the API's on the internet, very few actually stick to one standard. Whatever scheme you choose will depend on your specific problem-space. Just try to be secure and use intuitive design choices (ie dont name a service "cats" if it returns info about "dogs")
end edit
It is good practice in RESTful API's to manage some form of session/tokenizing scheme. Really the ideal (at least in my opinion, there are many schools of thought on this problem) setup involves rolling tokens.
If you are at all concerned with the security of your API, then permissions should be managed out of your database layer. Yes, this creates a bottleneck, BUT THAT IS ACTUALLY A GOOD THING. Needing to hit the database every single time to validate a client's token adds an extra step in the entire process. This slows down the API, which is actually desireable in a secure system. You don't want a malicious individual to be able to hit your API 3000 times a second, you want their requests to hang for a (somewhat) sizeable fraction of a second.
This is similar to MD5 hashing algorithms. Many of them recalculate the hash a few hundred times, with random pauses in between. This helps to keep a malicious client from attempting to brute force a password (by making it take more time to test each variation of the password string). The same applies to your API.
The other benefit, is that if you DO have a malicious user trying to log in over and over again, if you are managing them from the database layer, then you can red flag their IP Address/username/what-have-you and just drop their requests at step 1.
Anyway, for a suggested process (with rolling tokens, you can cut out parts of this if it seems overkill, but this is hella secure):
User hits a 'login' service, this requires a username/password, and returns two tokens, a Private Access Token and a Public Request Token (the server stores these tokens in the db).
The client stores these Tokens in a secure place
User accesses another endpoint to push/pull some data
Request includes a timestamp
Request includes the Public Request Token
Request includes an Access Token=> This token should be a MD5 hash of the string resulting from concatenating the timestamp string to the end of the Private Access Token string
The server takes the Public Request Token, uses that to lookup the Private Access Token that was stored
The server takes that Private Access Token, and concatenates on the Timestamp String, it then takes the MD5 of this string
If the new Access Token matches the one that the client sent the server, HURRAY, this client is validated, so push/pull the data
(Optional) The server generates new tokens on every request, and returns them to the client. This way every transaction invalidates the old tokens, and if there was some kind of man-in-the-middle attack occurring, if the VALID user has already completed their request, the malicious user now has invalid tokens and can't start messing with your API. This scheme tries to ensure that a malicious user can not expect to intercept a single communication between the server and the client, and still gain access to the system. If they do, then the REAL user should immediately get invalidated tokens. Which should then trigger their API client to hit the 'login' service AGAIN, getting new valid tokens. This once again kicks the malicious user out of the system.
This scheme is not 100% secure, no user access system ever will be. It can be made more secure by adding expiration dates on tokens. This scheme also has the added benefit that you can assign specific permissions to users/tokens (ie Read-Only access, only certain End-Points can be seen, etc)
This is not the only way you can do things, I would look up other Authentication Schemes and take what you want from each of them (OAUTH is a good place to start, then I'd look at Facebook/Twitter/Instagram)
Make your app login everytime, but not with login-pass pair as Swayok lastly suggested. When you login, server generates a token and returns it back to the client. Client then uses this token whenever it makes a request. On each request, server checks whether the token is valid and if so, executes the request.
This is very similar to how sessions work in that, server side frameworks manage it internally and these tokens expire from time to time. However, as Swayok rightuflly pointed out, you don't want session mainly because you're RESTful API should have no state. You get the same utility without storing any user specific data regarding user and logging user in with every request.
Here's a good article on this, or you can try the Facebook Graph API explorer to see it in action
Restful API restricts using sessions and saving system state at all. Each request must log-in user.
Access tokes are great but also require additional handling.
The easiest way is to send authorisation data via HTTP Basic Auth ("Authorization" HTTP header)
http://www.httpwatch.com/httpgallery/authentication/
Mobile Applications can easily do that and it is easy to add this header for each request to API.
On server side:
$username = env('PHP_AUTH_USER');
$password = env('PHP_AUTH_PW');
And process user log-in with this data in ApiAppController->beforeFilter()
To answer your questions
Its not a bad practice as long as you close their session on app close and recreate it when needed. it is same as if they were logged in on a browser they would know and have facility to log out however the same should be available on the app as well otherwise they might have closed the app but not actually ended their session. You can handle this in many ways by asking them to log out automatic checking when they close app
Tokens are an enhanced way of doing the above however you have to consider how secure the token is when transmitted and server need to verify the token on each request. You have said that it seems like more work so yes its more work and if you have time or money constrains and looking for an answer to say if the session style would harm your application in future it wont as long as you are in control of session and not leaving user without ending the session. If you have time then implement tokens and you would like that.

BigQuery Simple Api Authentication

I am trying to gain access to my BigQuery enabled Google API project using the .net Google APIs.
Using a console application, I am trying to authenicate first by supplying my simple API key in the URI, then just trying to get the list of projects.
The error I am receiving when I call Fetch() on the project list is: Login Required [401]
var bigqueryService = new BigqueryService{ Key = "MY-API_KEY" };
var projectList = bigqueryService.Projects.List().Fetch();
I am purposefully not using OAuth2 as we don't need any user data.
The API key simply identifies your app to the API console for quota purposes and other housekeeping. It's not authoritative for accessing BigQuery, as we do consider the BigQuery data as "user data."
If you're just trying to get an OAuth 2 access token for playing around quickly, you can use the OAuth 2 playground:
https://code.google.com/oauthplayground/
This token will be valid for one hour and can be copied/pasted as the access_token query parameter
Here's the scope for BigQuery to use in the playground:
https://www.googleapis.com/auth/bigquery
In the end, you'll either want to use the native client (out of band) flow:
https://developers.google.com/accounts/docs/OAuth2InstalledApp
Or the server-to-server (service accounts) flow:
https://developers.google.com/accounts/docs/OAuth2ServiceAccount
I don't have quick samples handy for those in .NET, but post another question on SO if you can't find them-- I'm sure someone will chip in!
You won't be able to use a simple API key - all authorization to the BigQuery API must happen via user interaction, or alternatively through a service account.

How to hide real URL with Google Cloud Storage?

Scenario: I place some files on Google web storage.
And I want only paid users can download this file. So my question is, how to hide this file from paid user to prevent them from sharing this URL with other unpaid users.
So, is there a way to hide the real file location? Single-use or time-restricted URLs or any other?
May be hiding URL is possible with other CDN providers - MIcrosoft Azure Storage or Amazon S3?
Amazon S3 provides query string authentication (usually referred to as pre-signed URLs) for this purpose, see Using Query String Authentication:
Query string authentication is useful for giving HTTP or browser
access to resources that would normally require authentication. The
signature in the query string secures the request. Query string
authentication requests require an expiration date. [...]
All AWS Software Development Kits (SDKs) provide support for this, here is an example using the GetPreSignedUrlRequest Class from the AWS SDK for .NET, generating a pre-signed URL expiring 42 minutes from now:
using (var s3Client = AWSClientFactory.CreateAmazonS3Client("AccessKey", "SecretKey"))
{
GetPreSignedUrlRequest request = new GetPreSignedUrlRequest()
.WithBucketName("BucketName")
.WithKey("Key")
.WithProtocol(Protocol.HTTP)
.WithExpires(DateTime.Now.AddMinutes(42));
string url = s3Client.GetPreSignedURL(request);
}
Azure Storage has the concept of a Shared Access Signature. It's basically the URL for a BLOB (file) with parameters that limit access. I believe it's nearly identical to the Amazon S3 query string authentication mentioned in Steffen Opel's answer.
Microsoft provides a .NET library for handling Shared Access Signatures. They also provide the documentation you would need to roll your own library.
You can use Signed URLs in Google Cloud Storage to do this:
https://developers.google.com/storage/docs/accesscontrol#Signed-URLs
One way would be to create a Google Group containing only your paid users. Then, for the object's of interest, grant read permission to the group's email address (via the object's Access Control List). With that arrangement, only your paid members will be able to download these projected objects. If someone outside that group tries to access the URL, they'll get an access denied error.
After you set this up, you'll be able to control who can access your objects by editing your group membership, without needing to mess with object ACLs.
Here's an alternative that truly hides the S3 URL. Instead of creating a query string authenticated URL that has a limited viability, this approach takes a user's request, authorizes the user, fetches the S3 data, and finally returns the data to the requestor.
The advantage of this approach is that the user has no way of knowing the S3 URL and cannot pass the URL along to anyone else, as is the case in the query string authenticated URL during its validity period. The disadvantages to this approach are: 1) there is an extra intermediary in the middle of the S3 "get", and 2) it's possible that extra bandwidth charges will be incurred, depending on where the S3 data physically resides.
public void streamContent( User requestor, String contentFilename, OutputStream outputStream ) throws Exception {
// is the requestor entitled to this content?
Boolean isAuthorized = authorizeUser( requestor, filename );
if( isAuthorized ) {
AWSCredentials myCredentials = new BasicAWSCredentials( s3accessKey, s3secretKey );
AmazonS3 s3 = new AmazonS3Client( myCredentials );
S3Object object = s3.getObject( s3bucketName, contentFilename );
FileCopyUtils.copy( object.getObjectContent(), outputStream );
}
}

How to identify a Google OAuth2 user?

I used Facebook login to identify users. When a new user comes, I store their userID in my database. Next time they come, I recognized their Facebook ID and I know which user it is in my database.
Now I am trying to do the same with Google's OAuth2, but how can I recognize the users?
Google sends me several codes and tokens (access_token, id_token, refresh_token), however none of them are constant. Meaning if I log out and log back in 2 minutes later, all 3 values have changed. How can I uniquely identify the user?
I am using their PHP client library: https://code.google.com/p/google-api-php-client/
As others have mentioned, you can send a GET to https://www.googleapis.com/oauth2/v3/userinfo, using the OAuth2 bearer token you just received, and you will get a response with some information about the user (id, name, etc.).
It's also worth mentioning that Google implements OpenID Connect and that this user info endpoint is just one part of it.
OpenID Connect is an authentication layer on top of OAuth2. When exchanging a authorization code at Google's token endpoint, you get an access token (the access_token parameter) as well as an OpenID Connect ID token (the id_token parameter).
Both these tokens are JWT (JSON Web Token, https://datatracker.ietf.org/doc/html/draft-ietf-oauth-json-web-token).
If you decode them, you'll get some assertions, including the id of the user. If you link this ID to a user in your DB, you can immediately identify them without having to do an extra userinfo GET (saves time).
As mentioned in the comments, these tokens are signed with Google's private key and you may want to verify the signature using Google's public key (https://www.googleapis.com/oauth2/v3/certs) to make sure they are authentic.
You can see what's in a JWT by pasting it at https://jwt.io/ (scroll down for the JWT debugger). The assertions look something like:
{
"iss":"accounts.google.com",
"id":"1625346125341653",
"cid":"8932346534566-hoaf42fgdfgie1lm5nnl5675g7f167ovk8.apps.googleusercontent.com",
"aud":"8932346534566-hoaf42fgdfgie1lm5nnl5675g7f167ovk8.apps.googleusercontent.com",
"token_hash":"WQfLjdG1mDJHgJutmkjhKDCdA",
"iat":1567923785,
"exp":1350926995
}
There are also libraries for various programming languages to programatically decode JWTs.
PS: to get an up to date list of URLs and features supported by Google's OpenID Connect provider you can check that URL: https://accounts.google.com/.well-known/openid-configuration.
I inserted this method into google-api-php-client/src/apiClient.php:
public function getUserInfo()
{
$req = new apiHttpRequest('https://www.googleapis.com/oauth2/v1/userinfo');
// XXX error handling missing, this is just a rough draft
$req = $this->auth->sign($req);
$resp = $this->io->makeRequest($req)->getResponseBody();
return json_decode($resp, 1);
}
Now I can call:
$client->setAccessToken($_SESSION[ 'token' ]);
$userinfo = $client->getUserInfo();
It returns an array like this (plus e-mail if that scope has been requested):
Array
(
[id] => 1045636599999999999
[name] => Tim Strehle
[given_name] => Tim
[family_name] => Strehle
[locale] => de
)
The solution originated from this thread: https://groups.google.com/forum/#!msg/google-api-php-client/o1BRsQ9NvUQ/xa532MxegFIJ
It should be mentioned, that the OpenID Connect API returns no id attribute anymore.
It's now the sub attribute which serves as a unique user identification.
See Google Dev OpenID Connect UserInfo
"Who is this?" is essentially a service; you have to request access to it as a scope and then make a request to the Google profile resource server to get the identity. See OAuth 2.0 for Login for the details.
Altough JWTs can be validated locally with the public key, (Google APIs Client Library downloads and caches they public keys automatically) checking the token on Google's side via the https://www.googleapis.com/oauth2/v1/tokeninfo endpoint is necessary to check if the access for the applicaton has been revoked since the creation of the token.
Java version
OAuth2Sample.java