Query data from Google Sheets-based table in BigQuery via API using service account - google-bigquery

I can fetch data from native BigQuery tables using a service account.
However, I encounter an error when attempting to select from a Google Sheets-based table in BigQuery using the same service account.
from google.cloud import bigquery
client = bigquery.Client.from_service_account_json(
json_credentials_path='creds.json',
project='xxx',
)
# this works fine
print('test basic query: select 1')
job = client.run_sync_query('select 1')
job.run()
print('results:', list(job.fetch_data()))
print('-'*50)
# this breaks
print('attempting to fetch from sheets-based BQ table')
job2 = client.run_sync_query('select * from testing.asdf')
job2.run()
The output:
⚡ ~/Desktop ⚡ python3 bq_test.py
test basic query: select 1
results: [(1,)]
--------------------------------------------------
attempting to fetch from sheets-based BQ table
Traceback (most recent call last):
File "bq_test.py", line 16, in <module>
job2.run()
File "/usr/local/lib/python3.6/site-packages/google/cloud/bigquery/query.py", line 381, in run
method='POST', path=path, data=self._build_resource())
File "/usr/local/lib/python3.6/site-packages/google/cloud/_http.py", line 293, in api_request
raise exceptions.from_http_response(response)
google.cloud.exceptions.Forbidden: 403 POST https://www.googleapis.com/bigquery/v2/projects/warby-parker-1348/queries: Access Denied: BigQuery BigQuery: No OAuth token with Google Drive scope was found.
I've attempted to use oauth2client.service_account.ServiceAccountCredentials for explicitly defining scopes, including a scope for drive, but I get the following error when attempting to do so:
ValueError: This library only supports credentials from google-auth-library-python. See https://google-cloud-python.readthedocs.io/en/latest/core/auth.html for help on authentication with this library.
My understanding is that auth is handled via IAM now, but I don't see any roles to apply to this service account that have anything to do with drive.
How can I select from a sheets-backed table using the BigQuery python client?

I've ran into the same issue and figured out how to solve it.
When exploring google.cloud.bigquery.Client class, there is a global variable tuple SCOPE that is not being updated by any arguments nor by any Credentials object, persisting its default value to the classes that follows its use.
To solve this, you can simply add a new scope URL to the google.cloud.bigquery.Client.SCOPE tuple.
In the following code I add the Google Drive scope to it:
from google.cloud import bigquery
#Add any scopes needed onto this scopes tuple.
scopes = (
'https://www.googleapis.com/auth/drive'
)
bigquery.Client.SCOPE+=scopes
client = bigquery.Client.from_service_account_json(
json_credentials_path='/path/to/your/credentials.json',
project='your_project_name',
)
With the code above you'll be able to query data from Sheets-based tables in BigQuery.
Hope it helps!

I think you're right that you need to pass the scope for gdrive when authenticating. The scopes are passed here https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/core/google/cloud/client.py#L126 and it seems like the BigQuery client lacks these scopes https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/bigquery/google/cloud/bigquery/client.py#L117 . I suggest asking on github and also as a workaround you can try to override client credentials including gdrive scope, but you'll need to use google.auth.credentials from GoogleCloudPlatform/google-auth-library-python instead of oauth2client, as error message suggests.

Related

Using a service account and JSON key which is sent to you to upload data into google cloud storage

I wrote a python script that uploads files from a local folder into Google cloud storage.
I also created a service account with sufficient permission and tested it on my computer using that service account JSON key and it worked.
Now I send the code and JSON key to someone else to run but the authentication fails on her side.
Are we missing any authentication through GCP UI?
def config_gcloud():
subprocess.run(
[
shutil.which("gcloud"),
"auth",
"activate-service-account",
"--key-file",
CREDENTIALS_LOCATION,
]
)
storage_client = storage.Client.from_service_account_json(CREDENTIALS_LOCATION)
return storage_client
def file_upload(bucket, source, destination):
storage_client = config_gcloud()
...
The error happens in the config_cloud and it says it is expecting str, path, ... but gets NonType.
As I said, the code is fine and works on my computer. How anotehr person can use it using JSON key which I sent her?She stored Json locally and path to Json is in the code.
CREDENTIALS_LOCATION is None instead of the correct path, hence it complaining about it being NoneType instead of str|Path.
Also you don't need that gcloud call, that would only matter for gcloud/gsutil commands, not python client stuff.
And please post the actual stacktrace of the error next time, not just a misspelled interpretation of it.

Auth0. How to retrieve over 1000 users (and make this call via a python script than be run as a cron job)

I am trying to use Auth0 to get a list of users when my user list is >1000 (approx 2000)
So I understand a bit better now how this works after following the steps at:
https://auth0.com/docs/manage-users/user-migration/bulk-user-exports
There are three steps:
Use a POST call to the https://MY_DOMAIN/oauth/token endpoint to get an auth token (done)
Then take this token and insert it into the next POST call to the endpoint: https://MY_DOMAIN/api/v2/jobs/users-exports
Then take the job_id and insert it into the 3rd GET call to the endpoint: https://MY_DOMAIN/api/v2/jobs/MY_JOB_ID
But this just gives me a link to a document that I download. Essentially is the same end result as using the User Import / Export extension.
This is NOT what I want. I want to be able to call an endpoint and have it return a list of all the users (similar to the Retrieve Users with the Get Users Endpoint). I require it is done this way, so I can write a python script and run it as a cron job.
However, since I have over 1000 users, I am getting the below error when I call the GET /API/v2/users endpoint.
auth0.v3.exceptions.Auth0Error: 400: You can only page through the first 1000 records. See https://auth0.com/docs/users/search/v3/view-search-results-by-page#limitation
Can anyone help? Can this be done all the way I wish it to be?

I want to download bigquery result from my own python code and something wrong with my authentication

I am trying to download bigQuery results from GCP and I was following the instruction on GCP documentation GCP authentication. It tells me to create a service account which I did, however, the output tells me that this service account has no permission to access the table
google.api_core.exceptions.Forbidden: 403 Access Denied: Table dbd-sdlc-prod:HKG_NORMALISED.HKG_NORMALISED: User does not have permission to query table dbd-sdlc-prod:HKG_NORMALISED.HKG_NORMALISED.
This reminds me that the table I wished to query was provided by a third party, they grant my account permission to access these data and the permission was only granted for my google account. I wish to find a way to authenticate it with my own account instead of a service account to download the query result, will it be possible and how can I do that exactly?
And following is the role for my test service account, I believe I have set them right as the top role "owner". Thanks in advance
from google.cloud import bigquery
bqclient = bigquery.Client()
# Download query results.
query_string = """
SELECT
Date_Time,
Price,
Volume,
Market_VWAP,
Qualifiers AS Qualifiers,
Ex_Cntrb_ID,
Qualifiers AS TradeCategory
FROM
`dbd-sdlc-prod.HKG_NORMALISED.HKG_NORMALISED`
WHERE
RIC = '1606.HK'
AND (Date_Time BETWEEN TIMESTAMP('2016-07-11 00:00:00.000000') AND
TIMESTAMP('2016-07-11 23:59:59.999999'))
AND Type="Trade"
AND Volume >0
AND Price >0
"""
dataframe = (
bqclient.query(query_string)
.result()
.to_dataframe(
# Optionally, explicitly request to use the BigQuery Storage API. As of
# google-cloud-bigquery version 1.26.0 and above, the BigQuery Storage
# API is used by default.
create_bqstorage_client=True,
)
)
print(dataframe.head())
If are you using Are using the Google Cloud SDK you can just run gcloud auth login and authenticate on your google account.
If not, you will have to Authenticate as an end user, here is the details and examples of how to do that.
On your code, you will have to add the code for authenticate your application. (Don't forget to do the other steps in the tutorial)
Your code will be like this:
from google.cloud import bigquery
from google_auth_oauthlib import flow
#--- Authentication
appflow = flow.InstalledAppFlow.from_client_secrets_file(
"client_secrets.json", scopes=["https://www.googleapis.com/auth/bigquery"]
)
if launch_browser:
appflow.run_local_server()
else:
appflow.run_console()
credentials = appflow.credentials
project = 'user-project-id'
#---
bqclient = bigquery.Client(project=project, credentials=credentials)
# Download query results.
query_string = """
SELECT
Date_Time,
Price,
Volume,
Market_VWAP,
Qualifiers AS Qualifiers,
Ex_Cntrb_ID,
Qualifiers AS TradeCategory
FROM
`dbd-sdlc-prod.HKG_NORMALISED.HKG_NORMALISED`
WHERE
RIC = '1606.HK'
AND (Date_Time BETWEEN TIMESTAMP('2016-07-11 00:00:00.000000') AND
TIMESTAMP('2016-07-11 23:59:59.999999'))
AND Type="Trade"
AND Volume >0
AND Price >0
"""
dataframe = (
bqclient.query(query_string)
.result()
.to_dataframe(
# Optionally, explicitly request to use the BigQuery Storage API. As of
# google-cloud-bigquery version 1.26.0 and above, the BigQuery Storage
# API is used by default.
create_bqstorage_client=True,
)
)
print(dataframe.head())

Credentials Error when integrating Google Drive with

I am using Google Big Query, I want to integrate Google Big Query to Google Drive. In Big query I am giving the Google spread sheet url to upload my data It is updating well, but when I write the query in google Add-on(OWOX BI Big Query Reports):
Select * from [datasetName.TableName]
I am getting an error:
Query failed: tableUnavailable: No suitable credentials found to access Google Drive. Contact the table owner for assistance.
I just faced the same issue in a some code I was writing - it might not directly help you here since it looks like you are not responsible for the code, but it might help someone else, or you can ask the person who does write the code you're using to read this :-)
So I had to do a couple of things:
Enable the Drive API for my Google Cloud Platform project in addition to BigQuery.
Make sure that your BigQuery client is created with both the BigQuery scope AND the Drive scope.
Make sure that the Google Sheets you want BigQuery to access are shared with the "...#appspot.gserviceaccount.com" account that your Google Cloud Platform identifies itself as.
After that I was able to successfully query the Google Sheets backed tables from BigQuery in my own project.
What was previously said is right:
Make sure that your dataset in BigQuery is also shared with the Service Account you will use to authenticate.
Make sure your Federated Google Sheet is also shared with the service account.
The Drive Api should as well be active
When using the OAuthClient you need to inject both scopes for the Drive and for the BigQuery
If you are writing Python:
credentials = GoogleCredentials.get_application_default() (can't inject scopes #I didn't find a way :D at least
Build your request from scratch:
scopes = (
'https://www.googleapis.com/auth/drive.readonly', 'https://www.googleapis.com/auth/cloud-platform')
credentials = ServiceAccountCredentials.from_json_keyfile_name(
'/client_secret.json', scopes)
http = credentials.authorize(Http())
bigquery_service = build('bigquery', 'v2', http=http)
query_request = bigquery_service.jobs()
query_data = {
'query': (
'SELECT * FROM [test.federated_sheet]')
}
query_response = query_request.query(
projectId='hello_world_project',
body=query_data).execute()
print('Query Results:')
for row in query_response['rows']:
print('\t'.join(field['v'] for field in row['f']))
This likely has the same root cause as:
BigQuery Credential Problems when Accessing Google Sheets Federated Table
Accessing federated tables in Drive requires additional OAuth scopes and your tool may only be requesting the bigquery scope. Try contacting your vendor to update their application?
If you're using pd.read_gbq() as I was, then this would be the best place to get your answer: https://github.com/pydata/pandas-gbq/issues/161#issuecomment-433993166
import pandas_gbq
import pydata_google_auth
import pydata_google_auth.cache
# Instead of get_user_credentials(), you could do default(), but that may not
# be able to get the right scopes if running on GCE or using credentials from
# the gcloud command-line tool.
credentials = pydata_google_auth.get_user_credentials(
scopes=[
'https://www.googleapis.com/auth/drive',
'https://www.googleapis.com/auth/cloud-platform',
],
# Use reauth to get new credentials if you haven't used the drive scope
# before. You only have to do this once.
credentials_cache=pydata_google_auth.cache.REAUTH,
# Set auth_local_webserver to True to have a slightly more convienient
# authorization flow. Note, this doesn't work if you're running from a
# notebook on a remote sever, such as with Google Colab.
auth_local_webserver=True,
)
sql = """SELECT state_name
FROM `my_dataset.us_states_from_google_sheets`
WHERE post_abbr LIKE 'W%'
"""
df = pandas_gbq.read_gbq(
sql,
project_id='YOUR-PROJECT-ID',
credentials=credentials,
dialect='standard',
)
print(df)

Google Analytics Management API - Insert method - Insufficient permissions HTTP 403

I am trying to add users to my Google Analytics account through the API but the code yields this error:
googleapiclient.errors.HttpError: https://www.googleapis.com/analytics/v3/management/accounts/**accountID**/entityUserLinks?alt=json returned "Insufficient Permission">
I have Admin rights to this account - MANAGE USERS. I can add or delete users through the Google Analytics Interface but not through the API. I have also added the service account email to GA as a user. Scope is set to analytics.manage.users
This is the code snippet I am using in my add_user function which has the same code as that provided in the API documentation.
def add_user(service):
try:
service.management().accountUserLinks().insert(
accountId='XXXXX',
body={
'permissions': {
'local': [
'EDIT',
]
},
'userRef': {
'email': 'ABC.DEF#gmail.com'
}
}
).execute()
except TypeError, error:
# Handle errors in constructing a query.
print 'There was an error in constructing your query : %s' % error
return None
Any help will be appreciated. Thank you!!
The problem was I using a service account when I should have been using an installed application. I did not need a service account since I had access using my own credentials.That did the trick for me!
Also remember that you have to specify the scope you would like to use, this example here (using the slightly altered example by Google) defines by default two scopes which would NOT allow to insert users (as they both give read only permissions) and would result in "Error 403 Forbidden" trying so.
The required scope is given in the code below:
from apiclient.discovery import build
from googleapiclient.errors import HttpError
from oauth2client.service_account import ServiceAccountCredentials
def get_service(api_name, api_version, scopes, key_file_location):
"""Get a service that communicates to a Google API.
Args:
api_name: The name of the api to connect to.
api_version: The api version to connect to.
scopes: A list auth scopes to authorize for the application.
key_file_location: The path to a valid service account JSON key file.
Returns:
A service that is connected to the specified API.
"""
credentials = ServiceAccountCredentials.from_json_keyfile_name(
key_file_location, scopes=scopes)
# Build the service object.
service = build(api_name, api_version, credentials=credentials)
return service
def get_first_profile_id(service):
# Use the Analytics service object to get the first profile id.
# Get a list of all Google Analytics accounts for this user
accounts = service.management().accounts().list().execute()
if accounts.get('items'):
# Get the first Google Analytics account.
account = accounts.get('items')[0].get('id')
# Do something, e.g. get account users & insert new ones
# ...
def main():
# Define the auth scopes to request.
# Add here
# https://www.googleapis.com/auth/analytics.manage.users
# to be able to insert users as well:
scopes = [
'https://www.googleapis.com/auth/analytics.readonly',
'https://www.googleapis.com/auth/analytics.manage.users.readonly',
]
key_file_location = 'my_key_file.json'
# Authenticate and construct service.
service = get_service(
api_name='analytics',
api_version='v3',
scopes=scopes,
key_file_location=key_file_location)
profile_id = get_first_profile_id(service)
print_results(get_results(service, profile_id))
if __name__ == '__main__':
main()
Regards,
HerrB92