Since today our Airflow service is not able to access queries in BigQuery. All jobs fail with the following message:
[2021-03-12 10:17:28,079] {taskinstance.py:1150} ERROR - Reason: 403 GET https://bigquery.googleapis.com/bigquery/v2/projects/waipu-app-prod/queries/e62030d7-36eb-4420-b482-b5327f4f6c7e?maxResults=0&timeoutMs=900&location=EU: Access Denied: BigQuery BigQuery: Permission denied while getting Drive credentials.
We haven't changed anything in recent days. Therefore we are quite puzzled what the reason might be. Is there a temporary bug? Or might we have to check any settings?
Thanks & Best regards
Albrecht
I solved this by:
Giving the Airflow service account email access to Google Sheet where BigQuery table is derived from
Adding https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/bigquery,https://www.googleapis.com/auth/drive to scopes in the Airflow connection
Regenerating the service account JSON keyfile and pasting into the Keyfile JSON in the Airflow connection
Related
I'm attempting to use the Athena-express node module to query Athena.
Per the Athena-express docs:
This IAM role/user must have AmazonAthenaFullAccess and AmazonS3FullAccess policies attached
Note: As an alternative to granting AmazonS3FullAccess you could granularize and limit write access to a specific bucket. Just specify this bucket name during athena-express initialization"
Providing AmazonS3FullAccess to this micro service is a non-starter. What is the minimum set of priviledges I can grant to the micro service and still get around the "Permission denied on S3 path: s3://..." errors I've been getting?
Currently, I've got the following
Output location: (I don't think the problem is here)
s3:AbortMultipartUpload, s3:CreateMultipartUpload, s3:DeleteObject, s3:Get*, s3:List*, s3:PutObject, s3:PutObjectTagging
on "arn:aws:s3:::[my-bucket-name]/tmp/athena" and "arn:aws:s3:::[my-bucket-name]/tmp/athena/*"
Data source location:
s3:GetBucketLocation
on "arn:aws:s3:::*"
s3:ListBucket
on "arn:aws:s3:::[my-bucket-name]"
s3:Get* and s3:List*
on "arn:aws:s3:::[my-bucket-name]/production/[path]/[path]" and "arn:aws:s3:::[my-bucket-name]/production/[path]/[path]/*"
The error message I get with the above is:
"Permission denied on S3 path: s3://[my-bucket-name]/production/[path]/[path]/v1/dt=2022-05-26/.hoodie_partition_metadata"
Any suggestions? Thanks!
It turned out that the bucket storing the data I needed to query was encrypted, which meant that the missing permission to query was kms:Decrypt.
Athena by outputs the results of a query to a location (which athena-express then retrieves). The location of the output was in that same encrypted bucket, so I also ended up giving my cronjob kms:Encrypt and kms:GeneratedDataKey.
I ended up using CloudTrails to figure out which permissions were causing my queries to fail.
I am trying to run a bigquery query on Airflow with MWAA.
This query uses a table that is based on a Google Sheet. When I run it, I have the following error:
google.api_core.exceptions.Forbidden: 403 Access Denied: BigQuery BigQuery: Permission denied while getting Drive credentials.
I already have a working Google cloud connection on Airflow with an admin service account.
Also:
This service account has access to the google sheet
I added https://www.googleapis.com/auth/drive in the scopes of the Airflow connection
I re-generated a JSON file
Am I doing something wrong? Any idea what I can do to fix this problem?
Thanks a lot
I fixed my issue by creating a NEW Airflow connection. It's a new google cloud connection, with the exact same values as the default google_cloud_default values. Now it works perfectly.
Hope it can help !
I'm quite new to the BigQuery world so apologize if I'm asking a stupid question.
I'm trying to create a scheduled transfer data job that import data into BigQuery from Google Cloud Storage.
Unfortunately I always get the following error message:
Failed to start job for table MyTable with error PERMISSION_DENIED: Access Denied: BigQuery BigQuery: Permission denied while globbing file pattern.
I verified to have all the required permissions already but it still isn't working.
I am trying to connect DataGrip to Google BigQuery exactly following the steps on the JetBrains blog (https://blog.jetbrains.com/datagrip/2018/07/10/using-bigquery-from-intellij-based-ide/) but I keep getting the same error message when I test the connection:
The specified database user/password combination is rejected: com.simba.googlebigquery.support.exceptions.GeneralException: EXEC_JOB_EXECUTION_ERR
And on that same pop up box, it is asking me to enter my login credentials for BigQuery. I shouldn't need to do this as I created a service account in GCP with the correct level of access, and used the json key stored on my local machine (and linked to this service account).
Any ideas?
Expected result: Successful connection
Actual result: Error message above
I am trying to run this powershell cmdlet :
Get-AzureRmDataLakeStoreChildItem -AccountName "xxxx" -Path "xxxxxx"
It fails with an access error. It does not really make sense because i have complete access to the ADLS account. I can browse in the Azure portal. It does not even work with a AzureRunAsConnection from an automation account. But it works perfectly for my colleague. What am i doing wrong?
Error :
Operation: LISTSTATUS failed with HttpStatus:Forbidden
RemoteException: AccessControlException LISTSTATUS failed with error
0x83090aa2 (Forbidden. ACL verification failed. Either the resource
does not exist or the user is not authorized to perform the requested
operation.).
[1f6e5d40-9be1-4682-84be-d538dfca0d19][2019-01-24T21:12:27.0252648-08:00]
JavaClassName: org.apache.hadoop.security.AccessControlException.
Last encountered exception thrown after 1 tries. [Forbidden (
AccessControlException LISTSTATUS failed with error 0x83090aa2
(Forbidden. ACL verification failed. Either the resource does not
exist or the user is not authorized to perform the requested
operation.).
I don't see any firewall restrictions :
I resolved the problem by providing read and execute access to all parent folders in the path. Since ADLS uses the POSIX standard, it does not inherit permissions from parent folders. So, even though the SPN(generated by the automation account) i was using had read/execute access to the specific folder i was interested in, it did not have access to other folders in that path.