Unable to create Big Query Data Transfer job using a service account - google-bigquery

Am not able to create a data transfer job between a google playstore store to a google storage bucket using a service account that has permissions for both. I am able to create a transfer job using my project account which only has access to the storage bucket so I cannot use this in production.
By running the following:
bq mk --transfer_config --target_dataset=<my dataset> --display_name=<My Transfer Job> --params='{"bucket":<playstore bucket>,"table_suffix":<my suffix>}' --data_source=play --service_account <service account email> --service_account_credential_file $GOOGLE_APPLICATION_CREDENTIALS
I am getting error:
Unexpected exception in GetCredentialsFromFlags operation: Credentials
appear corrupt. Please delete the credential file and try your command
again. You can delete your credential file using "bq init
--delete_credentials".
Did a bq init and reran bq cmd but getting same error.
Also activated service acct using below cmd; still same error.
gcloud auth activate-service-account <service account email> --key-file $GOOGLE_APPLICATION_CREDENTIALS

Related

AWS S3 CMD Error message An Error occurred (AccessDenied) when calling the PutObject operation: Access Denied

I have installed AWSCLIV2.MSI on my windows desktop.
After signing into my IAM account using secret key and key ID I try to upload a simple TXT document to my S3 Bucket using the following command:
aws s3 cp 4CLItest.txt s3://bucketname
After entering the command, I receive the following error:
aws s3 cp 4CLItest.txt s3://bucketname
upload failed: .\4CLItest.txt to s3://bucketname/4CLItest.txt
An error occurred (AccessDenied) when
calling the PutObject operation: Access Denied
I'm running the following version of AWS CLI:
aws-cli/2.10.0 Python/3.9.11 Windows/10 exe/AMD64 prompt/off
I am able to upload to this bucket if I use the browser with click and drag using the same IAM account.
However, I have a very large upload i'm needing to perform and would like to use this CLI through command prompt to do so.
Any help or advice would be greatly appreciated.

Query from Bigquery using local command line

I'm trying to query from BigQuery using PowerShell. I've initialised gcloud init and logged in to my account.
The request was this:
bq query --use_legacy_sql=false 'SELECT customer_id FROM `demo1.customers1`'
Resulting with this error:
BigQuery error in query operation: Error processing job
'PROJECT-ID:bqjob': Access Denied:
BigQuery BigQuery: Permission denied while getting Drive credentials.
This worked when I run it in cloud shell.
I've created a service account before and a key for the project. I tried to run this command and doesn't solve it:
gcloud auth activate-service-account SERVICE_ACCOUNT#DOMAIN.COM --key-file=D:/folder/key.json --project=MYPROJECT_ID
Service account should have the OAuth scope for Drive to access drive, below command can be used to authenticate with Drive.
gcloud auth login --enable-gdrive-access

Airflow Permission denied while getting Drive credentials

I am trying to run a bigquery query on Airflow with MWAA.
This query uses a table that is based on a Google Sheet. When I run it, I have the following error:
google.api_core.exceptions.Forbidden: 403 Access Denied: BigQuery BigQuery: Permission denied while getting Drive credentials.
I already have a working Google cloud connection on Airflow with an admin service account.
Also:
This service account has access to the google sheet
I added https://www.googleapis.com/auth/drive in the scopes of the Airflow connection
I re-generated a JSON file
Am I doing something wrong? Any idea what I can do to fix this problem?
Thanks a lot
I fixed my issue by creating a NEW Airflow connection. It's a new google cloud connection, with the exact same values as the default google_cloud_default values. Now it works perfectly.
Hope it can help !

How to create a service account for a bigquery dataset from the cli

I've found instructions how to generate credentials for the project level but there aren't clear instructions on adding a service account to only a specific dataset using the cli.
I tried creating the service account:
gcloud iam service-accounts create NAME
and then getting the dataset:
bq show \
--format=prettyjson \
project_id:dataset > path_to_file
and then adding a role to the access section
{
"role": "OWNER",
"userByEmail": "NAME#PROJECT.iam.gserviceaccount.com"
},
and then updating it. It seemed to work because I was able to create a table but then I got an access denied error User does not have bigquery.jobs.create permission in project when I tried loading data into the table.
When I inspected the project in the cloud console, it seemed as if my service account was added to the project rather then the dataset, which is not what I want but also does not explain why I don't have the correct permissions. In addition to owner permissions I tried assigning editor permission and admin, neither of which solved the issue.
It is not possible for a service account to only have permissions on a dataset level and then run a query. When a query is invoked, it will create a job. To create a job, the service account to be used should have permission bigquery.jobs.create added at a project level. See document for required permissions to run a job.
With this in mind, it is required to add bigquery.jobs.create at project level so you can run queries on the shared dataset.
NOTE: You can use any of the following pre-defined roles as they all have bigquery.jobs.create.
roles/bigquery.user
roles/bigquery.jobUser
roles/bigquery.admin
With my example I used roles/bigquery.user. See steps below:
Create a new service account (bq-test-sa#my-project.iam.gserviceaccount.com)
Get the permissions on my dataset using bq show --format=prettyjson my-project:mydataset > info.json
Add OWNER permission to service account in info.json
{
"role": "OWNER",
"userByEmail": "bq-test-sa#my-project.iam.gserviceaccount.com"
},
Updated the permissions using bq update --source info.json my-project:mydataset
Check BigQuery > mydataset > "SHARE DATASET" to see if the service account was added.
Add role roles/bigquery.user to service account using gcloud projects add-iam-policy-binding myproject --member=serviceAccount:bq-test-sa#my-project.iam.gserviceaccount.com --role=roles/bigquery.jobUser

ADLS to Azure Storage Sync Using AzCopy

Looking for some help to resolve the errors I'm facing. Let me explain the scenario. I'm trying to sync one of the ADLS Gen2 container to Azure BLOB Storage. I have AzCopy 10.4.3, I'm using Azcopy Sync to do this. I'm using the command below
azcopy sync 'https://ADLSGen2.blob.core.windows.net/testsamplefiles/SAMPLE' 'https://AzureBlobStorage.blob.core.windows.net/testsamplefiles/SAMPLE' --recursive
When I run this command I'm getting below error
REQUEST/RESPONSE (Try=1/71.0063ms, OpTime=110.9373ms) -- RESPONSE SUCCESSFULLY RECEIVED
PUT https://AzureBlobStorage.blob.core.windows.net/testsamplefiles/SAMPLE/SampleFile.parquet?blockid=ZDQ0ODlkYzItN2N2QzOWJm&comp=block&timeout=901
X-Ms-Request-Id: [378ca837-d01e-0031-4f48-34cfc2000000]
ERR: [P#0-T#0] COPYFAILED: https://ADLSGen2.blob.core.windows.net/testsamplefiles/SAMPLE/SampleFile.parquet: 404 : 404 The specified resource does not exist.. When Staging block from URL. X-Ms-Request-Id: [378ca837-d01e-0031-4f48-34cfc2000000]
Dst: https://AzureBlobStorage.blob.core.windows.net/testsamplefiles/SAMPLE/SampleFile.parquet
REQUEST/RESPONSE (Try=1/22.9854ms, OpTime=22.9854ms) -- RESPONSE SUCCESSFULLY RECEIVED
GET https://AzureBlobStorage.blob.core.windows.net/testsamplefiles/SAMPLE/SampleFile.parquet?blocklisttype=all&comp=blocklist&timeout=31
X-Ms-Request-Id: [378ca84e-d01e-0031-6148-34cfc2000000]
So far I checked and ensured below things
I logged into correct tenant while logging into AzCopy
Storage Blob Data Contributor role was granted to my AD credentials
Not sure what else I'm missing as the file exists in the source and I'm getting the same error. I tried with SAS but I received different error though. I cannot proceed with SAS due to the vendor policy so I need to ensure this is working with oAuth. Any inputs is really appreciated.
For the 404 error, you may check if there is any typo in the command and the path /testsamplefiles/SAMPLE exists on both source and destination account. Also, please note that from the tips.
Use single quotes in all command shells except for the Windows Command
Shell (cmd.exe). If you're using a Windows Command Shell (cmd.exe),
enclose path arguments with double quotes ("") instead of single
quotes ('').
From azcopy sync supported scenario:
Azure Blob <-> Azure Blob (Source must include a SAS or is publicly
accessible; either SAS or OAuth authentication can be used for
destination)
We must provide include a SAS token in the source, but I tried the below code with AD authentication.
azcopy sync "https://[account].blob.core.windows.net/[container]/[path/to/blob]?[SAS]" "https://[account].blob.core.windows.net/[container]/[path/to/blob]"
but got the same 400 error as the Github issue.
Thus, in this case, after my validation, you could use this command to sync one of the ADLS Gen2 container to Azure BLOB Storage without executing azcopy login. If you have login in, you can run azcopy logout.
azcopy sync "https://nancydl.blob.core.windows.net/container1/sample?sv=xxx" "https://nancytestdiag244.blob.core.windows.net/container1/sample?sv=xxx" --recursive --s2s-preserve-access-tier=false