Using airflow with BigQuery and cloud sdk gives error "User must be authenticated when user project is provided" - google-bigquery

I am trying to run airflow locally. My DAG has a BigQueryOperator and I want to use the cloud sdk for authentication. I run "gcloud auth application-default login" in order to get the json file with the credentials. I try to test my Dag running the command:
airflow test testdag make_tmp_table 2019-02-13 I get the error message "User must be authenticated when user project is provided"
If I instead of using the cloud sdk use a service account that has admin rights to BigQuery it works, but I need to use authentication through the cloud sdk.
Does anyone know what this error message means or how I can run airflow and using the cloud sdk for authentication?
I have used the following source to try to understand how I can run airflow with BigQueryOperators locally.
https://medium.com/#jbencina/local-testing-with-google-cloud-composer-apache-airflow-75d4213d2893

either you are not working on the right project or you don't have permissions to do this job.
what I suggest is:
check your current configuration by running:
gcloud auth list
make sure that you have the right project and the right account set if not run these commands to set them:
gcloud auth application-default login
you will be prompted for a link. follow it and enter your account. after that you will see a verification code, copy it and add it to your gcloud terminal.
next thing to do is to make sure that your account has permissions to do the job that you are trying. probably you need this role roles/composer.admin if it didn't work add the premitive role roles/editor from your IAM console. But use that premitive role only for testing purposes and it's not adviasable to use it for production level project.

I solved it by deleting the credentials file produced when I did:
gcloud auth application-default login and then recreating the file.
Then it worked. So I had the right method, just that something was broken in the credentials file.

as #dlbech said:
Blockquote
This solution was not enough for me. I solved it by deleting the line "quota_project_id": "myproject" line in the application_default_credentials.json file. I don't know why Airflow doesn't like the quota project ID key, but I tested it multiple times, and this was the problem

Related

Gitlab CI/CD How to use PAT

I am currently trying to build my first pipeline. The goal is to download the git repo to a server. In doing so, I ran into the problem that I have 2FA enabled on my account. When I run the pipeline I get the following error message:
remote: HTTP Basic: Access denied. The provided password or token is incorrect or your account has 2FA enabled and you must use a personal access token instead of a password.
Pipeline:
download_repo:
script:
echo "Hallo"
As far as I understand I have to use a PAT because I have 2FA enabled. But unfortunately I have not found any info on how to use the PAT.
To access one of your GitLab repository from your pipeline, you should create a deploy token (as described in token overview).
As noted here:
You get Deploy token username and password when you create deploy token on the repository you want to clone.
You can also use Job token. Job token inherits permissions of the user triggering the pipeline.
If your users have access to the repository you need to clone you can use git clone https://gitlab-ci-token:${CI_JOB_TOKEN}#gitlab.example.com/<namespace>/<project>.
More details on Job token is here.
The OP Assassinee adds in the comments:
The problem was that the agent could not access the repository.
I added the following item in the agent configuration:
clone_url = "https://<USER>:<PAT>#gitlab.example.com"
This makes it possible for the agent to access the repository.

Execute BigQuery using python sdk from Jenkins

I have a python program executing bigquery using cloud service account successfully.
When I try to schedule the python program using Jenkins, I see the below error
The gcloud user has bigquery editor, dataowner and admin permission to table, and dataset.
Log:
gcloud auth activate-service-account abc --key-file=****
Activated service account credentials for: abc273721.iam.gserviceaccount.com]
gcloud config set project p1
Updated property p1.
403 Access Denied: Table XYZ: User does not have permission to query table
I see that you have provided all the required roles; bigquery.dataOwner & bigquery.admin,as mentioned here but it looks like you have to also grant the service account access to the dataset.
Create a service account with BigQuery Admin Role and download JSON key file (example: data-lab.json).
Use below code:
gcloud auth activate-service-account "service-account" --key-file=data-lab.json --project="project-name"

CloudFoundry CLI login not working (Credentials were rejected, please try again)

So far I have always been able to log in successfully via sso.
cf login -a url --sso
I need another way to log in for my pipeline script and tried the following command.
cf login [-a API_URL] [-u USERNAME] [-p PASSWORD] [-o ORG] [-s SPACE]
This command does not work with my user, nor with a technical user to whom all necessary roles have been assigned (M D A). I get the following message.
API endpoint: url
Password>
Authenticating...
Credentials were rejected, please try again.
Does anyone know how to solve this problem?
Or maybe an alternative to create a gradle task, for example, that can be executed in a jenkins pipeline.
At the end, I want to automate a deploy (to cloud) of an artifact with my Jenkins pipeline.
You provided —sso flag, so you shouldn’t see a password prompt. Instead you should be given the url to get a token.
Maybe your CF has been misconfigured and does not support SSO yet. I tried to fix the CF CLI to avoid this but it was oddly rejected https://github.com/cloudfoundry/cli/pull/1624
Try fixing your CF installation (it needs to provide some prompts), or skip the —sso flag usage.
Using --sso and -u/-p are not doing the same thing on the backend, and there's no guarantee that a user which can login through SSO is also set up to login as a user stored directly in UAA. UAA has multiple origin's from which users can be loaded, like SAML, LDAP and internal to UAA. When you use the --sso flag, you are typically logging in via a user from your company's SAML provider. When you use the -u/-p flags, it's typically LDAP or UAA, something UAA validates directly.
In order for what you are trying to do to work, you would need to have a user available with an origin in SAML (for --sso) and a user in origin LDAP or UAA (internal), and technically those would be two separate users (despite the fact that they may have the same credentials).
At any rate, if you normally login with the --sso flag and you want to automate work, what you really want is to get a UAA client that is set with the grant type of client credentials. You can then use cf auth CLIENT_ID CLIENT_SECRET --client-credentials to automate logging in.
Typically you don't want your user account to be tied to pipelines and automated scripts anyway. If you leave the company and your user get deactivated then everything breaks :) You want a service account, and that is basically a client enabled with the client credentials grant type in UAA.

bq cmd query Google Sheet Table occur "Access Denied: BigQuery BigQuery: No OAuth token with Google Drive scope was found" Error

I have a table connect with Google Sheet, use WebUI query this table success, but if I use bq cmd query, It will echo error msg:
Access Denied: BigQuery BigQuery: No OAuth token with Google Drive
scope was found
I presume you are using bq command line tool which comes with Cloud SDK.
To use bq you had to procure credentials, most likely you used
gcloud auth login
By default these credentials do not get drive scope. You have to explicitly request it via
gcloud auth login --enable-gdrive-access
Now running bq to access Google Drive data should work.
From the comments: for some people, it seems to be necessary to run
gcloud auth revoke
before logging in again. (Deleting ~/.config/gcloud would also work, but is probably overkill.)
Chances are the 'https://www.googleapis.com/auth/drive.readonly' scope is missing in the credentials of your request.
For details, see:
Credentials Error when integrating Google Drive with
Run auth revoke then auth login if only the latter doesn't work.
gcloud auth revoke
gcloud auth login --enable-gdrive-access
Hi, I know what happen, before gcloud auth login --enable-gdrive-access, I need delete ~/.config/gcloud folder, thanks!! – Karl Lin Sep 14 '17 at 12:32
Here's the complete answer based on Karl Lin's comment to the accepted answer.
rm -rf ~/.config/gcloud
gcloud auth login --enable-gdrive-access
I needed to delete ~/.config/gcloud or it won't work.

How to get gcloud auth activate-service-account persist

I am using the bq command line tool to query from a Bigquery table. Is there a way to get the service account authentication to persist when I logged in and out of the box that the query process is running on?
Steps I did:
I logged into the linux box
Authenticate service account by running:
gcloud auth activate-service-account --key-file /somekeyFile.p12 someServiceAccount.gserviceaccount.com
Query from bigquery table, this works fine:
bq --project_id=formal-cascade-571 query "select * from dw_test.clokTest"
But then I logged out from the box, and logged back in. When I query the Bigquery table again:
bq --project_id=formal-cascade-571 query "select * from dw_test.clokTest"
It gives me the error:
Your current active account [someServiceAccount.gserviceaccount.com] does not have any valid credentials.
Even when I pass in the private key file:
bq --service_account=someServiceAccount.gserviceaccount.com --service_account_credential_file=~/clok_cred.txt --service_account_private_key_file=/somekeyFile.p12 --project_id=formal-cascade-571 query "select * from dw_test.clokTest"
It gives the same error:
Your current active account [someServiceAccount.gserviceaccount.com] does not have any valid credentials.
So every time I need to re-authenticate my service account by:
gcloud auth activate-service-account
Is there a way to have the authenticated service account credential persist?
Thank you for your help.
I asked the GCloud devs and they mention a known bug where service accounts don't show up unless the environment variable CLOUDSDK_PYTHON_SITEPACKAGES is set.
Hopefully this will be fixed soon, but in the meantime, when you log in again, can you try running
export CLOUDSDK_PYTHON_SITEPACKAGES=1
and seeing if it then works?
You can run
gcloud auth list
to see what accounts there are credentials for; it should list your service account.
I fixed it by relaunching gcloud auth login. Google then asked me to open a webpage which triggered the CLOUDSDK authorization which I believe is linked to the solution shared by J. Tigani.