dbt Redshift IAM credentials in profiles.yaml

dbt Redshift IAM credentials in profiles.yaml - dbt

I am new to dbt and trying to get started. I have managed to configure profiles.yml to match Redshift IAM authentication profile, but running dbt debug after, gives the following error -
>Database Error
Unable to get temporary Redshift cluster credentials: An error occurred (ClusterNotFound) when calling the GetClusterCredentials operation: Cluster cluster_zzz not found.
However, I am able to connect fine from within DataGrip using the exact IAM configuration.
dbt documentaion I followed - https://docs.getdbt.com/reference/warehouse-profiles/redshift-profile
profiles.yml
risk:
target: dev
outputs:
dev:
type: redshift
method: iam
cluster_id: cluster_zzz
host: cluster_zzz.d4dcyl2bbxyz.eu-west-1.redshift.amazonaws.com
user: dbuser
iam_profile: risk
iam_duration_seconds: 900
autocreate: true
port: 5439
dbname: db_name
schema: dbname_schema
threads: 1
keepalives_idle: 0
Appreciate any help as it would help me move forward evaluating the tool. Thanks!

I managed to find the issue. I had to set the correct region in my aws config. As we are using aws-azure-login this setting didn't matter for my day to day logins
region=eu-west-1
Looking through the implementation of the IAM feature, helped me look in the right place

Related

eks assume role is not permitted on user

I'm trying to run an automated build on my aws codebuild project. I have the following statement in my buildpsec.yaml file
aws eks update-kubeconfig --name ticket-api --region us-west-1 --role-arn arn:aws:iam::12345:role/service-role/CodeBuild-API-Build-service-role
I get the following error
An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::12345:assumed-role/CodeBuild-API-Build-service-role/AWSCodeBuild-99c25416-7046-416e-b5d9-4bff1f4992f3 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::12345:role/service-role/CodeBuild-API-Build-service-role
I'm not sure whats the role arn:aws:sts::12345:assumed-role/CodeBuild-API-Build-service-role/AWSCodeBuild-99c25416-7046-416e-b5d9-4bff1f4992f3 especially the part AWSCodeBuild-99c25416-7046-416e-b5d9-4bff1f4992f3. Is it something unique to the current user ?
Also I think the current user is already assigned to the role based on what its trying to assign.
If I run the same command without role aws eks update-kubeconfig --name ticket-api --region us-west-1 then it works but when I try kubectl version after that I get the following error
error: You must be logged in to the server (the server has asked for the client to provide credentials)

Permission denied on S3 path - What are the minimum policies required on the data source to get athena-express to work?

I'm attempting to use the Athena-express node module to query Athena.
Per the Athena-express docs:
This IAM role/user must have AmazonAthenaFullAccess and AmazonS3FullAccess policies attached
Note: As an alternative to granting AmazonS3FullAccess you could granularize and limit write access to a specific bucket. Just specify this bucket name during athena-express initialization"
Providing AmazonS3FullAccess to this micro service is a non-starter. What is the minimum set of priviledges I can grant to the micro service and still get around the "Permission denied on S3 path: s3://..." errors I've been getting?
Currently, I've got the following
Output location: (I don't think the problem is here)
s3:AbortMultipartUpload, s3:CreateMultipartUpload, s3:DeleteObject, s3:Get*, s3:List*, s3:PutObject, s3:PutObjectTagging
on "arn:aws:s3:::[my-bucket-name]/tmp/athena" and "arn:aws:s3:::[my-bucket-name]/tmp/athena/*"
Data source location:
s3:GetBucketLocation
on "arn:aws:s3:::*"
s3:ListBucket
on "arn:aws:s3:::[my-bucket-name]"
s3:Get* and s3:List*
on "arn:aws:s3:::[my-bucket-name]/production/[path]/[path]" and "arn:aws:s3:::[my-bucket-name]/production/[path]/[path]/*"
The error message I get with the above is:
"Permission denied on S3 path: s3://[my-bucket-name]/production/[path]/[path]/v1/dt=2022-05-26/.hoodie_partition_metadata"
Any suggestions? Thanks!

It turned out that the bucket storing the data I needed to query was encrypted, which meant that the missing permission to query was kms:Decrypt.
Athena by outputs the results of a query to a location (which athena-express then retrieves). The location of the output was in that same encrypted bucket, so I also ended up giving my cronjob kms:Encrypt and kms:GeneratedDataKey.
I ended up using CloudTrails to figure out which permissions were causing my queries to fail.

azdata erroring out due to config file error

All,
The admin setup a 3 node AKS cluster today. I got the kube/config file update by running the az command
az aks get-credentials --name AKSBDCClus --resource-group AAAA-Dev-RG --subscription AAAA-Subscription.
I was able to run all the kubectl commands fine but when I tried setting up the SQLServer 2019 BDC by running azdata bdc create it gave me an error Failed to complete kube config setup.
Since it was something to do with azdata and kubectl I checked the azdata logs and this is what I see in the azdata.log.
Loading default kube config from C:\Users\rgn\.kube\config
Invalid kube-config file. Expected all values in kube-config/contexts list to have 'name' key
Thinking the config file could have got corrupted I tried running az aks get-credentials --name AKSBDCClus --resource-group AAAA-Dev-RG --subscription AAAA-Subscription.
This time I got whole lot of error
The client 'rgn#mycompany.com' with object id 'XXXXX-28c3-YYYY-ZZZZ-AQAQAQd'
does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/listClusterUserCredential/action'
over scope '/subscriptions/Subscription-ID/resourceGroups/
ResourceGroup-Dev-RG/providers/Microsoft.ContainerService/managedClusters/AKSCluster' or the scope is invalid. If access was recently granted, please refresh your credentials.
I logged out and logged back into azure and retried but got the same errors as above. I was able to even stop the VM Scale before I logged for the day. Everything works fine but I'm unable to run azdata script.
Can someone point me in the right direction.
Thanks,
rgn

Turns out that the config file was bad. I deleted the file and ran "az aks get-credentials" (after the necessary permissions to run it) and it worked. The size of old config is 19kb but the new one is 10k.
I guess, I might have messed it up while testing "az aks get-credentials"

gcloud compute ssh requires password even after using json key file for authentication

I am trying to authenticate gcloud using json key and even doing everything as per docs it requires for password when I run gcloud compute ssh root#production
Here is snapshot of steps I performed.
1. Authorizing access to Google Cloud Platform with a service account
tahir#NX00510:~/www/helloworld$ gcloud auth activate-service-account 1055703200677-compute#developer.gserviceaccount.com --key-file=gcloud_key.json
Activated service account credentials for: [1055703200677-compute#developer.gserviceaccount.com]
2. Initializing the gcloud
tahir#NX00510:~/www/helloworld$ gcloud init
Welcome! This command will take you through the configuration of gcloud.
Settings from your current configuration [default] are:
compute:
region: us-central1
zone: us-central1-b
core:
account: 1055703200677-compute#developer.gserviceaccount.com
disable_usage_reporting: 'True'
project: concise-hello-122320
Pick configuration to use:
[1] Re-initialize this configuration [default] with new settings
[2] Create a new configuration
Please enter your numeric choice: 1
Your current configuration has been set to: [default]
You can skip diagnostics next time by using the following flag:
gcloud init --skip-diagnostics
Network diagnostic detects and fixes local network connection issues.
Checking network connection...done.
Reachability Check passed.
Network diagnostic passed (1/1 checks passed).
Choose the account you would like to use to perform operations for
this configuration:
[1] 1055703200677-compute#developer.gserviceaccount.com
[2] Log in with a new account
Please enter your numeric choice: 1
You are logged in as: [1055703200677-compute#developer.gserviceaccount.com].
API [cloudresourcemanager.googleapis.com] not enabled on project
[1055703200677]. Would you like to enable and retry (this will take a
few minutes)? (y/N)? N
WARNING: Listing available projects failed: PERMISSION_DENIED: Cloud Resource Manager API has not been used in project 1055703200677 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/cloudresourcemanager.googleapis.com/overview?project=1055703200677 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.
- '#type': type.googleapis.com/google.rpc.Help
links:
- description: Google developers console API activation
url: https://console.developers.google.com/apis/api/cloudresourcemanager.googleapis.com/overview?project=1055703200677
Enter project id you would like to use: concise-hello-122320
Your current project has been set to: [concise-hello-122320].
Do you want to configure a default Compute Region and Zone? (Y/n)? n
Your Google Cloud SDK is configured and ready to use!
* Commands that require authentication will use 1055703200677-compute#developer.gserviceaccount.com by default
* Commands will reference project `concise-hello-122320` by default
Run `gcloud help config` to learn how to change individual settings
This gcloud configuration is called [default]. You can create additional configurations if you work with multiple accounts and/or projects.
Run `gcloud topic configurations` to learn more.
Some things to try next:
* Run `gcloud --help` to see the Cloud Platform services you can interact with. And run `gcloud help COMMAND` to get help on any gcloud command.
* Run `gcloud topic --help` to learn about advanced features of the SDK like arg files and output formatting
3. SSHing to gcloud
tahir#NX00510:~/www/helloworld$ gcloud compute ssh root#production
No zone specified. Using zone [us-central1-b] for instance: [production].
root#compute.1487950061407628967's password:
I don't know which password should I enter here, also I believe it should not ask for password in the first place because I have used json key file for authentication.
Could you guys please help me out to fix this.
Thanks !

Google cloud dataproc failing to create new cluster with initialization scripts

I am using the below command to create data proc cluster:
gcloud dataproc clusters create informetis-dev
--initialization-actions “gs://dataproc-initialization-actions/jupyter/jupyter.sh,gs://dataproc-initialization-actions/cloud-sql-proxy/cloud-sql-proxy.sh,gs://dataproc-initialization-actions/hue/hue.sh,gs://dataproc-initialization-actions/ipython-notebook/ipython.sh,gs://dataproc-initialization-actions/tez/tez.sh,gs://dataproc-initialization-actions/oozie/oozie.sh,gs://dataproc-initialization-actions/zeppelin/zeppelin.sh,gs://dataproc-initialization-actions/user-environment/user-environment.sh,gs://dataproc-initialization-actions/list-consistency-cache/shared-list-consistency-cache.sh,gs://dataproc-initialization-actions/kafka/kafka.sh,gs://dataproc-initialization-actions/ganglia/ganglia.sh,gs://dataproc-initialization-actions/flink/flink.sh”
--image-version 1.1 --master-boot-disk-size 100GB --master-machine-type n1-standard-1 --metadata "hive-metastore-instance=g-test-1022:asia-east1:db_instance”
--num-preemptible-workers 2 --num-workers 2 --preemptible-worker-boot-disk-size 1TB --properties hive:hive.metastore.warehouse.dir=gs://informetis-dev/hive-warehouse
--worker-machine-type n1-standard-2 --zone asia-east1-b --bucket info-dev
But Dataproc failed to create cluster with following errors in failure file:
cat
+ mysql -u hive -phive-password -e '' ERROR 2003 (HY000): Can't connect to MySQL server on 'localhost' (111)
+ mysql -e 'CREATE USER '\''hive'\'' IDENTIFIED BY '\''hive-password'\'';' ERROR 2003 (HY000): Can't connect to MySQL
server on 'localhost' (111)
Does anyone have any idea behind this failure ?

It looks like you're missing the --scopes sql-admin flag as described in the initialization action's documentation, which will prevent the CloudSQL proxy from being able to authorize its tunnel into your CloudSQL instance.
Additionally, aside from just the scopes, you need to make sure the default Compute Engine service account has the right project-level permissions in whichever project holds your CloudSQL instance. Normally the default service account is a project editor in the GCE project, so that should be sufficient when combined with the sql-admin scopes to access a CloudSQL instance in the same project, but if you're accessing a CloudSQL instance in a separate project, you'll also have to add that service account as a project editor in the project which owns the CloudSQL instance.
You can find the email address of your default compute service account under the IAM page for your project deploying Dataproc clusters, with the name "Compute Engine default service account"; it should look something like <number>#project.gserviceaccount.com`.

I am assuming that you already created the Cloud SQL instance with something like this, correct?
gcloud sql instances create g-test-1022 \
--tier db-n1-standard-1 \
--activation-policy=ALWAYS
If so, then it looks like the error is in how the argument for the metadata is formatted. You have this:
--metadata "hive-metastore-instance=g-test-1022:asia-east1:db_instance”
Unfortuinately, the zone looks to be incomplete (asia-east1 instead of asia-east1-b).
Additionally, with running that many initializayion actions, you'll want to provide a pretty generous initialization action timeout so the cluster does not assume something has failed while your actions take awhile to install. You can do that by specifying:
--initialization-action-timeout 30m
That will allow the cluster to give the initialization actions 30 minutes to bootstrap.

By the time you reported, it was detected an issue with cloud sql proxy initialization action. It is most probably that such issue affected you.
Nowadays, it shouldn't be an issue.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas