Why can't EMR Notebook can't connect to its cluster when running as the AWS account owner - amazon-emr

I have created an AWS EMR cluster and notebook using default settings.
When I open the notebook, the kernel won't launch. I get the message "Workspace is not attached to cluster".
The cluster is in a "Ready" state.
None of the kernels work (Python, Spark, PySPark).
The error occurs using both Jupyter Labs or Jupyter.
I switched to a different AWS account where I had never run EMR and created a notebook. I requested that a cluster be created. AWS launched a cluster, but gave the same error when I launched a notebook.
A clue
I looked at the log files created by a cluster where the notebook failed.
In the log file https://aws-logs-***.s3.amazonaws.com/elasticmapreduce/j-3SOK08VFSQDPO/node/i-04af0a3d2d6d96cac/daemons/emr-on-cluster-env/gateway.log.gz, I found the following:
Jupyter Enterprise Gateway 2.1.0 is available at http://127.0.0.1:9547
User 'root' is not authorized to start kernel 'Python 3'. Ensure KERNEL_USERNAME is set to an appropriate value and retry the request.
User 'root' is not authorized to start kernel 'PySpark'. Ensure KERNEL_USERNAME is set to an appropriate value and retry the request.
How I got the notebook kernel to work
Per the Stackoverflow post Notebooks on EMR (AWS): Failed to start kernel, I switched from using the root AWS account, to an IAM user. This worked with EMR 6.5.0.
My question
What changed when I launched the cluster with an IAM account? How could I have figured out that using the root user is the problem?
EMR is a black box to me. Thanks in advance for helping me understand the inner workings of this amazing technology.

This is the key issue:
User 'root' is not authorized to start kernel 'Python 3'. Ensure KERNEL_USERNAME is set to an appropriate value and retry the request.
User 'root' is not authorized to start kernel 'PySpark'. Ensure KERNEL_USERNAME is set to an appropriate value and retry the request.
You need to create a normal IAM account, with EMR permission, login with that user, and start the notebook from there. Your main AWS account is root account. I talked to the AWS support and got my notebook running that way.

Related

gcloud compute ssh requires password even after using json key file for authentication

I am trying to authenticate gcloud using json key and even doing everything as per docs it requires for password when I run gcloud compute ssh root#production
Here is snapshot of steps I performed.
1. Authorizing access to Google Cloud Platform with a service account
tahir#NX00510:~/www/helloworld$ gcloud auth activate-service-account 1055703200677-compute#developer.gserviceaccount.com --key-file=gcloud_key.json
Activated service account credentials for: [1055703200677-compute#developer.gserviceaccount.com]
2. Initializing the gcloud
tahir#NX00510:~/www/helloworld$ gcloud init
Welcome! This command will take you through the configuration of gcloud.
Settings from your current configuration [default] are:
compute:
region: us-central1
zone: us-central1-b
core:
account: 1055703200677-compute#developer.gserviceaccount.com
disable_usage_reporting: 'True'
project: concise-hello-122320
Pick configuration to use:
[1] Re-initialize this configuration [default] with new settings
[2] Create a new configuration
Please enter your numeric choice: 1
Your current configuration has been set to: [default]
You can skip diagnostics next time by using the following flag:
gcloud init --skip-diagnostics
Network diagnostic detects and fixes local network connection issues.
Checking network connection...done.
Reachability Check passed.
Network diagnostic passed (1/1 checks passed).
Choose the account you would like to use to perform operations for
this configuration:
[1] 1055703200677-compute#developer.gserviceaccount.com
[2] Log in with a new account
Please enter your numeric choice: 1
You are logged in as: [1055703200677-compute#developer.gserviceaccount.com].
API [cloudresourcemanager.googleapis.com] not enabled on project
[1055703200677]. Would you like to enable and retry (this will take a
few minutes)? (y/N)? N
WARNING: Listing available projects failed: PERMISSION_DENIED: Cloud Resource Manager API has not been used in project 1055703200677 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/cloudresourcemanager.googleapis.com/overview?project=1055703200677 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.
- '#type': type.googleapis.com/google.rpc.Help
links:
- description: Google developers console API activation
url: https://console.developers.google.com/apis/api/cloudresourcemanager.googleapis.com/overview?project=1055703200677
Enter project id you would like to use: concise-hello-122320
Your current project has been set to: [concise-hello-122320].
Do you want to configure a default Compute Region and Zone? (Y/n)? n
Your Google Cloud SDK is configured and ready to use!
* Commands that require authentication will use 1055703200677-compute#developer.gserviceaccount.com by default
* Commands will reference project `concise-hello-122320` by default
Run `gcloud help config` to learn how to change individual settings
This gcloud configuration is called [default]. You can create additional configurations if you work with multiple accounts and/or projects.
Run `gcloud topic configurations` to learn more.
Some things to try next:
* Run `gcloud --help` to see the Cloud Platform services you can interact with. And run `gcloud help COMMAND` to get help on any gcloud command.
* Run `gcloud topic --help` to learn about advanced features of the SDK like arg files and output formatting
3. SSHing to gcloud
tahir#NX00510:~/www/helloworld$ gcloud compute ssh root#production
No zone specified. Using zone [us-central1-b] for instance: [production].
root#compute.1487950061407628967's password:
I don't know which password should I enter here, also I believe it should not ask for password in the first place because I have used json key file for authentication.
Could you guys please help me out to fix this.
Thanks !

batch job occasionally gets "does not have bigquery.jobs.create permission" error after multiple successful queries

I have a python batch job running as a service account using bq.cmd to load multiple datastore backups.
It has been running successfully for 2 years, but recently in the middle of some runs (after multiple successful loads by the same user into the same dataset) it fails, continuously returning : "does not have bigquery.jobs.create permission".
Restarting the job, with no changes, usually succeeds.
bq.cmd load --quiet --source_format=DATASTORE_BACKUP --project_id=blah-blah --replace project-name:data_set_name.TableName gs://project-datastore-backup/2018-08-30-03_00_01/blahblah.TableName.backup_info
gcloud components are up to date.
Any suggestions welcome
There's a public bug with a similar issue which was resolved by recreating the service account. If you don't see any actual changes to the IAM permissions occurring in the logs, then I'd try with a new service account.

AWS CLI returning error 10060 connection error only on some call(s)

I am using AWS CLI on Windows 7:
C:\Users\auser>aws --version
aws-cli/1.11.121 Python/2.7.9 Windows/7 botocore/1.5.84
Some command works normally, for example:
aws ec2 describe-instances
But if I try organizations describe-account I get:
C:\Users\auser>aws organizations describe-account --account-id XXXXXXXXXX
('Connection aborted.', error(10060, "Impossible to connect ..."))
Note: i am in an enterprise behind a proxy, but i don't think that's a problem since the describe-instances works normally
Could it be a permission problem? How could I check on my own?
This sounds like a networking issue, most probably on your enterprise's end. Without knowing how your network is setup, it's difficult to troubleshoot. Perhaps there is a maximum amount of connections allowed to go out and sometimes you happen to run up against this limit.
If you want to check your permissions to AWS, assuming you have access to view IAM, you could review the policies affecting whatever user the AWS CLI your machine is configured to work on behalf of.

Apache Zeppelin on EMR login error

I have setup an EMR 4.4 cluster with Zeppelin/Spark configured. I successfully managed to setup Zeppelin on localhost and was logged in as anonymous. I added created a user and password and continued to work with my notebook. I later started a new cluster and I am now presented with a login screen for Zeppelin which will not accept my username and password. Is there a way to flush the privileges or find out what I did enter?
Many thanks!
Added the port at the end and it allowed access.
http://127.0.0.1:8890/

Why Elastic MapReduce job flow failed in AWS MapReduce?

I created a job flow in AWS MapReduce, I created a job flow of Contextual Advertising (Hive Script) - done 'Start interactive Hive Session', selected m1.small instances, proceeded without a VPC subnet id and Configure Hadoop in Configure Bootstrap actions.
Now, job flow goes into starting state and after 15-20 minutes it goes into failed state and it does not go into waiting state.
It shows "Last State Change Reason: User account is not authorized to call EC2 "
I gave PowerUserAccess to myself thru IAM. also I have given below policies to myself.
1.AmazonEC2FullAccess
2.AmazonElasticMapReduceFullAccess
3.IAMFullAccess
After giving all these policies still it shows "User account is not authorized to call EC2"
please guide. Thanks.
EMR builds on other AWS services that you also need to subscribe to. Giving IAM privileges to call ec2, s3 and emr is not sufficient.