Why Elastic MapReduce job flow failed in AWS MapReduce? - amazon-s3

I created a job flow in AWS MapReduce, I created a job flow of Contextual Advertising (Hive Script) - done 'Start interactive Hive Session', selected m1.small instances, proceeded without a VPC subnet id and Configure Hadoop in Configure Bootstrap actions.
Now, job flow goes into starting state and after 15-20 minutes it goes into failed state and it does not go into waiting state.
It shows "Last State Change Reason: User account is not authorized to call EC2 "
I gave PowerUserAccess to myself thru IAM. also I have given below policies to myself.
1.AmazonEC2FullAccess
2.AmazonElasticMapReduceFullAccess
3.IAMFullAccess
After giving all these policies still it shows "User account is not authorized to call EC2"
please guide. Thanks.

EMR builds on other AWS services that you also need to subscribe to. Giving IAM privileges to call ec2, s3 and emr is not sufficient.

Related

Why can't EMR Notebook can't connect to its cluster when running as the AWS account owner

I have created an AWS EMR cluster and notebook using default settings.
When I open the notebook, the kernel won't launch. I get the message "Workspace is not attached to cluster".
The cluster is in a "Ready" state.
None of the kernels work (Python, Spark, PySPark).
The error occurs using both Jupyter Labs or Jupyter.
I switched to a different AWS account where I had never run EMR and created a notebook. I requested that a cluster be created. AWS launched a cluster, but gave the same error when I launched a notebook.
A clue
I looked at the log files created by a cluster where the notebook failed.
In the log file https://aws-logs-***.s3.amazonaws.com/elasticmapreduce/j-3SOK08VFSQDPO/node/i-04af0a3d2d6d96cac/daemons/emr-on-cluster-env/gateway.log.gz, I found the following:
Jupyter Enterprise Gateway 2.1.0 is available at http://127.0.0.1:9547
User 'root' is not authorized to start kernel 'Python 3'. Ensure KERNEL_USERNAME is set to an appropriate value and retry the request.
User 'root' is not authorized to start kernel 'PySpark'. Ensure KERNEL_USERNAME is set to an appropriate value and retry the request.
How I got the notebook kernel to work
Per the Stackoverflow post Notebooks on EMR (AWS): Failed to start kernel, I switched from using the root AWS account, to an IAM user. This worked with EMR 6.5.0.
My question
What changed when I launched the cluster with an IAM account? How could I have figured out that using the root user is the problem?
EMR is a black box to me. Thanks in advance for helping me understand the inner workings of this amazing technology.
This is the key issue:
User 'root' is not authorized to start kernel 'Python 3'. Ensure KERNEL_USERNAME is set to an appropriate value and retry the request.
User 'root' is not authorized to start kernel 'PySpark'. Ensure KERNEL_USERNAME is set to an appropriate value and retry the request.
You need to create a normal IAM account, with EMR permission, login with that user, and start the notebook from there. Your main AWS account is root account. I talked to the AWS support and got my notebook running that way.

View Spark UI for Jobs executed via Azure ADF

I am not able to view the spark-ui for databricks jobs executed through notebook activity in Azure datafactory.
Does anyone know which permissions needs to be added to enable the same?
--Update
Ensure you have Cluster-level permissions on PROD env
Job permissions
There are five permission levels for jobs: No Permissions, Can View,
Can Manage Run, Is Owner, and Can Manage. Admins are granted the Can
Manage permission by default, and they can assign that permission to
non-admin users.
Next...
Am able to view the completed jobs in the Spark UI without any additional setup.
It maybe that the stuff you are doing in the Notebook, does not constitute for a Spark Job.
Refer: Web UI and Monitoring and Instrumentation for more details.
Checkout Spark Glossary
Note: Job A parallel computation consisting of multiple tasks that gets spawned in response to a Spark action (e.g. save, collect);
you'll see this term used in the driver's logs.
View cluster information in the Apache Spark UI
You can get details about active and terminated clusters. If you
restart a terminated cluster, the Spark UI displays information for
the restarted cluster, not the historical information for the
terminated cluster.
So, If I run a notebook with a task to simply display my mounts or with a task that failed or exception, it is not listed in Spark UI.
#job/83 is not seen

Permission denied while running azure databricks notebook as job scheduler

I am trying to run a job scheduler in azure databricks. While its running various notebook, its failing and showing below error.
As mentioned by #santoznma in comment, you don’t have jobs access control enabled.
Enabling access control for jobs allows job owners to control who can view job results or manage runs of a job.
To enable it follow below steps:
1. Go to the Admin Console.
2. Click the Workspace Settings tab.
3. Click the Cluster, Pool and Jobs Access Control toggle.
4. Click Confirm.
Refer - Enable jobs access control for your workspace - Azure Databricks

Google Cloud Dataflow permission issues

Beginner in GCP here. I'm testing GCP Dataflow as part of a IOT project to move data from Pub/Sub to BigQuery. I created a Dataflow job from the Topic's page "Export to BigQuery" button.
Apart from the issue that I can't delete a dataflow, I am hitting the following issue:
As soon as the dataflow starts, I get the error:
Workflow failed. Causes: There was a problem refreshing your credentials. Please check: 1. Dataflow API is enabled for your project. 2. Make sure both the Dataflow service account and the controller service account have sufficient permissions. If you are not specifying a controller service account, ensure the default Compute Engine service account [PROJECT_NUMBER]-compute#developer.gserviceaccount.com exists and has sufficient permissions. If you have deleted the default Compute Engine service account, you must specify a controller service account. For more information, see: https://cloud.google.com/dataflow/docs/concepts/security-and-permissions#security_and_permissions_for_pipelines_on_google_cloud_platform. , There is no cloudservices robot account for your project. Please ensure that the Dataflow API is enabled for your project.
Here's where it's funny:
Dataflow API is definitely enabled, since I am looking at this from the Dataflow portion of the console.
Dataflow is using the default compute engine service account, that exists. The link it's pointing at says that this account is created automatically and has a broad access to project's resources. Well, does it?
Dataflows elude me.. How can I tell a dataflow job to restart, or edit or delete it?
please verify below checklist:
Dataflow API should be enabled check under APIs & Services. If you just enabled ,wait for some time to get it updated
[project-number]-compute#developer.gserviceaccount.com and service-[project-number]#dataflow-service-producer-prod.iam.gserviceaccount.com service accounts should exists if dataflow-service-producer-prod didn't get created you can contact dataflow support or you can create and assign Cloud Dataflow Service Agent role, If you are using shared VPC create it in host project and assign Compute Network User role

batch job occasionally gets "does not have bigquery.jobs.create permission" error after multiple successful queries

I have a python batch job running as a service account using bq.cmd to load multiple datastore backups.
It has been running successfully for 2 years, but recently in the middle of some runs (after multiple successful loads by the same user into the same dataset) it fails, continuously returning : "does not have bigquery.jobs.create permission".
Restarting the job, with no changes, usually succeeds.
bq.cmd load --quiet --source_format=DATASTORE_BACKUP --project_id=blah-blah --replace project-name:data_set_name.TableName gs://project-datastore-backup/2018-08-30-03_00_01/blahblah.TableName.backup_info
gcloud components are up to date.
Any suggestions welcome
There's a public bug with a similar issue which was resolved by recreating the service account. If you don't see any actual changes to the IAM permissions occurring in the logs, then I'd try with a new service account.