I am trying to run a job scheduler in azure databricks. While its running various notebook, its failing and showing below error.
As mentioned by #santoznma in comment, you don’t have jobs access control enabled.
Enabling access control for jobs allows job owners to control who can view job results or manage runs of a job.
To enable it follow below steps:
1. Go to the Admin Console.
2. Click the Workspace Settings tab.
3. Click the Cluster, Pool and Jobs Access Control toggle.
4. Click Confirm.
Refer - Enable jobs access control for your workspace - Azure Databricks
Related
I am not able to view the spark-ui for databricks jobs executed through notebook activity in Azure datafactory.
Does anyone know which permissions needs to be added to enable the same?
--Update
Ensure you have Cluster-level permissions on PROD env
Job permissions
There are five permission levels for jobs: No Permissions, Can View,
Can Manage Run, Is Owner, and Can Manage. Admins are granted the Can
Manage permission by default, and they can assign that permission to
non-admin users.
Next...
Am able to view the completed jobs in the Spark UI without any additional setup.
It maybe that the stuff you are doing in the Notebook, does not constitute for a Spark Job.
Refer: Web UI and Monitoring and Instrumentation for more details.
Checkout Spark Glossary
Note: Job A parallel computation consisting of multiple tasks that gets spawned in response to a Spark action (e.g. save, collect);
you'll see this term used in the driver's logs.
View cluster information in the Apache Spark UI
You can get details about active and terminated clusters. If you
restart a terminated cluster, the Spark UI displays information for
the restarted cluster, not the historical information for the
terminated cluster.
So, If I run a notebook with a task to simply display my mounts or with a task that failed or exception, it is not listed in Spark UI.
#job/83 is not seen
Beginner in GCP here. I'm testing GCP Dataflow as part of a IOT project to move data from Pub/Sub to BigQuery. I created a Dataflow job from the Topic's page "Export to BigQuery" button.
Apart from the issue that I can't delete a dataflow, I am hitting the following issue:
As soon as the dataflow starts, I get the error:
Workflow failed. Causes: There was a problem refreshing your credentials. Please check: 1. Dataflow API is enabled for your project. 2. Make sure both the Dataflow service account and the controller service account have sufficient permissions. If you are not specifying a controller service account, ensure the default Compute Engine service account [PROJECT_NUMBER]-compute#developer.gserviceaccount.com exists and has sufficient permissions. If you have deleted the default Compute Engine service account, you must specify a controller service account. For more information, see: https://cloud.google.com/dataflow/docs/concepts/security-and-permissions#security_and_permissions_for_pipelines_on_google_cloud_platform. , There is no cloudservices robot account for your project. Please ensure that the Dataflow API is enabled for your project.
Here's where it's funny:
Dataflow API is definitely enabled, since I am looking at this from the Dataflow portion of the console.
Dataflow is using the default compute engine service account, that exists. The link it's pointing at says that this account is created automatically and has a broad access to project's resources. Well, does it?
Dataflows elude me.. How can I tell a dataflow job to restart, or edit or delete it?
please verify below checklist:
Dataflow API should be enabled check under APIs & Services. If you just enabled ,wait for some time to get it updated
[project-number]-compute#developer.gserviceaccount.com and service-[project-number]#dataflow-service-producer-prod.iam.gserviceaccount.com service accounts should exists if dataflow-service-producer-prod didn't get created you can contact dataflow support or you can create and assign Cloud Dataflow Service Agent role, If you are using shared VPC create it in host project and assign Compute Network User role
I am facing issue while running "AzureActivity" query in Azure Monitor Log. I am using free trail subscription and I created a vm and trying to run query but facing issue. Please find below screenshot.
you can create an Log Analytics workspace. Then go to azure portal -> your vm -> in the Activity log page, click the Diagnostic settings button -> then in the Diagnostic settings, click the Add diagnostic setting button -> then you can send all the logs to the Log Analytics workspace. At last, you can try to query in that Log Analytics workspace.
I have got a ssis package which runs when I manually run from Integration Services. But when I try to run it from a job. Then it runs but no data is seen in the data. There seems to be some permission issue. Can somebody tell me what permissions are required for running a package from a SQL Server Job?
State the error message.
If you are using a flat file connection manager, and that's where the error is occurring, click 'start' then 'computer' then check to make sure you are mapped to that drive. If not, click the tab upper right corner to map to the drive then when you access the file through SSIS you shouldn't have an error.
If the package runs successfully as a job using the SQL Server Agent then you have the permissions set right for the database side.
However make sure if you are accessing any external data such as flat files that the agent is able to access these locations. You may have permissions on your Windows account to access the locations when you run the package in Visual Studio but the agent service running the job requires those permissions too.
If this is not the case can you clarify what your package does and any messages you receive from the catalog reports so I can help further.
I created a job flow in AWS MapReduce, I created a job flow of Contextual Advertising (Hive Script) - done 'Start interactive Hive Session', selected m1.small instances, proceeded without a VPC subnet id and Configure Hadoop in Configure Bootstrap actions.
Now, job flow goes into starting state and after 15-20 minutes it goes into failed state and it does not go into waiting state.
It shows "Last State Change Reason: User account is not authorized to call EC2 "
I gave PowerUserAccess to myself thru IAM. also I have given below policies to myself.
1.AmazonEC2FullAccess
2.AmazonElasticMapReduceFullAccess
3.IAMFullAccess
After giving all these policies still it shows "User account is not authorized to call EC2"
please guide. Thanks.
EMR builds on other AWS services that you also need to subscribe to. Giving IAM privileges to call ec2, s3 and emr is not sufficient.