YARN jobs getting stuck in ACCEPTED state despite memory available - hadoop-yarn

Cluster goes into deadlock state and stops allocating containers even when GBs of RAM and Vcores are available.
This was happening only when we start a lot of jobs in parallel most of which were Oozie jobs with many forked actions.

After a lot of search and reading related questions and articles, we came across a property called maxAMShare for YARN job scheduler (we are using Fair Scheduler).
What it means?
Percentage of memory and vcores from user's queue share that can be allotted to Application Masters. Default value: 0.5 (50%). Source
How it caused the deadlock?
When we will start multiple oozie jobs in parallel, each oozie job and the forked actions require couple of ApplicationMaster containers to be allotted first for oozie launchers which then start the other containers to do the actual action task.
In our case, we were actually starting around 20-30 oozie jobs in parallel, each with close to 20 forked actions. And with each action requiring 2 ApplicationMasters, close to 800 containers were getting blocked only by the Oozie ApplicationMasters.
Due to this, we were hitting the 50% default maxAMShare limit for our user queue. And YARN was not allowing to create new ApplicationMasters to run the actual job.
Solution?
One instant suggestion could be to disable the check by setting this property to -1.0. But this is not recommended. You can again end up allocating all or most of the resources to AMs and the real job that will get done will be very less.
Other option (which we went ahead with) is to specify a separate queue for AMs in the oozie configuration and then set maxAMShare property to 1.0. This way you can control how much resources can be allocated to AMs without affecting the other jobs. Reference
<global>
<configuration>
<property>
<name>oozie.launcher.mapred.job.queue.name</name>
<value>root.users.oozie_am_queue</value>
</property>
</configuration>
</global>
Hope this will be a major time saver for people facing the same issue. There could be many others reasons for deadlock too which are already discussed in other questions on SO.

Related

Nlog in hangfire jobs deadlock

I am using hangfire as a custom workflow engine in a dotnet core application, which works perfectly fine until I start logging. I log to DB as a requirement. So normal database target. I enabled a sync=true. Once I turn logging on I start getting deadlocks. All is using DI. Again all works fine without log, once I add a rule to write the logs in db I get a deadlock. It’s not even a lot of concurrent jobs, only 11 in my test. Help pls?! Already lost a week on this and still stuck.
I am not sure it’s related, but I am also getting this at the same time: asynchronous exception: timeout flushing all targets

IBM APIConnect - task security-appID

I have an instance of APIConnect on premise.
Analyzing the logs, I have seen the task called "security-appID" moving from 10ms execution time to 200ms execution time.
What is the meaning of this task?
This task I believe offloads application security requests to other integrations if you have it so configured. It does not have anything to do necessarily with apiconnect, it is probably related to your bluemix ID, dashboard or landing page and how that is setup. You can probably find more information about it in the BMX docs. https://console.dys0.bluemix.net/docs/services/appid/existing.html#adding-app-id-to-an-existing-app

Long running chef-client executions

I'm using a open-source chef server managing about 150 nodes.
Analytics/Reporting module is not activated in the chef server due to resource constraints.
"chef-client" is running on all the nodes every 30 minutes
How can I find, how much time each chef-client run is taking to complete?
I'm trying to find the nodes that are slowest in completing their chef-client runs
Chef Server doesn't store this information. You'll need to manage it yourself, possibly using a handler as linked above in the comments. A simple option would be to make a handler which stores the duration of the last run as a node attribute, but the sky is the limit. If you want something to help debug long runs once you find them, check out my poise-profiler cookbook.

YARN Architecture of Hadoop 2.0

From below link of Apache Hadoop site, I learn that
ApplicationMaster has the responsibility of negotiating appropriate
resource containers from the Scheduler (ResourceManager)
and also learn that
ApplicationsManager negotiating the first container for executing the
ApplicationMaster
Link : http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html
So here is my confusion.
If ApplicationMaster has the responsilibility to request ResourceManager for Container, then Who is creating the first container and what is the process to create the first container for executing the ApplicationMaster?
Is there anyone giving and request to create the first container?
What are the resonsibilities of the first Container? First Container only executes the ApplicationMaster or it is also behaving like other Resource Container?
Please let me know if anyone has the idea regarding this.
First of all, you are confusing the terms ApplicationManager and ApplicationMaster. They are not the same, have a look at my answer to understand difference between Application Manager and Application Master in YARN.
Answers to your questions are given below:
YarnClient has the responsibility to submit the application to ResourceManager, it sends an ApplicationSubmissionContext object to ResourceManager, which represents all of the information needed by the ResourceManager to launch the ApplicationMaster for an application.
Yes, YarnClient does that!
First Container is the Application Master, its job is to request the resources(containers) from ResourceManager and make application level decisions. If a sufficient number of containers (defined by the logic in your ApplicationMaster) are provided by the ResourceManager, then ApplicationMaster can go ahead and launch the application code on containers. FurtherMore, ApplicationMaster keeps track of failed containers and relauch them or terminates the application(kills all other containers), again based on the logic of your ApplicationMaster.
To understand the internals of Hadoop YARN, i would suggest you to read YARN paper or if you have more time you can read a book on Hadoop YARN.

How to submit code to a remote Spark cluster from IntelliJ IDEA

I have two clusters, one in local virtual machine another in remote cloud. Both clusters in Standalone mode.
My Environment:
Scala: 2.10.4
Spark: 1.5.1
JDK: 1.8.40
OS: CentOS Linux release 7.1.1503 (Core)
The local cluster:
Spark Master: spark://local1:7077
The remote cluster:
Spark Master: spark://remote1:7077
I want to finish this:
Write codes(just simple word-count) in IntelliJ IDEA locally(on my laptp), and set the Spark Master URL to spark://local1:7077 and spark://remote1:7077, then run my codes in IntelliJ IDEA. That is, I don't want to use spark-submit to submit a job.
But I got some problem:
When I use the local cluster, everything goes well. Run codes in IntelliJ IDEA or use spark-submit can submit job to cluster and can finish the job.
But When I use the remote cluster, I got a warning log:
TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
It is sufficient resources not sufficient memory!
And this log keep printing, no further actions. Both spark-submit and run codes in IntelliJ IDEA result the same.
I want to know:
Is it possible to submit codes from IntelliJ IDEA to remote cluster?
If it's OK, does it need configuration?
What are the possible reasons that can cause my problem?
How can I handle this problem?
Thanks a lot!
Update
There is a similar question here, but I think my scene is different. When I run my codes in IntelliJ IDEA, and set Spark Master to local virtual machine cluster, it works. But I got Initial job has not accepted any resources;... warning instead.
I want to know whether the security policy or fireworks can cause this?
Submitting code programatically (e.g. via SparkSubmit) is quite tricky. At the least there is a variety of environment settings and considerations -handled by the spark-submit script - that are quite difficult to replicate within a scala program. I am still uncertain of how to achieve it: and there have been a number of long running threads within the spark developer community on the topic.
My answer here is about a portion of your post: specifically the
TaskSchedulerImpl: Initial job has not accepted any resources; check
your cluster UI to ensure that workers are registered and have
sufficient resources
The reason is typically there were a mismatch on the requested memory and/or number of cores from your job versus what were available on the cluster. Possibly when submitting from IJ the
$SPARK_HOME/conf/spark-defaults.conf
were not properly matching the parameters required for your task on the existing cluster. You may need to update:
spark.driver.memory 4g
spark.executor.memory 8g
spark.executor.cores 8
You can check the spark ui on port 8080 to verify that the parameters you requested are actually available on the cluster.